LLMs Are Refrigeration. The Coca-Cola of AI Has Not Been Built Yet.

Abhishek Gautam··9 min read

Quick summary

Chamath Palihapitiya says LLM builders are the refrigerator inventors — real money goes to the Coca-Colas that use AI as infrastructure. Here is which companies have the extra ingredient.

The person who invented refrigeration made some money. Most of the money was made by Coca-Cola, who used refrigeration to build an empire.

That is Chamath Palihapitiya's analogy for large language models — and it is the most accurate framing of the AI economy that anyone has produced in plain language. Palihapitiya, the former Facebook VP turned venture capitalist behind Social Capital and the All-In Podcast, has been saying versions of this publicly since early 2026. He also said, separately, that AI model building is "a money trap" — an investment that compounds losses rather than value for the builders themselves.

Both things can be true: LLMs are transformational and the companies building them may not be the ones that get rich from the transformation.

This post breaks down why the analogy holds, which companies have the "one extra ingredient" Chamath describes, and what it means practically if you are a developer or founder building on top of AI right now.

Why the Refrigeration Analogy Is Exactly Right

Refrigeration was invented in the mid-1800s. The companies that pioneered industrial cooling equipment — Linde, Carrier, York — are real businesses that still exist. None of them became dominant in the way that Coca-Cola, Anheuser-Busch, or the entire global cold chain logistics industry did. Refrigeration became infrastructure. The value migrated to whoever used that infrastructure to do something nobody else could do.

The parallel to LLMs is precise on several dimensions.

The technology is rapidly commoditising. In 2023, GPT-4 had a meaningful capability lead over every other model. By 2026, GPT-4o, Claude 3.7 Sonnet, Gemini 2.0 Flash, Qwen 3.5, and Mistral Small 4 all perform competitively on most tasks at dramatically lower prices. The API cost for one million tokens of GPT-4-class intelligence has fallen roughly 97% since 2023. Infrastructure that drops 97% in cost in two years is on a clear path to commodity pricing.

The builders are spending more than they earn. OpenAI is not profitable. Anthropic is not profitable. Google DeepMind is a cost centre within Alphabet. The capital required to stay competitive at the frontier — training runs, inference infrastructure, talent — is measured in billions per year. Chamath called this out explicitly: "AI model building is a money trap. There is no bounding law like Moore's law where advances can be predictably expected." Without a predictable improvement curve, you cannot plan R&D spending, and without profitable unit economics, you are burning investor capital in a race that may have no finish line.

The infrastructure is already abstracted. Developers do not need to understand transformer architectures to call the Claude API. The same way a Coca-Cola bottling plant does not need to understand the thermodynamics of refrigeration, a startup building on LLMs does not need to train models. The abstraction is complete enough that the underlying technology has become, functionally, a utility.

The One Extra Ingredient

Palihapitiya's second insight is the actionable one. Give Facebook, Microsoft, Google, and Amazon the same 1,000 inputs — they produce the same machine learning model. But give one company one extra thing that none of the others have, and the output can be markedly different. His analogy: two great chefs with three ingredients versus a third chef with four. The fourth ingredient is not additive — it is transformative.

That extra ingredient, in practice, is proprietary data that cannot be replicated by training on public sources.

This is not the same as "having lots of data." Every company has data. The question is whether your data is unique, time-stamped to real events, and reflects patterns that the public internet does not contain.

Here is what that actually looks like across sectors:

Stripe processes approximately $1.4 trillion in payments annually across millions of businesses. The fraud patterns, payment failure modes, dispute rates by merchant category, and cross-business revenue benchmarks that exist in Stripe's dataset do not exist anywhere else at that resolution. When Stripe trains models on that data to predict churn, flag fraud, or recommend pricing, it produces outputs that GPT-4o cannot replicate — not because GPT-4o is less capable, but because it has never seen that data.

Bloomberg has 40 years of financial time-series data, earnings call transcripts, and real-time market signals tied to actual trades. Bloomberg's terminal is $24,000 per year per seat in 2026 because the data behind it has compounding value that public sources cannot match. Bloomberg Intelligence, their AI layer on top of that data, is not replicable by a startup calling the Gemini API.

Epic Systems holds electronic health records for roughly 280 million patients in the United States. The longitudinal outcome data — what treatments worked, what failed, what dosing protocols correlate with readmission rates — is decades of structured clinical information that exists in no public dataset. An AI model trained on Epic's data and used inside Epic's ecosystem for clinical decision support is not competitive with a model trained on PubMed. The data is the moat, not the model.

Palantir has operational data from defence, intelligence, and enterprise deployments going back to 2004. The edge cases, failure modes, and system-level behaviours that exist in classified and regulated environments are simply not in the training data of any public model. When Palantir integrates LLMs into its Gotham and AIP platforms, it wraps them in proprietary ontologies built from 20 years of hard-won operational context.

The pattern is the same in each case: a long-running data collection advantage, tied to real transactions or outcomes, in a domain where public alternatives are structurally absent.

The Collapse of Terminal Value

Palihapitiya has also written about a related concern he calls "the collapse of terminal value." The traditional way investors value a company is to project future earnings and discount them back to today. That projection relies on competitive moats lasting long enough to produce sustained earnings.

AI is compressing the time it takes to erode those moats. A feature that took three years to build in 2019 can be scaffolded in three months with AI assistance in 2026. Distribution advantages that depended on sales team size are less durable when AI agents can replace the top of the sales funnel. Brand advantages that depended on content volume are less durable when content production costs near zero.

The companies whose moats survive this compression are exactly the ones with the extra ingredient Chamath describes — data that cannot be generated, only accumulated. You cannot use AI to manufacture 20 years of Stripe transaction data. You cannot use Claude to synthesise 40 years of Bloomberg time-series. The data advantage is durable precisely because it is historical and proprietary, not because it is large.

This creates a clear dividing line in the AI economy: companies with irreplaceable data assets are building durable value. Companies without them are building features that will be commoditised by the next model release.

What This Means for Developers and Founders Building on AI Today

The refrigeration analogy has a direct implication for anyone building a product on top of LLM APIs right now.

If your entire product value comes from wrapping an LLM, you are the refrigeration patent holder, not Coca-Cola. The next model release may render your differentiation irrelevant. This is not hypothetical — it has already happened to dozens of AI writing tools that were genuinely differentiated in 2023 and are indistinguishable from a GPT wrapper by 2026.

The durable play is to use AI to accumulate the data that becomes the moat. Every interaction your product has with users is potentially the extra ingredient — if you collect it, structure it, and use it to train or fine-tune models that new entrants cannot replicate. This is how Duolingo built a durable product: not because their language learning app is technically complex, but because 500 million learners have generated a dataset of language acquisition patterns that no competitor can reproduce from scratch.

Domain specificity is more valuable than general capability. A model fine-tuned on 10 years of legal contract disputes is more useful to a law firm than GPT-4o, not because it is smarter overall, but because its output is calibrated to the edge cases and risk patterns that matter in that domain. The investment thesis for AI in 2026 is: who has the domain data and the distribution to deploy a model that a general-purpose provider cannot match?

The "one extra ingredient" framework is a product question, not a technology question. Founders should ask: what data does my product generate that nobody else can get? What patterns exist in my user behaviour, transactions, or outcomes that are not in any public training set? That data, accumulated over time, is the business. The LLM is the refrigeration.

Which Sectors Have the Extra Ingredient

The sectors where the Coca-Cola dynamic is most likely to play out:

Healthcare and clinical AI — patient outcome data, diagnostic imaging, treatment protocols. Epic, Cerner, and the large health systems are best positioned. Startups like Abridge (clinical documentation) are accumulating real clinical conversation data that is inherently scarce.

Financial services — transaction patterns, risk signals, portfolio behaviour. Stripe, Plaid, and the major prime brokers. Bloomberg and FactSet on the data side. Trading firms with proprietary order flow data.

Legal and compliance — contract language, dispute outcomes, regulatory interpretation. Harvey AI is building this by ingesting the internal knowledge of major law firms. The data accumulated through paid engagements is not available to competitors.

Industrial and manufacturing — sensor data from physical equipment, failure modes, supply chain signals. Palantir, Samsara, and the industrial IoT players. The time-series data from a manufacturing line running for a decade is worth more than any synthetic dataset.

Logistics and supply chain — real-time routing data, demand signals, carrier behaviour. Flexport, FourKites, project44. The proprietary signals in live shipment data are not reproducible from public sources.

Key Takeaways

  • Chamath Palihapitiya's refrigeration analogy: LLMs are the infrastructure, like refrigeration — the companies building models will make some money, but the real wealth goes to the Coca-Colas that use LLMs to build something nobody else can replicate
  • "AI model building is a money trap" — his exact words: no predictable improvement curve, no profitable unit economics, capital requirements measured in billions per year
  • The one extra ingredient is proprietary data: give every major tech company the same inputs and they produce the same model — one extra data source no one else has produces markedly different output
  • Real examples: Stripe ($1.4T annual payments data), Bloomberg (40 years of financial time-series), Epic (280M patient records), Palantir (20 years of classified operational context)
  • The collapse of terminal value: AI is compressing the time it takes to erode competitive moats — only data that cannot be generated, only accumulated, survives the compression
  • For founders and developers: if your entire product value comes from wrapping an LLM, you are the refrigeration patent holder; the durable play is using AI to accumulate the proprietary data that becomes the moat
  • Sectors most likely to produce Coca-Cola outcomes: healthcare AI (clinical outcome data), financial services (transaction and risk data), legal (contract and dispute data), industrial IoT (equipment sensor data), logistics (real-time supply chain signals)

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.

Free Tool

What should your project cost?

Get honest 2026 price ranges for any project type — website, SaaS, MVP, or e-commerce. No fluff.

Try the Website Cost Calculator →

Free Tool

Will AI replace your job?

4 questions. Get a personalised developer risk score based on your stack, role, and what you actually build day to day.

Check Your AI Risk Score →
ShareX / TwitterLinkedIn

Written by

Abhishek Gautam

Software Engineer based in Delhi, India. Writes about AI models, semiconductor supply chains, and tech geopolitics — covering the intersection of infrastructure and global events. 355+ posts cited by ChatGPT, Perplexity, and Gemini. Read in 121 countries.