LLMs Are Refrigeration. The Coca-Cola of AI Has Not Been Built Yet.

Q: What is the Chamath Palihapitiya refrigeration and Coca-Cola AI analogy?

Chamath Palihapitiya, the venture capitalist and All-In Podcast host, argues that large language models are like refrigeration technology — the inventors made some money, but most of the wealth was created by companies like Coca-Cola that used refrigeration as infrastructure to build something entirely new. In his framing, LLM builders (OpenAI, Anthropic, Google DeepMind) are the refrigerator inventors, and the companies that will generate the most value from AI are the ones that use LLMs as a foundation combined with one extra ingredient — proprietary data — that no competitor can replicate.

Q: Why does Chamath Palihapitiya say AI model building is a money trap?

Palihapitiya stated that AI model building is a money trap because there is no predictable improvement law (unlike Moore's law in semiconductors), which makes it impossible to plan R&D spending reliably. The capital requirements to stay competitive at the frontier are measured in billions per year, and no major model builder has publicly demonstrated profitable unit economics. OpenAI and Anthropic are both not profitable despite massive revenue growth.

Q: Which companies have the proprietary data advantage in AI in 2026?

The clearest examples are: Stripe (approximately $1.4 trillion in annual payment data including fraud patterns and merchant behaviour), Bloomberg (40 years of financial time-series, earnings transcripts, and real-time market signals), Epic Systems (electronic health records for 280 million US patients), and Palantir (20 years of classified operational data from defence and intelligence deployments). These companies have data assets that cannot be replicated by training on public sources, which is the structural moat that survives AI commoditisation.

Q: What is the "one extra ingredient" Chamath refers to in his AI analogy?

The one extra ingredient is proprietary data that no competitor can access or generate. Palihapitiya's point is that if you give Facebook, Microsoft, Google, and Amazon the same inputs, they all produce the same machine learning model. But a company with one additional data source — patient outcomes, transaction histories, legal dispute records, industrial sensor readings — can produce a model with markedly different and more valuable output. The data is not additive; it is transformative, like giving one chef a fourth ingredient the others do not have.

Q: How should developers and founders respond to the refrigeration theory of AI?

The practical implication is to evaluate whether your product generates data that competitors cannot replicate. If your product's entire value comes from wrapping an LLM, you are building on top of commoditising infrastructure — the next model release may obsolete your differentiation. The durable strategy is to use AI to accumulate proprietary interaction data, domain-specific patterns, or outcome signals over time. The LLM is the refrigeration; the data your product generates through real usage is the ingredient that makes you the Coca-Cola rather than the refrigerator manufacturer.

Abhishek Gautam

AI Business Strategy Chamath Palihapitiya Developer Tools

LLMs Are Refrigeration. The Coca-Cola of AI Has Not Been Built Yet.

Abhishek Gautam·March 28, 2026·9 min read

Quick summary

Chamath Palihapitiya says LLM builders are the refrigerator inventors — real money goes to the Coca-Colas that use AI as infrastructure. Here is which companies have the extra ingredient.

Why the Refrigeration Analogy Is Exactly Right

Refrigeration was invented in the mid-1800s. The companies that pioneered industrial cooling equipment — Linde, Carrier, York — are real businesses that still exist. None of them became dominant in the way that Coca-Cola, Anheuser-Busch, or the entire global cold chain logistics industry did. Refrigeration became infrastructure. The value migrated to whoever used that infrastructure to do something nobody else could do.

The parallel to LLMs is precise on several dimensions.

The technology is rapidly commoditising. In 2023, GPT-4 had a meaningful capability lead over every other model. By 2026, GPT-4o, Claude 3.7 Sonnet, Gemini 2.0 Flash, Qwen 3.5, and Mistral Small 4 all perform competitively on most tasks at dramatically lower prices. The API cost for one million tokens of GPT-4-class intelligence has fallen roughly 97% since 2023. Infrastructure that drops 97% in cost in two years is on a clear path to commodity pricing.

The builders are spending more than they earn. OpenAI is not profitable. Anthropic is not profitable. Google DeepMind is a cost centre within Alphabet. The capital required to stay competitive at the frontier — training runs, inference infrastructure, talent — is measured in billions per year. Chamath called this out explicitly: "AI model building is a money trap. There is no bounding law like Moore's law where advances can be predictably expected." Without a predictable improvement curve, you cannot plan R&D spending, and without profitable unit economics, you are burning investor capital in a race that may have no finish line.

The infrastructure is already abstracted. Developers do not need to understand transformer architectures to call the Claude API. The same way a Coca-Cola bottling plant does not need to understand the thermodynamics of refrigeration, a startup building on LLMs does not need to train models. The abstraction is complete enough that the underlying technology has become, functionally, a utility.

The One Extra Ingredient

Palihapitiya's second insight is the actionable one. Give Facebook, Microsoft, Google, and Amazon the same 1,000 inputs — they produce the same machine learning model. But give one company one extra thing that none of the others have, and the output can be markedly different. His analogy: two great chefs with three ingredients versus a third chef with four. The fourth ingredient is not additive — it is transformative.

That extra ingredient, in practice, is proprietary data that cannot be replicated by training on public sources.

This is not the same as "having lots of data." Every company has data. The question is whether your data is unique, time-stamped to real events, and reflects patterns that the public internet does not contain.

Here is what that actually looks like across sectors:

Stripe processes approximately $1.4 trillion in payments annually across millions of businesses. The fraud patterns, payment failure modes, dispute rates by merchant category, and cross-business revenue benchmarks that exist in Stripe's dataset do not exist anywhere else at that resolution. When Stripe trains models on that data to predict churn, flag fraud, or recommend pricing, it produces outputs that GPT-4o cannot replicate — not because GPT-4o is less capable, but because it has never seen that data.

Bloomberg has 40 years of financial time-series data, earnings call transcripts, and real-time market signals tied to actual trades. Bloomberg's terminal is $24,000 per year per seat in 2026 because the data behind it has compounding value that public sources cannot match. Bloomberg Intelligence, their AI layer on top of that data, is not replicable by a startup calling the Gemini API.

Epic Systems holds electronic health records for roughly 280 million patients in the United States. The longitudinal outcome data — what treatments worked, what failed, what dosing protocols correlate with readmission rates — is decades of structured clinical information that exists in no public dataset. An AI model trained on Epic's data and used inside Epic's ecosystem for clinical decision support is not competitive with a model trained on PubMed. The data is the moat, not the model.

Palantir has operational data from defence, intelligence, and enterprise deployments going back to 2004. The edge cases, failure modes, and system-level behaviours that exist in classified and regulated environments are simply not in the training data of any public model. When Palantir integrates LLMs into its Gotham and AIP platforms, it wraps them in proprietary ontologies built from 20 years of hard-won operational context.

The pattern is the same in each case: a long-running data collection advantage, tied to real transactions or outcomes, in a domain where public alternatives are structurally absent.

The Collapse of Terminal Value

Palihapitiya has also written about a related concern he calls "the collapse of terminal value." The traditional way investors value a company is to project future earnings and discount them back to today. That projection relies on competitive moats lasting long enough to produce sustained earnings.

AI is compressing the time it takes to erode those moats. A feature that took three years to build in 2019 can be scaffolded in three months with AI assistance in 2026. Distribution advantages that depended on sales team size are less durable when AI agents can replace the top of the sales funnel. Brand advantages that depended on content volume are less durable when content production costs near zero.

The companies whose moats survive this compression are exactly the ones with the extra ingredient Chamath describes — data that cannot be generated, only accumulated. You cannot use AI to manufacture 20 years of Stripe transaction data. You cannot use Claude to synthesise 40 years of Bloomberg time-series. The data advantage is durable precisely because it is historical and proprietary, not because it is large.

This creates a clear dividing line in the AI economy: companies with irreplaceable data assets are building durable value. Companies without them are building features that will be commoditised by the next model release.

What This Means for Developers and Founders Building on AI Today

The refrigeration analogy has a direct implication for anyone building a product on top of LLM APIs right now.

If your entire product value comes from wrapping an LLM, you are the refrigeration patent holder, not Coca-Cola. The next model release may render your differentiation irrelevant. This is not hypothetical — it has already happened to dozens of AI writing tools that were genuinely differentiated in 2023 and are indistinguishable from a GPT wrapper by 2026.

The durable play is to use AI to accumulate the data that becomes the moat. Every interaction your product has with users is potentially the extra ingredient — if you collect it, structure it, and use it to train or fine-tune models that new entrants cannot replicate. This is how Duolingo built a durable product: not because their language learning app is technically complex, but because 500 million learners have generated a dataset of language acquisition patterns that no competitor can reproduce from scratch.

Domain specificity is more valuable than general capability. A model fine-tuned on 10 years of legal contract disputes is more useful to a law firm than GPT-4o, not because it is smarter overall, but because its output is calibrated to the edge cases and risk patterns that matter in that domain. The investment thesis for AI in 2026 is: who has the domain data and the distribution to deploy a model that a general-purpose provider cannot match?

The "one extra ingredient" framework is a product question, not a technology question. Founders should ask: what data does my product generate that nobody else can get? What patterns exist in my user behaviour, transactions, or outcomes that are not in any public training set? That data, accumulated over time, is the business. The LLM is the refrigeration.

Which Sectors Have the Extra Ingredient

The sectors where the Coca-Cola dynamic is most likely to play out:

Healthcare and clinical AI — patient outcome data, diagnostic imaging, treatment protocols. Epic, Cerner, and the large health systems are best positioned. Startups like Abridge (clinical documentation) are accumulating real clinical conversation data that is inherently scarce.

Financial services — transaction patterns, risk signals, portfolio behaviour. Stripe, Plaid, and the major prime brokers. Bloomberg and FactSet on the data side. Trading firms with proprietary order flow data.

Legal and compliance — contract language, dispute outcomes, regulatory interpretation. Harvey AI is building this by ingesting the internal knowledge of major law firms. The data accumulated through paid engagements is not available to competitors.

Industrial and manufacturing — sensor data from physical equipment, failure modes, supply chain signals. Palantir, Samsara, and the industrial IoT players. The time-series data from a manufacturing line running for a decade is worth more than any synthetic dataset.

Logistics and supply chain — real-time routing data, demand signals, carrier behaviour. Flexport, FourKites, project44. The proprietary signals in live shipment data are not reproducible from public sources.

Key Takeaways

Chamath Palihapitiya's refrigeration analogy: LLMs are the infrastructure, like refrigeration — the companies building models will make some money, but the real wealth goes to the Coca-Colas that use LLMs to build something nobody else can replicate
"AI model building is a money trap" — his exact words: no predictable improvement curve, no profitable unit economics, capital requirements measured in billions per year
The one extra ingredient is proprietary data: give every major tech company the same inputs and they produce the same model — one extra data source no one else has produces markedly different output
Real examples: Stripe ($1.4T annual payments data), Bloomberg (40 years of financial time-series), Epic (280M patient records), Palantir (20 years of classified operational context)
The collapse of terminal value: AI is compressing the time it takes to erode competitive moats — only data that cannot be generated, only accumulated, survives the compression
For founders and developers: if your entire product value comes from wrapping an LLM, you are the refrigeration patent holder; the durable play is using AI to accumulate the proprietary data that becomes the moat
Sectors most likely to produce Coca-Cola outcomes: healthcare AI (clinical outcome data), financial services (transaction and risk data), legal (contract and dispute data), industrial IoT (equipment sensor data), logistics (real-time supply chain signals)

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.