OpenAI Spud Day 4: Still Not Live — Polymarket at 78%

Abhishek GautamApril 17, 20265 min read

OpenAI Spud Day 4: Still Not Live — Polymarket at 78%

Quick summary

OpenAI Spud has not launched through April 17. Polymarket today window hits 78%. What the delay reveals about GPT-5 codename strategy and competitive pressure.

What We Know About Spud

The Spud codename first appeared in reliable internal leak channels on April 13 — the same day the Hormuz blockade activated and dominated global news cycles. That timing may not be coincidental.

OpenAI has a well-documented pattern of holding major releases when geopolitical news cycles are saturated. The reasoning is straightforward: a GPT-5 class release on April 13 would have competed for attention with the Iran blockade activation across every developer and tech outlet simultaneously. A delayed release into a calmer news cycle captures more sustained coverage.

The Spud leaks described a multimodal model with extended context window and significantly improved coding benchmarks compared to GPT-4o. Internal evaluation scores showed strong improvement on SWE-bench (the primary software engineering benchmark) — the specific benchmark where Claude 3.7 Sonnet currently leads.

Why the Delay Pattern Matters

Day 1 delay: possible technical hold.

Day 2 delay: news cycle management.

Day 3 delay: strategic timing.

Day 4 delay: something changed.

Four days is outside the normal OpenAI pre-release staging window. When OpenAI has a model ready to ship, they typically execute within 48 hours of the internal green-light. A four-day delay at the Spud stage suggests one of three scenarios.

Scenario A — Last-mile safety evaluation: OpenAI's safety team flagged something that required additional red-teaming. Given the improved coding capabilities in Spud, jailbreak evaluation for code generation takes time. This is the most procedurally normal scenario.

Scenario B — Competitive timing recalculation: Google DeepMind announced a Gemini update during the April 13-17 window. If Spud's benchmarks do not cleanly beat the updated Gemini on key metrics, releasing into a fresh Gemini announcement is suboptimal. OpenAI may be waiting for the Gemini coverage cycle to fade.

Scenario C — API infrastructure scaling: A GPT-5 class release requires significant API capacity pre-provisioning. The geopolitical uncertainty since April 13 may have delayed the final capacity allocation decision that had to happen before launch authorisation.

Polymarket participants currently assign Scenario A or C as most likely — both resolve today or tomorrow. Scenario B implies a longer delay, potentially into next week.

The Developer Impact Window

For developers, a four-day delay on a major model release creates a practical planning problem.

If you have been waiting to upgrade your API integration from GPT-4o to whatever Spud becomes — likely GPT-5 or GPT-4.5 in the public naming — you are now in the worst possible position: too late to plan around the old model, too early to build against the new one.

The practical advice: do not block any production deployment decisions on Spud's API availability. Build against GPT-4o or Claude 3.7 Sonnet depending on your use case. When Spud lands, the migration path from GPT-4o to GPT-5 will be API-compatible — OpenAI has maintained backward compatibility across every major model transition since GPT-3.5.

The Polymarket Signal

Polymarket prediction markets on AI model releases have been surprisingly accurate over the past 18 months. The 78% probability for an April 17 Spud launch reflects aggregated market information from participants who have stronger signals than public leaks — some combination of OpenAI employee option holder behaviour, infrastructure monitoring, and API probe analysis.

The market moved from 65% to 78% between 6am and 9am IST today. That move suggests new information entered the market in the past 6 hours that increased the probability of same-day launch. The most common catalyst for that type of move: OpenAI's internal status page showing the model in "launch ready" state, which certain participants monitor through API probing.

If Polymarket drops below 50% without a launch by 6pm US Pacific time, the probability shifts to an April 18-22 window.

What Spud Means for Claude and Gemini

The competitive dynamics are worth understanding clearly.

Claude 3.7 Sonnet currently leads on SWE-bench and extended context tasks. Anthropic's next release (internally expected in Q2 2026) is designed to maintain that lead. If Spud closes the SWE-bench gap significantly, Anthropic will likely accelerate its own release schedule.

For developers choosing between providers: the right time to evaluate Spud is after it has been live for 72 hours and independent benchmark replications have been published. First-party OpenAI benchmark numbers at launch should be treated as marketing materials, not engineering specifications. Wait for evals.ai, LMSYS, and independent replication before making infrastructure switching decisions.

Key Takeaways

OpenAI Spud still not launched through April 17, four days after first credible leaks — outside the normal 48-hour pre-release window
Polymarket sitting at 78% for same-day launch, up from 65% this morning — suggests internal "launch ready" signal may have been detected by market participants
Three scenarios for delay: last-mile safety evaluation (most likely), competitive timing vs Gemini, or API capacity scaling decision
Do not block production decisions on Spud: build against GPT-4o or Claude 3.7 Sonnet today; Spud-to-GPT5 migration will be API-compatible when it lands
Evaluate Spud 72 hours post-launch: wait for independent SWE-bench replications before making infrastructure switching decisions — first-party benchmarks are marketing

For the initial Spud delay analysis, read OpenAI Spud Not Launched: What the Delay Means for GPT-5 Competition. Compare current model capabilities at Claude vs ChatGPT. See live API pricing at LLM API Pricing.

FAQ

Frequently Asked Questions

Why has OpenAI Spud not launched yet as of April 17 2026?

Four-day delays are outside OpenAI's normal 48-hour pre-release staging pattern, which suggests something changed after the initial green-light. The three most likely explanations: last-mile safety evaluation flagging that required additional red-teaming, competitive timing recalculation waiting for Gemini coverage to fade, or API capacity scaling decisions delayed by geopolitical uncertainty. Polymarket at 78% suggests the market leans toward a same-day resolution, implying safety or capacity scenarios rather than a strategic multi-week hold.

What is the Polymarket probability for OpenAI Spud launching April 17?

Polymarket's "Spud launches today" contract for April 17 is at 78% as of the morning of April 17. This is up from 65% earlier in the day, suggesting new positive signal entered the market in the past 6 hours — likely from infrastructure monitoring or API probe analysis by market participants. If the contract drops below 50% without a launch by 6pm US Pacific, the probability shifts to an April 18-22 launch window.

Should I wait for OpenAI Spud before building my AI application?

No. Do not block production decisions on Spud's availability. Build against GPT-4o or Claude 3.7 Sonnet today — both are stable, well-documented, and production-proven. When Spud lands (expected as GPT-5 or similar), the migration from GPT-4o will be API-compatible. Evaluate Spud 72 hours after launch using independent benchmark replications, not first-party OpenAI numbers.

What are Spud's expected capabilities compared to GPT-4o and Claude?

Spud leaks describe a multimodal model with extended context window and significantly improved coding benchmarks. Internal evaluations showed strong SWE-bench improvement — the benchmark where Claude 3.7 Sonnet currently leads. Expect Spud to challenge Claude's SWE-bench position and improve on GPT-4o on agentic coding tasks. Independent replication of these benchmarks will be available within 72 hours of launch.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.