Nvidia Nemotron 3 Ultra: 500B MoE Agent Model at Computex 2026
Quick summary
At Computex June 3, 2026 Nvidia launched Nemotron 3 Ultra, a 500B-parameter MoE model for agents, plus Agent Toolkit. Vera Rubin in full production; open models continue.
Read next
- Claude Opus 4.8 Ships With Dynamic Workflows — Same $5/$25 API Price
- Anthropic Files Confidential IPO at $965B Val, $47B Revenue Run Rate
At COMPUTEX on June 3, 2026, Nvidia announced Nemotron 3 Ultra — a ~500-billion-parameter mixture-of-experts (MoE) model aimed at enterprise agents — alongside an Agent Toolkit to build, deploy, and run agents across AI factories, while confirming Vera Rubin platforms in full production.
Huang positioned agents as the next business computing pattern after mobile and cloud — software to sell, not just silicon.
What Is Nemotron 3 Ultra?
Nemotron 3 Ultra is Nvidia's frontier open-weights-class MoE for agent workflows — routing experts for tool use, planning, and long-horizon tasks instead of single-shot chat.
Press summaries cite:
- ~500B parameters (MoE — not all active per token)
- Lower running cost claims vs comparable closed models (vendor benchmark — verify on your workload)
- Continued open Nemotron releases for fine-tune and on-prem deploys
Pair with Nemotron 3 Super SWE-bench for the coding-specific line.
What Is the Nvidia Agent Toolkit?
The Agent Toolkit is software to standardize agent lifecycles on Nvidia infrastructure:
| Layer | Function |
|---|---|
| Build | Templates for tool-calling agents |
| Deploy | Hooks into NIM, clusters, edge RTX Spark |
| Operate | Telemetry aligned with AI factory ops |
It competes with Microsoft Windows Agent Framework, OpenAI Codex plugins, and Anthropic enterprise agents — all announced within the same June 2026 window.
Vera Rubin and Hardware Context
Vera Rubin entered full production with Spectrum-X Ethernet Photonics networking shipping — CoreWeave, Lambda, OCI named as adopters in trade press.
Developers should read this as Nvidia owning the full stack narrative:
- Train/serve on Rubin + Vera
- Orchestrate with Agent Toolkit + Nemotron 3 Ultra
- Prototype on N1X laptops (N1X keynote)
For export-control context, see BIS Blackwell loophole closed.
Compare API costs at LLM API Pricing and Claude vs ChatGPT.
Key Takeaways
- June 3, 2026 (COMPUTEX): Nemotron 3 Ultra (~500B MoE) for enterprise agents
- Agent Toolkit ships for build/deploy/run on Nvidia AI factories
- Vera Rubin in full production; Spectrum-X photonics networking production
- Software push matches Intel Xeon 6+ and Microsoft Build agent platforms same week
- For developers: benchmark tool-call latency + $/task on MoE — active params matter more than 500B headline
Sources
FAQ
Frequently Asked Questions
What is Nvidia Nemotron 3 Ultra?
Nemotron 3 Ultra is a mixture-of-experts AI model of roughly 500 billion parameters that Nvidia announced at Computex on June 3, 2026, positioned for enterprise agent workloads with claims of competitive running cost versus other frontier models.
What is the Nvidia Agent Toolkit announced in June 2026?
The Agent Toolkit is Nvidia software to help enterprises build, deploy, and operate AI agents across Nvidia AI factory infrastructure, announced alongside Nemotron 3 Ultra at Computex 2026.
Is Vera Rubin in production as of Computex 2026?
Yes. Jensen Huang said at GTC Taipei during Computex week that Vera Rubin platforms entered full production, with networking products such as Spectrum-X Ethernet Photonics also moving into production.
How does Nemotron 3 Ultra relate to Nemotron 3 Super?
Nemotron 3 Super targets open-weight coding benchmarks such as SWE-bench, while Nemotron 3 Ultra is a larger MoE model aimed at enterprise agent orchestration announced at Computex June 2026.
Free Weekly Briefing
The AI & Dev Briefing
One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.
No spam. Unsubscribe anytime.
More on AI Models
All posts →Claude Opus 4.8 Ships With Dynamic Workflows — Same $5/$25 API Price
Anthropic released Claude Opus 4.8 on May 28, 2026: dynamic workflows for Claude Code, effort controls, faster fast mode. API ID claude-opus-4-8 at unchanged $5/$25 per million tokens.
Anthropic Files Confidential IPO at $965B Val, $47B Revenue Run Rate
Anthropic confidentially filed with the SEC on June 1, 2026 at $965B valuation and $47B annualized revenue — beating OpenAI to the IPO starting line as SpaceX roadshows this week.
OpenAI Codex Hits 5M Weekly Users: Sites, 6 Role Plugins, AWS
OpenAI Codex reached 5M weekly users June 2, 2026. New Sites hosting, Annotations, six role plugins (62 apps), 20% knowledge workers. Plus GA on AWS Bedrock.
NVIDIA Nemotron 3 Super: 60% SWE-bench, Best Open Model for Code
NVIDIA Nemotron 3 Super hits 60.47% on SWE-bench — highest open-weight score ever. 120B total, 12B active, 1M context, 5x throughput vs GPT-OSS. Already in CodeRabbit and Greptile.
Written by
Software Engineer based in Delhi, India. Writes about AI models, semiconductor supply chains, and tech geopolitics — covering the intersection of infrastructure and global events. 795+ posts cited by ChatGPT, Perplexity, and Gemini. Read in 164 countries.
