Jensen Huang Gave Karpathy the First NVIDIA DGX GB300: "You Were With Me Every Step"
Quick summary
NVIDIA CEO Jensen Huang personally delivered the world's first DGX Station GB300 to Andrej Karpathy with a handwritten note. 748GB memory, 20 petaflops, and one hint: it needs 20 amps.
Read next
- NVIDIA GTC 2026: Everything We Know Before Jensen Huang Takes the Stage
- NVIDIA GTC 2026: Jensen Huang Keynote March 16 — Vera Rubin, Feynman Chips, and Why Developers Should Watch
Andrej Karpathy knew something significant was coming. He had been told he would receive a secret gift, and the only hint was this: it requires 20 amps. "So I knew it had to be good," he posted on X.
What arrived was the world's first NVIDIA DGX Station GB300 — hand-delivered to his Palo Alto lab by Jensen Huang personally, with a handwritten note: "You were with me every step of the way."
Who Karpathy Is and Why This Moment Matters
Karpathy is not a passive recipient of hardware gifts. He is one of the five people who founded OpenAI in 2015 alongside Sam Altman, Ilya Sutskever, Greg Brockman, and Wojciech Zaremba. He built Tesla's Autopilot team from scratch as Director of AI. He left Tesla, rejoined OpenAI as Senior Director of AI, then left again in 2024 to focus on independent research and education.
His YouTube tutorials — "Neural Networks: Zero to Hero" — have been watched by millions of developers globally. His AutoResearch paper published in March 2026 outlined a framework for fully autonomous AI-driven research, earning coverage in Fortune and every major AI publication.
Jensen Huang's note — "You were with me every step of the way" — is not a generic thank-you. It is an acknowledgment that Karpathy's work on language models and neural scaling was intellectually foundational to the transformer era that made NVIDIA a $3 trillion company. Without transformers, demand for high-end GPU compute would be a fraction of what it is today.
What the DGX Station GB300 Actually Is
The DGX Station GB300 is not a gaming rig or a consumer GPU. It is a liquid-cooled AI supercomputer in a workstation form factor — data centre performance without a data centre.
The hardware:
- GPU: NVIDIA Blackwell Ultra (GB300 architecture) — the same generation as NVIDIA's HGX H200 server GPUs
- CPU: 72-core NVIDIA Grace — ARM-based, designed to pair with Blackwell GPUs
- Memory: 748GB of coherent unified memory — Grace CPU and Blackwell GPU share the same memory pool via NVLink-C2C, eliminating PCIe bandwidth bottlenecks
- AI compute: Up to 20 petaflops of FP8 performance
- Power: Requires a dedicated 20-amp circuit — which is exactly why the hint gave it away immediately
For context: 20 petaflops is more AI compute than the entire infrastructure of most mid-sized tech companies. A single DGX Station GB300 can run open-source models with up to 1 trillion parameters locally — models that previously required multi-GPU cloud clusters.
The NVLink-C2C Architecture: Why 748GB Unified Memory Is the Real Story
In a standard GPU workstation, the GPU has dedicated VRAM (80GB on an H100) and the CPU has separate system RAM. Data moves between them over PCIe at around 64 GB/s — a meaningful bottleneck for large model inference.
The DGX Station GB300 uses NVLink-C2C, connecting the Grace CPU and Blackwell Ultra GPU at 900 GB/s bidirectional bandwidth with a shared memory address space. From the software perspective, there is no "GPU memory" and "CPU memory" — there is one 748GB pool both processors access at full speed.
Practically: a 70B parameter model in BF16 occupies ~140GB. On a standard H100 workstation (80GB VRAM), running 70B requires quantisation or multi-GPU setups. On the DGX GB300, a 70B model fits with 608GB to spare. A 405B model fits. A 1T model fits at 4-bit quantisation.
For Karpathy's use case — autonomous agents running extended research loops — memory headroom is the critical spec. His AutoResearch framework involves agents maintaining long context windows, spawning sub-agents, and retaining state across multi-hour sessions. 748GB unified memory means those sessions do not hit memory walls.
"Dobby the House Elf Claw" — What Karpathy Is Actually Building
In his thank-you post, Karpathy described the DGX Station as "a beautiful, spacious home for my Dobby the House Elf claw."
Dobby is the name of his autonomous AI agent project — named after the Harry Potter house elf who, when freed, could act independently rather than being bound to commands. The "claw" refers to the agentic system itself: an agent capable of reaching out, taking actions, running experiments, and operating autonomously.
Running a sovereign AI agent — one that does not route every call through OpenAI's API or Anthropic's API — requires local compute that until the DGX GB300 was only available in data centres. Karpathy now has that compute on his desk.
This is the direction his research is pointing: AI agents that run unsupervised research loops, identify hypotheses, design experiments, execute them computationally, interpret results, and generate new hypotheses without human intervention between steps.
Jensen Huang's Personal Delivery: The Signal Behind the Gesture
Huang did not ship this via FedEx. He delivered it personally. NVIDIA's developer account confirmed on X: "Andrej Karpathy's lab has received the first DGX Station GB300 — a Dell Pro Max with GB300. We can't wait to see what you'll create."
The system is manufactured by Dell Technologies — NVIDIA designs the chips and platform specification; Dell builds and integrates the workstation. NVIDIA calls it the "Dell Pro Max with GB300."
Giving Karpathy the world's first unit is NVIDIA making a statement: your intellectual contribution is part of why this hardware exists.
What Developers Can Build on a DGX Station GB300
The DGX Station GB300 is available for pre-order through Dell Technologies. Based on previous generations — the DGX Station A100 retailed at approximately $150,000 — the GB300 is expected in the $200,000-$300,000 range for commercial purchasers.
What it unlocks for developers and researchers:
Run any open-source model locally: LLaMA 3 70B, Qwen 2.5 72B, DeepSeek R1 671B (4-bit), and emerging 1T models without cloud API dependency or per-token costs.
Long-context inference: 748GB unified memory supports 128K+ token context windows economically — not practical on cloud GPU instances at production scale.
Multi-agent systems: Multiple agents running simultaneously, each with their own model and context, sharing one memory pool. This is the architecture Karpathy's AutoResearch framework requires.
Private fine-tuning: Fine-tune a 70B model on proprietary data locally, without sending training data to a cloud provider — a compliance requirement for enterprise and defence applications.
Key Takeaways
- Jensen Huang personally delivered the world's first DGX Station GB300 to Karpathy in Palo Alto with a handwritten note: "You were with me every step of the way"
- The hint was 20 amps — the dedicated circuit requirement of the DGX GB300, which immediately told Karpathy it was serious hardware
- 748GB of unified coherent memory via NVLink-C2C at 900 GB/s — Grace CPU and Blackwell Ultra GPU share one memory pool, eliminating PCIe bottlenecks that limit standard GPU workstations
- 20 petaflops FP8 compute — enough to run 1 trillion parameter models locally without cloud infrastructure
- "Dobby the House Elf claw" is Karpathy's autonomous agent project — the DGX is the substrate for agents running unsupervised research loops without API dependency
- Manufactured by Dell Technologies (Dell Pro Max with GB300); expected commercial price $200K-$300K range; pre-orders open
- Karpathy co-founded OpenAI, built Tesla Autopilot, published AutoResearch in March 2026 — Huang's delivery is NVIDIA acknowledging that Karpathy's intellectual work helped make the GPU era commercially possible
Free Weekly Briefing
The AI & Dev Briefing
One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.
No spam. Unsubscribe anytime.
More on NVIDIA
All posts →NVIDIA GTC 2026: Everything We Know Before Jensen Huang Takes the Stage
NVIDIA's GTC 2026 is on March 16-19 in San Jose. Jensen Huang has teased 'a chip that will surprise the world.' Here's what's expected: the next GPU architecture, robotics announcements, and what it means for AI infrastructure.
NVIDIA GTC 2026: Jensen Huang Keynote March 16 — Vera Rubin, Feynman Chips, and Why Developers Should Watch
NVIDIA GTC 2026 keynote is confirmed for March 16, 2026 in San Jose. Jensen Huang has promised a chip that will surprise the world. Vera Rubin is coming in H2 2026 with 10x inference cost reduction. Feynman may get its first public reveal. Here is everything confirmed, expected, and why this matters for every developer building with AI.
Nvidia Just Stopped Making H200 Chips for China. Every GPU Allocation Is Now Going to Vera Rubin.
Nvidia halted all H200 production for China on March 5 and redirected TSMC capacity to Vera Rubin. Here is what this means for GPU supply, cloud pricing, and AI infrastructure in 2026.
Jensen Huang: Nvidia Will Stop Investing in OpenAI and Anthropic
Nvidia CEO Jensen Huang announced Nvidia will no longer invest in OpenAI or Anthropic. Here's why the chip giant is pulling back and what it means for the AI industry.
Written by
Abhishek Gautam
Full Stack Developer & Software Engineer based in Delhi, India. Building web applications and SaaS products with React, Next.js, Node.js, and TypeScript. 8+ projects deployed across 7+ countries.