Google I/O 2026 (May 19-20): Gemini 3.1, Android 17, What to Watch

Abhishek GautamAbhishek Gautam6 min read
Google I/O 2026 (May 19-20): Gemini 3.1, Android 17, What to Watch

Quick summary

Google I/O 2026 is May 19-20 at Shoreline Amphitheater. Expect Gemini 3.1 Ultra, Android 17 developer preview, Project Astra updates, and TPU v7 Vertex AI access.

Google I/O 2026 is confirmed for May 19-20 at Shoreline Amphitheater in Mountain View. The keynote starts at 10 AM PT on May 19, with livestream available at io.google/2026. With 18 days until the event, the developer prep window is open now — and the signals from Google's Q1 2026 earnings call (Google Cloud at ~50% growth, Gemini driving API consumption at scale) suggest this is the I/O where Google moves from "catching up on AI" to "setting the agenda."

The short version of what to expect: Gemini 3.1 Ultra will be announced or staged for release, Android 17 developer preview will drop alongside the keynote, Project Astra gets a production upgrade, and TPU v7 (Trillium successor) access via Vertex AI is the infrastructure announcement that will get the least press coverage but has the most practical impact for developer cost structures.

What Google I/O 2026 Is and Why This One Matters

Google I/O is Google's annual developer conference. The keynote covers Google Search, Android, Chrome, Google Cloud, and increasingly AI infrastructure. Sessions run through May 20 covering Android development, Kubernetes, Firebase, and Gemini API integrations.

Why 2026's I/O is different from recent years: Google Cloud posted approximately 50% year-on-year growth in Q1 2026 — its fastest growth in years — driven specifically by Gemini API consumption and TPU v6 (Trillium) inference economics. That means Google is no longer defending a position; it's pressing an advantage. Announcements at this I/O will reflect a company that believes it has cost and capability superiority in AI infrastructure and is making moves accordingly.

For developers, this is the I/O where decisions made about Google Cloud vs. AWS vs. Azure for AI workloads in 2026-2027 will be shaped. The announcements will have multi-year implications.

Gemini 3.1: What to Expect

Gemini 3.1 Flash-Lite is already in production at $0.25 per million tokens with a 1 million token context window. The I/O announcement will almost certainly be Gemini 3.1 Ultra — the frontier model in the Gemini family.

Based on Google's typical announcement cadence and the signals from CEO Sundar Pichai's Q1 call language ("significant capability improvements coming in H1 2026"), here's what Gemini 3.1 Ultra is expected to bring:

Extended context window: Gemini 3.1 Pro already supports 1 million tokens. Ultra is expected to push toward 2-4 million tokens. For context: a 4 million token context window can hold approximately 3,000 pages of text or an entire mid-size codebase. This changes what's possible for long-context reasoning, legal document analysis, and full-repository code review tasks.

Native multimodal improvement: Project Astra (Google's multimodal agent) has been in preview. Gemini 3.1 Ultra is expected to bring Astra's real-time audio/video understanding into production API access. This means developers can build applications that process live video feeds, real-time audio transcription with contextual understanding, and multi-turn visual conversations without a separate vision model call.

SWE-bench and reasoning benchmarks: Google has been aggressive on coding benchmarks. Gemini 3.1 Ultra is expected to push SWE-bench scores above 75%, competing directly with Claude Opus 4.7 and GPT-5.5 in the agentic coding category that developers actually care about.

Pricing: Google has used each Gemini generation to cut prices. Flash-Lite at $0.25/M tokens is already a market-leading price for frontier-class models. Expect Ultra pricing to be positioned competitively against Claude Opus and GPT-5 — likely in the $12-20/M token output range.

Android 17: Developer Preview Alongside the Keynote

Android 17 developer preview is expected to drop alongside or immediately after the keynote. Key development-relevant changes based on pre-release signals:

On-device AI improvements: Android 17 expands Gemini Nano capabilities on-device. The practical impact for app developers: more on-device inference tasks can be handled without an API call, reducing latency and cost for common operations (text summarization, image captioning, voice-to-text, translation).

Predictive back navigation improvements: The predictive back animation API that shipped in Android 14/15 gets system-level support for custom transitions in Android 17. Apps that haven't adopted the API yet will face stronger system enforcement.

Large screen and foldable APIs: Google has been steadily improving the large-screen / foldable support story. Android 17 is expected to add more stable APIs for adaptive layouts, making the dual-screen development experience less fragmented.

Privacy and permissions: Android 17 introduces finer-grained media permissions (photo/video access scoped to specific albums, not the full gallery) and likely additional restrictions on background location access.

For developers: the priority before I/O is checking your target SDK against Android 17 preview behaviors. Apps targeting SDK 35+ will need to verify predictive back handling and the new media permission model.

TPU v7 and Vertex AI: The Infrastructure Announcement That Matters

The announcement most developers will skip in the keynote but should watch for: TPU v7 (codenamed "Ironwood" in internal references) access via Vertex AI.

TPU v6 (Trillium) gave Google vertically integrated inference economics that AWS and Azure cannot match without their own silicon — this was the stated driver of Google Cloud's 50% growth in Q1. TPU v7 is expected to double the memory bandwidth of v6, which is the primary constraint on large-model inference throughput.

For developers building on Google Cloud:

  • TPU v7 access via Vertex AI would give Gemini inference on the most capable available silicon
  • The cost-per-token economics should improve again (each TPU generation has brought price reductions)
  • JAX and PyTorch XLA support will be confirmed for v7 at I/O

The practical action if you're evaluating Google Cloud for AI workloads: the Gemini 3.1 Ultra announcement combined with TPU v7 availability is likely the session where the economics become compelling enough to justify a serious evaluation. Pre-I/O is the time to set up a GCP account, understand Vertex AI quotas, and have a test workload ready to benchmark on the new hardware.

Project Astra: From Preview to Production

Project Astra is Google's multimodal AI agent — capable of understanding live video, audio, and text simultaneously in a continuous conversation. It has been in limited preview since Google I/O 2024. The 2026 I/O is almost certainly where Astra moves into production availability.

What Astra enables that current APIs cannot do in a single call:

  • Real-time video stream analysis with conversational follow-up (describe what you see, answer questions about it, track changes over time)
  • Multi-modal live agent interactions (see + hear + respond in real time, not transcribe-then-process)
  • Computer use from a video input rather than screenshot-by-screenshot

For developers building AI assistants, the current model is: capture frame/screenshot → send to vision API → get response → next frame. Astra collapses this into a streaming continuous context. The latency improvement is meaningful for applications where real-time understanding matters.

The API surface for production Astra is expected to be a streaming WebSocket or gRPC endpoint on Vertex AI, compatible with existing Gemini API patterns.

Firebase and Developer Tooling Updates

I/O is always accompanied by Firebase updates. Signals from developer feedback and Firebase blog pre-announcements suggest:

Firestore vector search general availability: Firestore's native vector search (for semantic similarity queries without a separate vector database) is expected to hit GA at I/O. This matters for developers building RAG (retrieval-augmented generation) applications — you can store and query embeddings in the same Firestore database as your application data, without managing a separate Pinecone or Weaviate instance.

Gemini in Firebase Studio: Firebase Studio (the AI-assisted development environment) is expected to get Gemini 3.1 integration. The practical change: code completion, debugging suggestions, and deployment generation in the Firebase development environment using frontier model capabilities.

Firebase App Hosting stability: Firebase App Hosting (the framework-aware static and dynamic hosting for Next.js, Angular, etc.) launched in preview in 2024. Expect a GA announcement at I/O 2026 with SLA commitments.

How to Watch: I/O 2026 Schedule

  • May 19, 10:00 AM PT: Opening keynote (Sundar Pichai, followed by developer keynote)
  • May 19, 1:00 PM PT: Developer keynote (technical deep-dives on Android, Google Cloud, Firebase, AI)
  • May 19-20: Sessions available on demand at io.google/2026
  • Livestream: io.google/2026/watch — no registration required for streaming

Sessions most relevant to developers building AI applications:

  • "What's new in Gemini API" — day 1, morning
  • "Building with Vertex AI and TPU v7" — day 1, afternoon
  • "Android 17 for developers" — day 1
  • "Firestore vector search and Firebase AI extensions" — day 2
  • "Project Astra API: building multimodal agents" — day 2

Key Takeaways

  • Google I/O 2026: May 19-20, Shoreline Amphitheater Mountain View; keynote 10 AM PT May 19; livestream at io.google/2026 (no registration required)
  • Gemini 3.1 Ultra expected: 2-4M token context window, extended multimodal (Project Astra production), improved SWE-bench coding scores; pricing likely $12-20/M tokens output
  • Android 17 developer preview drops at I/O: On-device Gemini Nano expansion, predictive back enforcement, finer-grained media permissions; developers should check SDK 35 target compatibility before the preview
  • TPU v7 (Ironwood) on Vertex AI: Doubles v6 memory bandwidth; expected to reduce Gemini inference cost further; the infrastructure announcement with the most practical developer impact
  • Project Astra production: Real-time multimodal streaming API on Vertex AI; collapses screenshot-by-screenshot vision workflows into continuous video context
  • Firebase: Firestore vector search GA, Gemini 3.1 in Firebase Studio, Firebase App Hosting GA — all expected at I/O

For the Q1 earnings context showing why Google is pressing this advantage, read Big Tech Q1 2026: Meta +31%, Google Cloud +50%, Amazon Chips $20B. For the TSMC supply chain behind TPU v7 production, read TSMC Q1 2026: 58% Profit Jump, 4.17M Wafers, HBM4 Sold Out.

FAQ

Frequently Asked Questions

When is Google I/O 2026 and how can I watch it?

Google I/O 2026 is May 19-20, 2026 at Shoreline Amphitheater in Mountain View, California. The opening keynote starts at 10:00 AM PT on May 19. The developer keynote follows at approximately 1:00 PM PT. All sessions are available for free livestream at io.google/2026 — no registration required to watch. Sessions on Android 17, Gemini API, Vertex AI, and Firebase continue through May 20 and are available on demand after the event.

What will Google announce at I/O 2026?

Based on Q1 2026 earnings signals and Google's typical announcement cadence, the major expected announcements are: Gemini 3.1 Ultra (2-4 million token context window, production multimodal via Project Astra, competitive pricing against Claude and GPT-5.5), Android 17 developer preview (on-device Gemini Nano expansion, new media permission model, predictive back enforcement), TPU v7 (Ironwood) access on Vertex AI with improved inference economics, and Firebase updates including Firestore vector search general availability. The developer keynote on May 19 afternoon will have the most technically detailed content.

What is Project Astra and will it be available at Google I/O 2026?

Project Astra is Google's multimodal AI agent that can understand live video, audio, and text simultaneously in a continuous conversation — rather than processing individual screenshots or audio clips in isolation. It has been in limited preview since Google I/O 2024. At I/O 2026, it is expected to move into production availability as a streaming API on Vertex AI, likely a WebSocket or gRPC endpoint compatible with existing Gemini API patterns. For developers, the practical impact is replacing screenshot-by-screenshot vision workflows with continuous video context understanding.

How does TPU v7 affect developer costs on Google Cloud?

TPU v7 (codenamed Ironwood) is Google's next-generation AI chip expected to be announced at I/O 2026. It doubles the memory bandwidth of TPU v6 (Trillium), which is the primary constraint on large-model inference throughput. TPU v6 was already the driver behind Google Cloud's approximately 50% growth in Q1 2026 — its cost-per-token economics beat equivalent Nvidia GPU instances for compatible workloads. TPU v7 access via Vertex AI would improve those economics further. For developers building on Google Cloud, the announcement is the signal to benchmark Gemini inference costs against AWS Bedrock and Azure OpenAI Service prices.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.

Written by

Software Engineer based in Delhi, India. Writes about AI models, semiconductor supply chains, and tech geopolitics — covering the intersection of infrastructure and global events. 931+ posts cited by ChatGPT, Perplexity, and Gemini. Read in 167 countries.