Google Gemini Nano on Android: What Developers Can Actually Build With It in 2026

Abhishek Gautam··9 min read

Quick summary

Gemini Nano is Google's on-device model for Pixel and Samsung Galaxy devices. Unlike Apple, Google has opened it via the Android AI Edge SDK. Here is what it can do, what it cannot, which devices support it, and where it actually makes sense in a real app.

Google took a different approach to on-device AI than Apple. Instead of keeping the foundation model closed, Google opened Gemini Nano to third-party developers through the Android AI Edge SDK. If you build Android apps and have not looked at this yet, here is what is actually available.

What Is Gemini Nano?

Gemini Nano is Google's smallest Gemini model — designed specifically to run on mobile hardware without a network connection. It shipped first on Pixel 8 Pro, then Pixel 9 series, and has been rolling out to Samsung Galaxy S24 and S25 series devices via Android's AICore system service.

It is not the same as the Gemini you access via the API or in the Gemini app. Those are larger cloud-hosted models (Gemini Pro, Gemini Ultra/1.5/2.0). Nano is smaller, quantised, and runs entirely on the device's NPU/GPU.

The Android AI Edge SDK

Google released the Android AI Edge SDK to give developers access to Gemini Nano and other on-device models via a consistent API. The key components:

Gemini Nano via AICore

AICore is a system-level service on supported Android devices that manages the Gemini Nano model. Your app requests inference through the Google Play Services AI APIs — you do not bundle the model yourself, AICore manages it. This keeps your APK small (the model is several GB; you do not want to ship that in your app).

The API is intentionally similar to the Gemini cloud API, so switching between on-device and cloud is mostly a model name swap.

MediaPipe LLM Inference

For more control — including running models other than Gemini Nano — MediaPipe's LLM Inference API lets you load any compatible model (Gemma 2B, Phi-2, Falcon, etc.) and run inference locally. You bundle the model file or download it on first run. This works on a wider range of devices since it does not depend on AICore.

What Gemini Nano Can Do Well

Based on developer testing and Google's own documentation, Gemini Nano performs well at:

  • Summarisation: Condensing long articles, emails, or documents into key points. This is the flagship use case and works reliably.
  • Reply suggestions: Generating short contextual replies to messages. Used in Gboard and Google Messages.
  • Text rewriting and tone adjustment: Paraphrasing, simplifying, or changing the register of text.
  • Simple classification and labelling: Categorising text into predefined categories, spam detection, sentiment.
  • Proofreading: Grammar and spelling correction with explanations.
  • Short generation tasks: Writing short descriptions, alt text, tags — tasks with well-bounded outputs.

What It Cannot Do (Honestly)

Nano is a small model. Compared to cloud-hosted Gemini or GPT-4 class models, it has real limitations:

  • No internet access or real-time knowledge: Nano has a training cutoff and no retrieval capability.
  • Weak on complex reasoning: Multi-step logic problems, code generation beyond simple snippets, and structured data extraction from long documents are unreliable.
  • Context window is limited: Current on-device context is much smaller than cloud models — roughly 2K–4K tokens depending on device.
  • Device support is narrow: AICore with Gemini Nano requires Pixel 6+, Pixel 8+, or Samsung Galaxy S24/S25. If your app targets a broad Android audience, you cannot assume Nano is available.

Practical Architecture Pattern

The sensible pattern for production apps is a hybrid with graceful fallback:

  • Check availability first: Verify whether Nano is accessible on the current device before calling inference.
  • Use Nano for latency-sensitive or offline tasks: Summaries, reply suggestions, local classification.
  • Fall back to cloud API for complex tasks or unsupported devices: Same Gemini API, just swap the model endpoint.
  • Never block the UI on inference: Nano inference takes 1–5 seconds depending on task length and device. Always run async, show a loading state.

Gemini Nano vs Apple on-Device AI: The Key Difference

Apple's approach: closed, high quality, consistent experience, zero developer access to the foundation model.

Google's approach: open API, more device fragmentation, less polished integration, but third-party developers can actually call it.

For pure user experience in Apple's apps, Apple's approach wins. For developers who want to build AI features into their own apps without routing everything through a cloud API, Google's approach is more useful right now.

Should You Use It?

Use Gemini Nano (via Android AI Edge SDK) if:

  • Your app has summarisation, reply suggestion, or text rewriting features
  • You need these features to work offline or with low latency
  • Your primary audience is on recent Pixel or Samsung devices
  • You want to avoid cloud API costs for lightweight inference tasks

Use the Gemini cloud API (or another provider) if:

  • You need complex reasoning, code generation, or large context windows
  • You need to support a wide range of Android devices
  • Latency of a network call is acceptable

The on-device and cloud APIs are intentionally similar, so building with Gemini Nano first and falling back to cloud is a sensible architecture for most use cases.

Free Tool

Will AI replace your job?

4 questions. Get a personalised developer risk score based on your stack, role, and what you actually build day to day.

Check Your AI Risk Score →
ShareX / TwitterLinkedIn

Written by

Abhishek Gautam

Full Stack Developer & Software Engineer based in Delhi, India. Building web applications and SaaS products with React, Next.js, Node.js, and TypeScript. 8+ projects deployed across 7+ countries.

Free Weekly Briefing

The AI & Dev Briefing

One honest email a week — what actually matters in AI and software engineering. No noise, no sponsored content. Read by developers across 30+ countries.

No spam. Unsubscribe anytime.