Skip to content
Giant Antique Postage Stamp style editorial illustration for the news article: Google's Gemini 3.1 Flash Lite Model Now Available on OpenRouter
Model ReleaseGeminivgoogle/gemini-3.1-flash-liteVibe BuilderNon Technical

Google's Gemini 3.1 Flash Lite Model Now Available on OpenRouter

By Harsh Desai
Share

TL;DR

Google's lightweight Gemini 3.1 Flash Lite lands on OpenRouter at $0.25 per million input tokens with a roughly 1M-token context window.

What changed

Google added Gemini 3.1 Flash Lite to OpenRouter. The model handles roughly 1M-token context at $0.25 per million input tokens and $1.50 per million output. Existing OpenRouter integrations only need to switch the model ID to google/gemini-3.1-flash-lite to start using it.

Specs

  • Context window 1,049,576 tokens (~1M)
  • Pricing input $0.25 per million tokens
  • Pricing output $1.50 per million tokens
  • License proprietary
  • Model ID google/gemini-3.1-flash-lite
  • Vendor docs https://openrouter.ai/google/gemini-3.1-flash-lite

Why it matters

At $0.25/M input, Flash Lite undercuts GPT-4o-mini (~$0.15/M but smaller context) and Claude Haiku ($0.80/M) on the unit-cost-per-context-token metric. The 1M window lets you stuff an entire codebase, a multi-hundred-page PDF, or a year of customer-support transcripts into a single prompt without chunking. For high-volume tasks like classification, RAG, content scoring, and routing, the cost math now favours cheap-and-massive over premium-and-small.

What to watch for

Benchmark Flash Lite against Pro on your actual workload before swapping; lite tier trades raw reasoning for speed. Track OpenRouter rate limits as adoption climbs. Compare price-vs-context against DeepSeek V3.2, Qwen 3.5 Plus, and Kimi K2 before locking in a single provider for production.

Who this matters for

  • Vibe Builders: Feed an entire project history (codebase + chat transcripts + design notes) into one prompt for genuinely coherent output across a build session.
  • Basic Users: Run high-volume tasks (summarising long docs, classifying support tickets, batch translation) at a fraction of the cost of premium models.

Harshs take

Google is aggressively dumping cheap, high-context models into the ecosystem to choke out smaller competitors. At 25 cents per million tokens, Flash Lite forces a race to the bottom that turns proprietary model hosting into a commodity. This release is a clear signal Google wants to own the infrastructure layer for every developer building on OpenRouter.

If you ignore the cost efficiency here you are burning capital on inferior alternatives. The 1M context window is the only real differentiator worth noting; most developers hit token-limit pain when building complex automation, and this model solves that for pennies.

Stop overpaying for bloated models on simple tasks. Slot Flash Lite into classification, RAG, and routing today, or accept that your margins will keep shrinking against competitors who do.

by Harsh Desai

Source:openrouter.ai

About Gemini

View the full Gemini page →All Gemini updates

More AI news

Everything AI. One email.
Every Monday.

New tools. Model launches. Plugins. Repos. Tactics. The moves the sharpest builders are making right now, before everyone else.

No spam. Unsubscribe anytime.