7 OpenRouter Leaderboards Reshuffle with Hy3, Kimi K2.6, Claude Sonnet 4.6 Top
TL;DR
OpenRouter reshuffled 7 leaderboards for the week of 2026-05-11. Hy3 preview (free), Kimi K2.6, and Claude Sonnet 4.6 hold the top three across rankings.
A volatile week on OpenRouter. 6 rankings saw their #1 model change, including gpt-oss-safeguard-20b, Tencent's Hy3 Preview (free), Tencent's Hy3 preview. The field is churning fast: see per-area breakdowns.
LLM Leaderboard
The big picture: which models AI builders are paying to run this week.
- Hy3 preview (free) by Tencent: 3.24T tokens ↑25%
- Kimi K2.6 by Moonshot: 1.68T tokens ↑7%
- Claude Sonnet 4.6 by Anthropic: 1.45T tokens ↑7%
- Claude Opus 4.7 by Anthropic: 1.2T tokens ↑31%
- Gemini 3 Flash Preview by Google: 1.07T tokens ↑11%
Benchmark Leaders (AA Index)
Independent benchmark scores via Artificial Analysis. High AA Index = stronger reasoning, but cost matters too.
- GPT-5.5 (xhigh) by OpenAI: 60.2 AA Index
- Claude Opus 4.7 (Adaptive Reasoning, Max Effort) by Anthropic: 57.3 AA Index
- Gemini 3.1 Pro Preview by Google: 57.2 AA Index
- GPT-5.4 (xhigh) by OpenAI: 56.8 AA Index
- Kimi K2.6 by Moonshot: 53.9 AA Index
Fastest Models
Throughput champs: pick these for latency-sensitive apps where speed beats raw quality.
- gpt-oss-safeguard-20b
- gpt-oss-120b
- Qwen3 32B
- gpt-oss-20b
- Qwen3 235B A22B Instruct 2,507
Top Coding Models
If you're shipping coding workflows on OpenRouter, this is what other builders chose this week.
- Hy3 Preview (free) by Tencent: 2.31T tokens ↑23.6%
- Kimi K2.6 by Moonshot: 1.81T tokens ↑18.5%
- Claude Opus 4.7 by Anthropic: 500B tokens ↑5.1%
- Step 3.5 Flash by StepFun: 448B tokens ↑4.6%
- DeepSeek V4 Pro by DeepSeek: 417B tokens ↑4.3%
Top Models for English
Most-used models for English content this week: the multilingual leaders.
- Hy3 preview by Tencent: 336B tokens ↑8.7%
- Kimi K2.6 by Moonshot: 258B tokens ↑6.7%
- DeepSeek V4 Flash by DeepSeek: 216B tokens ↑5.6%
- Claude Sonnet 4.6 by Anthropic: 197B tokens ↑5.1%
- DeepSeek V3.2 by DeepSeek: 184B tokens ↑4.8%
Top Models for Python
Which models are getting picked for Python work right now.
- Hy3 preview by Tencent: 145B tokens ↑17.6%
- DeepSeek V4 Flash by DeepSeek: 49.9B tokens ↑6.1%
- Kimi K2.6 by Moonshot: 43.8B tokens ↑5.3%
- Claude Opus 4.7 by Anthropic: 40.6B tokens ↑4.9%
- DeepSeek V3.2 by DeepSeek: 40.3B tokens ↑4.9%
Top Models for short prompts (1K-10K tokens)
For short prompts (1K-10K tokens, the bulk of typical traffic), here's what builders chose.
- Gemini 2.5 Flash Lite by Google: 111M requests ↑9.4%
- Gemini 2.5 Flash by Google: 87.4M requests ↑7.4%
- Grok 4.1 Fast by X-ai: 83.6M requests ↑7.1%
- Gemini 3 Flash Preview by Google: 66.4M requests ↑5.6%
- gpt-oss-120b by OpenAI: 50.2M requests ↑4.3%
Top Models for Tool Calls
If your stack uses tool calls / function calling, these models are getting the most invocations.
- Hy3 Preview (free) by Tencent: 35.5M tokens ↑11.3%
- Gemini 3 Flash Preview by Google: 17.9M tokens ↑5.7%
- Kimi K2.6 by Moonshot: 16.9M tokens ↑5.4%
- Claude Sonnet 4.6 by Anthropic: 14.1M tokens ↑4.5%
- Gemini 2.5 Flash by Google: 13.7M tokens ↑4.4%
Top Image Models
Image-generation through OpenRouter: most-served models this week.
- Gemini 2.5 Flash Lite by Google: 187M images ↑32.6%
- Gemini 3 Flash Preview by Google: 55.1M images ↑9.6%
- Qwen3.5 397B A17B by Qwen: 42.5M images ↑7.4%
- Gemini 2.5 Flash by Google: 41.9M images ↑7.3%
- GPT-4.1 Mini by OpenAI: 26.4M images ↑4.6%
Top Audio-Input Models
Audio-input (transcription, voice-in) leaders.
- Gemini 3.1 Flash Lite Preview by Google: 2.58M prompts ↑45.4%
- Gemini 2.5 Flash by Google: 1.04M prompts ↑18.2%
- Gemini 3 Flash Preview by Google: 770K prompts ↑13.5%
- Gemini 3.1 Pro Preview by Google: 309K prompts ↑5.4%
- Gemini 2.0 Flash Lite by Google: 157K prompts ↑2.8%
Top Apps on OpenRouter
Useful as social proof when picking your stack: the largest public apps and agents that opt into OpenRouter usage tracking.
Most Popular
- OpenClaw (8.96T tokens): OpenClaw is an open-source AI agent that connects to your messaging apps and takes real actions on your behalf, from running commands and browsing the web to managing files and sending emails.
- Hermes Agent (6.47T tokens): Hermes Agent is an open-source, self-improving AI agent by Nous Research that runs persistently with memory across sessions, and builds reusable skills from experience. It comes with 40+ built-in tools, including web search, browser automation, and vision, plus scheduled automations and subagents.
- Kilo Code (5.41T tokens): Kilo Code is an open-source AI coding agent that works across VS Code, JetBrains, and CLI to help developers ship code faster with agentic workflows.
- Claude Code (3.05T tokens): Claude Code is Anthropic's agentic coding tool that reads your entire codebase, plans and executes changes across files, runs tests, and iterates on failures, all from natural language prompts.
Trending
Fastest-growing apps on OpenRouter this week: early signals of which builder workflows are breaking out.
- pi (312B tokens) ↑242%
- Hermes Agent (1.69T tokens) ↑6%
- Lemonade (172B tokens) ↑32%
- MinutaIA (39.7B tokens) ↑1,103%
- Roo Code (151B tokens) ↑25%
- One API (32.2B tokens) ↑357%
- PaperGen Terminus-2 Agent (19.7B tokens) ↑415,330%
- Portkey AI (31.5B tokens) ↑109%
What this means for builders
The OpenRouter rankings are the cleanest signal of what AI builders pay to run, not what vendor marketing claims. Watch the top three week-on-week: when they reshuffle, the field is volatile and your model choice has a short half-life.
Who this matters for
- Vibe Builders: Pick winners by real-world usage on OpenRouter, not vendor marketing.
- Developers: Token-share ratios are the cleanest production-fit signal for model selection.
Harsh’s take
OpenRouter rankings are the realest signal we have for what AI builders actually pay to run. Marketing pages claim everything; this table reflects production fit. Watch leadership changes for early indicators of where the field is consolidating.
by Harsh Desai
About OpenRouter
View the full OpenRouter page →All OpenRouter updatesMore from OpenRouter
- LaunchOpenRouter adds Tencent Hunyuan model with 262K context window
OpenRouter adds Tencent Hunyuan Hy3 preview, a mixture-of-experts model with 262K context window. Pricing is $0.07/M input tokens and $0.26/M output tokens with configurable reasoning levels.
- LaunchInclusionAI launches free Ring-2.6-1T (262k context) on OpenRouter
InclusionAI launches free Ring-2.6-1T on OpenRouter. The 1T-parameter-scale model uses 63B active parameters and supports coding agents and tool use.