Giant Antique Postage Stamp style editorial illustration for the news article: Perceptron Mk1 now available on OpenRouter (33k context, $0.15/M in, $1

OpenRouter launches Perceptron Mk1 with 33k context at $0.15/M input, $1.50/M output

By Harsh Desai12 May 2026

TL;DR

OpenRouter launched Perceptron Mk1, a vision-language model for video and embodied reasoning. It processes images and videos with 33k context at $0.15/M input and $1.50/M output.

What changed

Perceptron Mk1 is now available on OpenRouter. This vision-language model handles video and embodied reasoning with 33k context length. It processes image and video inputs alongside natural language queries for detailed visual understanding, priced at $0.15 per million input tokens and $1.50 per million output tokens.

Why it matters

Developers integrating video analysis gain a cost-effective option at $0.15/M input versus GPT-4V's higher rates on the same platform. Vibe Builders can generate detailed insights from video clips for creative projects. The 33k context supports longer sequences in embodied reasoning tasks.

What to watch for

Compare output quality against LLaVA models on OpenRouter for video queries. Test Perceptron Mk1 directly via the OpenRouter playground with a sample video input. Monitor usage limits and rate card updates on the model page.

Who this matters for

Vibe Builders: Use Perceptron Mk1 to extract detailed visual insights from video clips for your creative projects.
Developers: Integrate this cost-effective vision model for video analysis tasks at $0.15 per million input tokens.

Harsh’s take

Perceptron Mk1 enters a crowded vision-language market with a clear pricing advantage. By undercutting premium models on OpenRouter, it forces a shift in how teams approach video-heavy workflows. The 33k context window is sufficient for most short-form video reasoning tasks, making it a practical choice for developers who need to balance performance with operational costs.

Success depends on the model's ability to maintain accuracy during complex visual reasoning. Builders should prioritize benchmarking this model against existing LLaVA variants to determine if the cost savings justify a migration. If the reasoning quality holds up, this becomes a primary tool for high-volume visual data processing.

Focus on testing specific video input types to verify if the model meets your production requirements.

by Harsh Desai

Source:openrouter.ai