Hermes Agent v0.4.0 Adds OpenAI API, 6 Messaging Platforms
TL;DR
Hermes Agent v0.4.0 (March 24, 2026) adds an OpenAI-compatible /v1/chat/completions endpoint, six new messaging adapters (Signal, DingTalk, SMS, Mattermost, Matrix, Webhook), @file/@url context, and gateway prompt caching.
What changed
Hermes Agent v0.4.0 (The Platform Expansion Release) shipped March 24, 2026. Hermes now exposes a drop-in OpenAI-compatible /v1/chat/completions endpoint plus a /api/jobs REST API for cron job management. Six new messaging adapters (Signal, DingTalk, SMS via Twilio, Mattermost, Matrix, generic Webhook) bring the channel count to nine. Claude Code-style @file and @url context references with CLI tab completion inject content into the prompt automatically. Four new providers land: GitHub Copilot via OAuth, Alibaba DashScope, Kilo Code, and OpenCode Zen/Go. Per-session AIAgent gateway cache preserves Anthropic prompt cache across turns; streaming is on by default; 200+ bug fixes ship alongside.
Why it matters
The OpenAI-compatible endpoint converts Hermes from a standalone agent into a backend for any tool or SDK that already speaks OpenAI. Treating Copilot subscriptions as an inference provider lets you reuse credits you are already paying for. Gateway prompt caching across turns materially reduces the cost of long debugging or research sessions where the same context is referenced repeatedly. The expanded messaging surface (now nine channels) means the chat client your team already uses is almost certainly covered.
What to watch for
The OpenAI-compatible API server is opt-in: hermes api serve starts it on a configurable port; existing direct API usage continues unchanged. Validate that any existing code repointed to Hermes still receives the response shape it expects, especially around streaming and tool-call deltas. Confirm that the OAuth flow for GitHub Copilot, DashScope, Kilo Code, and OpenCode Zen completes cleanly in your environment before relying on them as fallbacks.
Who this matters for
- Developers: OpenAI-compatible /v1/chat/completions endpoint makes Hermes a drop-in replacement for any OpenAI call. GitHub Copilot OAuth, Alibaba DashScope, Kilo Code, and OpenCode Zen added as inference providers. Gateway prompt caching preserves Anthropic cache across turns.
Harsh’s take
v0.4.0 is the release where Hermes's strategic positioning becomes obvious. Exposing an OpenAI-compatible API endpoint is not just a convenience feature: every tool, library, SDK, and agent framework that speaks OpenAI can now use Hermes as its backend. Your Hermes install becomes a local AI platform that any existing OpenAI-compatible tool can adopt without code changes beyond a base URL.
The @file and @url context pattern borrowed from Claude Code is a smart ecosystem move; muscle memory transfers automatically. Gateway prompt caching preserving Anthropic cache across turns is the economics win that matters most for long debugging sessions: instead of paying full prompt cost for 50 turns over the same codebase, you pay incremental context. The GitHub Copilot OAuth provider is the detail worth calling out: routing agent work through your Copilot Pro subscription is effectively free additional model access for anyone already on that plan.
by Harsh Desai
About Hermes Agent
View the full Hermes Agent page →All Hermes Agent updatesMore AI news
- Daily RoundupVercel Sandbox drives, Gemma 4 QAT, and NVIDIA models trending on Hugging Face (agent tools today)
Vendors added persistent storage and quantized models while new text-to-speech and image models appeared on Hugging Face and Fal, with fresh agent tools listed on Product Hunt.
- Weekly DigestCursor enterprise orgs, Claude Code Bedrock auto mode, Codex Sites preview for quick deploys
Cursor added team management and design tools while Claude Code and Codex rolled out cloud integrations, safety checks, and deployment features across CLI and apps from 29 May to 5 June 2026.
- Daily RoundupGemini for Education hits Utah K-12, Nemotron 3 Ultra on Vercel, and agent workspaces from Product Hunt
Google expanded Gemini access in schools and infrastructure while Nvidia and Vercel enabled heavier agent models, and Product Hunt surfaced new local and collaborative agent tools.