Pressed Ink Seal / Typewriter Imprint style editorial illustration for the news article: Hermes Agent v0.4.0 Adds OpenAI API, 6 Messaging Platforms

Hermes Agent v0.4.0 Adds OpenAI API, 6 Messaging Platforms

By Harsh Desai24 March 2026

TL;DR

Hermes Agent v0.4.0 (March 24, 2026) adds an OpenAI-compatible /v1/chat/completions endpoint, six new messaging adapters (Signal, DingTalk, SMS, Mattermost, Matrix, Webhook), @file/@url context, and gateway prompt caching.

What changed

Hermes Agent v0.4.0 (The Platform Expansion Release) shipped March 24, 2026. Hermes now exposes a drop-in OpenAI-compatible /v1/chat/completions endpoint plus a /api/jobs REST API for cron job management. Six new messaging adapters (Signal, DingTalk, SMS via Twilio, Mattermost, Matrix, generic Webhook) bring the channel count to nine. Claude Code-style @file and @url context references with CLI tab completion inject content into the prompt automatically. Four new providers land: GitHub Copilot via OAuth, Alibaba DashScope, Kilo Code, and OpenCode Zen/Go. Per-session AIAgent gateway cache preserves Anthropic prompt cache across turns; streaming is on by default; 200+ bug fixes ship alongside.

Why it matters

The OpenAI-compatible endpoint converts Hermes from a standalone agent into a backend for any tool or SDK that already speaks OpenAI. Treating Copilot subscriptions as an inference provider lets you reuse credits you are already paying for. Gateway prompt caching across turns materially reduces the cost of long debugging or research sessions where the same context is referenced repeatedly. The expanded messaging surface (now nine channels) means the chat client your team already uses is almost certainly covered.

What to watch for

The OpenAI-compatible API server is opt-in: hermes api serve starts it on a configurable port; existing direct API usage continues unchanged. Validate that any existing code repointed to Hermes still receives the response shape it expects, especially around streaming and tool-call deltas. Confirm that the OAuth flow for GitHub Copilot, DashScope, Kilo Code, and OpenCode Zen completes cleanly in your environment before relying on them as fallbacks.

Who this matters for

Developers: OpenAI-compatible /v1/chat/completions endpoint makes Hermes a drop-in replacement for any OpenAI call. GitHub Copilot OAuth, Alibaba DashScope, Kilo Code, and OpenCode Zen added as inference providers. Gateway prompt caching preserves Anthropic cache across turns.

Harsh’s take

v0.4.0 is the release where Hermes's strategic positioning becomes obvious. Exposing an OpenAI-compatible API endpoint is not just a convenience feature: every tool, library, SDK, and agent framework that speaks OpenAI can now use Hermes as its backend. Your Hermes install becomes a local AI platform that any existing OpenAI-compatible tool can adopt without code changes beyond a base URL.

The @file and @url context pattern borrowed from Claude Code is a smart ecosystem move; muscle memory transfers automatically. Gateway prompt caching preserving Anthropic cache across turns is the economics win that matters most for long debugging sessions: instead of paying full prompt cost for 50 turns over the same codebase, you pay incremental context. The GitHub Copilot OAuth provider is the detail worth calling out: routing agent work through your Copilot Pro subscription is effectively free additional model access for anyone already on that plan.

by Harsh Desai

Source:github.com