Skip to content
Improve OpenAI streaming usage reporting | My AI Guide
App UpdateOpenClawv2026.4.19-beta.2

Improve OpenAI streaming usage reporting

By Harsh Desai
Share

TL;DR

Always send `stream_options.include_usage` on streaming requests to ensure local and custom OpenAI-compatible backends report accurate context usage instead of defaulting to 0%.

## What changed OpenClaw now forces stream_options.include_usage on every streaming request to OpenAI and OpenAI-compatible endpoints. The update shipped May 19 2026. Local and custom backends that previously returned zero context usage now report actual token counts.

No user configuration is required. The flag is added automatically in the request body for all streaming calls.

## Why it matters Accurate usage data removes a common blind spot for builders who route traffic between paid APIs and self-hosted models. Without the flag, token tracking stayed broken and bills could jump without clear signals.

The change strengthens OpenClaw's position as a reliable daily driver for mixed-provider setups. It pressures other agent tools to fix the same reporting gap or accept higher support load from confused users.

## How to use it Pull the latest OpenClaw release through the existing CLI command. Restart the agent and send a test streaming prompt to any connected backend. Check the response metadata or provider dashboard for non-zero usage values.

The fix works on current stable versions and requires no YAML edits or new environment variables.

## Watch for Stable usage numbers across Ollama, LM Studio, and custom proxies will confirm the fix holds. A provider that still ignores the flag or returns inflated counts would break the improvement. Expect similar default flags for other streaming options in the next release cycle.

Harshs take

This patch removes one source of hidden cost for anyone running OpenClaw with multiple backends. The real trade-off is that every streaming call now carries a small extra payload, which adds up if you keep high-frequency heartbeats active.

Solo operators who treat OpenClaw as a 24/7 assistant should treat the update as a reminder that token visibility is still their responsibility, not the tool's. Set provider-side hard limits today rather than waiting for the next surprise invoice.

Do this now: review the last 30 days of usage logs from every connected model and add alerts at 70 percent of your monthly budget.

by Harsh Desai

Source:myaiguide.co

About OpenClaw

View the full OpenClaw page →All OpenClaw updates

More from OpenClaw

  • Feature
    Expand QA-Lab with runtime parity scenarios

    Added comprehensive runtime parity tiers and token-efficiency artifacts to the QA-Lab, including specific checks for Codex-vs-Pi compatibility and tool fixture coverage.

  • App Update
    Update Node.js requirement and Pi packages

    Raised the minimum supported Node.js version to 22.19 and updated Pi packages to version 0.75.1 to ensure compatibility with the latest runtime features.

  • App Update
    Optimize Gateway startup and restart latency

    Reduced restart ready latency by overlapping startup logging and plugin-service initialization with channel sidecars while maintaining strict readiness gating.

Everything AI. One email.
Every Monday.

New tools. Model launches. Plugins. Repos. Tactics. The moves the sharpest builders are making right now, before everyone else.

No spam. Unsubscribe anytime.