Add Gemini text-to-speech support
TL;DR
Integrate Gemini TTS into the bundled Google plugin, supporting voice selection, WAV/PCM output, and automated setup guidance.
## What changed OpenClaw added Gemini text-to-speech support to its bundled Google plugin on 18 May 2026. The update lets users pick from Gemini voices and output audio as WAV or PCM files. Automated setup guidance appears in the CLI during configuration.
The change ships as a free update for all self-hosted installs. No new pricing tier is required.
## Why it matters Voice output turns OpenClaw from a text-only responder into an agent that can deliver briefings or alerts as spoken audio inside the same messaging apps users already check. This reduces context switching for Vibe Builders who run the agent on WhatsApp or Telegram.
The move pressures closed cloud agents to match multimodal output quickly. It also bets that local control over voice generation will matter more than raw model quality for daily personal-assistant tasks.
## How to use it Pull the latest OpenClaw release through the existing CLI command. Run the Google plugin setup again and select Gemini as the TTS provider when prompted. Choose a voice name and set the output format to WAV or PCM in the YAML file.
Restart the agent. Test by sending a message that triggers a spoken reply. The feature works on any plan that already connects to Google models.
## Watch for Confirm the bet if users report reliable voice check-ins without extra token spend. Watch for quality drops or sudden API rate limits that break scheduled audio briefings. Expect a follow-up integration with at least one non-Google TTS option within the next quarter.
Harsh’s take
Gemini TTS inside OpenClaw gives a solo operator spoken reminders without opening another app, yet it still routes every word through Google's paid endpoint. The real trade-off is added latency and cost versus the simplicity of text-only heartbeats that already work.
Most Vibe Builders will hit the same wall they always do with self-hosted agents: setup friction plus unpredictable bills once audio generation starts running daily. Skip the feature if your current text flow already covers the job.
Update the Google plugin config this week and cap daily TTS calls at a fixed token budget before you turn it on.
by Harsh Desai
About OpenClaw
View the full OpenClaw page →All OpenClaw updatesGo deeper
More AI news
- FeatureHermes Agent verifies work with completion contracts and evidence ledgers
Hermes Agent records verification evidence for coding tasks. The /goal command uses completion contracts to judge success against test runs rather than model assertions.
- FeatureCursor adds cloud agent management to the Agents window
Cursor sets up cloud development environments in under 10 minutes, spins up isolated cloud subagents using /in-cloud, and hands off sessions between local and cloud.
- FeatureCursor introduces /automate skill for automating repetitive tasks
Cursor's new /automate skill creates automations from plain language. Workflows trigger via Slack emojis or GitHub events while cloud agents access virtual computers.