Gemini 3.5 Flash on Replicate, GLM-5.2 trends on Hugging Face, and agent tools for daily use
TL;DR
Google released multiple Gemini models on Replicate while Hugging Face saw five new text and speech models climb the charts. Product Hunt featured dictation, ad agents, and visible agent builders alongside an Amazon OpenAI deal update and an IEEE training course.
What shipped
On 19 June several model releases and platform updates reached builders and business users at once. Google placed four Gemini variants on Replicate for immediate API calls. Trending models on Hugging Face, new Product Hunt tools, and one industry partnership change rounded out the day.
Hugging Face trending
Five models rose on Hugging Face Hub today, led by three text-generation entries and one text-to-speech model. The GGUF and FP8 formats make them ready for local runs or fine-tuning. One research paper on policy-adherent agents also appeared in the feed.
- •Gemma-4-12B GGUF model yuxinlu1 placed a 12B text-generation model in GGUF format on the Hub where it now trends. Vibe Builders can download it for offline inference or quick fine-tuning runs.
- •GLM-5.2-FP8 model zai-org released an FP8 version of GLM-5.2 that is climbing the text-generation charts. Developers can pull it through the transformers library for faster local testing.
- •Qwen3.6-27B GGUF model bytkim uploaded a 27B Qwen variant in GGUF format that is trending for text tasks. SMB owners gain a ready option for document or chat workloads without cloud calls.
- •Inflect-Nano-v1 model owensong posted a PyTorch text-to-speech model that is gaining traction on the Hub. Teams can test it for voice output in apps that need multiple languages.
- •LedgerAgent paper A new paper describes how to keep task states consistent in policy-bound customer-service agents. Builders can review the approach before adding tool-calling logic to their own agents.
Product Hunt picks
Seven tools appeared on Product Hunt, with four aimed at faster content or agent creation. Google Ads and Claude entries target ad decisions and live code previews. Two entries focus on visible agent operation and large file transfer.
- •Mutter AI Dictation The tool turns spoken thoughts into polished text without extra editing steps. SMB owners can dictate meeting notes or emails and receive clean drafts in one pass.
- •Ask Ad Manager Google Ads released a Gemini agent that answers performance questions and suggests next moves. Ad teams can query campaign data in plain language instead of building reports.
- •Claude Code Artifacts The feature lets users preview and share code changes live while they edit. Developers can show working demos to clients without deploying separate environments.
- •Unreal Engine 5.8 The update adds AI agents that help build game levels and behaviors. Studios can test agent-driven design loops inside the same editor they already use.
- •Foglamp The platform makes agent actions visible during runs so teams can watch and adjust them. Operators gain a dashboard view instead of tracing logs after failures.
Industry news
IEEE launched a virtual course on large language models for working engineers. MoonshotAI placed a free long-context coding model on OpenRouter that supports 262k tokens at no cost.
- •IEEE LLM course The new training program teaches engineers how to use large language models for code review and specification writing. Participants gain practical workflows they can apply to existing projects.
- •Kimi K2.7 Code model MoonshotAI released a free coding model on OpenRouter with 262k context and zero per-token fees. Developers can run extended programming tasks without paying for context length.
Other
Google added four Gemini models to Replicate in a single day, giving builders direct HTTP access to fast and reasoning-focused variants. The releases cover text, speech, and multimodal tasks. Existing Replicate token holders can call them immediately.
- •gemini-3.5-flash model Google released its fast multimodal model on Replicate for agent, coding, and long-context work. Vibe Builders can swap it into existing Replicate calls for quicker responses.
- •gemini-3.1-flash-tts model The text-to-speech version offers 30 voices across more than 70 languages on Replicate. Teams can add voice output to apps without managing separate speech services.
- •gemini-3.1-pro model Google placed its strongest reasoning model on Replicate with a new medium thinking level. Developers can test deeper analysis tasks against their current model stack.
- •gemini-3-flash model The speed-focused model with strong search and grounding landed on Replicate today. Builders gain a low-latency option for grounded answers inside production flows.
Replicate new models
- •gemini-3.5-flash dropped on Replicate today google/gemini-3.5-flash dropped on Replicate. Google's fast multimodal model with frontier reasoning across agents, coding, and long-context tasks. Vibe Builders can call this model directly via Replicate's HTTP API or the existing token in their stack.
- •gemini-3.1-flash-tts dropped on Replicate today google/gemini-3.1-flash-tts dropped on Replicate. Google's fast, expressive text-to-speech model with 30 voices and 70+ language support. Vibe Builders can call this model directly via Replicate's HTTP API or the existing token in their stack.
- •gemini-3.1-pro dropped on Replicate today google/gemini-3.1-pro dropped on Replicate. Google's most intelligent model, with improved reasoning and a new medium thinking level. Vibe Builders can call this model directly via Replicate's HTTP API or the existing token in their stack.
- •gemini-3-flash dropped on Replicate today google/gemini-3-flash dropped on Replicate. Google's most intelligent model built for speed with frontier intelligence, superior search, and grounding. Vibe Builders can call this model directly via Replicate's HTTP API or the existing token in their stack.
What this means for you
For Vibe Builders: You can now call Gemini 3.5 Flash and the TTS variant directly through Replicate without new accounts. Trending GGUF models on Hugging Face give you local text and voice options to test in the same week. Product Hunt tools like Mutter and Foglamp let you add dictation or visible agents to client projects today.
For Non-techies: Dictation tools on Product Hunt turn spoken notes into clean text for daily business writing. Google Ads agent answers campaign questions in plain language so you skip report building. Free long-context models on OpenRouter remove cost barriers for document or chat tasks.
For Developers: Four Gemini models on Replicate let you benchmark speed and reasoning against your current stack this week. GGUF and FP8 entries on Hugging Face plus the IEEE course supply concrete local and training paths. The LedgerAgent paper and Kimi K2.7 Code release give specific signals for policy tools and long-context coding before you integrate.
What to watch next
Track whether the new Gemini models on Replicate keep top spots in usage logs. Watch Hugging Face for fine-tunes of the trending GLM and Qwen entries. Check Product Hunt for follow-up launches from the visible-agent and dictation tools.
Harsh’s take
The day showed a clear split between hosted frontier access and local or low-cost options. Google pushed multiple Gemini variants to Replicate while smaller GGUF models climbed charts on Hugging Face, giving builders parallel paths instead of one dominant route. The Amazon film cancellation underlines how commercial deals can quietly shape public narratives around the same companies releasing the models.
The practical signal is to test one hosted Gemini call and one local GGUF download in the same workflow this week. Measure latency and cost on a single task before committing further resources.
by Harsh Desai
Sources
Hugging Face trending
- •gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUF by yuxinlu1 trends on HuggingFace
- •GLM-5.2-FP8 by zai-org trends on HuggingFace
- •Qwen3.6-27B-MTP-pi-tune-GGUF by bytkim trends on HuggingFace
- •Inflect-Nano-v1 by owensong trends on HuggingFace
- •LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Product Hunt picks
- •Mutter AI Dictation
- •QuackScreen
- •Ask Ad Manager by Google Ads
- •Claude Code Artifacts
- •Unreal Engine 5.8
- •Foglamp
- •just f
Industry news
Other
- •IEEE Rolls Out Large Language Models Virtual Training Course
- •MoonshotAI: Kimi K2.7 Code (free) now available on OpenRouter (262k context, $0.00/M in, $0.00/M out)
Replicate new models
More AI news
- Weekly DigestClaude Fable 5 model, Cursor cloud agents, and Codex CLI relays for daily builds
Claude Code rolled out the Fable 5 model and deep sub-agent nesting while Cursor and Codex added cloud execution and CLI handoffs across the week.
- FeaturePerplexity adds self-improving memory capabilities to its AI agents
Perplexity introduces self-improving memory capabilities for AI agents that learn and adapt from past interactions.
- FeatureLovable Build with URL links now reference public web pages
Lovable updated Build with URL links to reference public web pages alongside images. The feature uses the page layout, content, and styling to recreate or iterate on it.