Gemini 3.5 Flash computer use, GLM 5.2 Fast on Wafer, and Hugging Face model trends

What shipped

On 24 June, model releases and platform updates focused on practical agent capabilities and faster inference. Vendors emphasized speed and cost benchmarks over broad claims. Industry coverage highlighted competitive pressure from lower-priced models and shifting researcher moves.

Hugging Face trending

Four models gained traction on the Hub for image, text, and action tasks. The releases include a 27B coder model and a 35B agent model from Qwen, plus a turbo image generator. A research paper on steerable vision-language-action models also appeared.

•Qwopus3.6-27B-Coder-Compat-MTP-GGUF Jackrong released a 27B image-text-to-text model that supports fine-tuning and inference on Hugging Face for coding workflows with visual inputs.
•Qwen-AgentWorld-35B-A3B Qwen released a 35B text-generation model on the Hub that targets agent-style tasks and runs via the transformers library for direct testing.
•Krea-2-Turbo Krea released a text-to-image model on the Hub built with diffusers that supports quick fine-tuning for custom image generation pipelines.
•InSight VLA framework A new paper describes a steerable vision-language-action model that learns manipulation skills beyond its training data for robotics use cases.

Vendor launches

Vercel added GLM 5.2 Fast to its AI Gateway with measured speed gains over other providers. Google introduced computer-use tools inside Gemini 3.5 Flash. NVIDIA reported infrastructure wins across supercomputers and AWS partnerships for scaled inference.

•GLM 5.2 Fast on AI Gateway Wafer made GLM 5.2 Fast available on Vercel with 170-200 tokens per second in small and large context tests, giving builders a faster option for tool-calling workloads.
•Computer use in Gemini 3.5 Flash Google added native computer-use tools to Gemini 3.5 Flash that let the model control screens and run tasks directly for automation scripts.
•NVIDIA TOP500 share NVIDIA systems now power 400 of the 500 fastest supercomputers, showing continued dominance in large-scale training clusters.
•NVIDIA and AWS collaboration NVIDIA and AWS expanded joint offerings for low-latency inference and vector search on OpenSearch and EC2, aimed at production AI deployments.

Replicate new models

Replicate added a lower-cost video model from Bytedance and an object detection model from Ultralytics. The video model handles multimodal inputs and native audio for high-volume generation.

•seedance-2.0-mini Bytedance released a lower-cost video generation model on Replicate that accepts multimodal inputs and produces native audio for batch content creation.
•yolo26 Ultralytics released YOLO26 on Replicate for image detection tasks with adjustable IoU and parameters that run via the HTTP API.
•seedance-2.0-mini on Replicate Bytedance made the video model available for direct calls through Replicate, letting builders integrate it into existing stacks without new infrastructure.

Product Hunt picks

Three AI tools appeared on Product Hunt for search, fitness, and coding topics. They target Slack integration, Apple Watch workouts, and agentic coding discussion.

•Nimt An AI search coworker launched that works inside Slack for team knowledge retrieval without leaving the chat.
•Off Autopilot A curated newsletter launched that collects human-written articles on agentic coding practices.

Industry news

Figma added code layers and AI plug-in support. Snowflake benchmarked GLM-5.2 against Claude Opus 4.7 at lower cost. OpenAI updated GPT-5.5 Instant for better intent handling. Reports noted researcher departures from Google and company efforts to control AI token spend.

•Facebook AI companion app Facebook began testing an AI companion app for creators that embeds its recent creator assistant for content planning.
•Figma update at Config 2026 Figma released code layers, animation support, and AI plug-in creation tools that turn the canvas into a fuller workspace.
•GLM-5.2 Snowflake benchmark Zhipu AI's GLM-5.2 matched Claude Opus 4.7 on 103 coding tasks at one-fifth the cost, though it used more tokens per task.
•Figma AI dependency Figma's new AI features rely on external API providers while one of those providers builds competing design tools.
•OpenAI GPT-5.5 Instant update OpenAI improved GPT-5.5 Instant with stronger multi-turn context and complex prompt handling for everyday ChatGPT use.
•Token spend controls Companies began limiting employee AI usage after small tasks drove up token budgets beyond forecasts.

Other

Databricks and AWS posted case studies on AI data work and agents. Klarna shared results from its LangGraph-based support assistant. Huntington Bank described redacting sensitive data across hundreds of millions of documents.

•Kythera Labs on Databricks Kythera Labs described an AI-native approach that finds answers already present in enterprise data without new model training.
•Huntington Bank redaction Huntington Bank used AWS tools to redact sensitive data from more than 400 million documents at scale.
•Healthcare appointment agent AWS showed how to build a healthcare appointment agent with Amazon Nova 2 Sonic for automated scheduling.
•Daikin data pipelines Daikin Applied Americas used Genie Code on Databricks to build consistent data pipelines with agentic engineering methods.
•Klarna AI assistant Klarna reported 80 percent faster resolution times after deploying a LangGraph and LangSmith support agent for 85 million users.

What this means for you

For Vibe Builders: You can now test GLM 5.2 Fast and Gemini 3.5 Flash computer-use tools directly through existing gateways and playgrounds. Trending Hugging Face models and the new Replicate video option give quick ways to add image and video generation without new infrastructure. Figma's code layers and plug-in tools let you ship design workflows that mix AI with manual control.

For Non-techies: Gemini 3.5 Flash now handles screen tasks while GLM 5.2 Fast offers faster responses at lower cost on Vercel. Figma's update adds motion and AI features that simplify everyday design work. Klarna-style assistants and Replicate video tools show how AI is moving into customer support and content creation for small teams.

For Developers: GLM 5.2 Fast delivered 170-200 tokens per second on Wafer, giving a concrete benchmark against other providers for tool-calling workloads. NVIDIA's AWS partnership and Replicate model drops provide direct paths to test scaled inference and video generation. Watch researcher moves and token-control policies as signals that production stacks will face tighter cost and reliability constraints this month.

What to watch next

Watch for production benchmarks on GLM 5.2 Fast versus Claude Opus 4.7. Track Figma plugin adoption and any new agent tools from Google or Vercel. Monitor Replicate run counts on seedance-2.0-mini for adoption signals.

Harsh’s take

The day showed clear price pressure from Chinese models and infrastructure wins for NVIDIA, yet most launches still rely on rented APIs rather than owned intelligence. This creates margin risk for platforms like Figma while giving builders faster options for testing. The second-order effect is tighter internal budgets on token spend and more selective model choices.

Builders should run a direct throughput test of GLM 5.2 Fast against their current stack this week and log any cost or latency gains before committing to new agent features.

by Harsh Desai

Sources

Hugging Face trending

Vendor launches

Replicate new models

Product Hunt picks

Industry news

Other

More AI news

Feature24 June 2026
AI21 Labs publishes vLLM debugging post on single token issue
AI21 Labs published a post examining a vLLM debugging case triggered by one token.
Feature24 June 2026
Lovable adds custom nameserver support for purchased domains
Lovable now supports custom nameservers for purchased domains, enabling external DNS management through services like Cloudflare while retaining registration.
Feature24 June 2026
Lovable adds private npm registry support for Enterprise workspaces
Lovable now lets Enterprise workspaces provision a private npm registry. Teams can publish internal packages and install them securely across projects.

TL;DR

What shipped

Hugging Face trending

Vendor launches

Replicate new models

Product Hunt picks

Industry news

Other

What this means for you

What to watch next

Harsh’s take

Sources

Hugging Face trending

Vendor launches

Replicate new models

Product Hunt picks

Industry news

Other

More AI news

Everything AI. One email.Every Monday.

Everything AI. One email.
Every Monday.