Gemini 3.5 Flash computer use, GLM 5.2 Fast on Wafer, and Hugging Face model trends
TL;DR
On 24 June, Gemini 3.5 Flash gained built-in computer use tools while GLM 5.2 Fast arrived on Vercel AI Gateway with strong throughput benchmarks. Hugging Face saw new trending models for coding and image tasks, Replicate added video and detection models, and Figma shipped code layers plus AI features.
What shipped
On 24 June, model releases and platform updates focused on practical agent capabilities and faster inference. Vendors emphasized speed and cost benchmarks over broad claims. Industry coverage highlighted competitive pressure from lower-priced models and shifting researcher moves.
Hugging Face trending
Four models gained traction on the Hub for image, text, and action tasks. The releases include a 27B coder model and a 35B agent model from Qwen, plus a turbo image generator. A research paper on steerable vision-language-action models also appeared.
- •Qwopus3.6-27B-Coder-Compat-MTP-GGUF Jackrong released a 27B image-text-to-text model that supports fine-tuning and inference on Hugging Face for coding workflows with visual inputs.
- •Qwen-AgentWorld-35B-A3B Qwen released a 35B text-generation model on the Hub that targets agent-style tasks and runs via the transformers library for direct testing.
- •Krea-2-Turbo Krea released a text-to-image model on the Hub built with diffusers that supports quick fine-tuning for custom image generation pipelines.
- •InSight VLA framework A new paper describes a steerable vision-language-action model that learns manipulation skills beyond its training data for robotics use cases.
Vendor launches
Vercel added GLM 5.2 Fast to its AI Gateway with measured speed gains over other providers. Google introduced computer-use tools inside Gemini 3.5 Flash. NVIDIA reported infrastructure wins across supercomputers and AWS partnerships for scaled inference.
- •GLM 5.2 Fast on AI Gateway Wafer made GLM 5.2 Fast available on Vercel with 170-200 tokens per second in small and large context tests, giving builders a faster option for tool-calling workloads.
- •Computer use in Gemini 3.5 Flash Google added native computer-use tools to Gemini 3.5 Flash that let the model control screens and run tasks directly for automation scripts.
- •NVIDIA TOP500 share NVIDIA systems now power 400 of the 500 fastest supercomputers, showing continued dominance in large-scale training clusters.
- •NVIDIA and AWS collaboration NVIDIA and AWS expanded joint offerings for low-latency inference and vector search on OpenSearch and EC2, aimed at production AI deployments.
Replicate new models
Replicate added a lower-cost video model from Bytedance and an object detection model from Ultralytics. The video model handles multimodal inputs and native audio for high-volume generation.
- •seedance-2.0-mini Bytedance released a lower-cost video generation model on Replicate that accepts multimodal inputs and produces native audio for batch content creation.
- •yolo26 Ultralytics released YOLO26 on Replicate for image detection tasks with adjustable IoU and parameters that run via the HTTP API.
- •seedance-2.0-mini on Replicate Bytedance made the video model available for direct calls through Replicate, letting builders integrate it into existing stacks without new infrastructure.
Product Hunt picks
Three AI tools appeared on Product Hunt for search, fitness, and coding topics. They target Slack integration, Apple Watch workouts, and agentic coding discussion.
- •Nimt An AI search coworker launched that works inside Slack for team knowledge retrieval without leaving the chat.
- •Off Autopilot A curated newsletter launched that collects human-written articles on agentic coding practices.
Industry news
Figma added code layers and AI plug-in support. Snowflake benchmarked GLM-5.2 against Claude Opus 4.7 at lower cost. OpenAI updated GPT-5.5 Instant for better intent handling. Reports noted researcher departures from Google and company efforts to control AI token spend.
- •Facebook AI companion app Facebook began testing an AI companion app for creators that embeds its recent creator assistant for content planning.
- •Figma update at Config 2026 Figma released code layers, animation support, and AI plug-in creation tools that turn the canvas into a fuller workspace.
- •GLM-5.2 Snowflake benchmark Zhipu AI's GLM-5.2 matched Claude Opus 4.7 on 103 coding tasks at one-fifth the cost, though it used more tokens per task.
- •Figma AI dependency Figma's new AI features rely on external API providers while one of those providers builds competing design tools.
- •OpenAI GPT-5.5 Instant update OpenAI improved GPT-5.5 Instant with stronger multi-turn context and complex prompt handling for everyday ChatGPT use.
- •Token spend controls Companies began limiting employee AI usage after small tasks drove up token budgets beyond forecasts.
Other
Databricks and AWS posted case studies on AI data work and agents. Klarna shared results from its LangGraph-based support assistant. Huntington Bank described redacting sensitive data across hundreds of millions of documents.
- •Kythera Labs on Databricks Kythera Labs described an AI-native approach that finds answers already present in enterprise data without new model training.
- •Huntington Bank redaction Huntington Bank used AWS tools to redact sensitive data from more than 400 million documents at scale.
- •Healthcare appointment agent AWS showed how to build a healthcare appointment agent with Amazon Nova 2 Sonic for automated scheduling.
- •Daikin data pipelines Daikin Applied Americas used Genie Code on Databricks to build consistent data pipelines with agentic engineering methods.
- •Klarna AI assistant Klarna reported 80 percent faster resolution times after deploying a LangGraph and LangSmith support agent for 85 million users.
What this means for you
For Vibe Builders: You can now test GLM 5.2 Fast and Gemini 3.5 Flash computer-use tools directly through existing gateways and playgrounds. Trending Hugging Face models and the new Replicate video option give quick ways to add image and video generation without new infrastructure. Figma's code layers and plug-in tools let you ship design workflows that mix AI with manual control.
For Non-techies: Gemini 3.5 Flash now handles screen tasks while GLM 5.2 Fast offers faster responses at lower cost on Vercel. Figma's update adds motion and AI features that simplify everyday design work. Klarna-style assistants and Replicate video tools show how AI is moving into customer support and content creation for small teams.
For Developers: GLM 5.2 Fast delivered 170-200 tokens per second on Wafer, giving a concrete benchmark against other providers for tool-calling workloads. NVIDIA's AWS partnership and Replicate model drops provide direct paths to test scaled inference and video generation. Watch researcher moves and token-control policies as signals that production stacks will face tighter cost and reliability constraints this month.
What to watch next
Watch for production benchmarks on GLM 5.2 Fast versus Claude Opus 4.7. Track Figma plugin adoption and any new agent tools from Google or Vercel. Monitor Replicate run counts on seedance-2.0-mini for adoption signals.
Harsh’s take
The day showed clear price pressure from Chinese models and infrastructure wins for NVIDIA, yet most launches still rely on rented APIs rather than owned intelligence. This creates margin risk for platforms like Figma while giving builders faster options for testing. The second-order effect is tighter internal budgets on token spend and more selective model choices.
Builders should run a direct throughput test of GLM 5.2 Fast against their current stack this week and log any cost or latency gains before committing to new agent features.
by Harsh Desai
Sources
Hugging Face trending
- •Qwopus3.6-27B-Coder-Compat-MTP-GGUF by Jackrong trends on HuggingFace
- •Qwen-AgentWorld-35B-A3B by Qwen trends on HuggingFace
- •Krea-2-Turbo by krea trends on HuggingFace
- •: Self-Guided Skill Acquisition via Steerable VLAs
Vendor launches
- •GLM 5.2 Fast via Wafer now available on AI Gateway
- •Introducing computer use in Gemini 3.5 Flash
- •Google Wallet makes TSA PreCheck Touchless ID available for more travelers
- •NVIDIA Powers Over 400 of the World’s 500 Fastest Supercomputers
- •NVIDIA and AWS Collaborate to Bring AI to Production at Scale
Replicate new models
Product Hunt picks
Industry news
- •Facebook rolls out an AI companion app for creators
- •Figma adds code layers, support for animations, more AI features in new update
- •Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost
- •Figma bets on human judgment at Config 2026 while the AI powering its canvas belongs to someone else
- •I Met With China’s Top AI Experts. They’re Freaking Out, Too
- •OpenAI says ChatGPT Instant now better understands what users actually want
- •Quoting Tom MacWright
- •AI was supposed to kill engineering jobs, but new data suggests they’re the most resilient
- •AI researchers continue to leave Google for its rivals
- •Companies are scrambling to stop employees from maxing out AI budgets with small tasks
- •A24 Knows You’re Mad About the Google AI Collab
Other
- •What if the answer was already in your data?
- •How IEEE Awardee Karen Panetta Became Bewitched by Engineering
- •Huntington Bank: Redacting sensitive data from 400M+ documents with AWS
- •Build a healthcare appointment agent with Amazon Nova 2 Sonic
- •How Daikin Applied Americas builds consistent data pipelines at scale with Genie Code
- •How Klarna's AI assistant redefined customer support at scale for 85 million active users
More AI news
- FeatureAI21 Labs publishes vLLM debugging post on single token issue
AI21 Labs published a post examining a vLLM debugging case triggered by one token.
- FeatureLovable adds custom nameserver support for purchased domains
Lovable now supports custom nameservers for purchased domains, enabling external DNS management through services like Cloudflare while retaining registration.
- FeatureLovable adds private npm registry support for Enterprise workspaces
Lovable now lets Enterprise workspaces provision a private npm registry. Teams can publish internal packages and install them securely across projects.