limbo | 全球大模型进展新闻站

Global foundation-model progress briefing

English Edition中文

Today / Thursday, June 25, 2026

limbo

Data updated

Jun 25, 09:51 AM

Live sources

Ingestion status

Database first

GPT Claude Gemini Llama Qwen Kimi GLM DeepSeek Mistral Grok Fable

Gemini is Google DeepMind's multimodal model family, covering text, images, audio, video, code, and search-related tasks. It is deeply connected to Google Search, Android, Workspace, Chrome, YouTube, and Google Cloud, so it is not just a model line but an AI infrastructure route across consumer products and cloud services.

When tracking Gemini, the key areas are video understanding, real-time multimodal interaction, mobile experiences, productivity-suite integration, and enterprise cloud deployment. Google's distribution is unusually strong, so capability improvements can spread quickly once they enter existing products.

Multimodal interaction

Working with text, images, voice, and video so AI can move beyond plain chat.

Video capability

Understanding video content, actions, scenes, and timelines for education, creation, and analysis.

Google ecosystem

Search, Android, Chrome, Workspace, and cloud services help model capabilities reach users quickly.

Cloud deployment

Using cloud APIs, hosted inference, and enterprise platforms to bring models into applications.

Latest / Gemini

Latest News

Frontier AI DeskJun 23, 02:00 AMHeat 99

Frontier multimodal model release puts video understanding and live voice in focus

The model shows stronger cross-modal reasoning, pushing real-time assistants, education, and creator tools into the next phase.

OpenAIGoogle DeepMindAnthropicGPT

United StatesOriginal

arXiv WatchJun 22, 06:00 PMHeat 85

New reasoning benchmark adds long-horizon planning and tool-verification tasks

Researchers argue multiple-choice tests no longer capture agentic systems, with new tasks closer to real workflows.

StanfordMITGPTClaude

GlobalOriginal