BriefGPT - AI 论文速递 ·

Embodied Evaluation: Assessing Multimodal Large Language Models as Embodied Agents

💡 原文英文，约100词，阅读约需1分钟。

📝

内容提要

本研究提出了EmbodiedEval评估基准，包含328个任务和125个3D场景，增强了多模态大型语言模型的评估多样性，揭示其在具身任务上的不足之处。

🎯

🏷️

What’s new: Air gets more agents, local models, and Java/Kotlin code intelligence
The new release of JetBrains Air brings support for GitHub Copilot, OpenCode,...
Why R&D Data Belongs in the Lakehouse - and Why Agents Need It There
The setupAt cellcentric, a joint venture of Daimler Truck and Volvo Group, we...
Google ships 3 new Gemini models. Just not the one everyone’s waiting for.
Google on Tuesday launched three new Gemini models: Gemini 3.6 Flash, a cheap...
Google launches a cheaper alternative to large AI security models like Mythos
Google is launching Gemini 3.6 Flash alongside a new security model dedicated...
Inside Roblox’s Bet on World Models
We sat down with Anupam Singh, senior vice president of engineering at Roblox...
The rise of the agent runtime: The compute platform behind production agents
The fast pace of AI research means organizations now have a wide range of mod...