BriefGPT - AI 论文速递 ·

Can Large Language Models Be Trusted for Evaluating Retrieval-Augmented Generation Systems? A Survey of Methods and Datasets

💡 原文英文，约100词，阅读约需1分钟。

📝

内容提要

本研究探讨了检索增强生成（RAG）系统的评估方法，分析了63篇学术文章，提出了一种新颖的自动评估方法，并强调了域特定数据集在基准测试中的重要性，为RAG系统的评估提供了更严格的指导。

🎯

🏷️

5 Must-Read Resources for Mastering Small Language Models
Five resources covering SLM architecture, fine-tuning, agentic workflows, and...
“Stateful systems are incredibly hard to build”: How Perplexity thinks about AI agent sandboxes
Sandboxes for AI agents may feel like a solved problem. After all, projects l...
Gemini for macOS adds new natural language capabilities
Gemini for macOS language capabilities
How to Build AI Applications That Switch Models Automatically
Large Language Models (LLMs) have fundamentally changed how we build modern s...
CVPR 2026 | PixelDiT：用于图像生成的像素扩散变换器
潜空间建模已成为扩散 Transformer（DiT）的标准范式。然而，它依赖于一个两阶段的流程，其中预训练的自编码器会引入有损重建，导致误差累积并阻碍联...
中之杰智能发布德沃克X-Agent工业智能体“三剑客”产品矩阵
(全球TMT 2026年07月30日讯)浙江中之杰智能系统有限公司正式发布德沃克X-Agent工业智能体“三剑 […]