BriefGPT - AI 论文速递 ·

IHEval：评估语言模型遵循指令层次结构的能力

📝

内容提要

本研究解决了语言模型在遵循指令层次结构方面缺乏评估基准的问题，通过引入IHEval这一新基准，提供了包含3,538个示例的九项任务，涵盖指令优先级一致或冲突的情况。研究发现，现有语言模型在面对冲突指令时，表现显著下降，最优的开源模型在此情况下的准确率仅为48%，因此强调了未来优化的必要性。

🏷️

能力厚重，接入极轻：HarmonyOS 7 如何把鸿蒙生态入场门槛降到几行代码
在手机相册里选好一张图，拿着手机往电脑屏幕轻轻一碰，图片就直接出现在了《简讯》app「半角巷」的编辑框里。不用翻文件夹，也不用靠传输工具互联，《简讯》团队...
Artists are lawyering up against AI slop, and some are even winning
When The Atlantic published a searchable dataset of works used to train AI, K...
Microsoft Three-Layer LLM Routing Architecture for AI Agents on AKS
Microsoft has released a reference architecture for routing agent traffic on ...
OpenAI’s rogue AI agent didn’t stop at hacking Hugging Face
The AI agent that escaped from OpenAI and hacked developer platform Hugging F...
中科院院士对话北电数智AI专家：以 AI 与数学 “乘法效应” 开辟产业落地新路径
中科院、北电数智等专家共探数学与AI边界
Your Kubernetes health checks are accidentally waking your services. Here’s the fix.
Scale-to-zero breaks when health checks scale you back up. Learn how KubeElas...