BriefGPT - AI 论文速递 ·

CodeRepoQA: A Large-scale Benchmark for Software Engineering Question Answering

💡 原文英文，约100词，阅读约需1分钟。

📝

内容提要

本研究提出了CodeRepoQA，这是一个用于评估软件工程领域代码库问答能力的大规模基准，包含来自30个知名GitHub代码库的585,687条问答记录，涉及五种编程语言，揭示了大语言模型在该领域的局限性。

🎯

关键要点

本研究提出了CodeRepoQA，这是一个用于评估软件工程领域代码库问答能力的大规模基准。
该基准数据集包含来自30个知名GitHub代码库的585,687条问答记录。
CodeRepoQA涵盖五种编程语言，涉及多样化的场景。
研究揭示了大语言模型在软件工程问答上的局限性。
中等长度的上下文更有利于模型的表现。

🏷️

标签

CodeRepoQA GitHub engineering 大语言模型软件工程问答能力

➡️

继续阅读

Automate custom PII detection at scale with Amazon Macie and Step Functions
Organizations in regulated industries like financial services, insurance, hea...
Presentation: From Copy-Paste to Composition: Building Agents Like Real Software
Jake Mannix discusses moving AI agents past chaotic "1970s BASIC" arc...
Session revocations at scale
How Canva keeps hundreds of millions of user sessions fast and secure
How Dow Built a Carbon Footprint Ledger on Databricks to Accelerate Sustainability at Scale
Why we built the Carbon Footprint LedgerAt Dow, our ambition is to be the mos...
Is retrieval engineering becoming AI’s next bottleneck?
Public AI assistants have become so commonplace that software vendors are inc...
Platform engineering for the agentic enterprise: Managing applications, resources, and AI agents
Platform engineering is evolving Platform engineering has become one of the d...