BriefGPT - AI 论文速递 ·

基于奖励引导的保守Q学习的乘车拼接与公共交通协调：离线训练与在线微调强化学习框架

📝

内容提要

本研究针对多模式交通网络中乘车拼接与公共交通之间的协调问题，提出了一种新颖的强化学习框架RG-CQL。通过离线训练和在线微调相结合的方法，我们显著提升了数据效率，并在现实案例研究中表明，该方法在乘车拼接与公共交通协同下的系统奖励表现比传统方法高出17%到22%。

🏷️

基于超1万肿瘤样本训练，哈佛医学院等提出泛癌症基础模型COMPASS，平均性能优于22种现有方法
COMPASS 首次将这一架构引入癌症转录组分析领域，通过利用免疫相关基因集，并建立：基因（gene）→ 基因集（gene set）→ 概念（concep...
Single-pass AI code isn’t dead, but “high-reasoning” is the next frontier
Ask an AI model what comes next after “bacon-double”, and the return is fairl...
Microsoft is building an AI stack it doesn’t fully own — on purpose
Microsoft and Mistral are deepening their partnership with a multibillion-dol...
Introducing the ChatGPT for small business program
OpenAI launches the ChatGPT for Small Businesses program, helping entrepreneu...
What’s new: Air gets more agents, local models, and Java/Kotlin code intelligence
The new release of JetBrains Air brings support for GitHub Copilot, OpenCode,...
Block built a Slack for AI agents — and gave each one its own passport
Block on Tuesday launched Buzz, a free, open-source workspace meant to give p...