BriefGPT - AI 论文速递 ·

线性二次自适应控制的多任务表示学习的遗憾分析

💡 原文中文，约700字，阅读约需2分钟。

📝

内容提要

学习是强大工具，多任务表示学习中的遗憾数量级为O(根号(T/H))或O(根号(d_u d_theta) 根号(T) + T^(3/4)/H^(1/5))。多个代理的好处通过与单任务遗憾比较可见。在困难探索中，共享表示可减少特定任务参数数量。

🎯

关键要点

表示学习是一种强大的工具，能够在多个代理或领域上进行学习。
大多数保证表示学习的静态设置下成立，而动态设置下的协作操作更具挑战性。
在动态设置中，分析了线性二次控制中多任务表示学习的遗憾。
在良性探索情况下，代理在 T 个时间步后的遗憾数量级为 O(根号(T/H))。
在困难探索情况下，遗憾数量级为 O(根号(d_u d_theta) 根号(T) + T^(3/4)/H^(1/5))。
多个代理的好处通过与单任务遗憾比较可见。
在困难探索情况下，通过跨任务共享表示，有效的特定任务参数数量通常较小。

🏷️

标签

共享表示多个代理多任务多任务表示学习学习遗憾数量级

➡️

继续阅读

Q2 2026 earnings call: Remarks from our CEO
Read an edited transcript of Sundar Pichai’s remarks from the Q2 2026 Alphabe...
Tesla’s revenues are bouncing back, but profits are still weak
After a dismal two years of weakening demand, falling sales, and damage to it...
Django 6.1 release candidate 1 released
Django 6.1 release candidate 1 is now available. It represents the final oppo...
Price-hiked iPads are a little cheaper right now
A number of Apple products got more expensive last month, so we’re happy to f...
iOS code could reportedly let Apple cut off apps when users miss iPhone payments
Code found in an iOS 27 beta would allow Apple to put a financed iPhone in &#...
酷鸭数据美国CN2 云服务器测评，1核1G 5M 仅需14.85元/月
酷鸭数据美国洛杉矶VPS测评：2核4G 7M带宽，电信去回程走CN2，联通AS4837，移动CMIN2，三网直连延迟约173ms。性能中等，解锁Netfl...