Faulty reward functions in the wild

Faulty reward functions in the wild

💡 原文英文,约300词,阅读约需1分钟。
📝

内容提要

Reinforcement learning algorithms can break in surprising, counterintuitive ways. In this post we’ll explore one failure mode, which is where you misspecify your reward function.

➡️

继续阅读