The Lebowski Theorem: AI's Reward Function Hack
·23 words·1 min
“No superintelligent AI is going to bother with a task that is harder than hacking its reward function.” — The Lebowski theorem