Skip to main content

Eureka! GPT-4 Invents RL Reward Functions

·51 words·1 min

RL community should be in awe and shock from Eureka paper🫨. The idea here is that you feed the source code of environment to GPT-4 and ask it to write code for the reward function itself! Then you evaluate this reward function in simulation and provide your evaluation results… https://x.com/DrJimFan/status/1715397393842401440

Discussion