Sherlock GPT: Solving Mysteries One Word at a Time
·48 words·1 min
@ilyasut had a great argument on why next word prediction works so well as training objective:
Assume you describe murder mystery in prompt and end it with asking for the name of murderer. The next word completion should require model to understand the whole story, analyze… https://x.com/francoisfleuret/status/1659688067853066240