Power of Inference Scaling
By how much do you need to scale up a single end-to-end model for Go that matches the performance of a smaller model with MCTS?
Ans: By ~100000X.
This was a super interesting insight from @polynoamial.
Noam Brown knew early on that winning poker was about scaling up training… continue reading