OpenWebMath: 14B Tokens That Add Up
    
    
      ·29 words·1 min
    
    
    
  
  
  
    
  
        OpenWebMath: 14B tokens of high quality math documents extracted from Common Crawl! Training on it produces far better results on math benchmarks than training on Pile etc. https://x.com/keirp1/status/1711918424866361610