↓
Skip to main content
Shital Shah’s Chain of Thought
Home
Blog
About
Home
Blog
About
Blog
Robots Now Speak Human: Scaled's GRID Launch
18 October 2023
·
46 words
·
1 min
Build AI Foundations at Microsoft Research!
17 October 2023
·
29 words
·
1 min
Gradients Gone Wild: Loss Scaling in FP16 Training
15 October 2023
·
179 words
·
1 min
OpenWebMath: 14B Tokens That Add Up
11 October 2023
·
29 words
·
1 min
AGI's Pop Quiz: TheoremQA Stumps AI
10 October 2023
·
45 words
·
1 min
One Ring to Train Them All: Scaling Transformers with Ring Attention
5 October 2023
·
42 words
·
1 min
Training Tidbits: Behind Stability's New 3B Model with WandB
2 October 2023
·
40 words
·
1 min
Strange PyTorch Bug Foils My Debugging Hack
30 September 2023
·
28 words
·
1 min
Bard's Unfiltered Image Chats: No Cherry-Picking!
26 September 2023
·
67 words
·
1 min
No Excuses: Triple Your LLM Training Speed with SwiGLU, ALiBi & μP
22 September 2023
·
149 words
·
1 min
←
1
⋯
12
13
14
⋯
100
→