↓
Skip to main content
Shital Shah’s Chain of Thought
Home
Blog
About
Home
Blog
About
Twitter Post
MathVista Benchmark
18 October 2023
·
41 words
·
1 min
Stability's 3B Model Training Report
2 October 2023
·
40 words
·
1 min
Architecture vs Auto-regressive Objective
17 September 2023
·
42 words
·
1 min
Reasoning AWOL? Paper Claims Only ICL Emerged
7 September 2023
·
50 words
·
1 min
Weight Averaging: Have Your Cake and Save Your Model
4 September 2023
·
44 words
·
1 min
Weaving Longer Contexts: Yarn Scaling for LLMs
2 September 2023
·
25 words
·
1 min
Tensor Shape-Shifting: See Shapes Before Values
1 September 2023
·
33 words
·
1 min
Conquering FP16 Frustrations: One-Liner to FP8 Gold on H100s
24 August 2023
·
54 words
·
1 min
Creativity on Sale, Reasoning Still Full Price
14 August 2023
·
12 words
·
1 min
Out-Knuthing Knuth: The AGI Test
13 August 2023
·
37 words
·
1 min
←
1
⋯
3
4
5
⋯
79
→