↓
Skip to main content
Shital Shah’s Chain of Thought
Home
Blog
About
Home
Blog
About
Blog
Phi-nomenal Code Gen: Phi-1-Base Beats August Models
3 September 2023
·
172 words
·
1 min
Weaving Longer Contexts: Yarn Scaling for LLMs
2 September 2023
·
25 words
·
1 min
Tensor Shape-Shifting: See Shapes Before Values
1 September 2023
·
33 words
·
1 min
Tiny Weights, Mighty Models: The Mystery of Weight Decay
29 August 2023
·
308 words
·
2 mins
Conquering FP16 Frustrations: One-Liner to FP8 Gold on H100s
24 August 2023
·
54 words
·
1 min
Creativity on Sale, Reasoning Still Full Price
14 August 2023
·
12 words
·
1 min
Out-Knuthing Knuth: The AGI Test
13 August 2023
·
37 words
·
1 min
The Great 5G Disguise
13 August 2023
·
40 words
·
1 min
Learning Rates Without Fortune Telling
6 August 2023
·
415 words
·
2 mins
LLAMA's Math Leap: From 11% to 49% with Scaling
6 August 2023
·
424 words
·
2 mins
←
1
⋯
7
8
9
⋯
92
→