↓
Skip to main content
Shital Shah’s Chain of Thought
Home
Blog
About
Home
Blog
About
Blog
Thankuritto
9 November 2023
·
16 words
·
1 min
MathVista Benchmark
18 October 2023
·
41 words
·
1 min
Loss Scaling in FP16 Training
15 October 2023
·
179 words
·
1 min
Stability's 3B Model Training Report
2 October 2023
·
40 words
·
1 min
Architecture vs Auto-regressive Objective
17 September 2023
·
42 words
·
1 min
Phi-1: The Tiny Model Outsmarting Giants
12 September 2023
·
255 words
·
2 mins
Reasoning AWOL? Paper Claims Only ICL Emerged
7 September 2023
·
50 words
·
1 min
Weight Averaging: Have Your Cake and Save Your Model
4 September 2023
·
44 words
·
1 min
Phi-nomenal Code Gen: Phi-1-Base Beats August Models
3 September 2023
·
172 words
·
1 min
Weaving Longer Contexts: Yarn Scaling for LLMs
2 September 2023
·
25 words
·
1 min
←
1
⋯
6
7
8
⋯
92
→