Skip to main content

Weight Averaging: Have Your Cake and Save Your Model

·44 words·1 min

If I am reading this paper right, you can avoid the degradation of base model performance after fine tuning on general tasks by simply averaging the weights of base model and fine tuned model!

Aka have your cake and eat it too!

https://arxiv.org/abs/2109.01903

Discussion