Weight Averaging: Have Your Cake and Save Your Model
·44 words·1 min
If I am reading this paper right, you can avoid the degradation of base model performance after fine tuning on general tasks by simply averaging the weights of base model and fine tuned model!
Aka have your cake and eat it too!