Skip to main content

Deep Learning's Log-Linear Scaling Secrets

·38 words·1 min · Download pdf

Deep Learning Scaling is Predictable, Empirically - https://arxiv.org/abs/1712.00409

  1. Test loss reduces log linearly with training data size.

  2. Model parameters needs to be increased log linearly with training data size.

Roughly, ResNet parameters in millions = sqrt(data size)*2.6

Discussion