PyTorch Gold Rush: Unearthing Distributed Debugging Tips
·10 words·1 min
Literal goldmine of PyTorch distributed training debugging tips:
https://github.com/stas00/ml-engineering/blob/master/debug/pytorch.md