DeepSpeed Profiler: The Fast and the FLOPs
DeepSpeed Profiler looks pretty amazing. It prints out per layer MACs, latency in forward/backward passes, FLOPs, throughput and wealth of information! https://github.com/microsoft/DeepSpeed/tree/master/deepspeed/profiling/flops_profiler