Skip to main content

Lights, Camera, AIction: Tiny Model Makes Big Movies

·43 words·1 min · Download pdf

The generated example 2 min video is pretty amazing. The more surprising thing: the model is only 1.8B params, trained on ∼15M text- video pairs + ∼50M text-images in just 5 days! Imagine scaling the model size and dataset by 30X. https://x.com/_akhaliq/status/1575546841533497344

Discussion