What is LongCat-Video-Avatar?
LongCat-Video-Avatar is an open-source unified model from Meituan’s LongCat team for expressive audio-driven character animation, supporting AT2V, ATI2V, and video continuation with natural lip-sync and dynamics.
When was LongCat-Video-Avatar released?
It was released on December 16, 2025, with model weights, code, and technical report made public on Hugging Face and GitHub.
Is LongCat-Video-Avatar free to use?
Yes, it’s completely free and open-source under MIT license, with full model weights and inference code available for download and modification.
What tasks does LongCat-Video-Avatar support?
It natively handles Audio-Text-to-Video (AT2V), Audio-Text-Image-to-Video (ATI2V), and Video Continuation for single or multi-person scenarios.
What hardware is needed for LongCat-Video-Avatar?
It requires powerful multi-GPU setup (e.g., A100/H100) with PyTorch 2.6+, FlashAttention, and high VRAM for efficient inference, especially long videos.
Does LongCat-Video-Avatar support multi-person generation?
Yes, it handles both single-person and multi-character/avatar scenarios with consistent identity and natural interactions.
Where can I download LongCat-Video-Avatar?
Model weights are on Hugging Face at meituan-longcat/LongCat-Video-Avatar; code and report on GitHub meituan-longcat/LongCat-Video.
What license does LongCat-Video-Avatar use?
It is released under the MIT License, allowing free use, modification, and commercial applications (with trademark/patent caveats).


