What is LongCat-Video-Avatar?
LongCat-Video-Avatar is an open-source AI model from Meituan that generates expressive, lip synchronized avatar videos from audio, text, and reference images, supporting long sequences and multi character scenes.
Is LongCat-Video-Avatar free and open-source?
Yes, it’s completely free under the MIT License, with model weights downloadable from Hugging Face for local use and modification.
How do I use LongCat-Video-Avatar?
Clone the GitHub repo, set up a Conda environment with PyTorch and dependencies, download the weights via Hugging Face CLI, and run inference scripts with torchrun (requires high-end GPU(s)).
What makes it stand out for long videos?
It uses innovative techniques like Cross Chunk Latent Stitching and reference skip attention to maintain quality, identity, and natural motion over extended durations without repetition or drift.






