Read our Quickstart Guide to Mochi on the Civitai Education Hub!
If you don't want to run it locally, you can try it out now on the Civitai Generator! Read the Guide to Video Generation in the Civitai Generator!
Mochi 1 preview, by creators https://www.genmo.ai, is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation.
This model dramatically closes the gap between closed and open video generation systems.
The model is released under a permissive Apache 2.0 license.
To get started with ComfyUI;
Update to the latest version of ComfyUI
Download Mochi model weights into
models/diffusion_modelsfolderMake sure a text encoder [1][2] is in your
models/clipfolderDownload the VAE to:
ComfyUI/models/vae
Mochi has native ComfyUI support, and will run on 12GB+ VRAM.
Github: https://github.com/genmoai/models
HuggingFace: https://huggingface.co/genmo/mochi-1-preview
Description
Scaled Text Encoder for lower vram usage
FAQ
Comments (27)
Want to try this one! see how it runs on my machines
Could you let me know how long a single generation takes on your computer? And which graphics card is used
On an RTX 4090 with the native ComfyUI it's about 3 minutes for me, but you can get faster with some of the new Mochi wrappers out there - check out the guide. On a 3060 with 12GB, you can expect about 15 minutes per image, but it's amazing that it even runs on 12GB, and it will get faster as things develop!
Hollywood is so dead. This is OUR world now.
I just watched the Genmo video... and wept.
A dream of 40yrs is coming true. What a time to be alive.
Two Minute Papers - What a time to be alive!
unfortunately i am getting an out of memory error at the VAE encoder. with a 12GB Vram Nvidia3060
I also get the same message, but the generation is still successful
@neznajka_na_lune for me the vae decode node gets purple and it stops
@skechtup Same for me, I reduced the resolution to 1/2 and now is working.
Can I run this with Forge or Auto?
did you find out?
To my knowledge, no, not at this time.
thank you very much, your description helped me a lot to understand how mochi works, currently trying a couple of stuffs.
this is what i've been waiting for! awesome !
Is this only the text2video model? is there a way to make img2video?
Not officially, but it can be done in ComfyUI. Officially, Genmo are working on img2video, apparently.
A ComfyUI workflow for img2vid would be great!
@theally any hint on "not official" workflow?
I tried LTX img to video. I even pulled a workflow from a tutorial. The guy in the video got it working, but it was very bad results for me. frankly it didn't seem to use the image at all. Could just be that I don't have a 3000$++ video card though, or that I actually used a unique image, and not an image of a human female.. Using human females as an example for what AI img/vid models can do, is a terrible example, since most visual AI models were trained on like, 50% human female images.. So that's like trying to challenge Einstein with a 1+1 math question..
I guess it's time to also upgrade my RAM 🙃32GB ram froze my PC for 10mins then GPU kicked in smoothly with 24GB Vram.
25mins total render.
It's super RAM intensive too, yup - but your final output was great! Worth it :)
Very cool so far. I am running it locally and wondering if anyone has some ideal settings they can share so far? Looking for how to increase duration and quality. Right now, I see that length is set to 43 in the default workflow, which equals a loop of about 3 seconds. Is there some kind of equation of how much length = actual seconds?
Amazing that we're seeing competitive open-weight txt2video models! Now just need to find where I put those H100s to start training LoRAs... 😭
@theally OP, Is there a node we can use that's similar to stable video diffusions "SVD img2vid_Conditioning" node that will allow us to do text+img2vid?
I made short animation via 4Gb VRAM 🤐
What did it cost you?
"If you don't want to run it locally,"
Only an idiot wouldn't 'want' to run it locally. If people don't it's either cuz they haven't been taught how, or cuz, thx to NVIDIA's proprietary gatekeeping BS, they can't afford a card good enough to do it locally.
Just wanted to clarify, that people really should try to do it locally if they have the hardware, cuz eventually, anything that can be locked behind insurmountable paywalls, will be.
So, get into it locally while you still can.
Details
Files
mochi1PreviewVideo_t5xxlFP8E4m3fnScaled.safetensors
Mirrors
mochi1PreviewVideo_t5xxlFP8E4m3fnScaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
model.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
mochi1PreviewVideo_t5xxlFP8E4m3fnScaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_scaled.safetensors
t5xxlFP8E4m3fnScaled.safetensors
t5xxlFP8E4m3fnScaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxlFP8E4m3fnScaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
model.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
flux_t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors
t5xxl_fp8_e4m3fn_scaled.safetensors