CivArchive
    Mochi 1 Preview - Video Model - T5XXL FP8 e4m3fn Scaled

    Read our Quickstart Guide to Mochi on the Civitai Education Hub!

    If you don't want to run it locally, you can try it out now on the Civitai Generator! Read the Guide to Video Generation in the Civitai Generator!

    Mochi 1 preview, by creators https://www.genmo.ai, is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation.

    This model dramatically closes the gap between closed and open video generation systems.

    The model is released under a permissive Apache 2.0 license.

    To get started with ComfyUI;

    1. Update to the latest version of ComfyUI

    2. Download Mochi model weights into models/diffusion_models folder

    3. Make sure a text encoder [1][2] is in your models/clip folder

    4. Download the VAE to: ComfyUI/models/vae


    Mochi has native ComfyUI support, and will run on 12GB+ VRAM.

    Github: https://github.com/genmoai/models

    HuggingFace: https://huggingface.co/genmo/mochi-1-preview

    Description

    Scaled Text Encoder for lower vram usage

    FAQ

    Comments (27)

    nai00digitalartNov 8, 2024· 1 reaction
    CivitAI

    Want to try this one! see how it runs on my machines

    AnselmoNov 8, 2024· 1 reaction
    CivitAI

    Could you let me know how long a single generation takes on your computer? And which graphics card is used

    theallyNov 8, 2024· 2 reactions

    On an RTX 4090 with the native ComfyUI it's about 3 minutes for me, but you can get faster with some of the new Mochi wrappers out there - check out the guide. On a 3060 with 12GB, you can expect about 15 minutes per image, but it's amazing that it even runs on 12GB, and it will get faster as things develop!

    Davros666Nov 9, 2024· 14 reactions
    CivitAI

    Hollywood is so dead. This is OUR world now.

    I just watched the Genmo video... and wept.

    A dream of 40yrs is coming true. What a time to be alive.

    CatzNov 12, 2024

    Two Minute Papers - What a time to be alive!

    skechtupNov 9, 2024· 1 reaction
    CivitAI

    unfortunately i am getting an out of memory error at the VAE encoder. with a 12GB Vram Nvidia3060

    neznajka_na_luneNov 10, 2024

    I also get the same message, but the generation is still successful

    skechtupNov 11, 2024

    @neznajka_na_lune for me the vae decode node gets purple and it stops

    rubcaNov 11, 2024· 1 reaction

    @skechtup Same for me, I reduced the resolution to 1/2 and now is working.

    elderscrollswiza3521Nov 10, 2024· 1 reaction
    CivitAI

    Can I run this with Forge or Auto?

    radiantreachxNov 11, 2024

    did you find out?

    theallyNov 11, 2024

    To my knowledge, no, not at this time.

    P0L0Nov 11, 2024
    CivitAI

    thank you very much, your description helped me a lot to understand how mochi works, currently trying a couple of stuffs.

    Mr_fries1111Nov 12, 2024· 1 reaction
    CivitAI

    this is what i've been waiting for! awesome !

    nomoreplayNov 12, 2024· 5 reactions
    CivitAI

    Is this only the text2video model? is there a way to make img2video?

    theallyNov 12, 2024

    Not officially, but it can be done in ComfyUI. Officially, Genmo are working on img2video, apparently.

    AIArtsChannelNov 15, 2024

    A ComfyUI workflow for img2vid would be great!

    mikeraftNov 16, 2024· 8 reactions

    @theally any hint on "not official" workflow?

    LazmanMar 18, 2025

    I tried LTX img to video. I even pulled a workflow from a tutorial. The guy in the video got it working, but it was very bad results for me. frankly it didn't seem to use the image at all. Could just be that I don't have a 3000$++ video card though, or that I actually used a unique image, and not an image of a human female.. Using human females as an example for what AI img/vid models can do, is a terrible example, since most visual AI models were trained on like, 50% human female images.. So that's like trying to challenge Einstein with a 1+1 math question..

    CatzNov 12, 2024· 3 reactions
    CivitAI

    I guess it's time to also upgrade my RAM 🙃32GB ram froze my PC for 10mins then GPU kicked in smoothly with 24GB Vram.
    25mins total render.

    theallyNov 13, 2024· 1 reaction

    It's super RAM intensive too, yup - but your final output was great! Worth it :)

    BleedyNov 13, 2024· 2 reactions
    CivitAI

    Very cool so far. I am running it locally and wondering if anyone has some ideal settings they can share so far? Looking for how to increase duration and quality. Right now, I see that length is set to 43 in the default workflow, which equals a loop of about 3 seconds. Is there some kind of equation of how much length = actual seconds?

    WeBeJavnNov 13, 2024· 6 reactions
    CivitAI

    Amazing that we're seeing competitive open-weight txt2video models! Now just need to find where I put those H100s to start training LoRAs... 😭

    TsterbtaNov 17, 2024· 1 reaction
    CivitAI

    @theally OP, Is there a node we can use that's similar to stable video diffusions "SVD img2vid_Conditioning" node that will allow us to do text+img2vid?

    mistporyvaevJan 18, 2025· 1 reaction
    CivitAI

    I made short animation via 4Gb VRAM 🤐

    ZorkgreyJun 26, 2025

    What did it cost you?

    LazmanMar 18, 2025· 4 reactions
    CivitAI

    "If you don't want to run it locally,"

    Only an idiot wouldn't 'want' to run it locally. If people don't it's either cuz they haven't been taught how, or cuz, thx to NVIDIA's proprietary gatekeeping BS, they can't afford a card good enough to do it locally.

    Just wanted to clarify, that people really should try to do it locally if they have the hardware, cuz eventually, anything that can be locked behind insurmountable paywalls, will be.

    So, get into it locally while you still can.

    Checkpoint
    Mochi

    Details

    Downloads
    1,151
    Platform
    CivitAI
    Platform Status
    Available
    Created
    11/7/2024
    Updated
    5/13/2026
    Deleted
    -

    Files

    mochi1PreviewVideo_t5xxlFP8E4m3fnScaled.safetensors

    Mirrors

    HuggingFace (51 mirrors)