CivArchive
    Optimised Hunyuan/Skyreels/Wan 2.1 GGUF I2V + Upscale (Hunyuan LORA Compatible) (3060 12GBVRAM + 32gbRAM) - Wan2.1 I2V
    NSFW
    Preview 60460204
    Preview 60465812
    Preview 60466017

    If you run into any problems feel free to pm me on civitai/discord

    Hunyuan 720p I2V

    1316.72s 73F 688x800 22steps dpmpp_2m simple

    Hunyuan720pI2V Q6_K gguf (adjust as needed)
    https://huggingface.co/city96/HunyuanVideo-I2V-gguf/tree/main

    llava_llama3_vision
    https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/blob/main/split_files/clip_vision/llava_llama3_vision.safetensors

    clip_l (renamed to clip_hunyuan)
    https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files/text_encoders

    hunyuan_video_vae_bf16
    https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files/vae

    Python version 3.12.7 Cuda 12.6 Torch 2.6.0+cu126
    Triton windows: https://github.com/woct0rdho/triton-windows/releases
    Once you’ve downloaded the appropriate wheel file for your Python version, proceed to open your command prompt and navigate to the directory where the downloaded file is located. Then, run the following command:

    Through python_embeded

    python.exe -m pip install triton-3.2.0-(filename)
    python.exe -m pip install sageattention==1.0.6

    ------------------------------------------------------------------------------------------------------
    Wan2.1


    562.51s 512x512 uni_pc simple 33F
    12step & 8step split works as intended
    81F 1018.89s!
    81F 573.99s!
    8step Split 161F/10s (16fps) 512x512 uni_pc simple 6760.70seconds but it works! (metadata baked png posted)


    I got buzz to tip, post your creations to the workflow gallery or add the resource to your posts, Have fun!
    Wan2.1 I2V update published!
    49F
    512x512
    12step(2stage 6+6)
    Uni_pc
    Simple
    Seems like each lora I add +200-400s inference time
    33F 700-900s
    49F 1000-1500s


    Wan2.1 480p I2V /unet (Adjust as needed)
    https://huggingface.co/city96/Wan2.1-I2V-14B-480P-gguf/blob/main/wan2.1-i2v-14b-480p-Q6_K.gguf

    Clip vision /clip_vision
    https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/clip_vision/clip_vision_h.safetensors

    Vae /vae
    https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/vae/wan_2.1_vae.safetensors

    Text encoder /clip or /text_encoders
    https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/blob/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors

    (Optional) Upscale /upscale_models

    https://huggingface.co/lokCX/4x-Ultrasharp/blob/main/4x-UltraSharp.pth

    -----------------------------------------------------------------------------------------------------

    Skyreels

    Final barebones+ text weighted Hunyuan Lora compatibility update published
    831.61 seconds (NO US)
    932.07 seconds (NO US)
    published vids in showcase
    Could potentially work on 8GBVRAM or lower if you tinker with virtual_vram_gb on the UnetLoaderGGUFDisTorchMultiGPU custom node (if you have sufficient RAM GB)

    Stage 1 415.369 Stage 2 315.937 VAE 70.838 total 837.93seconds. Q6+6stepLORA+SmoothLORA+DollyLORA
    (I have defaulted to DPM++2M\Beta + Smooth LORA always (without for human-centric), AVG runtime: 700-900s 73F No US)

    Comfyui_MultiGPU = UnetLoaderGGUFDisTorchMultiGPU (image latent batch 4 flux-finetune Q8, replace gguf loader in txt2img workflow)
    Comfyui_KJNodes = TorchCompileModelHyVideo, Patch Sage Attention KJ, Patch Model Patcher Order (Add nodes>KJNodes>Experimental)

    ∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨∨
    https://huggingface.co/spacepxl/skyreels-i2v-smooth-lora
    ∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧∧

    Finetune the virtual_vram_gb to fit your requirements (I suggest looking at the Comfyui cmd for the distorch allocation values that show up after loading the model into SamplerCustom) or use normal Unet Loader (GGUF) with skyreels-hunyuan-I2V-Q?_


    1st load
    Prompt executed in 1662.22 seconds -587.365 seconds for upscale = 1075 seconds
    640x864
    73 frames (stable/generation time)
    Steps: 6-12 (Stage 1 6 steps + Stage 2 6 steps)
    cfg: 4.0
    Sampler: Euler
    Scheduler: Simple

    (Original Kijai WF https://huggingface.co/Kijai/SkyReels-V1-Hunyuan_comfy/blob/main/skyreels_hunyuan_I2V_native_example_01.json)


    Barebones I2V workflow with Upscaler, optimised on 306012GBVRAM + 32GBRAM
    Make sure you update comfyui, torch & cuda

    Run the update_comfyui.bat from the update folder

    Go back to your python_embeded folder

    Click on the file directory bar at the top, type cmd then hit enter

    In cmd type "python.exe -m pip install --upgrade torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu126"



    ∨∨ May ruin older workflows ∨∨

    Run the other update.bat if it still aint working: update_comfyui_and_python_dependencies.bat

    ∧∧ May ruin older workflows ∧∧



    Workflow Resources:
    Fast_Hunyuan Lora (models/lora): https://huggingface.co/Kijai/HunyuanVideo_comfy/blob/main/hyvideo_FastVideo_LoRA-fp8.safetensors

    GGUF Model (Switch the models to fit your requirements) (models/unet):

    https://huggingface.co/Kijai/SkyReels-V1-Hunyuan_comfy/blob/main/skyreels-hunyuan-I2V-Q6_K.gguf

    VAE model (models/vae): https://huggingface.co/Kijai/HunyuanVideo_comfy/blob/main/hunyuan_video_vae_bf16.safetensors

    Clip_l model (I renamed it to clip_hunyuan) (models/clip):

    https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/blob/main/split_files/text_encoders/clip_l.safetensors

    llava_llama3 model (models/clip):

    https://huggingface.co/calcuis/hunyuan-gguf/blob/main/llava_llama3_fp8_scaled.safetensors

    Upscale Model (models/upscale_models):

    https://huggingface.co/uwg/upscaler/blob/main/ESRGAN/4x-UltraSharp.pth

    Personal Generation Times

    after 1st load base gen runtimes(2Stage+Vae Decode):
    758.173 seconds
    704.589 seconds

    with suggested lora after 1st:
    779.494


    169F tests after 1st (No Load Test):
    OOM

    121F test after 1st+6stepLORA+smoothLORA (No Load Test):
    1st stage
    525.14s 1st iteration
    729.66s 2nd
    736.19s 3rd
    645.15s 4th
    665.55s 5th
    764.12s 6th/Average
    2nd stage
    81.90s 1st+2nd iteration
    OOM
    Instant requeue after oom runs from 2nd stage
    6.17s 1st Iteration
    113.74s 2nd+3rd
    222.92s 4th
    327.62s 5th
    282.29s 6th/Average
    VAE 128.309s

    97F tests I2V+6stepLora (posted in gallery) (no oom yet)
    1123s
    1013s

    Description

    Optimised Wan 2.1 480P GGUF I2V + Upscale (3060 12GBVRAM + 32gbRAM)

    FAQ

    Comments (16)

    dkain76Feb 27, 2025· 4 reactions
    CivitAI

    Trying to find the TorchCompileModelWanVideo.

    Reinstalled KJ Nodes nightly and still missing.

    tsolful
    Author
    Feb 27, 2025

    try cloning the repository again
    cd custom_nodes
    git clone https://github.com/kijai/ComfyUI-KJNodes
    pip install -r requirements.txt

    tsolful
    Author
    Feb 27, 2025

    or go into ComfyUI-KJNodes folder, type 'cmd' in the directory bar at the top
    pip install -r requirements.txt

    blakerabbitMar 24, 2025

    I got that node but it won't work, refuses to compile model

    tsolful
    Author
    Mar 24, 2025

    @blakerabbit try bypassing Patch Model Patcher Order

    CoCatgirlFeb 28, 2025· 2 reactions
    CivitAI

    i can't find nodes UnetLoaderGGUFAdvancedDisTorchMultiGPU

    WanImageToVideo

    tsolful
    Author
    Feb 28, 2025

    Update KJNodes

    UnvisualStudioMar 1, 2025· 3 reactions

    you need to install ComfyUI-GGUF and ComfyUI-MultiGPU
    from the Manager ;-)

    pumaai487Apr 8, 2025· 1 reaction

    @UnvisualStudio comfyui-gguf wasnt in missing nodes but getting it fixed this issue, thanks

    art365959589Mar 4, 2025· 3 reactions
    CivitAI

    Thank you. All work good and nice, but LORA not... I tried LORA Hunyuan for WAN its need work?

    All time error:

    lora key not loaded: transformer.single_blocks.8.linear2.lora_B.weight

    etc.

    How its fix, please?

    dirtysemMar 28, 2025· 1 reaction

    It's the same for me. Although in other projects, the same lora work.

    AirafindielMar 5, 2025· 6 reactions
    CivitAI

    No module named 'sageattention'

    CatzMar 10, 2025

    Bypass the Sage node. You need to have Triton installed to use it and it's a real pain. Optimised speed by 30% though

    relinquishedMar 19, 2025· 4 reactions

    Here is a pretty good tutorial on how to install triton, sage and more https://www.patreon.com/posts/easy-guide-sage-124253103

    gambikules858Mar 12, 2025· 3 reactions
    CivitAI

    node "Patch Model Patcher Order" = 99% Vram and 15min for 1 step -_- why ? Without 1 step = 20 sec

    tsolful
    Author
    Mar 21, 2025

    Patch Model Patcher Order = LORA compiling

    Workflows
    Wan Video

    Details

    Downloads
    1,895
    Platform
    CivitAI
    Platform Status
    Available
    Created
    2/27/2025
    Updated
    5/14/2026
    Deleted
    -

    Files

    optimisedHunyuanSkyreelsWan21GGUF_wan21I2V.zip

    optimisedSkyreelsWan21GGUFI2V_wan21.zip

    Mirrors