🆕 VELOPILOT v2 UNLIMITED - AI Character Replacement Video Pipeline — Changelog
Unlimited video length is here. This version removes the ~5 second cap: a new chained video extender (Wan Context Windows, 81/30 overlap, pyramid fusion) with EXTEND 10 SEC / 15 SEC groups continues generation seamlessly from the last frames of each segment. Per-frame ColorTransfer (Reinhard LAB) keeps color grading consistent across all segments — no more drift on long videos.
Also new: a second LLM pass that auto-writes SCAIL-2-optimized animation prompts straight from your driving video frames (motion, object interactions, environment, camera — toggleable); a ControlNet subgraph replacing DWPreprocessor for pose-driven mode; separate prompt fields for FLUX reference generation and animation; one-click Fast Groups Bypasser toggles for the LLM and extender; and a built-in resolution cheat sheet (512p / 704p official variants).
Same core pipeline as before: SAM3 tracking → FLUX.2 Klein 9B character reference → Wan 2.1 SCAIL-2 replacement → Wan 2.2 T2V detail refine → RIFE 32 FPS. Works with any character LoRA. ~20–24 GB VRAM recommended.
VELOPILOT v1 PILOT - AI Character Replacement Video Pipeline
================================================================================
Created by VeloPilot
This is my first public workflow release! Please be kind :)
Feedback and suggestions are very welcome.
WHAT IT DOES
================================================================================
A complete multi-stage pipeline for AI-driven character replacement in video:
1. Load any driving video and extract a reference frame
2. Generate a consistent character photo using FLUX.2 Klein 9B FP8
3. Animate via Wan 2.1 SCAIL-2 (motion transfer)
4. Refine details with Wan 2.2 T2V Low Noise
5. Upscale to 32 FPS with RIFE frame interpolation
The result is a seamless video of your character performing the original
subject's movements, with consistent appearance throughout all frames.
KEY FEATURES
================================================================================
[OK] Full character replacement in video with motion preservation
[OK] Pose/motion transfer to any character
[OK] Dual Wan pass: SCAIL-2 for animation + T2V for detail upscaling
[OK] Automatic prompt generation via LLM (Gemma 4) from reference image
[OK] RIFE frame interpolation to 32 FPS
[OK] SAM3 segmentation for precise masking
SYSTEM REQUIREMENTS
================================================================================
GPU: NVIDIA RTX 4090 / 5090 (20-24 GB VRAM recommended)
Minimum: ~16 GB VRAM (reduce length and disable LLM)
ComfyUI: v0.26.0 or newer
HOW TO USE
================================================================================
1. Install all custom nodes via ComfyUI Manager first
2. Download all required models (links below)
3. Load the workflow JSON in ComfyUI
4. Load your driving video in the VHS_LoadVideo node
5. Generate a reference character image (or load your own)
6. Click "Queue Prompt" and wait
REQUIRED CUSTOM NODES
================================================================================
1. ComfyUI-KJNodes
https://github.com/kijai/ComfyUI-KJNodes
2. rgthree-comfy
https://github.com/rgthree/rgthree-comfy
3. ComfyUI-Easy-Use
https://github.com/yolain/Comfyui-Easy-Use
4. ComfyUI-Custom-Scripts
https://github.com/pythongosssss/ComfyUI-Custom-Scripts
5. ComfyUI-VideoHelperSuite
https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite
6. ComfyUI-Frame-Interpolation
https://github.com/Fannovel16/ComfyUI-Frame-Interpolation
7. ComfyUI-SAM3
https://github.com/PozzettiAndrea/ComfyUI-SAM3
8. ComfyUI-ControlNet-Aux
https://github.com/Fannovel16/comfyui_controlnet_aux
9. ComfyUI-Workflow-Encrypt
https://github.com/jtydhr88/ComfyUI-Workflow-Encrypt
Note: SCAIL-2 nodes (WanSCAILToVideo, SCAIL2ColoredMask) are built
into ComfyUI core since v0.26.0 - no separate install needed.
REQUIRED MODELS
================================================================================
diffusion_models (place in ComfyUI/models/diffusion_models/):
[1] flux-2-klein-9b-fp8.safetensors
[2] wan2.1_14B_SCAIL_2_fp16.safetensors
[3] wan2.2_t2v_low_noise_14B_fp16.safetensors
checkpoints (place in ComfyUI/models/checkpoints/):
[4] sam3.1_multiplex_fp16.safetensors
https://huggingface.co/Comfy-Org/sam3.1/resolve/main/checkpoints/sam3.1_multiplex_fp16.safetensors
vae (place in ComfyUI/models/vae/):
[5] flux2-vae.safetensors
https://huggingface.co/Comfy-Org/flux2-dev/resolve/main/split_files/vae/flux2-vae.safetensors
[6] wan_2.1_vae.safetensors
text_encoders (place in ComfyUI/models/text_encoders/):
[7] qwen_3_8b_fp8mixed.safetensors
[8] nsfw_wan_umt5-xxl_bf16.safetensors
[9] gemma-4-E4B-it-ultra-uncensored-heretic-fp8.safetensors
clip_vision (place in ComfyUI/models/clip_vision/):
[10] clip_vision_h.safetensors
loras (place in ComfyUI/models/loras/):
[11] Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64.safetensors
[12] wan2.2_t2v_lightx2v_4steps_lora_v1.1_low_noise.safetensors
Auto-downloaded models (no manual download needed):
rife49.pth - auto-downloaded by ComfyUI-Frame-Interpolation
yolox_l.onnx - auto-downloaded by comfyui_controlnet_aux
dw-ll_ucoco_384_bs5.torchscript.pt - auto-downloaded by comfyui_controlnet_aux
IMPORTANT NOTES
================================================================================
[*] Custom LoRAs: The workflow includes private character LoRAs
(AL1TA_LowNoise for Wan, FLUX 2 AL1TA* for Flux). If you don't
have them, simply bypass the "YOUR CHARACTER LORA / BYPASS"
nodes in Steps 2 and 3. The workflow works with any LoRA.
[*] LLM Pass: The "ENABLE LLM DESCRIPTION" group uses Gemma 4 to
auto-generate prompts from your reference image. Disable it if
you're low on VRAM or prefer writing prompts manually.
[*] SCAIL-2 Modes:
replacement_mode = true -> Full character replacement
replacement_mode = false -> Motion/pose transfer only
[*] Frame Interpolation: RIFE upscales output to 32 FPS. Adjust
the multiplier in RIFE nodes for different speeds.
[*] VRAM: ~20-24 GB on RTX 5090 32GB. For lower VRAM reduce
"length" in WanSCAILToVideo and disable the LLM group.
TROUBLESHOOTING
================================================================================
[1] Update ComfyUI to latest
https://docs.comfy.org/installation/update_comfyui
[2] Update all custom nodes via ComfyUI Manager
[3] Verify all model files are in correct directories
[4] Out of VRAM? Reduce length in WanSCAILToVideo (node 143)
and disable the LLM group
DISCLAIMER
================================================================================
This workflow is provided as-is for educational and creative purposes.
The author assumes no responsibility for any content generated with it.
Users are solely responsible for ensuring their use of all included
models, LoRAs, and custom nodes complies with their respective licenses
and terms of distribution. Third-party models retain their original
licenses - refer to each model's page on Hugging Face or GitHub for
full terms.
================================================================================
Made by Velopilot - AI Content Creator
CivitAI: https://civarchive.com/user/Velopilot
================================================================================
Description
🎬 ANIMATION by VELOPILOT v2 — UNLIMITED
AI character replacement in video — now with unlimited duration.
Swap any person in any video with YOUR character (LoRA-based), keeping the original motion, environment, and camera work. Built on Wan 2.1 SCAIL-2 (end-to-end character animation by zai-org) + FLUX.2 Klein 9B for reference generation.
---
🆕 What's new in v2 UNLIMITED
⏱️ Unlimited video length
The biggest one. v1 was capped at ~5 seconds (81 frames). v2 adds a chained video extender built on Wan Context Windows (window 81 / overlap 30, pyramid fusion):
Dedicated EXTEND 10 SEC and EXTEND 15 SEC groups — each new segment continues seamlessly from the last frames of the previous one
One-click VIDEO EXTENDER toggle (Fast Groups Bypasser) — turn extension on/off without rewiring anything
* Smart batch stitching with overlap trimming — no duplicated or frozen frames at segment joints
🎨 Color drift correction
Long AI videos slowly shift in color. v2 runs per-frame ColorTransfer (Reinhard LAB) on every extended segment, matching it back to the first segment — consistent color grading across the entire video, no matter how long.
🤖 Second LLM pass — auto animation prompts
v1 used an LLM only to describe the reference photo for FLUX. v2 adds a dedicated video-analysis LLM pass: it reads frames from your driving video and writes a full SCAIL-2-optimized prompt — motion sequence, object interactions, environment, lighting, camera behavior — following the official SCAIL-2 prompting guidelines (describe the final video, never the original person). Just add your LoRA trigger + outfit line on top. Toggleable if you prefer manual prompts or need the VRAM.
🦴 ControlNet pass (pose-driven mode)
DWPreprocessor is replaced with a clean ControlNet subgraph. Use end-to-end mode (default, recommended) or switch to pose-driven for extremely challenging inputs — per the official SCAIL-2 docs, pose-driven works best at 704p.
📐 Built-in resolution cheat sheet
A note right inside the workflow with all officially supported SCAIL-2 resolutions:
* 512p: 512×896 / 896×512 (default — faster, less VRAM)
* 704p: 704×1280 / 1280×704 (better face detail)
* Custom sizes allowed if divisible by 32
---
🔄 Pipeline overview
1. Reference extraction — load driving video, SAM3 tracks & masks the person to replace
2. Character creation — FLUX.2 Klein 9B + your character LoRA generates a matching reference photo
3. Motion transfer — Wan 2.1 SCAIL-2 (replacement mode) animates your character with the original motion
4. Extension (new) — optional +10s / +15s segments with seamless continuation & color matching
5. Detail refine — Wan 2.2 T2V Low Noise pass (denoise 0.15) sharpens skin & face
6. Interpolation — RIFE 4.9 doubles output to 32 FPS
⚙️ Quick specs
Works with any character LoRA (Wan + FLUX.2 versions) — just swap the loaders or bypass them
~20–24 GB VRAM recommended; for less: disable LLM passes, keep 512×896, reduce frame length
💡 Tips
* Pick a reference photo pose close to the first frame of your driving video
* Put your LoRA trigger word + character outfit description at the start of the animate prompt
* If masks flicker, tweak the SAM3 threshold (default 0.5) or increase mask blur
* replacement_mode = true for full swap, false for classic motion transfer
---
Made by VeloPilot — AI Content Creator. Enjoy, and share your results! ❤️




