Turns photo style into long, seamless motion videos fast.
Who it's for: creators who want this pipeline in ComfyUI without assembling nodes from scratch. Not for: one-click results with zero tuning - you still choose inputs, prompts, and settings.
Open preloaded workflow on RunComfy
Open preloaded workflow on RunComfy (browser)
Why RunComfy first
- Fewer missing-node surprises - run the graph in a managed environment before you mirror it locally.
- Quick GPU tryout - useful if your local VRAM or install time is the bottleneck.
- Matches the published JSON - the zip follows the same runnable workflow you can open on RunComfy.
When downloading for local ComfyUI makes sense - you want full control over models on disk, batch scripting, or offline runs.
How to use (local ComfyUI)
1. Load inputs (images/video/audio) in the marked loader nodes.
2. Set prompts, resolution, and seeds; start with a short test run.
3. Export from the Save / Write nodes shown in the graph.
Expectations - First run may pull large weights; cloud runs may require a free RunComfy account.
Overview
This workflow helps you transform a single reference photo into extended videos that retain character identity and style with exceptional accuracy. It aligns your image to dynamic motion footage, ensuring consistent detail from frame to frame. You gain control over lighting, masking, and scene adaptation using advanced video conditioning and LightX2V acceleration. Perfect for showcasing fashion motion tests, editorial prototypes, or identity-focused animation reels. Ideal for creators seeking detailed control with minimal manual retouching and professional-level output.
Important nodes:
Key nodes in ComfyUI SCAIL-2 character motion transfer reference image to long video workflow
WanSCAILToVideo (#114)
Generates the initial latent segment by fusing pose frames, subject masks, and CLIP Vision identity embeddings from the reference image. Adjust pose_strength to trade off between copying exact motion and allowing subtle style adaptation. Use length to match your segment size so the sampler processes a predictable chunk each pass. If you are strictly replacing the on-screen person, set replacement_mode to favor identity over background styling. Backed by SCAIL-2 on Wan 2.1 14B as packaged in Comfy-Org/SCAIL-2 with method context from zai-org/SCAIL.
WanSCAILToVideo (#115)
Runs during the loop to cover the remainder of the timeline with improved temporal stability. Provide previous_frames from the prior segment to help the model keep clothing details and facial identity steady across boundaries. video_frame_offset and previous_frame_count keep segments in sync with the driving clip. When relight guidance is enabled via the LoRA, push style matching slightly stronger in this pass to harmonize global lighting.
SAM3_VideoTrack (#85, #91)
Detects and tracks the person in both the pose video and the reference image. The “person” text conditioning improves robustness when multiple objects are present. If the tracker drifts, raise detection confidence or limit max_objects so the same subject is selected throughout. The tracking concept follows the Segment Anything family, see facebookresearch/segment-anything for background.
…
Notes
SCAIL-2 Motion Transfer in ComfyUI | Reference Image to Video - see RunComfy page for the latest node requirements.
Description
Initial release - SCAIL-2-MotionTransfer.
