SCAIL-2 Single-Person Reference Editing Long-Video Workflow

Watch the full video first if you want to understand how this SCAIL-2 single-person reference editing workflow works in practice. The video shows how one reference character can replace the selected person in a driving video, while the workflow preserves pose motion, original scene structure, identity consistency, lighting coherence, and long-video continuity.

This ComfyUI workflow is designed for SCAIL-2 single-person biological reference editing. Its main purpose is to place one reference character onto the target person in a single-person driving video. Unlike the single-person driving workflow, this version is not only about making a reference character follow motion. It is explicitly configured for character replacement, using skeleton guidance, subject tracking, colored masks, reference identity encoding, and long-video continuation.

The workflow is built around wan2.1_14B_SCAIL_2_fp8_scaled.safetensors as the main SCAIL-2 model. It also uses WAN VAE, UMT5 XXL WAN text encoding, CLIP Vision, SAM3, SCAIL2ColoredMask, WanSCAILToVideo, SamplerCustom, VAEDecode, ForLoop continuation, overlap-frame trimming, ColorTransfer, final video combining, and original audio restoration. A multi-LoRA enhancement chain is also preserved to improve motion quality, visual stability, and final rendering consistency.

The most important switch in this workflow is replacement_mode=true. This tells the SCAIL route to perform single-person skeleton guidance with character replacement. The reference image provides the replacement character identity, while the driving video provides the target motion and scene structure. The positive prompt focuses on replacing the selected single target person, following one-person pose guidance, keeping the original scene structure, preserving consistent identity, natural motion, coherent lighting, and smooth temporal consistency.

The negative prompt is also designed for this task. It suppresses bad video quality, flicker, wrong-area replacement, identity drift, deformed bodies, distorted faces, extra limbs, missing hands, warped hands, broken anatomy, blur, and low-quality output. This is important because single-person replacement often fails when the mask is inaccurate, the reference image is unclear, or the driving video contains heavy occlusion.

The workflow uses strict 512×896 alignment. Both the reference image and the driving video are resized to the same canvas before entering SAM3, CLIPVision, and SCAIL. This reduces mismatch between pose masks, reference masks, and generated frames.

SAM3 is configured with max_objects=1. SCAIL2ColoredMask uses object_indices=0, sort_by=left_to_right, and replacement_mode=true. This means the workflow tracks one target person and uses the reference character as the replacement identity. This structure is suitable for single-person dance videos, digital human replacement, character cosplay edits, mascot video generation, anime character motion editing, stylized biological character edits, and short-form AI video production.

The long-video system follows the same continuation logic as the other SCAIL-2 long-video workflows. The first segment is 65 frames and establishes the replacement relationship, pose guidance, mask structure, and visual direction. The continuation segment is 81 frames. Each loop removes 5 overlapping frames, so every loop effectively adds 76 new frames. The loop count is calculated as max(1, ceil((F - 65) / 76)), where F is the loaded driving video frame count.

The final output does not rely on a separate ImageCompositeMasked stage. The generated frames from the loop output are sent directly into the final video combine node. The audio is taken from the original driving video, and the frame rate is controlled by the unified FPS node, making the final result easier to match with the source rhythm.

Main features:

SCAIL-2 single-person reference editing workflow
One reference character replaces one target person
Single-person skeleton-guided video editing
replacement_mode=true for character replacement
512×896 unified input alignment
SAM3 max_objects=1 subject tracking
SCAIL2ColoredMask single-target mask control
object_indices=0 target selection
CLIP Vision reference identity encoding
WanSCAILToVideo first-segment replacement
65-frame initial segment
81-frame continuation segment
5-frame overlap trimming
ForLoop long-video continuation
Direct loop-frame final output
Original driving video audio restored
Unified 24fps output control
Multi-LoRA enhancement chain

Suggested workflow:

Prepare one clear reference character image and one clean single-person driving video. The reference image should show the character clearly, with readable face, outfit, body shape, and silhouette. The driving video should contain one main target person with stable framing, visible body movement, and limited occlusion. Keep the default 512×896 setting first. Confirm that SAM3 tracks the correct single subject. If the workflow replaces the wrong area, check the tracking result and simplify the driving video. If the identity drifts, use a cleaner reference image and reduce unnecessary prompt complexity. Run a short test first, then enable the long-video loop after the replacement relationship is stable.

⚙️ RunningHub Workflow

Try the workflow online right now — no installation required.
👉 Workflow: https://www.runninghub.ai/post/2065060329721253889?inviteCode=rh-v1111

If the results meet your expectations, you can later deploy it locally for customization.

🎁 Fan Benefits: Register to get 1000 points + daily login 100 points — enjoy 4090 performance and 48 GB super power!

📺 Bilibili Updates (Mainland China & Asia-Pacific)

If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.
📺 Bilibili Video: https://www.bilibili.com/video/BV1w2Ei6pEsJ/

☕ Support Me on Ko-fi

If you find my content helpful and want to support future creations, you can buy me a coffee ☕.
Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.
👉 Ko-fi: https://ko-fi.com/aiksk

💼 Business Contact

For collaboration or inquiries, please contact aiksk95 on WeChat.

⚙️打开下方链接即可在线体验，无需安装。
👉 工作流： https://www.runninghub.ai/post/2065060329721253889?inviteCode=rh-v1111
如果觉得效果理想，你也可以在本地进行自定义部署。

🎁 粉丝福利：注册即送 1000 积分，每日登录 100 积分，畅玩 4090 体验 48 G 超级性能！

📺 Bilibili 更新（中国大陆及南亚太地区）

如果你在中国大陆或南亚太地区，可以通过下方视频查看该工作流的实测效果与构思讲解。
📺 B站视频： https://www.bilibili.com/video/BV1w2Ei6pEsJ/

我会在夸克网盘持续更新模型资源：
👉 https://pan.quark.cn/s/20c6f6f8d87b
这些资源主要面向本地用户，方便进行创作与学习。

Description

FAQ

Comments (1)

Details

Files

scail2SinglePerson_v10.json

Mirrors