Running Kontext ReferenceLatent conditioning through standard PuLID throws an error about
unexpected conditioning format. Fix: patch pulidflux.py to add timestep_zero_index kwarg
support and generation-token-only injection mode. Full patch instructions in the linked Article.
━━━ HARDWARE REQUIREMENTS ━━━
• Minimum: 16GB VRAM (will be slow, may need to reduce steps/resolution)
• Recommended: 24GB+ VRAM (RTX 3090/4090)
• Ideal: 48–96GB VRAM (A100, H100, RTX 6000 Ada)
• RAM: 32GB+ system RAM
• Storage: ~22GB for all models + 5–10GB per batch run output
━━━ REQUIRED MODELS ━━━
Place in the specified ComfyUI subdirectory:
• models/unet/ — flux1-dev-kontext_fp8_scaled.safetensors (~17GB) [HuggingFace: black-forest-labs]
• models/vae/ — ae.safetensors (~335MB) [HuggingFace: black-forest-labs/FLUX.1-dev]
• models/clip/ — clip_l.safetensors (~246MB) + t5xxl_fp8_e4m3fn_scaled.safetensors (~4.7GB)
• models/loras/ — flux_realism_lora.safetensors (XLabs — CivitAI)
• models/loras/ — fluxRealSkin_v2.safetensors (CivitAI)
• models/pulid/ — pulid_flux_v0.9.0.safetensors
InsightFace weights download automatically on first run (internet required for first execution only).
━━━ REQUIRED CUSTOM NODES ━━━
• ComfyUI-PuLID-Flux (with pulidflux.py patch — see Article)
• ComfyUI-Detail-Daemon
• ComfyUI-Kontext (ReferenceLatent + FluxKontextImageScale — built into recent ComfyUI)
━━━ TO USE ━━━
1. Place 3 reference images in ComfyUI/input/:
— Node 4: full/3-quarter body reference
— Node 30: head+shoulders face reference
— Node 20: tight face crop for PuLID (face fills 70%+ of frame, near-frontal angle)
2. Edit CLIPTextEncode (Node 7) with your character description
3. Run a single image first to verify setup, then queue the full batch
━━━ KEY PARAMETERS ━━━
PuLID — Node 24:
weight 0.73 (0.35–0.75 useful range) — higher = stronger identity, lower = more variation
start_at 0.05 — leave at 0.05; lets structure form before identity locks
end_at 0.72 — lower if skin looks plastic; releases final texture passes from PuLID constraint
Detail Daemon — Node 50:
detail_amount 0.40 (0.20–0.55 range) — sweet spot; 0.50+ = over-sharpened
start 0.25 / end 0.80 — active across mid-sampling where detail resolves
LoRA strengths:
XLabs Realism (Node 40): 0.60 (0.35–0.75 useful range)
FluxRealSkin v2 (Node 41): 0.45 (0.25–0.55 useful range)
Note: these two compound — if you raise one, slightly lower the other
Sampling: 25 steps, euler_ancestral, simple scheduler, guidance 2.5
━━━ FILES INCLUDED ━━━
• workflow_ui_format.json — import directly into ComfyUI
Generates 164 photorealistic identity-consistent images from 3 reference photos.
Built for high-quality synthetic training datasets for Wan 2.2 / HunyuanVideo character
LoRAs — but outputs are also great standalone character content. Developed across 16
test iterations for maximum photorealism, identity consistency, and anatomical accuracy.
━━━ WHAT IT DOES ━━━
Combines three techniques that individually exist in the community but haven't been
documented together in a single production-ready batch pipeline:
• Flux Kontext DUAL reference chain — body ref + face ref simultaneously, stronger than single ref
• PuLID Flux face identity injection — biometric-level face locking via InsightFace embeddings
• Detail Daemon sampler augmentation — forces skin pores, hair strands, iris texture to resolve
• XLabs Flux Realism LoRA + FluxRealSkin v2 stacked for photographic skin rendering
━━━ OUTPUT ━━━
82 prompts × 2 seeds = 164 images:
• 10 extreme close-ups (frontal, 90° profile left/right, dutch tilt, high angle, low angle, head down)
• 15 portraits head+shoulders (laughing, looking away, over-shoulder, near-profile, editorial, candid)
• 15 medium waist-up shots
• 12 full body — 768×1152 PORTRAIT canvas so subjects look tall, not short
• 12 revealing/fashion outfits
• 12 lifestyle/candid
• 6 lighting moods (golden hour, dramatic shadow, neon night, overcast, backlit, morning window)
━━━ KEY TECHNICAL FINDINGS ━━━
1. PuLID must be applied AFTER LoRAs — LoRAs define rendering style first, PuLID locks face into it
2. Detail Daemon wraps the SAMPLER output of KSamplerSelect, NOT the sigmas — wrong wiring = 400 error
3. SamplerCustomAdvanced + BasicGuider is REQUIRED when using PuLID (standard KSampler won't work)
4. 768×1152 portrait canvas for full body — square 1024×1024 makes subjects look short/wide
━━━ COMPATIBILITY WARNING — READ BEFORE INSTALLING ━━━
The standard ComfyUI-PuLID-Flux node is NOT compatible with Flux Kontext out of the box.
Running Kontext ReferenceLatent conditioning through standard PuLID throws an error about
unexpected conditioning format. Fix: patch pulidflux.py to add timestep_zero_index kwarg
support and generation-token-only injection mode. Full patch instructions in the linked Article.
━━━ HARDWARE REQUIREMENTS ━━━
• Minimum: 16GB VRAM (will be slow, may need to reduce steps/resolution)
• Recommended: 24GB+ VRAM (RTX 3090/4090)
• Ideal: 48–96GB VRAM (A100, H100, RTX 6000 Ada)
• RAM: 32GB+ system RAM
• Storage: ~22GB for all models + 5–10GB per batch run output
━━━ REQUIRED MODELS ━━━
Place in the specified ComfyUI subdirectory:
• models/unet/ — flux1-dev-kontext_fp8_scaled.safetensors (~17GB) [HuggingFace: black-forest-labs]
• models/vae/ — ae.safetensors (~335MB) [HuggingFace: black-forest-labs/FLUX.1-dev]
• models/clip/ — clip_l.safetensors (~246MB) + t5xxl_fp8_e4m3fn_scaled.safetensors (~4.7GB)
• models/loras/ — flux_realism_lora.safetensors (XLabs — CivitAI)
• models/loras/ — fluxRealSkin_v2.safetensors (CivitAI)
• models/pulid/ — pulid_flux_v0.9.0.safetensors
InsightFace weights download automatically on first run (internet required for first execution only).
━━━ REQUIRED CUSTOM NODES ━━━
• ComfyUI-PuLID-Flux (with pulidflux.py patch — see Article)
• ComfyUI-Detail-Daemon
• ComfyUI-Kontext (ReferenceLatent + FluxKontextImageScale — built into recent ComfyUI)
━━━ TO USE ━━━
1. Place 3 reference images in ComfyUI/input/:
— Node 4: full/3-quarter body reference
— Node 30: head+shoulders face reference
— Node 20: tight face crop for PuLID (face fills 70%+ of frame, near-frontal angle)
2. Edit CLIPTextEncode (Node 7) with your character description
3. Run a single image first to verify setup, then queue the full batch
━━━ KEY PARAMETERS ━━━
PuLID — Node 24:
weight 0.73 (0.35–0.75 useful range) — higher = stronger identity, lower = more variation
start_at 0.05 — leave at 0.05; lets structure form before identity locks
end_at 0.72 — lower if skin looks plastic; releases final texture passes from PuLID constraint
Detail Daemon — Node 50:
detail_amount 0.40 (0.20–0.55 range) — sweet spot; 0.50+ = over-sharpened
start 0.25 / end 0.80 — active across mid-sampling where detail resolves
LoRA strengths:
XLabs Realism (Node 40): 0.60 (0.35–0.75 useful range)
FluxRealSkin v2 (Node 41): 0.45 (0.25–0.55 useful range)
Note: these two compound — if you raise one, slightly lower the other
Sampling: 25 steps, euler_ancestral, simple scheduler, guidance 2.5
━━━ FILES INCLUDED ━━━
• workflow_ui_format.json — import directly into ComfyUI
Description
━━━ WHAT CHANGED IN V2 ━━━
V1 used two chained Kontext ReferenceLatent nodes (body ref + face ref simultaneously).
V2 drops the second Kontext chain entirely and relies on a single body reference + higher-weight
PuLID for all face identity. Why: the dual Kontext chain locked every generated image to the
exact studio pose from the reference photos, regardless of what angle the text prompt described.
It also bled hand positions from reference images into close-up face shots (hands-in-hair appearing
on "no hands in frame" prompts). Removing the face Kontext chain while raising PuLID weight from
0.73 to 0.78 gives equivalent identity strength with dramatically better pose variety.
Additional V2 changes:
• Canvas: portrait and medium shots now use 768×1024 instead of 1024×1024 — subjects no longer
look compressed/short on non-full-body shots
• FluxRealSkin strength lowered 0.45→0.35, Detail Daemon amount 0.40→0.20 — skin now reads as
real photography rather than stylized AI texture
• PuLID end_at lowered 0.72→0.65 — releases identity constraint earlier, allowing final texture
passes to resolve naturally without being locked by PuLID
• ID prompt updated: removed "subsurface scattering at cheekbones" (caused smooth oily look),
replaced "healthy matte complexion" with "natural matte skin, visible pores, real skin texture"
━━━ WHAT IT DOES ━━━
Generates photorealistic, identity-consistent images of a specific person from just 2 reference photos:
one body/appearance reference and one tight face crop for PuLID. The same person appears across
varied scenarios — different camera angles, outfits, lighting, expressions, shot distances.
Primary use case: building synthetic training datasets for character LoRAs (Wan 2.2, HunyuanVideo).
The output is also usable as standalone content, character sheets, or avatar generation.
━━━ OUTPUT FORMAT ━━━
The included batch runner script generates images across these categories:
• 10 extreme close-ups — frontal, 90° profile L/R, dutch tilt, chin-down, high-angle, etc.
• 15 portraits head+shoulders — laughing, over-shoulder, editorial, looking away, alluring
• 15 medium waist-up shots — various outfits, angles, lighting
• 12 full body — walking, sitting, pool, street, park, night (768×1152 portrait canvas)
• 12 revealing/fashion — bralette, bikini, slip dress, bodycon, lingerie
• 12 lifestyle/candid — coffee shop, laughing in park, reading, dancing, rooftop, selfie angle
• 6 lighting moods — golden hour, dramatic shadow, neon night, overcast, backlit, morning window
The batch runner runs 82 prompts × 2 seeds = 164 images by default. You can change the seed list,
add more prompts, or reduce categories to match your use case.
━━━ HOW TO USE — ADAPTING TO YOUR OWN CHARACTER ━━━
This workflow generates images of any person. You supply 2 reference photos and edit 3 variables.
No training required. Everything runs at inference time.
─── Step 1: Prepare your reference images ───
You need exactly 2 images in your ComfyUI input/ folder:
reference_body.png — Full body or 3/4-length photo of the person
• Subject should fill 60%+ of the frame
• Arms ideally at sides or relaxed (raised arms/hands near face will bleed into outputs)
• Clean, well-lit background preferred (busy backgrounds can bleed into generations)
• This anchors overall appearance: skin tone, hair, body shape, clothing tendency
reference_face_tight.png — Tight face crop for PuLID identity injection
• Face should fill 70-80% of the frame
• Near-frontal angle (0-30° works best; >45° reduces accuracy)
• Eyes clearly visible, no heavy shadows or obstructions
• This is what locks the specific facial geometry using InsightFace embeddings
Rename your files to these names, OR update the filenames in the script (lines marked REF_BODY
and REF_FACE_TIGHT).
─── Step 2: Edit the ID block ───
Open batch_runner_local.py and find the ID variable. Replace the placeholder text with a
description of your character's appearance:
ID = (
"same person as reference, [ethnicity] woman, "
"natural [hair color] hair with soft texture and loose natural waves, "
"wispy flyaways catching light, "
"beautiful [face description], high sharp cheekbones, soft full lips, "
"natural [eye color] eyes with realistic iris texture, natural limbal ring, "
"subtle natural catchlight, natural eye moisture, realistic eyelashes, "
"willowy slim figure, flat stomach, no muscle definition, slim straight body, "
"long slender legs, slim arms, slim narrow shoulders, "
"100mm portrait lens f/2.0, natural matte skin, visible pores, real skin texture, "
"unretouched RAW photograph, real person"
)
RULES for the ID block:
✅ Include: ethnicity, hair texture and color, face structure words (not expression),
skin tone descriptors, body shape, photographic style anchors
✅ Use: "natural waves, soft texture, wispy" for hair — NOT "lustrous/shiny/glossy" (= oily look)
✅ Use: "flat stomach, no muscle definition, willowy" — NOT "toned/defined/athletic" (= visible abs)
✅ Use: "natural matte skin, visible pores" — NOT "natural luminosity/glowing skin" (= AI sheen)
❌ Do NOT include: expressions, poses, outfits, settings — those go in individual prompts
─── Step 3: Review and edit the PROMPTS list ───
The 82 prompts are in the PROMPTS list as tuples:
("label_name", guidance_float, width_int, height_int, "full prompt text")
Canvas sizes by shot type:
Close-ups (face only): 1024 × 1024
Portraits and medium shots: 768 × 1024 ← taller canvas prevents compressed body look
Full body shots: 768 × 1152 ← 2:3 portrait ratio for proper height
To change an outfit in a prompt, find it and edit the wearing clause:
BEFORE: f"{ID}, wearing a nude lace bralette, extreme close-up..."
AFTER: f"{ID}, wearing a light blue sundress, extreme close-up..."
To add a new prompt:
("your_label", 2.5, 768, 1024,
f"{ID}, wearing a [outfit], portrait, [angle description], [expression], [lighting], {AR}, {IE}"),
Angle description guide — use explicit degrees, not vague terms:
"face turned 90 degrees left, true side profile, right ear fully visible" ← explicit ✅
"side profile" ← too vague, model may interpret loosely ❌
"body facing away, face turned 160 degrees back over right shoulder" ← explicit ✅
"looking over shoulder" ← vague ❌
"camera at knee level looking steeply upward, face tilted slightly down toward camera" ← explicit ✅
"low angle shot" ← vague ❌
─── Step 4: Run ───
# Start ComfyUI first (must be running on port 8188)
python batch_runner_local.py
The script queues all jobs into ComfyUI's internal queue (~30 seconds), then monitors completion.
ComfyUI owns the queue after that — you can close the terminal or disconnect SSH after queuing
and generation continues. Output images appear in ComfyUI/output/character_dataset/ (or whatever
you set OUTPUT_DIR to).
To run more seeds (more images per prompt):
SEEDS = [42, 9999, 12345, 77777] ← adds 2 more images per prompt = 328 total
To run a subset test first:
# At the bottom of the script, change:
for label, guidance, width, height, text in PROMPTS:
# to:
for label, guidance, width, height, text in PROMPTS[:10]: # first 10 only
━━━ KEY PARAMETERS — WHERE TO TUNE ━━━
PuLID (Node 24 — ApplyPulidFlux):
weight 0.78 — identity strength. Higher = stronger face lock, lower = more variation
Range 0.65-0.85 useful. Below 0.60 = identity loss. Above 0.85 = AI doll look
start_at 0.05 — when PuLID begins. Leave at 0.05 (lets structure form first)
end_at 0.65 — when PuLID stops. Lower = more natural texture in final steps
If skin looks plastic: lower to 0.60. If identity too weak: raise to 0.70
Detail Daemon (Node 50 — DetailDaemonSamplerNode):
detail_amount 0.20 — skin/hair micro-texture boost. Range 0.15-0.40
If skin still smooth: raise to 0.28-0.35. If over-textured: lower to 0.15
start 0.25 / end 0.75 — active window. Leave as-is unless troubleshooting
LoRA strengths:
XLabs Realism (Node 40): 0.60 (range 0.40-0.70)
FluxRealSkin v2 (Node 41): 0.35 (range 0.25-0.50)
Note: these compound — if you raise one, consider slightly lowering the other
Sampling:
25 steps, euler_ancestral, simple scheduler, guidance 2.5
euler_ancestral adds stochasticity between seeds for dataset variety
━━━ WHAT MAKES A GOOD BODY REFERENCE ━━━
The body reference image (Node 4) is the main identity anchor for Kontext. It has the most
impact on what the model considers the "default" appearance of the character.
GOOD reference characteristics:
• Subject fills most of the frame — waist-up or 3/4 length ideal
• Arms relaxed at sides — hands near face will bleed into outputs despite NHF prompts
• Clean background — busy backgrounds can appear in some generated images
• Natural, well-lit — avoid heavy filters, very dark conditions, dramatic shadows on face
• Outfit you DON'T mind occasionally appearing in outputs (Kontext may reference it)
If your reference has hands near face: add the NHF variable to more prompts or switch to a
reference with arms down. The NHF phrase ("no hands in frame, hands not visible...") is a
prompt-level countermeasure but is sometimes overridden by strong Kontext reference pose.
━━━ COMPATIBILITY WARNING ━━━
ComfyUI-PuLID-Flux standard install is NOT compatible with Flux Kontext ReferenceLatent
conditioning out of the box. Running Kontext conditioning through standard PuLID throws an
error about unexpected conditioning format / timestep_zero_index.
Fix: patch pulidflux.py to add timestep_zero_index kwarg support and generation-token-only
injection mode. Full patch in the companion documentation article.
━━━ HARDWARE REQUIREMENTS ━━━
Minimum: 16GB VRAM (slow, may need to reduce steps)
Recommended: 24GB+ VRAM (RTX 3090/4090)
Ideal: 48-96GB VRAM (A100, H100, RTX 6000 Ada) — full speed
━━━ REQUIRED MODELS ━━━
models/unet/ — flux1-dev-kontext_fp8_scaled.safetensors (~17GB)
models/vae/ — ae.safetensors (~335MB)
models/clip/ — clip_l.safetensors + t5xxl_fp8_e4m3fn_scaled.safetensors
models/loras/ — flux_realism_lora.safetensors (XLabs, on CivitAI)
models/loras/ — fluxRealSkin_v2.safetensors (on CivitAI)
models/pulid/ — pulid_flux_v0.9.0.safetensors
InsightFace weights download automatically on first run (requires internet)
━━━ REQUIRED CUSTOM NODES ━━━
ComfyUI-PuLID-Flux (with pulidflux.py patch)
ComfyUI-Detail-Daemon
ComfyUI-Kontext (ReferenceLatent + FluxKontextImageScale — built into recent ComfyUI)






