This workflow provides a complete, production-ready pipeline for generating high-fidelity NSFW images and animating them with realistic motion. It is built on a "separation of concerns" principle: using a depth map to lock in composition and anatomy during the image generation phase, then passing that pristine image to a video model for animation. This method avoids the common pitfalls of direct text-to-video, such as anatomical drift and inconsistent lighting.
Key Stages:
Depth Extraction: Uses a Depth Estimator (e.g., DeepAnything) to capture the spatial geometry of a reference image, creating a "cage" for the diffusion process.
Guided Diffusion: Leverages a ControlNet with an SDXL base model (highly compatible with photorealistic checkpoints like Lustify!) to generate the final image. The depth map ensures the pose and structure remain perfectly consistent, even with radical prompt changes.
High-Res Refinement: Employs a dedicated upscaling model followed by a low-denoise refinement pass to achieve sharp, photorealistic textures and eliminate artifacts.
Image-to-Video Integration: Outputs a still frame ready for use in I2V models (Kling, WAN, etc.), ensuring the final animation retains the fidelity of the original generation.
Changelog:
v2026.04.03: Added JoyCaption integration for automatic prompt generation.
v2026.04.01: Reorganized node grouping for improved readability.
Description
This workflow provides a complete, production-ready pipeline for generating high-fidelity NSFW images and animating them with realistic motion. It is built on a "separation of concerns" principle: using a depth map to lock in composition and anatomy during the image generation phase, then passing that pristine image to a video model for animation. This method avoids the common pitfalls of direct text-to-video, such as anatomical drift and inconsistent lighting.
Key Stages:
Depth Extraction: Uses a Depth Estimator (e.g., DeepAnything) to capture the spatial geometry of a reference image, creating a "cage" for the diffusion process.
Guided Diffusion: Leverages a ControlNet with an SDXL base model (highly compatible with photorealistic checkpoints like Lustify!) to generate the final image. The depth map ensures the pose and structure remain perfectly consistent, even with radical prompt changes.
High-Res Refinement: Employs a dedicated upscaling model followed by a low-denoise refinement pass to achieve sharp, photorealistic textures and eliminate artifacts.
Image-to-Video Integration: Outputs a still frame ready for use in I2V models (Kling, WAN, etc.), ensuring the final animation retains the fidelity of the original generation.
Looks like we don't have an active mirror for this file right now.
CivArchive is a community-maintained index — we catalog mirrors that volunteers upload to HuggingFace, torrents, and other public hosts. Looks like no one has uploaded a copy of this file yet.
Some files do get recovered over time through contributions. If you're looking for this one, feel free to ask in Discord, or help preserve it if you have a copy.