CivArchive
    Bernini-R Three-Image Reference Cinematic Video Workflow - v1.0
    Preview 132959549

    Watch the full video first if you want to understand how this Bernini-R three-image reference video workflow works in practice. The video shows how multiple reference images can be combined into one cinematic video generation pipeline, how the prompt enhancement system rewrites the visual concept, and how the final video can be generated online without rebuilding the full ComfyUI environment locally.

    This ComfyUI workflow is designed for Bernini-R three-image reference video generation. Its main purpose is to take three visual references and use them together as the foundation for a finished short video. Compared with a single-image image-to-video workflow, this graph gives the model more visual material to understand subject identity, supporting elements, atmosphere, scene structure, and cinematic direction.

    The workflow is built around the Bernini-R high-noise and low-noise model route. It uses Bernini_HIGH_fp8_e4m3fn_scaled.safetensors and Bernini_LOW_fp8_e4m3fn_scaled.safetensors as the dual model branches. It also uses UMT5 XXL fp8 text encoding, Wan 2.1 VAE, BerniniConditioning, KSamplerAdvanced, VAEDecode, CreateVideo, SaveVideo, and PathchSageAttentionKJ. The model chain includes both LightX2V acceleration LoRA and UnifiedReward-Flex LoRA for the high-noise and low-noise branches, helping the workflow stay more efficient while improving visual quality and coherence.

    The reference section is the core of this workflow. Three LoadImage nodes provide three separate image references. Each image is processed through image_scale_pixel_v2, then combined through BatchImagesNode. These batched images enter BerniniPromptEnhancer and BerniniConditioning as the multi-reference visual condition. This allows the workflow to treat the first image as the main subject, the second image as a secondary presence or object, and the third image as another visual element, environment cue, or story component.

    The prompt system is also important. BerniniPromptEnhancer is used to build a Bernini-specific instruction with r2v reference-to-video logic. Then RHLLMChatNode rewrites the instruction into a more complete video prompt. The output is cleaned through StringReplace nodes, removing the JSON wrapper before sending the rewritten prompt into CLIPTextEncode. This makes the workflow more practical because the user can start from a rough idea and let the system expand it into a detailed cinematic generation prompt.

    The uploaded example focuses on an epic fantasy scene: a guardian on a high platform, a colossal dark presence, a cracked sky, a collapsing floating holy city, broken bridges, glowing waterfalls of light, black energy tides, and final-battle atmosphere. This shows the intended strength of the workflow: combining multiple image references into one coherent cinematic story scene instead of simply animating one static picture.

    The generation path uses BerniniConditioning with a vertical video setup around 480×848 and 129 frames. The first KSamplerAdvanced stage handles the main high-noise construction, while the second stage refines the output through the low-noise route. The final latent is decoded with Wan 2.1 VAE and exported through CreateVideo and SaveVideo.

    Compared with ordinary image-to-video workflows, this Bernini-R three-image workflow is stronger for concept-driven cinematic generation. It is suitable for fantasy scenes, character-and-creature shots, multi-reference story clips, short-form vertical videos, game-style cinematic previews, AI trailers, Bilibili showcases, YouTube tutorials, RunningHub releases, and Civitai workflow publishing.

    Main features:

    • Bernini-R three-image reference video workflow

    • Three reference images combined into one video condition

    • Reference-to-video / image-to-video generation logic

    • Bernini HIGH / LOW fp8 dual-model route

    • UMT5 XXL fp8 text encoder

    • Wan 2.1 VAE decoding

    • image_scale_pixel_v2 reference image preparation

    • BatchImagesNode multi-image batching

    • BerniniPromptEnhancer prompt creation

    • RHLLMChatNode automatic prompt rewriting

    • JSON cleanup chain for LLM output

    • BerniniConditioning i2v / r2v control

    • PathchSageAttentionKJ optimization

    • LightX2V high / low noise LoRA support

    • UnifiedReward-Flex high / low noise LoRA support

    • KSamplerAdvanced two-stage generation

    • Vertical 480×848 / 129-frame video setup

    • CreateVideo and SaveVideo final output

    Suggested workflow:

    Prepare three clear reference images first. Use the first image for the main subject, the second image for the main supporting object or character, and the third image for the environment, atmosphere, or additional story element. Keep the images visually readable and avoid references that conflict too heavily with each other. Load all three images into the workflow, then write a direct scene prompt describing the final video concept, camera movement, action, lighting, environment, and story direction. Let BerniniPromptEnhancer and RHLLMChatNode expand the prompt into a more complete Bernini instruction. Check the cleaned prompt before rendering. If one reference dominates too strongly, simplify the prompt or adjust the reference set. If the result lacks motion, describe camera pullback, subject movement, environmental motion, and scene progression more explicitly.

    ⚙️ RunningHub Workflow

    Try the workflow online right now — no installation required.
    👉 Workflow: https://www.runninghub.ai/post/2062503680565403649?inviteCode=rh-v1111

    If the results meet your expectations, you can later deploy it locally for customization.

    🎁 Fan Benefits: Register to get 1000 points + daily login 100 points — enjoy 4090 performance and 48 GB super power!

    📺 Bilibili Updates (Mainland China & Asia-Pacific)

    If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.
    📺 Bilibili Video: https://www.bilibili.com/video/BV1yLEc6dEJc/

    ☕ Support Me on Ko-fi

    If you find my content helpful and want to support future creations, you can buy me a coffee ☕.
    Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.
    👉 Ko-fi: https://ko-fi.com/aiksk

    💼 Business Contact

    For collaboration or inquiries, please contact aiksk95 on WeChat.

    ⚙️打开下方链接即可在线体验,无需安装。
    👉 工作流: https://www.runninghub.ai/post/2062503680565403649?inviteCode=rh-v1111
    如果觉得效果理想,你也可以在本地进行自定义部署。

    🎁 粉丝福利: 注册即送 1000 积分,每日登录 100 积分,畅玩 4090 体验 48 G 超级性能!

    📺 Bilibili 更新(中国大陆及南亚太地区)

    如果你在中国大陆或南亚太地区,可以通过下方视频查看该工作流的实测效果与构思讲解。
    📺 B站视频: https://www.bilibili.com/video/BV1yLEc6dEJc/

    我会在 夸克网盘 持续更新模型资源:
    👉 https://pan.quark.cn/s/20c6f6f8d87b
    这些资源主要面向本地用户,方便进行创作与学习。

    Description

    FAQ

    Workflows
    Wan Video 2.2 T2V-A14B

    Details

    Downloads
    146
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/6/2026
    Updated
    6/29/2026
    Deleted
    -

    Files

    berniniRThreeImage_v10.zip

    Mirrors

    CivitAI (1 mirrors)