Bernini-R Three-Image Reference Cinematic Video Workflow

Watch the full video first if you want to understand how this Bernini-R three-image reference video workflow works in practice. The video shows how multiple reference images can be combined into one cinematic video generation pipeline, how the prompt enhancement system rewrites the visual concept, and how the final video can be generated online without rebuilding the full ComfyUI environment locally.

This ComfyUI workflow is designed for Bernini-R three-image reference video generation. Its main purpose is to take three visual references and use them together as the foundation for a finished short video. Compared with a single-image image-to-video workflow, this graph gives the model more visual material to understand subject identity, supporting elements, atmosphere, scene structure, and cinematic direction.

The workflow is built around the Bernini-R high-noise and low-noise model route. It uses Bernini_HIGH_fp8_e4m3fn_scaled.safetensors and Bernini_LOW_fp8_e4m3fn_scaled.safetensors as the dual model branches. It also uses UMT5 XXL fp8 text encoding, Wan 2.1 VAE, BerniniConditioning, KSamplerAdvanced, VAEDecode, CreateVideo, SaveVideo, and PathchSageAttentionKJ. The model chain includes both LightX2V acceleration LoRA and UnifiedReward-Flex LoRA for the high-noise and low-noise branches, helping the workflow stay more efficient while improving visual quality and coherence.

The reference section is the core of this workflow. Three LoadImage nodes provide three separate image references. Each image is processed through image_scale_pixel_v2, then combined through BatchImagesNode. These batched images enter BerniniPromptEnhancer and BerniniConditioning as the multi-reference visual condition. This allows the workflow to treat the first image as the main subject, the second image as a secondary presence or object, and the third image as another visual element, environment cue, or story component.

The prompt system is also important. BerniniPromptEnhancer is used to build a Bernini-specific instruction with r2v reference-to-video logic. Then RHLLMChatNode rewrites the instruction into a more complete video prompt. The output is cleaned through StringReplace nodes, removing the JSON wrapper before sending the rewritten prompt into CLIPTextEncode. This makes the workflow more practical because the user can start from a rough idea and let the system expand it into a detailed cinematic generation prompt.

The uploaded example focuses on an epic fantasy scene: a guardian on a high platform, a colossal dark presence, a cracked sky, a collapsing floating holy city, broken bridges, glowing waterfalls of light, black energy tides, and final-battle atmosphere. This shows the intended strength of the workflow: combining multiple image references into one coherent cinematic story scene instead of simply animating one static picture.

The generation path uses BerniniConditioning with a vertical video setup around 480×848 and 129 frames. The first KSamplerAdvanced stage handles the main high-noise construction, while the second stage refines the output through the low-noise route. The final latent is decoded with Wan 2.1 VAE and exported through CreateVideo and SaveVideo.

Compared with ordinary image-to-video workflows, this Bernini-R three-image workflow is stronger for concept-driven cinematic generation. It is suitable for fantasy scenes, character-and-creature shots, multi-reference story clips, short-form vertical videos, game-style cinematic previews, AI trailers, Bilibili showcases, YouTube tutorials, RunningHub releases, and Civitai workflow publishing.

Main features:

Bernini-R three-image reference video workflow
Three reference images combined into one video condition
Reference-to-video / image-to-video generation logic
Bernini HIGH / LOW fp8 dual-model route
UMT5 XXL fp8 text encoder
Wan 2.1 VAE decoding
image_scale_pixel_v2 reference image preparation
BatchImagesNode multi-image batching
BerniniPromptEnhancer prompt creation
RHLLMChatNode automatic prompt rewriting
JSON cleanup chain for LLM output
BerniniConditioning i2v / r2v control
PathchSageAttentionKJ optimization
LightX2V high / low noise LoRA support
UnifiedReward-Flex high / low noise LoRA support
KSamplerAdvanced two-stage generation
Vertical 480×848 / 129-frame video setup
CreateVideo and SaveVideo final output

Suggested workflow:

Prepare three clear reference images first. Use the first image for the main subject, the second image for the main supporting object or character, and the third image for the environment, atmosphere, or additional story element. Keep the images visually readable and avoid references that conflict too heavily with each other. Load all three images into the workflow, then write a direct scene prompt describing the final video concept, camera movement, action, lighting, environment, and story direction. Let BerniniPromptEnhancer and RHLLMChatNode expand the prompt into a more complete Bernini instruction. Check the cleaned prompt before rendering. If one reference dominates too strongly, simplify the prompt or adjust the reference set. If the result lacks motion, describe camera pullback, subject movement, environmental motion, and scene progression more explicitly.

⚙️ RunningHub Workflow

Try the workflow online right now — no installation required.
👉 Workflow: https://www.runninghub.ai/post/2062503680565403649?inviteCode=rh-v1111

If the results meet your expectations, you can later deploy it locally for customization.

🎁 Fan Benefits: Register to get 1000 points + daily login 100 points — enjoy 4090 performance and 48 GB super power!

📺 Bilibili Updates (Mainland China & Asia-Pacific)

If you’re in the Asia-Pacific region, you can watch the video below to see the workflow demonstration and creative breakdown.
📺 Bilibili Video: https://www.bilibili.com/video/BV1yLEc6dEJc/

☕ Support Me on Ko-fi

If you find my content helpful and want to support future creations, you can buy me a coffee ☕.
Every bit of support helps me keep creating — just like a spark that can ignite a blazing flame.
👉 Ko-fi: https://ko-fi.com/aiksk

💼 Business Contact

For collaboration or inquiries, please contact aiksk95 on WeChat.

⚙️打开下方链接即可在线体验，无需安装。
👉 工作流： https://www.runninghub.ai/post/2062503680565403649?inviteCode=rh-v1111
如果觉得效果理想，你也可以在本地进行自定义部署。

🎁 粉丝福利：注册即送 1000 积分，每日登录 100 积分，畅玩 4090 体验 48 G 超级性能！

📺 Bilibili 更新（中国大陆及南亚太地区）

如果你在中国大陆或南亚太地区，可以通过下方视频查看该工作流的实测效果与构思讲解。
📺 B站视频： https://www.bilibili.com/video/BV1yLEc6dEJc/

我会在夸克网盘持续更新模型资源：
👉 https://pan.quark.cn/s/20c6f6f8d87b
这些资源主要面向本地用户，方便进行创作与学习。

Description

FAQ

Details

Files

berniniRThreeImage_v10.zip

Mirrors

Description

FAQ

What is Bernini-R Three-Image Reference Cinematic Video Workflow?

What files are available and where can I download them?

Details

Files

berniniRThreeImage_v10.zip

Mirrors