Special thanks to:
@boinobin730 for initialising, forcing and supporting this project in all kinds of matter, like providing links, running tests, sharing knowlage and inspiring diskussions.
@Urabewe for publishing the original, perfectly running 12 GB VRAM LTX-2 workflows mainly used here in this workflow.
Features:
Simple to use all-In-One LTX-2 workflow with options for:
Text to Video + Audio
Image to Video + Audio
Video to Video + Audio
Image + Audio to Video + Audio
easy switching between all options with Fast Groups Bypasser,
all steps highly automated: no manual frame or aspect ratio calculations necessary,
easy to set inputs by predefined sliders (no risk to set wrong inputs, like wrong aspect ratios or wrong frame counts),
brilliant audio generation (speech/sound) with LTX-2.
Requirements:
GPU with 12 GB VRAM (maybe it will work with less, but not tested),
32 GB VRAM,
Swap file size: 64 - 128 GB.
Speed and video length:
Runs very fast: 5 second (1280 x 864) Video: < 10 minutes.
Generation of long high quality videos in one run possible: 10 - 20 seconds without any issues,
Testrun: 30 second video (1024 x 704) tooks around 40 minutes without anny OOM errors. Longer videos might be possible, but not tested yet.
Important:
This workflow is intended for advanced comfyui users who know how to install and operate the system and are able to resolve basic system errors themselves, like as node conflicts, or general system issues.
What comes next?
My intention is to add the folowing parts:
- integrating Urabewe`s text + audio to video workflow,
- adding different audio engines for integrated more complex audio generation.
About this workflow:
This workflow is mainly based on the fantastic LTX-2 workflows of @Urabewe.
As far as I know, those were the first workflows running LTX-2 with 12 GB VRAM. All credits goes to the original creator.
My job was only to combine and organise the different workflows in a hopfully simple to use all-in-one design, adding some simple calculations and sliders to make inputs as simple as possible and running lot of tests to do some minor optimisations, getting the limits and hopfully killing all bugs.
Description
First "beta" version - should run with all options:
Text 2 Video
Image 2 Video
Video 2 Video
Image + Audio 2 Video