🚀 Singularity LTX-2.3 OmniCine V1 (Official Release)
Try It Online Experience the full potential of this model via the optimized workflow on RunningHub: 👉
https://www.runninghub.ai/post/2062051326342815746?inviteCode=sdhs0trb
This is not just a standard fine-tune; it is a fundamental restructuring of the LTX-Video (2.3) generation logic.
I am thrilled to present the official release of LTX2.3 Singularity to the Civitai community! This comprehensive optimization framework focuses heavily on Image-to-Video (I2V), First & Last Frame Control, and Reference-to-Video generation. Although it has currently undergone only nearly 100,000 steps (calculated by gradient accumulation), its enhancements in physical consistency, dynamic motion, and cinematic expression have already far exceeded expectations.
🌟 Key Improvements & Features
🦴 Limbs & Anatomy Evolution: Specifically optimized to fix the common degradation of fingers and toes, drastically reducing anatomy warping and artifacts during fast movements.
🎬 Injecting Shot Continuity: Achieved precise timeline-based shot and camera cuts controlled directly via text prompts (0-5s logical segments), saying goodbye to erratic, randomized framing.
🗣️ Elimination of "AI Stiffness": Significantly enhanced facial expressiveness during speech, deeply optimized lip-syncing, and natively eliminated the rigid, burned-in subtitles frequently generated by the base model.
⚖️ Physical Consistency: Improved the structural integrity of characters and environments during high-speed actions, suppressing chaotic "twisting/morphing" and aligning motions with real-world physics.
🎨 Flawless Anime Compatibility: Integrated a high-quality Anime training dataset, allowing the model to seamlessly adapt across diverse styles including 2D anime, 3D CGI, and hyper-realism.
🌪️ Extreme Dynamic Range: Delivers stellar performance in high-action sequences like running and combat sports. Simultaneously, visual effects for cyberpunk themes, transformations, magic casting, and monster rendering have been massively amplified.
🖼️ Revolutionary Reference Image Control: Upgraded the "Reference-to-Video" capability. No longer bound to rigid first-frame constraints, the model intelligently extracts character features and artistic styles from the reference image, generating entirely new angles and compositions based on your prompts.
📊 Current Limitation Note: While it meets the vast majority of movement demands, slight motion blur may still occur during extreme, highly complex actions. This is currently being addressed via optimized post-processing workflows—stay tuned!
⚙️ Generation & Usage Guide
To get the absolute best results from this LoRA, please follow these recommendations:
Recommended Base Model:
ltx-2.3-22b-distilled-1.1_transformer_only_fp8_scaled.safetensorsComfyUI Workflow: Available in the files/post section. Highly recommended to use in First & Last Frame Mode for ultimate scene control.
LoRA Weight: Recommended to start at 0.8 - 1.0 and adjust based on your specific prompt intensity.
📝 Exclusive: Singularity Prompting Framework
This model follows a strict prompt structure to unlock its full cinematic potential. Please adhere closely to the "Cinematic Timeline Structure" below.
💡 Core Rule: Keep visual descriptions, timestamps, actions, and dialogue strictly formatted in English as shown below.
📐 Prompt Template Structure
[Scene & Style]: Core visual description in one sentence (e.g., Cinematic wuxia style, dim lighting, Anime, 3D).
[Action Timeline]: 0-X seconds, [action / emotional description].
[Camera Timeline]: 0-X seconds, [camera movement / composition parameters].
[Environment]: Lighting source, contrast, and color grading details.
[Dialogue]: 0-X seconds, [Character] says: "[Dialogue text]".
[Audio & Technical]: Background sounds, film grain, subtitle exclusion commands, etc.
🎬 Example Prompt
Cinematic wuxia style, indoor dim lighting, mysterious mood. 0-10 seconds, young man in ancient white robes looks down with a confused expression. 0-10 seconds, tight close-up, static camera with slight handheld movement. Dark stone background, warm candlelight bokeh. 0-10 seconds, man says: "What on earth is this? I've never heard of it before.". Voice: low and confused, Pace: slow. Precise lip-sync, film grain, cinematic bokeh, no subtitles.
🛠️ Dev Log (Behind the Scenes)
LTX2.3 is an architecture with immense latent potential, but I believe it requires more structured guidance to truly understand complex motion.
In this fine-tuning run, I abandoned brute-force action dataset stacking. Instead, I shifted towards high-quality dialogue scenes and clean, easily digestible action sequences for the model to fit. Furthermore, I deliberately reduced the ratio of real-world video footage. Real-world clips often carry heavy native motion blur. When combined with LTX2.3's high Latent compression ratio, the model easily loses temporal attention during high-velocity sequences, causing character consistency to collapse. By filtering out this noise, character feature retention has been massively reinforced.
What's Next? This run highlighted a few minor limitations that I plan to iterate on in the next version. However, given the intensity of this development cycle, I need to take a quick break before diving back in.
❤️ Support the Project: If you enjoy utilizing this model, please leave a 5-star review, drop a ❤️, and post your generations below! Your feedback and buzz directly shape the training set for the next phase. Enjoy the visual revolution of Singularity!
If you have questions, feedback, or want to collaborate on AI video workflows, feel free to reach out:
WeChat (微信):
aigctydQQ Group (QQ社群):
1058747239Email:
[email protected]
I'm actively looking for community feedback to refine the full version of this LoRA. Let's push the boundaries of LTX2.3 together!
Description
FAQ
Comments (15)
Reference video? Is it IC Lora
声音好糊,电充满了
do you think making a smaller sized lora would be possible?
does someone have a concrete explanation on what this actually does? the LLM slop description doesn't help at all.
A wild guess. Since OmniNFT was improving video generation, it would match with this Omni name.
honestly if devs cant write a simple explanation of what actually the model is , it safe to say it just bullshit. One does not have to be native speaker. They can just input whatever they have done to a LLM and ask it to make a nice simpleified descriptiobn , rather than posting this slop description.,
Using it with Eros model and i barely see any difference at best, otherwise the gens come out worse than without using it.
Too many bullshitters these days.
@gsffff I don't wanna be a hater, but the amount of CN accounts vagueposting, for their files to do absolutely nothing, is getting kind of absurd.
It's not made for Eros/Sulphur. Those require seperate LoRAs.
The 0.1 version is good with Eros st 0.3-0.5 strength but the 1.0 sfw and nsfw loras have a negative effect.
声音糊的兄弟,在2段放大不要接lora,lora强度设为0.5,可以好好玩了。
For those experiencing muffled audio, avoid connecting LoRa at the second amplification level. Set the LoRa intensity to 0.5, and you can have a lot of fun with it.
这个工作流 😶🌫️llama智能对话@炮老师的小课堂 提示词反推节点无法生成NSFW内容吗?会出现作为“负责任AI,道德"什么的