π Z-Image AIO Collection
β‘ Base & Turbo β’ All-in-One β’ Bilingual Text β’ Qwen3-4B
β οΈ IMPORTANT: Requires ComfyUI v0.11.0+
π₯ Download ComfyUI
β¨ What is Z-Image AIO?
Z-Image AIO is an All-in-One repackage of Alibaba Tongyi Lab's 6B parameter image generation models.
Everything integrated:
β VAE already built-in
β Qwen3-4B Text Encoder integrated
β Just download and generate!
π― Available Versions
π₯ Z-Image-Turbo-AIO (8 Steps β’ CFG 1.0)
Ultra-fast generation for production & daily use
β« NVFP4-AIO (7.8 GB) π
π― ONLY for NVIDIA Blackwell GPUs (RTX 50xx)!
β‘ Maximum speed optimized
πΎ Smallest file size
π FP4 precision - blazing fast
Perfect for: RTX 5070, 5080, 5090 owners who want maximum speed
π‘ FP8-AIO (10 GB) β RECOMMENDED
β
Best balance of size & quality
β
Works on 8GB VRAM
β
Fast downloads
β
Ideal for most users
Perfect for: Daily use, testing, RTX 3060/4060/4070
π΅ FP16-AIO (20 GB)
πΎ Same file size as BF16
π ComfyUI auto-casts to BF16 for compute
β οΈ Does NOT enable FP16 compute mode
π¦ Alternative download option
Note: Z-Image does not support FP16 compute - activation values exceed FP16's max range, causing NaN/black images. Weights are cast to BF16 during inference regardless of file format.
Perfect for: Alternative to BF16 download (identical inference behavior)
π BF16-AIO (20 GB) β RECOMMENDED FOR FULL PRECISION
β
BFloat16 full precision
β
Absolute best quality
β
Professional projects
β
Also works on 8GB VRAM
Perfect for: Professional work, maximum quality
π¨ Z-Image-Base-AIO (28-50 Steps β’ CFG 3-5)
Full creative control for pros & LoRA training
π‘ FP8-AIO (10 GB)
β
Efficient for daily use
β
Full CFG control
β
Negative prompts supported
β
8GB VRAM compatible
Perfect for: Daily work with full control
π΅ FP16-AIO (20 GB)
πΎ Same file size as BF16
π ComfyUI auto-casts to BF16 for compute
β οΈ Does NOT enable FP16 compute mode
π¦ Alternative download option
Note: See technical explanation in FAQ below.
Perfect for: Alternative to BF16 download (identical inference behavior)
π BF16-AIO (20 GB) β RECOMMENDED FOR FULL PRECISION
β
Maximum quality
β
Ideal for LoRA training
β
Professional projects
β
Highest precision
Perfect for: LoRA training, professional work
π Turbo vs Base - When to Use?
β‘ Use TURBO when:
β‘ Speed is priority β 8 steps = 3-10 seconds
πΈ Production workflows β Consistent high quality
πΎ Quick iterations β Rapid prototyping
π― Simple prompts β Less complex scenes
π¨ Use BASE when:
π¨ Creative exploration β Higher diversity
π§ LoRA/ControlNet dev β Undistilled foundation
π Complex prompting β Full CFG control
π« Negative prompts needed β Remove unwanted elements
βοΈ Recommended Settings
β‘ Turbo Settings (incl. NVFP4)
π Steps: 8
ποΈ CFG: 1.0 (don't change!)
π² Sampler: res_multistep OR euler_ancestral
π Scheduler: simple OR beta
π Resolution: 1920Γ1088 (recommended)
π« Negative Prompt: β Not used!
π¨ Base Settings
π Steps: 28-50
ποΈ CFG: 3.0-5.0 (start with 4.0)
π² Sampler: euler β OR dpmpp_2m
π Scheduler: normal β OR karras
π Resolution: 512Γ512 to 2048Γ2048
π« Negative Prompt: β
Fully supported!
π Quick Overview
Turbo Versions
β« NVFP4 β 7.8 GB β RTX 50xx only β Max Speed π
π‘ FP8 β 10 GB β 8GB VRAM β Recommended β
π΅ FP16 β 20 GB β β BF16 compute β See FAQ β οΈ
π BF16 β 20 GB β 8GB VRAM β Max Quality β
Base Versions
π‘ FP8 β 10 GB β 8GB VRAM β Efficient
π΅ FP16 β 20 GB β β BF16 compute β See FAQ β οΈ
π BF16 β 20 GB β 8GB VRAM β LoRA Training β
π‘ Prompting Guide
β Good Example:
Professional food photography of artisan breakfast plate.
Golden poached eggs on sourdough toast, crispy bacon, fresh
avocado slices. Morning sunlight creating warm glow. Shallow
depth of field, magazine-quality presentation.
β Bad Example:
breakfast, eggs, bacon, toast, food, morning, plate
π Tips
DO:
β Use natural language
β Be detailed (100-300 words)
β Describe lighting & mood
β Specify camera angle
β English OR Chinese (or both!)
DON'T:
β Tag-style prompts (tag1, tag2, tag3)
β Very short prompts (under 50 words)
β Negative prompts with Turbo
π Bilingual Text Rendering
English:
Neon sign reading "OPEN 24/7" in bright blue letters
above entrance. Modern sans-serif font, glowing effect.
δΈζ:
Traditional tea house entrance with sign reading
"ε€ι΅θΆε" in elegant gold Chinese calligraphy.
Both:
Modern cafe with bilingual sign. "Morning Brew" in
white script above, "ζ¨ζ¦εε‘" in Chinese below.
π₯ Installation
Step 1: Download
Choose your version based on:
GPU: RTX 50xx β NVFP4 possible
VRAM: 8GB β FP8 recommended
Purpose: LoRA Training β Base BF16
Step 2: Place File
ComfyUI/models/checkpoints/
βββ Z-Image-Turbo-FP8-AIO.safetensors
Step 3: Load & Generate
Open ComfyUI (v0.11.0+!)
Use "Load Checkpoint" node
Select your AIO version
Generate!
No separate VAE or Text Encoder needed!
π Credits
Original Model
π¨βπ» Developer: Tongyi Lab (Alibaba Group)
ποΈ Architecture: Single-Stream DiT (6B parameters)
π License: Apache 2.0
Links
π Z-Image Base: https://huggingface.co/Tongyi-MAI/Z-Image
π Z-Image Turbo: https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
π§ Text Encoder: https://huggingface.co/Qwen/Qwen3-4B
π Version History
v2.2 - FP16 Clarification
π Updated FP16 descriptions for technical accuracy
β οΈ Clarified: FP16 weights β FP16 compute
π FP16 files are cast to BF16 during inference
v2.1 - NVFP4 Release π
β Z-Image-Turbo-NVFP4-AIO (7.8GB)
β‘ Optimized for NVIDIA Blackwell (RTX 50xx)
π Maximum speed generation
v2.0 - Base AIO Release
β Z-Image-Base-BF16-AIO
β Z-Image-Base-FP16-AIO
β Z-Image-Base-FP8-AIO
π ComfyUI v0.11.0+ support
π Qwen3-4B Text Encoder
v1.1 - FP16 Added
β Z-Image-Turbo-FP16-AIO
π§ Wider GPU compatibility
v1.0 - Initial Release
β
Z-Image-Turbo-FP8-AIO
β
Z-Image-Turbo-BF16-AIO
β
Integrated VAE + Text Encoder
β FAQ
Q: Which version should I choose?
RTX 50xx + Speed β NVFP4 π
Most users β Turbo FP8 β
Full precision β BF16 β
LoRA Training β Base BF16
Q: Turbo or Base?
Fast & simple β Turbo β‘
Full control β Base π¨
Q: Will NVFP4 work on my RTX 4090?
β No! NVFP4 is only for RTX 50xx (Blackwell architecture).
Use FP8 instead for RTX 40xx and older.
Q: Do I need separate VAE/Text Encoder?
β No! Everything is already integrated.
Just Load Checkpoint and go!
Q: Works on 8GB VRAM?
β Yes! All versions work on 8GB VRAM.
(NVFP4 requires RTX 50xx regardless of VRAM)
β οΈ Q: What about FP16 for older GPUs (RTX 2000/3000)?
Important technical clarification:
Z-Image does NOT support FP16 compute type. Here's why:
π Technical reason:
- FP16 max value: ~65,504
- BF16 max value: ~3.39e+38 (same as FP32)
- Z-Image's activation values exceed FP16's range
- Result: Overflow β NaN β Black images
What actually happens:
ComfyUI automatically casts weights to BF16 for computation
You can see this in logs: "model weight dtype X, manual cast: torch.bfloat16"
"Weight dtype" (file format) β "Compute dtype" (actual calculation)
For RTX 20xx users (no native BF16):
BF16 is emulated via FP32 = slower but works
There is no way to run Z-Image in true FP16 compute
FP8 with CPU offload may be a better option for limited VRAM
TL;DR: FP16 and BF16 files behave identically during inference. Choose based on download preference, not GPU compatibility.
π Get Started Now!
Download β Load Checkpoint β Generate!
Recommended versions:
π‘ FP8 for most users (best size/quality balance)
π BF16 for maximum quality
β« NVFP4 for RTX 50xx speed
All versions work on 8GB VRAM
Happy generating! π¨
Description
Z-Image-Base-AIO-BF16