Rouwei-16channel - CivArchive (CivitAI Archive)

Rouwei-16channel - v0.1_alpha

NSFW

Experimental conversion of SDXL architecture to 16 channel latent space

This is an experimental pretrain on top of Rouwei-0.8 that works with 16 channel latent space and uses Flux ae.

Goals:

Achieve better details while maintaining low compute requirements and all existing knowledge and performance
Possibility of joint sampling with Flux/Chroma/Lumina and other models with same latent space

Current state:

Early alpha version, it is pretty raw. Images may contain extra noise and have artefacts in small details, level varies from neglectable to significant. Upscale, samplers/schedulers, styles, even prompt affect it.

Use of GAN upscale models in pixels space instead of latent upscale gives much smoother results, bumping base resolution higher helps too.

Epsilon prediction now, can be converted to vpred or anything in future.

Usage:

Comfyui

Workflow example (Or just pick any image from showcase)

Download the checkpoint (FP32 and Unet-only can be found in HF REPO
Download these nodes (or just use install missing nodes using Comfy Manager)
Use SDXL 16ch loader node to load it, then work just like you used to with sdxl
DO NOT REMOVE Latent multyply NODES, latents should be scaled before and after processing just same as in regular SDXL inference. This step just isn't hidden yet.

If you're getting error `mat1 and mat2 shapes cannot be multiplied (_x16 and 4x3)` - disable the preview option for Ksampler. It happens because preview uses taesd vae designed for 4channel.

Other UI

Since the main difference is just shapes of tensors, used vae and latents scaling factor - it should be easy to implement support to any other UI.

Lora adapters, controlnet, ip-adapters, other things untested.

Joint sampling:

Since the model operates in 16channel latent space similat to Flux, Chroma, Limina-image and some other, you can implement complex workflows (if you have enough memory). This allows to utilize all knowlege of characters, styles, concepts from RouWei along with the performance of bigger models.

Here is an example workflow. Using just few (1..4) steps from Flux you create some rough basic composition. Then the latents come to 16channel sdxl model where denoised (skipping initial high noise timesteps).

It is the most simple approach, since you don't need to reconvert latents though series of vae or some adapters, you can change models on every denoising step without having any performance impact.

Just don't forget to apply Latents multiply nodes between transitions

How it's made

Basically, no changes to default architecture. Just re-initializing if input and output layers to new size, then training with gradual unfreezing of blocks towards the middle.

Default SDXL latent scale factor of 0.13025 doesn't work well here, 0.6 is used for this release.

This is not the most optimal approach. Some changes to the outer layers of the model instead of direct use 'as is' should give improvement in future. If you have any thoughts or ideas about it - please share them.

Training:

To train it (in current version) all you need is to change the number of in/out channels in UNET config and set scale factor to 0.6 instead of 0.13025. And probably check vae part to work properly.

(Code examples later)

I'm willing to help/cooperate:

Join Discord server where you can share your thoughts, proposals, requests, etc. Write to me directly here or dm in discord.

Thanks:

Part of training was performed using google TPU and sponsored by OpenRoot-Compute

Personal: NeuroSenko

And many thanks to all fellow brothers who supported me before.

Donations:

BTC bc1qwv83ggq8rvv07uk6dv4njs0j3yygj3aax4wg6c

ETH/USDT(e) 0x04C8a749F49aE8a56CB84cF0C99CD9E92eDB17db

XMR 47F7JAyKP8tMBtzwxpoZsUVB8wzg2VrbtDKBice9FAS1FikbHEXXPof4PAb42CQ5ch8p8Hs4RvJuzPHDtaVSdQzD6ZbA5TZ

License:

Same viral as for Illustrious base.

Description

First release

FAQ

Comments (18)

RC0NNov 2, 2025· 28 reactions

CivitAI

Always appreciate people trying the things nobody else has the patience or know-how to do. I'll be watching this experiment with great interest.

Awesome work! 👏

arkbirdNov 2, 2025

CivitAI

HELP PLZ! I'm using Flux.1_Krea_Dev FP8 SCALED for the basic latent, but when it comes to rouwei16channel's k-sampler, an error occurs：“mat1 and mat2 shapes cannot be multiplied (15808x16 and 4x3)”

Minthybasis

Author

Nov 2, 2025

It looks like an issue related to loading of 16ch checkpoint. Does regular workflow without flux work or it gives the same error?

Minthybasis

Author

Nov 3, 2025

The issue comes from preview option for Ksampler because it tries to use taesd vae designed for 4channel. Just turn off the preview option.

reptilekillerNov 2, 2025

CivitAI

It will be necessary to perform full retraining for this architecture to be successful (or equivalent finetuning). But, I believe this is a good future, with balance between SDXL and FLUX.

Minthybasis

Author

Nov 2, 2025· 1 reaction

It actually is a full retraining to a different latent space, just early release. Compute requirements for it are way higher than for something like vpred conversion. The most heavy part is done, now it needs some minor changes in outer layers and polishing.

But this is just another part of a puzzle (like TE replacement) for the future large training. I don't think it makes sense to spend significant money and time to make yet another sdxl tune. At the same time, training of new dit-based models looks like dark forest due to reasons. So, fixing the main issues of SDXL and training a modified architecture seems to be a good option. Even if we see development of new-gen models for anime arts, it will still be useful in joint workflows due to very high inference speed and style flexibility.

PhatcatNov 3, 2025· 6 reactions

CivitAI

This along with Rouwei-T5Gemma is incredibly fascinating and interesting - Any plans to incorporate them both in the same model? Make a proper rouwei SDXL-Flux? :D

Minthybasis

Author

Nov 3, 2025· 4 reactions

Yes, these are experiments to be implemented together in future model. Maybe also there will be a replacement of few unet blocks with dit, but only very small part to keep high inference speed and low hardware requirements.

PhatcatNov 4, 2025· 1 reaction

@Minthybasis currently playing around with rouwei 16 channel using t5gemma alone, clips alone and t5gemma and clips in concat comparing results. atm concat of both clips and t5gemma seems to produce the most coherent result. this also goes for using other models with t5gemma; concat with clips produces better results.
Also getting intermittent runtime errors with rouwei-16channel, it seems like it's a toss-up for me if a workflow will run initially or not; running the same workflow without making any changes at all sometimes it will work, other times it won't. That one I don't understand.
Also I tried using the unet model alone but for some reason I couldn't get clips to produce anything besides noise so I had to get full model and load clips from there; is the full model not simply flux vae and sdxl clips (g and l) baked in? I' been using external flux vae and that seems to work just fine.

PhatcatNov 8, 2025· 1 reaction

@Minthybasis Well, the error seems like it very well could be related to preview; but when it does run with preview on, it will produce something and the progress can be followed in the preview. I dunno..
Using my own clips with the u-net only is where it seems to run, but only noise is produced; it could be there's an issue with my clips or how I loaded them. I'll just use the full model with baked in clips it's no issue.
But yeah I would love to show you some of the results, especially the clips vs t5gemma vs clips+t5gemma. If I could send the pictures the workflow is embedded in the metadata.

Minthybasis

Author

Nov 11, 2025

@Phatcat Yes, if it produces only a noise - it can be related to clips. Do they work with 4channel version?

You can upload pictures with workflow to any image hosting that doesn't cut metadata, for example catbox.moe

PhatcatNov 12, 2025· 1 reaction

@Minthybasis No.. They only work with sdxl based models apparently; so anything illustrious based will result in garbage output.. Apparently the clips are not quite the same..
Somewhere in the pibeline it seems to have the clips unfrozen during training, perhaps even pre-illustrious.

On a related note, NoobAI is suffering from broken clips and there's a write-up on it you may or may not have seen:

https://www.reddit.com/r/StableDiffusion/comments/1o1u2zm/text_encoders_in_noobai_are_dramatically_flawed_a/
https://www.reddit.com/r/StableDiffusion/comments/1o25x9t/text_encoders_in_noobai_are_part_2/

IJDEIHNov 7, 2025· 5 reactions

CivitAI

Thank you for sharing your work.

xikin2135558Nov 8, 2025

CivitAI

How do I train a lora for this model?

Minthybasis

Author

Nov 10, 2025· 1 reaction

A few edits in trainer code need to be made:

1. Change channels count from 4 to 16 in model config

2. Change VaeScaleFactor from 0.13025 to 0.6

3. Adjust part of code that is related to latents creation.

First two can be done pretty easily and enough if you're using precomputed latent. Last part is more complicated, I'm going to upload some code to weekend or after it.

bl4ckfuture107Nov 9, 2025· 5 reactions

CivitAI

MinthyBased does it again!

PhatcatDec 16, 2025

CivitAI

@Minthybasis Hey.. So... Flux 2 VAE is out..
So is Z-Image-Turbo, with more Z-Image models to come.

What does that mean for this project (16-channel; rouwei with flux vae), the t5gemma as sdxl encoder project and rouwei in general?

You moving rouwei to zit? Moving focus from flux 1 vae to flux 2 vae? Just keep working on getting rouwei-sdxl to play nice with flux 1 vae? Or something completely different?

qekDec 20, 2025

CivitAI

Doesn't work well. I thought it was at least 50% standalone

Checkpoint

Illustrious

by Minthybasis

Download (Beta) View on CivitAI

anime

base model

Details

Downloads

233

Platform

CivitAI

Platform Status

Available

Created

11/2/2025

Updated

6/28/2026

Deleted

Files

rouwei16channel_v01Alpha.safetensors

Size:

6.46 GB

SHA256:

78a5fc69d6489d31a2eaf830513f4f2266c2fda299509b018a4e0833408b6b45

Mirrors

HuggingFace (1 mirrors)

rouwei16channel_v01Alpha.safetensors

CivitAI (1 mirrors)

rouwei16channel_v01Alpha.safetensors

Available On (1 platform)

Same model published on other platforms. May have additional downloads or version variants.

SeaArt

Rouwei-16channel - v0.1_alpha

Experimental conversion of SDXL architecture to 16 channel latent space

Goals:

Current state:

Usage:

Comfyui

If you're getting error mat1 and mat2 shapes cannot be multiplied (_x16 and 4x3) - disable the preview option for Ksampler. It happens because preview uses taesd vae designed for 4channel.

Other UI

Joint sampling:

How it's made

Training:

I'm willing to help/cooperate:

Thanks:

Donations:

License:

Description

FAQ

What is Rouwei-16channel?

How do I use Rouwei-16channel?

What should I watch out for with Illustrious models?

What other Illustrious-based models are worth knowing?

Can I use this model commercially?

What files are available and where can I download them?

Comments (18)

Details

Files

rouwei16channel_v01Alpha.safetensors

Mirrors

Available On (1 platform)

If you're getting error `mat1 and mat2 shapes cannot be multiplied (_x16 and 4x3)` - disable the preview option for Ksampler. It happens because preview uses taesd vae designed for 4channel.