CivArchive
    NoobAI-Flux2VAE-RectifiedFlow - v0.1
    NSFW
    Preview 1
    Preview 2
    Preview 3
    Preview 4
    Preview 5
    Preview 6
    Preview 7
    Preview 8
    Preview 9

    Experimental Conversion of our NoobAI-RF model to Flux2 VAE.

    We have observed the model's ability to adapt to the Flux2 VAE, and current trends suggest that significant improvements are possible with bigger training, which potentially would allow it to compete with bigger models.
    By supporting us you could make it a reality.

    More info on supporting us: click me

    Model Description

    This is a native training of SDXL Unet in combination with Flux2 VAE. Essentially we've adapted previously 4 channel model to work with 32 complex channels of Flux 2. No adapters or tricks, fully native.
    Danbooru dataset of NoobAI has been utilized for this.

    Due to limited compute we were not able to fully converge it, expect output on the level of very early anime models. We hope community will find this interesting enough to support us. We observe steady convergence throughout whole training process, and believe that further training will result in a new standard for fast local anime generation.

    Please take this model a proof of concept, not as a final product.

    We have used Rectified Flow for training, with staged approach for adaptation of Flux2 VAE.
    Most of the knowledge seem to be preserved, but is significantly weakened due to completely new latent space.

    Bias and Limitations

    Once again, we are limited in budget for this fundamental task. We have adapted enough to have it output somewhat acceptable images (Closer to a theoretical NoobAI 0.1's knowledge using Flux 2 VAE), but further progress would require large compute, as we are in territory where model is simply seeing the new level of details for the first time(as well as old level of details in a new way), and it is hard.

    Most biases of official dataset will apply(Blue Archive, etc.).

    Expect noise, fuzzy details, low performance in landscape aspect ratio, bad hands and generally issues with composition as a whole.

    Model Output Examples

    One of the benefits we have achieved is color:

    Due to being native flow model, it achieves strong colors, while not making them acidic, or otherwise unstable.

    Generally, as already stated, expect at least some grain and fuzzyness in all gens, as we have not converged to the juicy details yet.

    Recommended Parameters:
    Sampler: Euler, Euler A, DPM++ SDE, etc.
    Steps: 20-28
    CFG: 6-9
    Schedule: Normal/Simple/SGM Uniform/Quadratic
    Positive Quality Tags: masterpiece, best quality
    Negative Tags: worst quality, normal quality, bad anatomy

    A1111 WebUI

    (All screenshots are repeating our RF release, as there is no difference in setup)

    Recommended WebUI: ReForge - has native support for Flow models, and we've PR'd our native support for Flux2vae-based SDXL modification.

    How to use in ReForge:

    (ignore Sigma max field at the top, this is not used in RF)

    Support for RF in ReForge is being implemented through a built-in extension:

    Set parameters to that, and you're good to go.

    Flux2VAE does not currently have an appropriate high quality preview method, please use Approx Cheap option, which would allow you to see simple PCA projection(ReForge).

    Recommended Parameters:
    Sampler: Euler A Comfy RF, Euler, DPM++ SDE Comfy, etc. ALL VARIANTS MUST BE RF OR COMFY, IF AVAILABLE. In ComfyUI routing is automatic, but not in the case of WebUI.
    Steps: 20-28
    CFG: 6-9
    Schedule: Normal/Simple/SGM Uniform
    Positive Quality Tags: masterpiece, best quality
    Negative Tags: worst quality, normal quality, bad anatomy

    ADETAILER FIX FOR RF: By default, Adetailer discards Advanced Model Sampling extension, which breaks RF. You need to add AMS to this part of settings:

    Add: advanced_model_sampling_script,advanced_model_sampling_script_backported to there.

    If that does not work, go into adetailer extension, find args.py, open it, replace builtinscripts like this:

    Training

    Model Composition

    (Relative to base it's trained from)

    Unet: Same CLIP L: Same, Frozen CLIP G: Same, Frozen VAE: Flux2 VAE

    Training Details

    (Main Stage Training)

    Samples seen(unbatched steps): ~18.5 million samples seen
    Learning Rate: 5e-5
    Effective Batch size: 1472 (92 Batch Size 2 Accumulation 8 GPUs)
    Precision: Full BF16
    Optimizer: AdamW8bit with Kahan Summation
    Weight Decay: 0.01
    Schedule: Constant with warmup
    Timestep Sampling Strategy: Logit-Normal -0.2 1.5 (sometimes referred to as Lognorm), Shift 2.5
    Text Encoders: Frozen
    Keep Token: False
    Tag Dropout: 10%
    Uncond Dropout: 10%
    Shuffle: True

    VAE Conv Padding: False
    VAE Shift: 0.0760
    VAE Scale: 0.6043

    Additional Features used: Protected Tags, Cosine Optimal Transport.

    Training Data

    2 epochs of the original NoobAI dataset, including images up to October 2024, minus screencap data(was not shared).

    LoRA Training

    Current stage is trainable, but it is hard to achieve accurate reproduction if subject/content is dependent on small details, as base model did not converge to them yet. My current style training settings (Anzhc):

    Learning Rate: tested up to 7.5e-4
    Batch Size: 144 (6 real * 24 accum), using SGA(Stochastic Gradient Accumulation) - without SGA I probably would lower accum to 4-8.
    Optimizer: Adamw8bit with Kahan summation
    Schedule: ReREX (Use REX for simplicity, or Cosine annealing)
    Precision: Full BF16
    Weight Decay: 0.02
    Timestep Sampling Strategy: Logit-Normal(either 0.0 1.0, or -0.2 1.5), Shift 2.5

    Dim/Alpha/Conv/Alpha: 24/24/24/24 (Lycoris/Locon)

    Text Encoders: Frozen

    Optimal Transport: True

    Expected Dataset Size: 100 images (Can be even 10, but balance with repeats to roughly this target.)
    Epochs: 50

    Hardware

    Model was trained on cloud 8xH200 node.

    Software

    Custom fork of SD-Scripts(maintained by Bluvoll)

    Acknowledgements

    Special Thanks

    To a special supporter who singlehandidly sponsored whole run and preferred to stay anonymous


    Support

    If you wish to support our continuous effort of making waifus 0.2% better, you can do it here:

    https://ko-fi.com/bluvoll

    Crypto link pending.

    Potential future

    Expected Compute Needed: We theorize that the model needs at the very least 20 epochs on full data, ideally 35 Epochs, each epoch was about 460 USD with the provider we use, at the very least each time we reach enough donations to train 2 epochs, we'll resume and train more. If we have enough donations we will update the dataset to most recent data.
    Why not do this now? Caching with Flux 2 VAE takes a whooping 15 hours, and +-20TB since each latent is 2MB, which in itself costs 180 USD of compute time.

    We are working on further improvements to pipeline and components at the moment of release of this model, and have plans to upgrade this arch more.

    Description

    Checkpoint
    NoobAI

    Details

    Downloads
    12
    Platform
    SeaArt
    Platform Status
    Available
    Created
    12/19/2025
    Updated
    12/19/2025
    Deleted
    -

    Files

    Available On (1 platform)

    Same model published on other platforms. May have additional downloads or version variants.