CivArchive
    Shiho a realistic Japanese woman z-image-turbo - v1.0-17RC
    NSFW
    Preview 113405130
    Preview 113405128
    Preview 113405129
    Preview 113405127

    Z-Image-Turbo version of Shiho a realistic Japanese woman trained with AI-Toolkit.

    Description

    FAQ

    Comments (5)

    MajinVegetaDec 11, 2025
    CivitAI

    Great Lora! Really captures the character, and very good with her anatomy.

    blackestcurse93Dec 11, 2025
    CivitAI

    Lora looks very good and without that generic AI model sameface problem. Can you tell more about the training process? I can see that you trained for 185000 steps. That seems like a lot! Do you gain more precision and quality with more steps? What are the other specs like learning rate, resolutions, number of images in the dataset and caption style?

    It would be really great help to learn from your experience on training this model!

    blackestcurse93Dec 11, 2025

    Also the LoRa's file size is huge! Does the rank affect the quality a lot?

    TakOkada
    Author
    Dec 12, 2025· 2 reactions

    @blackestcurse93 Thank you, blackestcurse93, for your excellent feedback and insightful questions! I'm delighted you noticed that the LoRA avoids the "sameface problem," as my primary goal was to achieve true photorealism that is indistinguishable from a photograph, prioritizing unique details over generic ideals.

    Here are the details about the complex training process:

    Training Methodology: Incremental and Adaptive

    Instead of a fixed schedule, I adopted an iterative approach, treating every 5,000 steps as one training turn. The learning rate was started at $1.0 \times 10^{-4}$ and then gradually adjusted downwards based on the results of each turn.

    Dataset and Iteration:

    My total dataset involved 30K images. However, to ensure maximum feature diversity and prevent mode collapse, the images were not used all at once. I employed an incremental system where 4,000 to 5,000 images were input at a time and were rotated out (replaced) every 5,000 steps. This frequent rotation, combined with the adaptive learning rate, was crucial for capturing the subtle nuances that define realistic details.

    Dataset Diversity and Captioning:

    The images included a wide range of poses, clothing, expressions, locations, and hairstyles, which prevents the model from generalizing into a single "AI look."

    The captions for the entire dataset were generated using Qwen2.5VL, ensuring deep and precise tagging of every element.

    On 185,000 Steps:

    Yes, 185,000 steps is a large number, but it was necessary for my goal. This extensive training time was required to deeply embed the subtle, non-ideal features (like minor skin imperfections, natural expressions, etc.) that distinguish photorealism from typical AI generation. This precision is what allows the model to overcome the sameface issue and achieve higher overall quality.

    I hope this sheds light on the process! I'm happy to share my experience if it helps others push the boundaries of realism.

    TakOkada
    Author
    Dec 12, 2025

    @blackestcurse93 I forgot to mention the LoRA Rank (Dimension)!

    The LoRA Rank was set to 128 (DIM=128).

    This higher rank was absolutely critical. It allowed the model to effectively store the vast amount of micro-details and subtle variations captured through the high step count (185,000) and the iterative dataset process. DIM=128 helps prevent the loss of those realistic, non-ideal features that distinguish photorealism from generic AI faces.

    LORA
    ZImageTurbo

    Details

    Downloads
    83
    Platform
    CivitAI
    Platform Status
    Available
    Created
    12/11/2025
    Updated
    6/24/2026
    Deleted
    -
    Trigger Words:
    shiho

    Files

    zimageShiho128v01 (17) RC.safetensors