CivArchive
    Foot Anime Yolo11m - v3
    NSFW
    Preview 133663760
    Preview 133663738
    Preview 133663793
    Preview 133663810
    Preview 133663814
    Preview 133663853
    Preview 133663859

    Anime foot detector (YOLO11m) — ADetailer / Impact Pack


    Also on Hugging Face (same files + ONNX, I can't upload ONNX files in this repo for some reason): https://huggingface.co/Claquasse/foot_anime_yolo

    A small YOLO11m detector that finds feet in anime and illustration images (single class foot). Drop it into an ADetailer or Impact Pack pass to auto-fix feet, which diffusion models often render badly. Built for the Anima model, but it works on anime-style art in general, so it should transfer to other anime or illustration generators.


    Three versions are provided. v3 is the one to use — best box accuracy and the widest coverage. v2 is the previous best and a little better at plain, clearly visible feet. v1 is the first and weakest, kept for reference.


    Files


    Each version ships as .pt (load directly in ComfyUI or Ultralytics) and .onnx (non-pickle, for ONNX Runtime).


    - foot_anime_yolo11m_v3.pt — production (recommended)

    - foot_anime_yolo11m_v2.pt — previous production

    - foot_anime_yolo11m_v1.pt — reference


    Install


    ComfyUI (Impact Pack): put the .pt in ComfyUI/models/ultralytics/bbox/, load it with UltralyticsDetectorProvider, and feed the bounding box into a detail or inpaint pass. A bbox threshold near 0.45 is a sensible default.


    A1111 / Forge (ADetailer): put the .pt in stable-diffusion-webui/models/adetailer/ and select it as the ADetailer model.


    Benchmark


    Held-out set of 100 generated anime images (185 feet), none of which the models trained on. Scores are mAP50 / mAP50-95.


    | model | mAP50 | mAP50-95 |

    |---|---|---|

    | v1 | 0.28 | 0.08 |

    | v2 | 0.81 | 0.50 |

    | v3 | 0.81 | 0.59 |


    v3 has the tightest boxes in every image type and matches or beats v2 at finding feet. Open-toe footwear is the hardest case for all of them.


    The preview images show all three versions plus a generic YOLOv8x foot detector run on the same frame at once (red = v3, green = v2, blue = v1, yellow = generic YOLOv8x reference), so you can see how they compare.


    Notes and scope


    Trained on bare anime feet, mined from Danbooru and labeled with DWPose keypoints, plus the public-domain ANFDet set, a few hundred hand-labeled images, and feet-free images as hard negatives. v3 was trained on roughly 286k images. Footwear, sandals, and stockings sit outside the primary case, though v3 generalizes to them noticeably better than v1 or v2. Tuned for anime and illustration, not photographs.


    The boxes are meant to feed a refiner, not to stand alone. v2 and v3 draw slightly looser boxes that wrap the whole foot, which is what you want for an inpaint pass.


    License: AGPL-3.0 (inherited from Ultralytics YOLO). If you serve these weights over a network, AGPL's source-availability terms apply. The AGPL license is the authoritative one regardless of the toggles on this page.


    Support

    Building these means mining and labeling hundreds of thousands of images and renting GPUs to train on them, which takes real time and money. If the models are useful to you and you want to chip in, it is appreciated and never expected: https://ko-fi.com/claquasse

    Description

    Detection
    Other

    Details

    Downloads
    61
    Platform
    CivitAI
    Platform Status
    Available
    Created
    6/15/2026
    Updated
    6/18/2026
    Deleted
    -

    Files

    footAnimeYolo11m_v3.pt

    Mirrors

    HuggingFace (1 mirrors)
    CivitAI (1 mirrors)