I developed this LORA Data Tool as a simple, all-in-one stand alone application designed to prepare high-quality images and captions Datasets for training custom LORA models (For SDXL 1.0, Flux, Z-Image, Chroma & QWEN Image) on base level Apple Silicon Mac or on a Linux computer (works on both Nvidia and AMD GPUs/eGPUs.. even on 8GB VRAM). It uses the Florence-2 AI model for automated captioning and provides a gallery view for review and editing. The Florence-2 model is very compact and works well on low end GPUs as compared to regular beefy JoyCaption model. I don't use any Windows computer so I didn't build any setup script for Windows but as the LORA DATA tool is written in Python it can also be run easily on Windows if you know how to setup and run Python virtual environment on Windows. The tool has a very easy to follow user interface with two sections (Dataset builder & Dataset auditor) each section has it's own dedicated tab. You can either copy the path to the folder you have containing your images you wish to process or select using built in file browse & select UI to start your process on your relavant tab. These are the two tabs -
I. Data Builder (Automate)
---------------------------
The main tab for bulk processing your dataset.
Key Features:
- Image Scaling: Resizes images so the shortest side is 1024 pixels. New images are saved to a subfolder named '1024_scaled'.
- Caption Generation: Uses the Florence-2 AI model to automatically generate a detailed caption (saved as a .txt file) for each image.
- Caption Styles: Supports Short, Medium, and Long (Civitai Max) caption styles.
- Bulk Trigger Word: Adds a specified trigger word to the start or end of all generated/existing captions.
- Bulk Search & Replace: Replaces all occurrences of a search term with a replacement term across all caption files.
II. Data Auditor (Review)
--------------------------
The quality control tab for reviewing and manually adjusting the training data because AI captioning is very good but not always accurate.
Key Features:
- Page-Based Gallery: Loads and displays images and their corresponding captions in batches (10 items per page).
- Live Editing: Allows direct editing of the caption text next to the image preview.
- Save: Saves the edited captions for the current page.
- Delete: Permanently deletes both the image and its caption file.
III. Setup and Launch (Default AMD GPU Linux Setup or Apple Silicon Macs - M2, M4 etc.)
----------------------------------------------------------------------------------------
The tool relies on the 'run_linux.sh' or 'run_mac.sh' script for environment and model management.
1. Launch: Run the script appropriate for your operating system.
2. Dependencies: The script automatically creates a Python virtual environment ('venv') and installs required libraries (transformers, etc.).
3. Model: The script downloads the 'MiaoshouAI/Florence-2-base-PromptGen-v1.5' model (~1.3 GB) into a 'model' subfolder during the first run.
4. Clean Operation: To remove the environment and model, run the script with the '--clean' argument:
- Linux: ./run_linux.sh --clean
- macOS: ./run_mac.sh --clean
** NVIDIA / CUDA Setup (Linux) modificatins (Very Important - without this the setup will fail)
------------------------------------------------------------------------------------------------
By default, 'run_linux.sh' is configured for AMD (ROCm). To use an NVIDIA GPU:
1. Open 'run_linux.sh' in a text editor.
2. Find the line that installs torch (usually Step 3).
3. Replace the pip3 install command with the following:
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/cu121
4. Save the file and run it. This will ensure the tool uses your NVIDIA GPU for
lightning-fast caption generation.
5. Clean Operation: To remove the environment and model, run the script with the '--clean' argument:
- Linux: ./run_linux.sh --clean
** With this tool I also provided 60 royaly free images from Unsplash as a free sample dataset to test out this tool's capability. If you are looking for good uncaptioned datasets to train but finding it difficult to collect your own you can also look up my CivitAI profile ( https://civarchive.com/user/sarcastictofu ) I have uplodead some decent datasets there. Feel free to download them and if you like any of my work please provide me some Buzz.
Description
Comments (14)
Feel free to review my scripts, tweak and fork of this toolset.. and share with me if you want.. hope this will help you a lot... I built this when CivitAI's LORA trainer was misbehaving pretty badly a few days ago.
Some of you Windows users have asked me how to use this... if you can't code you can upload Linux or Mac setup script I have provided to either claude or gemini AI and ask it to convert it to Windows setup script... be sure to provide the AI details about your GPU (like AMD or Nvidia, how much VRAM it has etc. ) so it can convert my setup script to your very specific Windows system.. it should be easy conversion..
Hi, I know practically nothing about programming or anything else, but with the help of gemini, I managed to get it running in a portable Python environment. I had to install all the necessary dependencies, modify the main lora_data_builder.py script, and even modify the modeling_florence2.py script to bypass flash_attn!
It works great, it's very fast, and it's very useful. Thanks for sharing.
Damn, I didn't read your message! Maybe it would have been much easier to convert the installation file!
@jazara930 So just with a converted Windows setup script regular python can't run? I am curious about that.. I personally like my Mac a lot and on another machine I use Linux so I am wondering why plain and simple python GUI won't run.
Hi! It’s great to hear from you. Following your suggestion, I’ve been working on the Windows conversion and managed to get everything running perfectly, including full GPU support!
Summary of the Windows & NVIDIA Optimization
To make the tool fully functional on Windows with a dedicated NVIDIA GPU (like the RTX 3090/4090), I performed the following key steps:
1. Portable Environment with Tkinter Support I moved away from a system-wide Python installation. I created a Portable Python environment that includes the Tkinter libraries (often missing in minimal Python embeds), allowing the GUI to run without requiring the user to install Python on their OS.
2. Dynamic Path Management (Portability) I replaced absolute paths with dynamic batch scripting using %~dp0. This allows the entire folder to be moved between different drives (e.g., from C: to D: or an external USB) without breaking the link between the Batch launcher, the Python interpreter, and the script.
3. The Half-Precision Fix (The most important step) The original code often default to float32 or generic CPU tensors. On Windows with CUDA, Florence-2 and similar models can trigger a "RuntimeError: mat1 and mat2 shapes cannot be multiplied" or precision mismatches. I modified lora_data_builder.py to:
Force the use of device='cuda'.
Convert inputs and model weights to float16 (Half-Precision). This is crucial for NVIDIA 30-series/40-series cards to ensure compatibility and significantly speed up the captioning process.
4. Dependency Isolation Instead of a global pip install, I used the python -m pip command within the batch file to install torch, torchvision (with CUDA 12.1 wheels), transformers, timm, and einops directly into the portable folder. This ensures that the tool doesn't conflict with other AI software on the user's PC.
5. Automatic Model Downloader I've integrated a small Python snippet using huggingface_hub within the batch script to automatically download the Florence-2 model into a local folder only if it's missing, making the setup "one-click" for the end user.
And.... oh I forgot, thank you so much for all the loras you share for free, and sorry for my bad english! I'm from Italy. Good job!
@jazara930 great, you should share your modifications of this in CivitAI.. If you do so I would also recommend this to others.
@sarcastictofu That’s a great idea! I will work on a clean, portable package to upload it to CivitAI as soon as I can. I’ll make sure to credit you as the original author. Happy New Year and thanks for your support!
@jazara930 Happy new year to you too.
@sarcastictofu happy new year to you,
Hi SarcasticTofu! I wanted to let you know that I followed your advice and officially released the Portable Windows version of the Florence-2 LORA Data Builder on Civitai. I managed to fix the CUDA initialization and the model loading issues specifically for NVIDIA users.
I made sure to give you full credit for the core code and logic in the description. Thanks again for the suggestion and for the great tool!
@jazara930 Good I am linking your fork in the "Suggested Resources" section
@sarcastictofu Hi SarcasticTofu! Thank you so much for the shout-out and for linking my fork in your Suggested Resources! I've just reciprocated by adding your original tool to my Suggested Resources section as well. It’s an honor to contribute to your project for the Windows community. Keep up the great work!
@sarcastictofu Hey SarcasticTofu! Just wanted to let you know that v1.2 (Universal CPU Edition) is coming! This new version fixes the blank screen and loading issues by forcing a super-stable CPU mode.
It’s surprisingly fast and avoids all those GPU driver headaches. I’ll be uploading the new files between tomorrow and the day after. Happy New Year!
Just did a minor UI tweak and re-uploaded the python script.. now text visibility of all input fields should be good.
If anyone is looking for a Windows - Nvidia Fork of my Tool, there is one already out and available here -
Florence-2 LORA Data Builder - NVIDIA Optimized (Windows Portable) by jazara930 - https://civitai.com/models/2266157/florence-2-lora-data-builder-nvidia-optimized-windows-portable


