Clarity TagFlow
The intelligent, local-first image tagging and AI dataset curation powerhouse.
original source code here
Clarity TagFlow is a modern desktop application designed to streamline the process of tagging images for Machine Learning datasets, Stable Diffusion training, and digital asset management. Built with a strict focus on privacy, speed, and optimization, it runs state-of-the-art AI models entirely on your local machine.
๐ Project Status: ~85% Complete! All major bugs have been squashed, and video playback is now fully operational. We are currently optimizing performance before the next massive architectural leap.
๐ฌ We Need Your Feedback! Do you miss an old feature that was removed? Is there something you don't like, or an improvement you're dying to see? Leave a comment and let us know!
๐ ๏ธ System Requirements
OS: Windows / Linux
Runtime: Java 21 (Note: Moving to native binaries soon!)
AI Acceleration: Microsoft Visual C++ Redistributable (for ONNX Runtime on Windows)
Tested Hardware: Fully tested and verified smooth on laptop RTX 5080 configurations.
๐ Key Features
๐ค Local AI Powerhouse
Privacy First: No images or data are ever uploaded to the cloud. All processing happens 100% offline.
Multi-Model Support: Seamlessly switch between JoyTag (SOTA) and various WD14 tagging models (ConvNext, SwinV2, Eva02) to get the most accurate results for your specific content.
Smart Thresholding: Fine-tune confidence thresholds to control exactly how strict the AI is when applying tags.
โก Accelerated Workflow
Batch Processing: Auto-tag entire folders of images in minutes. Choose to Append new tags or Overwrite existing ones completely.
Smart Autocomplete: Type faster with a context-aware autocomplete system that learns from your current dataset and standard tag libraries.
Sidecar Compatibility: Reads and writes standard
.txtsidecar files, ensuring seamless compatibility with Kohya_ss, OneTrainer, and other major training tools.
๐จ Modern Experience
Sleek Dark UI: A refined, eye-friendly interface designed for long, focused dataset curation sessions.
Visual Feedback: Features a "breathing" AI status indicator to let you know exactly when the system is processing.
Drag & Drop: Easily reorder tags or drag new images directly into your active workflow.
๐ก๏ธ Control & Safety
Global Blacklist: Automatically filter out unwanted tags (e.g., sensitive terms or specific formatting errors) across your entire dataset.
Session Memory: Newly added AI tags are highlighted in Cyan, making it incredibly easy to review changes before committing to a save.
๐ What's New & Improved
Fixed Video Playback: The underlying issues causing intermittent video playback failures have been completely resolved.
Total Bug Purge: Fixed all known memory leaks and stability quirks. The gallery navigation bug (where deleting an image reset you to page one) remains permanently fixed.
Smart Booru Downloader: Automatically fetches associated tags when downloading from Booru sites, establishing an immediate 80-90% accuracy baseline.
Seamless Generation Setup: The generation panel auto-starts in the background. No more manual configuration via
.batfiles.In-App Image Generation: Direct support for generating images right inside your tagging workflow.
โ ๏ธ Current Known Issues & Immediate Fixes
Gemma 4 Support: We are aware of the current issues regarding Gemma 4 integration. A dedicated compatibility fix is actively being worked on and will be released shortly.
๐ฎ The Grand Blueprint: Moving to Rust & Beyond
We are officially planning a massive architectural evolution, moving away from Java and rewriting the application core in Rust. This transition is all about native optimization, blistering speed, and breaking down dependencies.
Here is what the Rust-powered future of Clarity TagFlow looks like:
Native Cross-OS Performance: Moving to Rust unlocks a highly optimized, beautifully integrated native UI with seamless compatibility across Windows and Linux.
Built-in Model & LoRA Training: Train your models and LoRAs directly within the app. Features a live visual training preview so you can monitor progress in real-time and know your model is turning out perfectly.
All-in-One Image & Video Generation: Full, native image and video generation support built directly into the UI. Say goodbye to configuring complex, bloated ComfyUI workflows.
Zero-Dependency Local LLM: We are eliminating the need for external tools like Ollama or LM Studio. The local LLM engine will be completely embedded out of the box.
In-App Civitai Integration: Browse, search, and download Civitai resources directly inside the applicationโno more manual downloads or directory headaches.
Interactive VR Anime Companions: Introducing a built-in VR anime character for interactive roleplaying. The AI will dynamically control the avatar, express distinct emotions, and react in real-time.
External Engine Integration: Building the foundation for the embedded LLM to broadcast and connect with external 3D applications and game engines.
Description
================================================================
Clarity TagFlow โ UPDATES & FIXES
Session date: 2026-05-23
================================================================
This file lists the features added and bugs fixed during the
development session. Grouped by area.
----------------------------------------------------------------
LEFT BROWSER PANEL / THUMBNAILS
----------------------------------------------------------------
- GIF badge: GIF tiles now show a "gif.svg" badge in the top-left
corner of the thumbnail (ThumbnailToggleButton).
- Video badge: video tiles show a "video.svg" badge in the
top-left corner. Removed the old centered play-button overlay
and deleted playbutton.svg (badge replaces it).
- Video thumbnails fixed: video previews no longer get stuck on a
blank "loading" tile after deleting an image / refreshing.
* Root cause: thumbnail requests for an already-in-flight
image dropped their callback, so rebuilt tiles never
received the image. Reworked ThumbnailService to COALESCE
callbacks (one decode notifies all waiters) and to stop
clearing in-flight loads on clearCache().
* Also: failed video grabs are no longer cached as blanks
(so they retry), and the video-decode permit wait was
raised so a burst of videos loads instead of timing out.
- Scroll-gap bug fixed: the spacing between tiles stayed a
consistent 16px while scrolling. Previously lazily-appended
chunks used 6px, so images got tighter further down the list.
----------------------------------------------------------------
VIDEO PLAYER (VLC)
----------------------------------------------------------------
- Modern seek bar: replaced the plain slider with a modern
scrubber (thin rounded track, accent-coloured played portion,
round white knob that grows on hover, click-to-seek).
- Play/Pause icons: now uses playbutton.svg / pausebutton.svg
(fixed the "3 dots" that appeared from missing text glyphs).
- Fullscreen: added a full-screen toggle button
(full_screen.svg / close_fullscreen.svg); Esc exits.
- Video Film Strip: when enabled in Settings, the seek bar
becomes a YouTube-style strip of frame thumbnails sampled
across the video, with a playhead and a played-progress bar.
Click/drag the strip to seek. (New VideoFilmStrip component +
VideoThumbnailer.snapshotAt for grabbing frames at a position.)
----------------------------------------------------------------
VIEWER PANEL
----------------------------------------------------------------
- Right-click "Crop Image": right-click an image -> Crop Image ->
drag to select an area -> release to crop. Saves a COPY
(<name>_crop.png) next to the original; the original is never
changed. Crops from the full-resolution original. Esc or
right-click cancels.
----------------------------------------------------------------
RIGHT DETAILS PANEL
----------------------------------------------------------------
- Tags / SD Metadata / Caption switch: a window_switch.svg button
in the top-right of the content box cycles between the views
that exist (only shows when more than one is available).
- Caption support: shows the image's .caption file; the Edit
button edits the .caption when in Caption view.
----------------------------------------------------------------
TAG MANAGER PANEL
----------------------------------------------------------------
- Long captions/tags now WRAP and stay fully visible (switched
the list cell renderer to a wrapping text area; the list tracks
the viewport width). Content scrolls inside the box.
- Fixed a regression where the tag-list box became tiny (a
max-height cap was removed so the box fills the panel again).
----------------------------------------------------------------
AI / LLM
----------------------------------------------------------------
- Captioning vs tagging: the AI now tells the two apart.
* "tag this" -> comma tags saved to <name>.txt
* "caption this" -> a prose description saved to <name>.caption
(New [CAPTION] block in the assistant; captions stored in a
separate .caption sidecar so tags and captions coexist.)
- LLM Right panel (.caption aware): if an image has no .txt but
has a .caption it loads the caption; if it has both it loads
.txt first. A window-switch button next to "LLM Suggestions /
Tags:" toggles between .txt and .caption (only shows when both
exist). Long captions wrap.
- LLM Left panel: thumbnail gallery now uses aspect-correct,
image-hugging tiles (like the main browser) instead of fixed
squares.
- Clear chat: the Clear button now also clears the AI's
conversation memory (both the normal and role-play assistants),
not just the visible messages.
----------------------------------------------------------------
DRAG & DROP + PROJECTS
----------------------------------------------------------------
- Drag & drop import: drop images / GIFs / videos (or folders of
them) anywhere in the app.
* If a folder is open, files are copied into it.
* If no folder is open, a new date-stamped project folder is
created (under data/projects) and opened.
- Live folder updates: the open folder is watched; new/changed
files refresh the browser automatically, and the currently
selected image stays selected (no need to re-click it).
- Projects (saved in the app): auto-created folders live under
data/projects. A "Projects" entry in the folder menu lists them;
selecting one opens it. The currently-open project is
highlighted.
- New project: right-click the folder icon (or "New Project..."
in the menu) to create a named project folder.
- Folder menu header: now shows which folder is currently
selected.
----------------------------------------------------------------
ENCRYPTED FOLDERS (AES-256)
----------------------------------------------------------------
- Startup folder + encryption (Settings > General, top):
* Set a startup folder and "Load this folder when the app
opens".
* "Encrypt..." packs the folder into a password-protected
AES-256 .zip (offers to delete the unencrypted originals).
* On launch, an encrypted startup folder prompts for the
password and mounts it.
- Add to an encrypted folder: dropping files into an open
encrypted archive re-encrypts it (extract -> add -> re-zip ->
re-mount), keeping the current selection; offers to delete the
dropped originals so no plaintext copy remains.
- Delete / Edit tags in an encrypted folder: now work via the
same re-encrypt mechanism (Right Details panel + advanced Tag
Manager).
----------------------------------------------------------------
SETTINGS
----------------------------------------------------------------
- New "Info" tab (next to General): lists what the app can do and
a reference of keyboard & mouse controls. Scrolls inside the tab
so it doesn't make the dialog too tall.
- Fixed: couldn't change Thumbnail Width / Height (and other
spinners). The custom editor wrapper was breaking the spinner's
value binding, so typed values never committed. Now the spinner
paints its own rounded background and keeps the default editor.
- Default thumbnail size changed to Width 230 / Height 400.
----------------------------------------------------------------
CIVITAI INFO
----------------------------------------------------------------
- Added an "API Status" pill in the top-right corner (next to the
Civitai logo), like Danbooru's. Polls continuously every 5s and
shows Checking / Online / Offline.
----------------------------------------------------------------
NEW FILES ADDED THIS SESSION
----------------------------------------------------------------
- InfoPanel.java (Settings "Info" tab)
- MediaImporter.java (drag & drop import)
- FolderWatcher.java (live folder updates)
- ProjectStore.java (projects under data/projects)
- FolderEncryptor.java (AES-256 folder encryption)
- Captions.java (.caption sidecar read/write)
- ModernSliderUI.java (modern video scrubber)
- VideoFilmStrip.java (video frame-strip scrubber)
- Captions / caption support across panels
(Icons used: gif.svg, video.svg, playbutton.svg, pausebutton.svg,
full_screen.svg, close_fullscreen.svg, window_switch.svg)
----------------------------------------------------------------
BUNDLED VLC (libVLC) + LICENSE NOTE
----------------------------------------------------------------
- The video player now bundles libVLC so users don't need to
install VLC:
src/main/resources/tools/vlc/
libvlc.dll
libvlccore.dll
plugins/ (~133 MB)
Loaded via the new VlcBundle.configure(), which points vlcj/JNA
at this folder before NativeDiscovery runs. Falls back to an
installed VLC if the bundle isn't present.
Note: native DLLs can't be loaded from inside a packaged .jar โ
this works when running from the IDE / exploded classes; for a
fat-jar build, the tools/vlc folder must be extracted to disk at
startup first.
- LICENSE (important when distributing):
This app bundles and uses libVLC and VLC plugins from the
VLC media player project (c) the VideoLAN organization.
* libVLC is licensed under LGPL-2.1+.
* Some bundled VLC plugins are licensed under the GPL.
* The Java binding is vlcj (c) Caprica Software (LGPL/GPL).
When distributing the application you must include these license
notices. A summary was added to the Settings > Licenses tab, and
the full license / attribution texts now ship in the bundle folder:
src/main/resources/tools/vlc/
COPYING.txt (GNU GPL, as shipped with VLC)
AUTHORS.txt (VLC authors / contributors)
THANKS.txt (acknowledgements)
README.txt (VLC readme)
NOTICE.txt (our LGPL/GPL split + reference links)
References:
https://www.videolan.org/legal.html
https://www.gnu.org/licenses/lgpl-2.1.html
https://github.com/caprica/vlcj
================================================================
SESSION 2 โ MORE FIXES & FEATURES
================================================================
----------------------------------------------------------------
AI ORB (replaces the static AI.svg everywhere)
----------------------------------------------------------------
- New AiOrb component: a "living" ring of glowing particles that
gently breathes when idle and speeds up / brightens / ripples
when the assistant is THINKING or TALKING. A lightweight Swing
stand-in for a WebGL particle ring (no 3D, same feel).
* One shared 60fps ticker drives every visible orb (cheap even
with many on screen, e.g. one per chat message); off-screen
orbs are skipped and the ticker stops when none are alive.
* Theme-accent coloured; states IDLE / THINKING / TALKING.
- Replaced the AI.svg icon in all 5 spots: LLM chat assistant
avatar (thinks pre-stream, "talks" while streaming), LLM tagging
panel header, Tag Manager header, Generate controls header, and
the image-viewer header (the two generation headers tint the orb
to the live status colour).
- Orb size tuned (36px) and the four panel headers were pinned to a
fixed height so the orb sits inside them without resizing them.
----------------------------------------------------------------
TEXT-TO-SPEECH (smoother + faster to start)
----------------------------------------------------------------
- Fixed the squeaky/"chipmunk" voice: removed the resampling pitch
shift (was +18%, which moved the formants up). Pitch is now 1.0
(natural timbre); voice is a warm blend (af_heart + a little
af_bella) at a calmer 0.96 speed.
- Much lower start-up delay: speech now STREAMS by sentence โ the
first sentence plays while the next is still being synthesised,
instead of waiting for the whole reply to render. The model also
warms up the moment TTS is enabled, so the first line is prompt.
- Clean cancellation across the new streaming pipeline (interrupting
or starting a new line cuts off immediately, no orphaned temp WAVs).
----------------------------------------------------------------
TAG MANAGER โ NEW MODEL: PixAI Tagger v0.9
----------------------------------------------------------------
- Added PixAI Tagger v0.9 (EVA02, ~13.4k tags, great character /
series recognition) to "Get Models" and the model dropdown
(Tag Manager + LLM panel). Uses the DeepGHS ONNX export.
* New PixaiTagger class with the model's exact preprocessing
(448x448, RGB, mean/std 0.5) and per-category thresholds
(general from the spinner, character held to >= 0.85).
* Fixed initially-wrong tags: the ONNX has 3 outputs
(embedding / logits / prediction); we now read 'prediction'
by NAME instead of index 0 (which was the embedding).
- Model dropdown now lists ONLY downloaded models (plus the
"Select AI..." placeholder), and refreshes after Get Models
closes โ no more phantom models implying they're all installed.
Applied to both the Tag Manager and the LLM panel.
----------------------------------------------------------------
MODELS / DATA STORAGE (everything under src/main/resources)
----------------------------------------------------------------
- AI tagger models downloaded via "Get Models" now always save into
src/main/resources/tools/<model>/ (the old code fell back to a
project-root tools/ folder on first download).
- App runtime data (settings, projects, hearts, memory, SFTP host
key, emoji cache, Pexels logs) now lives under src/main/resources
/data and /logs, consistent with config/, logs/ and tools/.
- pom.xml excludes the large downloadable tagger models
(tools/joytag, tools/wd14-*, tools/pixai-*) from the resource copy
so they don't bloat builds (users download them at runtime).
----------------------------------------------------------------
GENERATE vs LOCAL LLM (mutually exclusive)
----------------------------------------------------------------
- Generate (Stable Diffusion) and the Local LLM can't run at once.
Enabling either while the other is on now shows a "turn the other
off first" message and reverts the toggle โ guarded in BOTH
directions (Generate tab and Local LLM tab).
----------------------------------------------------------------
CIVITAI INFO
----------------------------------------------------------------
- Hearting an image no longer refreshes the Civitai Info panel
(it was re-hitting the Civitai API on every heart). The favorite
indicator still updates live via HeartManager's change listener.
----------------------------------------------------------------
UI POLISH โ SMOOTH ROUNDED CORNERS (no more setShape)
----------------------------------------------------------------
- Replaced hard-clipped setShape() rounding (which looked blurry /
jagged on the right & bottom) with antialiased rounded panels on
transparent windows across the app:
* Tag Manager settings dialog (also fixed a border that looked
"cut off" on the right/bottom, an off-by-one in RoundedBorder).
* Backup dialogs (progress + New Backup).
* Folder picker, Embedding / LoRA selection dialogs, tooltips,
and the LLM emoji / tools popups.
* The folder menu + its Projects submenu are now rounded AND
smooth (custom rounded popup border + transparent popup window).
There is no longer any setShape() rounding anywhere in the app.
- Tag Manager settings: the Tag Separator dropdown and Default
Confidence Threshold now sit on the LEFT next to their labels
(instead of pushed to the far right). Same for the LLM panel's
Threshold control.


