(It would be greatly appreciated if someone can point to me a clean source of Tokyo 7th Sisters assets. I don't really want to scrape Twitter or reverse the game API.)
Mask, Don't Negative Prompt: Dealing with Undesirable Parts of Training Images
Introduction
Training images aren't always clean. Sometimes, when training for a given target, unrelated parts in images such as text, frames, or watermarks will also be learned by the model. There are several strategies that can be applied to this problem, each with shortcomings:
Cropping: Leave out undesired parts. Modifies source composition, not applicable in some cases.
Inpainting: Preprocess the data and replace undesirable parts with generated pixels. Requires a good inpainting prompt / model.
Negative Prompts: Train as is and add negative prompts when generating new images. Requires the model to know how the undesirable parts map to the prompt.
Another simple strategy is effective:
Masking: Multiply the loss with a predefined mask.
This method is not new, but the most popular LoRA training script has yet to have built-in support for it.
Experiment
60 images with card text and decorations of Serizwa Momoka from Tokyo 7th Sisters were used.
A masked LoRA and an plain unmasked LoRA were trained.
For the masked version, a mask was drawn using image editing software over source images. Note that since the VAE has a 8x scaling factor, what seen by the model is the 8x8 pixelated version. Tags that describe the parts masked away were removed.
Results
(see preview images)
Future work
Auto generation of masks with segmentation models
Description
Card art only. Trained with masks.
FAQ
Comments (12)
I used to collect T7s images, last updated since July2022, never knew people know about this game. Hope it can be of use to you. Uploaded to WeTransfer because its a free service that allows up to 2GB for anyone without sign ups. It will be deleted automatically within 7 days. https://we.tl/t-qu4fssW7lZ
https://imgur.com/a/8AotnE2 I also have some of these collections. If you're interested let me know.
Thanks for replying! I had a look at the images, unfortunately they aren't quite what I wanted. I should have clarified that I'm looking for 640x960 textless images like those people are posting on Twitter. The rar seems to use the same source as the game info wiki, which contains blurry/small images. Still thanks again for providing assistance.
@gustproof didnt know you were this specific. Would it help if i scrape those people's twi instead? just point me to them ill grab them for u
@gustproof based on what you given, it seems to me that people can share clean images of the card they got in game by twitter, and they have a certain keyword when they share, which u search u can find these borderless cards. am i right? So what i need to do is scrape a list of twitter url that contains the keyword and download the images. There will have tens or hundred of thousands of pictures and duplicates which make it a hastle to sort all of them, thus i'll do all that for you and make sure the images are nice as you needed.
If you've found yourself a complete source then lemme know, otherwise i'll be grabbing those pictures in the meantime. It'll take quite some time to properly collect all of them and have them sorted out.
hey there, a quick update https://we.tl/t-qEueAzD5Jp not complete yet, but just showing u what i managed to grab. progress is at about 2022 july, and still on going to grab older images. Hope it can be of use to you temporarily
@fizzballs Oh wow, didn't actually expect someone to do the chores for me, and with such efficiency! Really nice, helps a lot. While you're at it, can you also grab the tweet texts so that the images can be easier categorized by character?
@gustproof well i'd say we have the same goal instead of doing chores for you. I like collecting pictures of t7s, i also enjoy using artstyle loras. You seem to want to train t7s lora but dont have the needed pictures, so if i can provide you the pictures you could make the lora happen, then it is a win win situation i guess? so why not? Even if you didnt make the lora, still a win for me cuz now have collected borderless pictures of t7s, which is what i always wanted to do but didnt know how. I just would be sad if you dont happen to share the lora if you make it, but thats just another day of life and nothing is certain so i'm still fine with it.
unfortunately that is not possible, my limited skillset cant do that yet, but i have no problem collecting pictures if u need only this.
But if you have some magical tools that could work out somehow, you may find the twitter link of the current image with the "url id" in the filename as it is set by "username"-"url id"-"posted date". The “url id" here is twitter.com/i/status/"url id", which would contain information you needed.
@fizzballs
My interest in t7s was not enough to justify the time to collect and curate the images by myself, but now that we have the data, I can probably pump out LoRAs sometime later.
If possible, can you share your data collection code? I'll see if I can add some features to it.
@gustproof idk how to code, but i use tools others made. This particular tool i used is https://github.com/furyutei/twMediaDownloader . It is addon for a browser, which to any twitter page it can allow download of its media easily. And another tool for duplicate image detection and auto deletion that is https://github.com/KurtBestor/Hitomi-Downloader. The real hassle is downloading, scanning and deleting time but i can manage.
Hey there, after grabbing until 2021 August 19th, the pictures before that are all filled with text, so i think i've reached the end, i thought i would continue to scrape but they all are bordered.
So here https://we.tl/t-cSnrOvi2pE are the images from twitter keyword after being sorted out from recently until 2021 August 20th. For cards that never appeared on twitter ( for one example: https://tokyo7th.tumblr.com/post/664104188417015808 ) then I can't do anything about it since my source is all twitter.
I've tried my best making sure all the images are as you needed. It should be 99% complete, can't guarantee theres 1% mistakenly including texted version, I've double checked and did all i can.
Feel free to ask anymore follow up questions that if I am able to I will try to assist further more, otherwise, good luck with these sources!
@fizzballs Cool, didn't know these tools exist. As I'm sort of busy for the moment, the models will not be made until I find the time. Thanks for the help so far!
Details
Available On (1 platform)
Same model published on other platforms. May have additional downloads or version variants.


