Example of a prompt for generation:
positive:
In a gentle animation, a serene girl gently embraces a muscular male member with her lips, rhythmically shaking her head. Her hair sways softly, emphasizing each movement. The girl gives a slow blowjob to the man, sucking him and putting his penis completely into her mouth and then pulling it out.
negative:
The video is not of a high quality, it has a low resolution. Watermark present in each frame. Strange motion trajectory. Static image. Static video. Camera movement. Blur. Change of frame.
Description
FAQ
Comments (12)
Do you think it’s possible to run this on a 3060 Ti with 8GB VRAM and 80GB of system RAM
It may be possible if you apply some customization
Read about optimizing this process. Kinda it can use less than 8gb VRAM but generation will take longer (on your video card I think 50 steps will take 25-30 minutes for 49 frames of video).
@Cixiao I’m going to try it, and I’ll let you know if I get any results. Thanks so much.
Great work. Did you use validation while training this?
Well, I completely disabled it so as not to waste the power of the rented server on it (one test video is generated somewhere around 1~2 minutes on H100)
Basically I just collect a normal dataset and hope that everything will be fine
Oh yeah
for training I use https://github.com/a-r-r-o-w/cogvideox-factory
Can I ask what kinda dataset ended up working for you? I had success with characters but my nsfw stuff came up kinda flat, would love to discuss this with you if possible to see what insight im missing! Great work!
I collected about 100 nsfw videos with blowjob and divided them into 300~400 videos of 6 seconds and 49 frames (manually selected normal options and then programmatically removed 95% similar videos).
Then using LLM, joy-caption and CogVLM I made prompts for the videos
Also 70~80 percent of the training data were 3D animations and vertical videos (from them I made horizontal videos just adding black bars on the sides and in the prompt writing "Vertical video with black bars on the sides")
@Cixiao ingenious, I had collected about 200 nsfw videos but really just used only the first segment and hadn't considered splitting them up into more videos. I also had not used any portrait types opting for landscapes. this gives me good motivation to try refining the dataset for a better result. thank you!
any plans for Xfun lora versions? that COG is 1000 times more versatile: accept any size, ratio, frame durations...also faster.
if you want to chat a little about it just drop a discord or something, i'm sure you may be interested in some topics 😉
can you share workflow please?
Hello, one question, what would be the ideal configuration for a 12GB 3060?
If you guys are having trouble with any of this guys stuff I suggest reading the 20 absolutely necessary articles that this gatekeeping nobody has written. They are a lot of help.