CivArchive
    Preview 930742
    Preview 930743
    Preview 930830
    Preview 930877
    Preview 930881
    Preview 930959
    Preview 931160
    Preview 931215
    Preview 931294
    Preview 931314
    Preview 931337
    Preview 931390
    Preview 931843
    Preview 931846
    Preview 931958
    Preview 931952
    Preview 932024
    Preview 932319
    Preview 933000
    Preview 938433

    🌍欢迎加入QQ群"兔狲·AIGC梦工北厂",群号 :780132897 ;"兔狲·AIGC梦工南厂",群号 :835297318(入群答案:兔狲)。Telegram群聊“兔狲的SDXL百老汇”,链接:https://t.me/+KkflmfLTAdwzMzI1

    🚨Recommended parameters for FilmGirl Ultra:

    Clip skip:1

    CFG scale: 9

    Direct output image resolution: ~500,000 pixels (640x768)

    2024.2.29 Introducing "FilmGirl Ultra",Say goodbye to the AI face of SD1.5

    On February 24th last year, I completed the first version of FilmGirl LoRA. This LoRA was my first model to achieve a high download volume and marks the beginning of my dream in AI. Since the launch of SDXL, I have devoted a great deal of effort to improving the HelloWorld and AIArt XL models. It has also been 8 months since the FilmGirl series was last updated.

    In fact, whether it's FilmGirl, or the subsequent Polaroid LoRA or Helloworld XL, I have always been pursuing the ultimate in photorealism. Now a whole year has passed, and to commemorate the first anniversary, I have decided to release a model that elevates the photorealism of SD1.5 to new heights. The LoRA model is no longer sufficient for this mission; the new FilmGirl Ultra is an SD1.5 base model.

    To completely break away from the homogenization of SD1.5 photorealistic models and the issue of AI faces, FilmGirl Ultra didn't choose basilmix, chilloutmix, or their descendants as the training base model, but instead selected the newly released SPIN-Diffusion by UCLA. SPIN-Diffusion is a Self-Play Fine-Tune SD1.5 base model using the winner images of the pickapic_v2 dataset, which outperforms the SD1.5 original base model and SD1.5 DPO base model, and its prompt alignment performance is far superior to heavily fine-tuned and merged base models like Chilloutmix.

    The training set for FilmGirl Ultra comes from HelloWorld XL. In fact, the first version of HelloWorld XL also used the training set from the last version of FilmGirl LoRA. Throughout this year, I have been meticulously accumulating and selecting this training set, which now totals ~10,000 images. The training process for FilmGirl Ultra utilized multiple labeling methods, including GPT4V natural language captions, GPT4V tag-style captions, and Blip+Clip captions. To ensure that the model is compatible with the commonly used prompts "1girl", "best quality", and "masterpiece", these terms were also appropriately added to some images (but you can still accurately trigger the effect of a little girl with "little girl/child girl"). The reason for using multiple sets of labels is to maximize the likelihood of triggering the desired effect. As part of the FilmGirl tradition, the film style has been given special attention, and you can trigger this style with the prompt "film grain analog photography".

    This model underwent a total of 7 training phases, with different batch sizes, optimizers, learning rates, and training set ratios used in each phase to achieve the current effect. If anyone is interested in fine-tuning SPIN-Diffusion, I recommend that your total training iterations should exceed 50,000 steps; in fact, I trained for about 100,000 steps with batch sizes ranging from 40 to 64.

    The photorealistic effect of FilmGirl Ultra exceeded my expectations and is now close to the image quality of SDXL. Below is a comparison of this model with Realistic Vision v6 and epiCPhotoGasm, the former being the currently highest downloaded base model on Civitai, and the latter being the most photorealistic SD1.5 base model in my opinion for a long time. I pay tribute to these two excellent base models and their creators.

    close-up couple's portrait,African young woman and man,clear skin face,looking at camera,fashion photography,simple background
    Negative prompt: watermark,anime,cartoon,open mouth

    close-up couple's portrait,African little girl and boy,clear skin face,looking at camera,fashion photography,simple background,
    Negative prompt: watermark,anime,cartoon,open mouth,

    Thanks to GPT4V captions and the SPIN-Diffusion base model, the model's prompt alignment performance is excellent. Below are some xy plot tests for different concepts.

    Ethnic test

    Body shape test

    Skin color test

    Age test

    Animal test

    However, FilmGirl Ultra doesn't lead in all dimensions. After all, it started from a new point and gave up on the continuous optimization and refinement of the community's 1.5 base models over the past year. Through extensive testing and comparison, I found that this base model has a higher rate of limb errors than the community's mature realistic models. Also, due to a lack of anime-related content in the training set, the output is not good when your prompts involve related tags of ACGN. It is recommended to avoid using words like "digital art", "anime", "cartoon", etc. These two issues are the main current shortcomings of FilmGirl Ultra.

    FilmGirl Ultra is an annual summary of my first year on my AI journey, a gift to those AI enthusiasts who have supported me. The open-source community has brought me many friends, memories, joy, and knowledge. I also hope to contribute a bit back to the community. I welcome everyone to base your model training or merge it with FilmGirl Ultra. If you find this model helpful in improving your own model, please mention it in the model description. I hope that FilmGirl Ultra and SPIN-Diffusion will become more widely known and used.

    FilmGirl Ultra will continue to be updated, and I wish everyone happy usage!

    Hope we can continue to progress with AI, and meet here again this time next year!

    去年的2月24日,我完成了第一版FilmGirl LoRA制作。这个LoRA是我的首个高下载量模型,是我的AI之梦的开始。自从SDXL推出后,我将大量精力投入到HelloWorld和AIArt两个XL大模型的改进中。FilmGirl这个系列也已经8个月没有更新了。

    其实不管是FilmGirl,还是后来的拍立得LoRA、Helloworld XL,我一直都在追求极致的写实感。如今已整整一年过去,作为一周年纪念,我决定推出一个可以将SD1.5的写实感抬升至新高度的模型,LoRA模型已不足以承载这个使命,新的FilmGirl Ultra是一个SD1.5大模型。

    为了彻底摆脱SD1.5写实感大模型的同质化和AI脸问题,FilmGirl Ultra没有选择basilmix、chilloutmix及其子子孙孙们作为训练底模,而是选择了UCLA最新发布的SPIN-Diffusion。SPIN-Diffusion是一个使用 pickapic_v2 数据集胜者图像进行自我对弈微调的SD1.5底模,其表现优于SD1.5原始底模以及DPO底模,同时提示词对齐性能远好于Chilloutmix等经过大量微调与融合的底模。

    FilmGirl Ultra的训练集来自HelloWorld XL。实际上HelloWorld XL的第一版所使用的训练集也来自最后一版FilmGirl LoRA。这一年我都在精益求精地积累和筛选该训练集,如今整个训练集数量已达到1万张。FilmGirl Ultra的整个训练过程使用了多种打标方法,包括GPT4V自然语言caption、GPT4V 标签式caption、Blip+Clip caption。同时为了使得该模型可以兼容大家超常用的1girl、best quality、masterpiece三个词,也适当地在部分图像中添加了这三个词(但您仍可以通过child girl/girl这两个词准确触发小女孩效果)。之所以使用多套打标,是为了使训练集的效果可以尽可能高概率地触发。同时作为FilmGirl的传统,胶片风格被重点关注,您可以通过film grain analog photography来触发该风格。

    本模型进行了共7阶段的训练,不同阶段选用了不同的batch size、优化器、学习率以及训练集比例,方才达到了目前的效果。如果有朋友同样对微调SPIN-Diffusion感兴趣,我建议您的总体训练迭代步数应在5万步以上,实际上我以batch size 40~64,共训练了约10万步。

    FilmGirl Ultra的写实效果超出了我的预料,已经与SDXL的图像效果接近。上图中列出了该模型与Realistic Vision v6以及epiCPhotoGasm的对比,前者是目前C站下载量最高的1.5底模,后者是我心目中长期以来最为写实的1.5底模,向这两个优秀底模以及其背后的作者致敬。

    同时得益于GPT4V打标以及SPIN-Diffusion底模,该模型的提示词对齐性能优异。

    但FilmGirl Ultra也并非在所有维度都全面领先。它毕竟是从一个全新起点出发制作,放弃了社区一年多来对1.5底模的不断调优打磨内容,经过我的大量测试对比,该底模的肢体错误率要高于社区成熟的写实模型。同时由于训练集缺乏二次元内容,当你的提示词中涉及二次元相关tag时,出图效果不佳。建议大家避免使用digital art、anime、cartoon等词。这两个问题是FilmGirl Ultra目前最主要的两个缺陷。

    FilmGirl Ultra是我AI之旅第一年的年终总结,是我送给支持我的AI同好们的礼物。开源社区为我带来了诸多朋友、回忆、快乐以及知识,我也希望回馈社区做出自己的一点点贡献。希望上述的模型制作总结能为大家带来一些帮助,同时也欢迎大家基于FilmGirl Ultra进行你的模型训练或融合。本模型与其训练底模SPIN-Diffusion一样,请大家遵循Apache-2.0许可证使用,否则将被追责。如果您觉着这个模型有帮助您让自己的模型变得更好,请在模型说明中提及下它,希望FilmGirl Ultra以及SPIN-Diffusion能被更多人了解和使用。

    FilmGirl Ultra后续还会持续更新,祝大家使用愉快!

    希望我们能随AI一起不断进步,明年此时,仍能在此相遇!

    版权声明:

    FilmGirl Ultra系列模型(以下简称“本模型”)是由我(以下简称“所有者”)基于SPIN-Diffusion开发的SD1.5大模型。

    所有者授权个人或机构可免费使用本模型所生成的图像用于非商业性质的教育或信息传播目的,并且:

    - 遵守相关法律规定,不侵犯本模型或任何第三方的合法权益。

    - 在使用图像时需注明图像来源为“由LEOSAM's FilmGirl Ultra大模型生成”。

    对于商业目的的使用,必须先与所有者签署商用授权协议。有关商业授权和模型定制事宜,请通过所有者在Civitai平台的主页信息联系。

    所有者将持续为个人玩家免费提供FilmGirl Ultra模型的更新,以此表达对社区开源贡献者的支持和感谢。商业用户的有偿合作是推动本模型开发和持续改进的重要动力。感谢每一位用户的理解与支持。

    请注意,任何未经授权的使用行为都可能违反相关法律规定,并可能承担法律责任。本声明的最终解释权归所有者所有,并受相关法律法规的约束。

    Description

    VELVIA模型是采用高分辨率训练集以及新的打tag方法训练得到的LoRA模型。

    更新内容:

    • 照片训练集进行了重新筛选,不再使用正则化图像

    • 训练集分辨率由512x704提升至640x960

    • 采用了新的打tag方法与金字塔噪声

    The VELVIA model is a LoRA model trained using a high-resolution dataset and a new tagging method.

    Updates include:

    • The photo training dataset has been reselected, no longer using regularized images.

    • The training dataset's resolution has been increased from 512x704 to 640x960.

    • A new tagging method and pyramid noise has been adopted.

    FAQ

    Comments (25)

    JnyArtMay 27, 2023
    CivitAI

    rip real girls, nice lora btw

    movefastMay 27, 2023
    CivitAI

    高产啊,几天没来看,这更新速度,生产队的驴都得下岗,只是太多了,完全不知道哪个好用,大佬多写点介绍吧

    LEOSAM
    Author
    May 28, 2023

    我自己其实还是最推荐最新的这个,各方面相对要平衡很多

    GNFYMay 28, 2023
    CivitAI

    MewX AI推出了胶片女孩模型,我一看基本可以确定是抄的你的

    shanjiaoyu520May 29, 2023

    MewX AI 就是把抖音小红书上火的模型和图直接搬过去

    MengXMay 29, 2023

    @shanjiaoyu520 和小偷有什么区别,盗图狗是真恶心!

    karius828May 29, 2023

    国产那些套壳SD应用偷模型不是一天两天了,关键是还没办法维权。

    LEOSAM
    Author
    May 30, 2023

    谢谢提醒,我去看了下,应该就是用的这个 lora。不过这种确实太难维权了,只能祝这种未经许可商用的公司早点倒闭吧

    MengXMay 29, 2023
    CivitAI

    可以说是我最喜欢的lora了,从最早的版本一直用到现在,几乎可以说是每个版本都有不错的效果,每个版本都可以做出很多好看的脸,牛逼两个字我已经不想多说。除了lycoris的两款我用的比较少。

    LEOSAM
    Author
    May 30, 2023

    谢谢支持!!开心!

    Curtis_1995May 30, 2023· 2 reactions
    CivitAI

    不懂就问:1,是不是lora训练的素材带了面孔的原因?,同seed出的欧美面孔权重拉高就变化很大且有了一些亚洲特质,2,如果只是纯胶片化的目的,能不能仅通过vae实现? 求解答

    JoeanAmierMay 31, 2023
    CivitAI

    感觉权重0.5以上就容易坏图,G4都没有这个问题。

    LEOSAM
    Author
    Jun 1, 2023

    因为g4是融合lora,新版这个是直接训练得到的。直出的lora比融合后的在面部逼真度上更好,但出图正确率会略低一些。不过我测试不至于0.5就坏图,即使0.9也不会出现太严重的过拟合

    mramer723Jun 1, 2023· 1 reaction
    CivitAI

    Sorry, I am confused, what is the file NegStd3.safetensors for?

    izitnJun 1, 2023· 5 reactions
    CivitAI

    非常喜欢这个系列

    根据一直以来的使用经验,今天又跑了几遍XYZ,综合下来最喜欢FilmG2,也就是NegStd1,这一版在影调上的作用比较明显的同时对人脸的影响也保持在了合理的范围内。NegStd2\3虽然对脸影响不大但是对影调的影响有点小了。Astia系列应该是FilmG早期的版本,影调表现同样很好,不过高权重下对场景的影响更大了。Provia版本影调的表现很棒但是对脸的影响太大了。Velvia对我来说似乎是另一个表现较好的版本,不过对脸型有一点微妙的影响。

    LEOSAM
    Author
    Jun 1, 2023

    谢谢你的测评!相当详细用心!我也是一直在每个大版本下测试不同的训练和打标方法的效果,希望慢慢能摸索出一套成熟的照片质感训练方法吧

    mamadaochenggong5801Jun 2, 2023· 1 reaction
    CivitAI

    老大。请教下,训练模型需要哪款显卡起步呢,请推荐下,还有您用的是哪款显卡呢

    fuyunkingJun 3, 2023

    兄弟,最好是3080起步,下面虽然说也能玩,但是现存小了,3060 12g的性能又不能样

    karius828Jun 3, 2023· 1 reaction

    可以用Google colab进行训练,免费,但是不能训练太久,比如超过4个钟头,免费用户会被掐断。硬件是T4 GPU,VRAM 16GB

    LEOSAM
    Author
    Jun 5, 2023

    最低起步需要6G的N卡,比如2060。不过这种只能开一个batch size,速度比较慢。考虑到前期需要踩坑测试不同的训练方法之类,最好选10G以上的显卡,开三个batch size,时间基本就是1个batch size的1/3

    FangArtworkJun 6, 2023

    我也是用Google colab,免费版,练过4次,C站有教程 也是这博主的编程脚本

    KyrieKalJun 4, 2023· 2 reactions
    CivitAI

    大佬考虑过用2vXpSwA7提出的那个差异炼丹法做LORA么,我的想法是先用图片处理软件给原素材降噪,然后把降噪后的图片作为差异炼丹法的第一步材料过拟合掉,再用原素材去差异炼丹,我这个想法我自己试了试但感觉可能我找的素材都不太行效果不是很好

    LEOSAM
    Author
    Jun 5, 2023

    青龙大佬的视频我也看到了,确实是很特别的炼丹方法。我觉着这个思路肯定能训练出非常接近图片处理软件里加噪点或者其他滤镜的功能。这种功能lora还蛮有趣,不过我不打算按这个方法炼lora了,主要摸索新方法太肝了哈哈哈

    izitnJun 6, 2023

    @LEOSAM 其实看完青龙大佬的视频有一个思路,富士相机有胶片模拟的功能,可以使用富士相机的原片和胶片模拟后的照片分别炼丹,再用差异炼丹法生成新的lora,可以炼制不同的胶片模拟丹。不过有个问题待验证:原片不是由底模生成,差异能否精准对应识别。

    LEOSAM
    Author
    Jun 10, 2023

    @izitn 我看C站有网友出了个类似的lora(https://civitai.com/models/87080/film-simulator-lora)我还没具体测试,不知道效果如何。我最近也在用差异炼丹法,不过我在试着炼衣服增减功能,,,