Nova Flat XL
Nova Flat XL is Anime SDXL checkpoint that aims to have flat style: not flat coloring
All images on example were created with diffusers with custom png tags
Rules
You cannot use the generated images for commercial use if it's not edited (or just turning it to black and white)
You can share images without any restriction if you don't monetize it
Advertising this model to outside is always welcome
Recommend Settings
Sampler: Euler a
Steps: 20~30
Clip Skip: 1-2
Denoising Strength: 0.65 - 0.8
CFG Scale: 4~6
Prompt: masterpiece, best quality, amazing quality, 4k, very aesthetic, high resolution, ultra-detailed, absurdres, newest, scenery, {Prompt}, BREAK, depth of field, volumetric lighting
Negative Prompts: modern, recent, old, oldest, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured, long body, lowres, bad anatomy, bad hands, missing fingers, extra digits, fewer digits, cropped, very displeasing, (worst quality, bad quality:1.2), sketch, jpeg artifacts, signature, watermark, username, signature, simple background, conjoined, bad ai-generated
Description
Improved pose structure
Noob EPS 1.1 + Illustrious 1.1 DARE applied
FAQ
Comments (7)
In the very first test, prompt adherence sucks. Is this thing limited to just the stupid booru tags?
Compared with my other models, this model follows differently and probably has lesser adherence
In the future model I'll try my best to enhance it
Hello, I use the diffusers library for the Text-to-Image, and when I use the exact same prompt words and parameters as the example image, the result is very different from the example.
By the way, I use the compel library to work around the prompt word length limit.
Excuse me, is there any step I didn't handle well?
你好,我使用diffusers库进行文生图,在使用和例图完全相同的提示词与参数时,生成的结果与例图差距很大。
顺带一提,我使用compel库解决提示词长度限制。
请问是我哪一步没有处理好吗
=========The code I use:
config_path = "anime_illust_diffusion_xl"
model_path = "Nova_Flat_XL_02.safetensors"
pipe = StableDiffusionXLPipeline.from_single_file(
model_path,
dtype=torch.bfloat16,
config=config_path,
local_files_only=True)
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(
pipe.scheduler.config
)
prompt="masterpiece, best quality, amazing quality, very aesthetic, high resolution, ultra-detailed, absurdres, newest, scenery, colorful, rim light, backlit, cosmic sky, aurora, chaos, fashion photography of busty cute girl, (cute:1.2), intense long pink hair, long hair, choppy bangs, nebulae cosmic purple eyes, rimlit eyes, dynamic pose, bokeh, purple serafuku with big red ribbon, red annular solar eclipse halo, perfect night, fantasy background, looking at viewer, light smile, glowing star in hand, (colorful light particles:1.2), (face focus:0.7), from below, dutch angle, upper body, head tilt, BREAK, fingers, detailed background, blurry foreground, depth of field, volumetric lighting"
negative_prompt = "modern, recent, old, oldest, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured, long body, lowres, bad anatomy, bad hands, missing fingers, extra fingers, extra digits, fewer digits, cropped, very displeasing, (worst quality, bad quality:1.2), sketch, jpeg artifacts, signature, watermark, username, (censored, bar_censor, mosaic_censor:1.2), simple background, conjoined, bad ai-generated"
compel = Compel(tokenizer=[pipe.tokenizer, pipe.tokenizer_2] , text_encoder=[pipe.text_encoder, pipe.text_encoder_2], returned_embeddings_type=ReturnedEmbeddingsType.PENULTIMATE_HIDDEN_STATES_NON_NORMALIZED, requires_pooled=[False, True])
prompt_embeds, pooled_prompt_embeds = compel(prompt)
negative_prompt_embeds, negative_pooled_prompt_embeds = compel(negative_prompt)
generator = torch.Generator("cuda").manual_seed(0)
with torch.no_grad():
images = pipe(
prompt_embeds = prompt_embeds,
pooled_prompt_embeds = pooled_prompt_embeds,
negative_prompt_embeds = negative_prompt_embeds,
negative_pooled_prompt_embeds = negative_pooled_prompt_embeds,
height = 1216,
width= 832,
num_inference_steps=20,
guidance_scale=4.5,
num_images_per_prompt=2,
generator = generator
).images
Use sd_embed instead
The example notebook can be created via Model Merge Scripter
https://civitai.com/articles/12245/crodys-model-merge-guide-team-c
In short, use sd_embed instead of compel
That way you can use all the features on other system with diffusers including BREAK
@Crody I'm surprised at how quickly you replied. I have found sd_embed in lunch and I was about to revise this comment.
Now I've modified the code as follows, but the style is still quite different from the example image:
ori_image: https://civitai.com/images/69462418
my_image: https://imgur.com/a/FgpzaAv
=========The code I use:
config_path = "anime_illust_diffusion_xl"
model_path = "Nova_Flat_XL_02.safetensors"
pipe = StableDiffusionXLPipeline.from_single_file(
model_path,
dtype=torch.bfloat16,
config=config_path,
local_files_only=True)
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(
pipe.scheduler.config
)
prompt="masterpiece, best quality, amazing quality, very aesthetic, high resolution, ultra-detailed, absurdres, newest, scenery, (dappled sunlight:1.2), rim light, backlit, dramatic shadow, 1girl, long blonde hair, blue eyes, shiny eyes, parted lips, medium breasts, puffy sleeve white dress, forest, flowers, white butterfly, looking at viewer, leaning side against tree, vines, green, arms, upper body, close-up, dutch angle, shiny skin, BREAK, eyes, lips, dramatic shadow, detailed eyes, detailed hair, depth of field, vignetting, volumetric lighting"
negative_prompt = "modern, recent, old, oldest, cartoon, graphic, text, painting, crayon, graphite, abstract, glitch, deformed, mutated, ugly, disfigured, long body, lowres, bad anatomy, bad hands, missing fingers, extra fingers, extra digits, fewer digits, cropped, very displeasing, (worst quality, bad quality:1.2), sketch, jpeg artifacts, signature, watermark, username, (censored, bar_censor, mosaic_censor:1.2), simple background, conjoined, bad ai-generated"
clip_skip = 2
( prompt_embeds,
prompt_neg_embeds,
pooled_prompt_embeds,
negative_pooled_prompt_embeds
) = get_weighted_text_embeddings_sdxl(
pipe,
prompt = prompt,
neg_prompt = negative_prompt,
clip_skip=clip_skip)
generator = torch.Generator("cuda").manual_seed(1909963142)
with torch.no_grad():
images = pipe(
prompt_embeds = prompt_embeds,
pooled_prompt_embeds = pooled_prompt_embeds,
negative_prompt_embeds = negative_prompt_embeds,
negative_pooled_prompt_embeds = negative_pooled_prompt_embeds,
height = 1216,
width= 832,
num_inference_steps=20,
guidance_scale=4.5,
num_images_per_prompt=2,
generator = generator
).images
@baijin02 Use cpu instead for the generator
eg.) generator = torch.Generator("cpu").manual_seed(1909963142)
Pipe itself on cuda, only generator is on the cpu
Also, I use 768x1344 and then I hires it with x1.5 0.4 denoise 40 steps
I have resolved the aforementioned issues, which are due to the implementation of the diffusers library.
1. The diffusers library's sdxl pipeline defaults to clip=-2. This means that when you pass clip=-2, it actually results in clip=-4. You should use the infinite prompt pipeline from the clip=0.
2. diffusers library instead. The handling of BREAK is incorrect; you can remove the BREAK from the prompt.
我已经解决了上述问题,这是diffusers库的实现导致的。
1.diffuers库的sdxl pipeline默认就是clip=-2的,也就是说你传入clip=-2,其实此时clip=-4,你应该传入clip=0。
2.diffusers库的无限长prompt管道,对BREAK的处理是错误的,可以把提示词中的BREAK删去。





