Version 4.1 was trained with PonyXL v6 as the base, using a dataset of over 24k images, selected to teach concepts that were missing from the model, and 4.0 was based on AnimagineXL. 4.2 was a merge between 4.1 and 4.0, with further training afterwards to stabilize the merge and add styles besides anime. As a result, usage of this model needs to adhere to the licenses of both Pony and Animagine. The main focuses of this model are nsfw, holding weapons, and horror/monster themes, (which is where the "unsafe" name comes from) but it is meant to be a generalist model capable of handling images with or without those concepts. There is also a focus on being able to prompt facial expressions and general art styles.
Due to using an anime model trained on booru tags as a base, this version fully trained the text encoder, so it should have much better prompt adherence when using booru tags, but will have worse natural language performance. Because it has been trained on a variety of styles, you'll get more consistent results if you specifically prompt for the style you want, such as "photorealistic", "realistic", "3d", "anime", "cel shading", "painterly", "1980s (style)", "oil painting (medium)".
Description
Trained separate models on top of animagineXL and VenusXL, then merged those together and then with UnsafeXL 1.0, then further trained on top of that. The dataset of around 3.5k images now contains some photorealistic images, as well as things such as underwater scenes, "horror \(theme\)", and individual pieces of armor (such as pauldrons, rerebraces, vambraces, faulds, etc.). All images are also tagged better with "anime" or "photorealistic", so it should be easier to prompt for the those types of styles. I also tried a new thing where I trained on images for the first 200 noising steps (similar to how the SDXL refiner works) that were low quality or had bad anatomy, but merged it at negative weight.