Qwen-Image-Edit Low-Resolution Input Repair LoRA
Model Introduction
Qwen-Image-Edit is a powerful open-source image editing model. However, when the input resolution of the model is lower than the target resolution for image generation, the model's ability to maintain image details is poor. To address this, we made the following two modifications:
Rope Interpolation: The position encoding of the input image in Qwen-Image DiT is changed to an interpolated sampling of the position encoding at the target resolution. This modification can take effect independently of modification 2.
LoRA Fine-tuning: Quickly train a LoRA model to enhance the generalization of this interpolated encoding by DiT.
With these two modifications, the model can produce consistent edited images even when given low-resolution input. Additionally, compared to high-resolution input, the inference time of the model is significantly reduced.
Source: https://modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Edit-Lowres-Fix
Description
Qwen-Image-Edit Low-Resolution Input Repair LoRA
Model Introduction
Qwen-Image-Edit is a powerful open-source image editing model. However, when the input resolution of the model is lower than the target resolution for image generation, the model's ability to maintain image details is poor. To address this, we made the following two modifications:
Rope Interpolation: The position encoding of the input image in Qwen-Image DiT is changed to an interpolated sampling of the position encoding at the target resolution. This modification can take effect independently of modification 2.
LoRA Fine-tuning: Quickly train a LoRA model to enhance the generalization of this interpolated encoding by DiT.
With these two modifications, the model can produce consistent edited images even when given low-resolution input. Additionally, compared to high-resolution input, the inference time of the model is significantly reduced.
https://modelscope.cn/models/DiffSynth-Studio/Qwen-Image-Edit-Lowres-Fix
FAQ
Comments (8)
I don't know why this isn't way more popular, but it's an amazing, must have, fix for every worflow.
Thank you so much for this!
Glad you like it!
how does it work? What exactly is this Laura doing? so if I upload an image of a small image, will I get the exact facial features at the output?
When the input image resolution is lower than the target resolution, this does an interpolated upscale on the input before processing, giving the model more pixel info and better outputs. It does some other stuff, but I don't understand the whitepaper enough to comment on that.
@Cyph3r I don't see what difference it makes with or without Laura.
@soyv4 ok, let me refund all the money you paid.
2509 will work?
Rope Interpolation: The position encoding of the input image in Qwen-Image DiT is changed to an interpolated sampling of the position encoding at the target resolution. This modification can take effect independently of modification 2. HOW to do it in comfyUI???
