This is the official repository of Diff-Harmonization. We are working🏃🏃 on further improvements to this method (see Appendix D of the paper) to provide a better user experience, so stay tuned for more updates.
If you encounter any question about the paper, please feel free to contact us. You can create an issue or just send email to me windvchen@gmail.com. Also welcome for any idea exchange and discussion.
BTW: In the process of waiting for the final code, you may wish to pay attention to our 😊INR-Harmonization work that we recently released the final code. It is the first dense pixel-to-pixel method applicable to high-resolution (~6K) images without any hand-crafted filter design, based on Implicit Neural Representation,.
[07/18/2023] Repository init.
- Code release
- Gradio release
Recent image harmonization methods have demonstrated promising results. However, due to their heavy reliance on a large number of composite images, these works are expensive in the training phase and often fail to generalize to unseen images. In this paper, we draw lessons from human behavior and come up with a zero-shot image harmonization method. Specifically, in the harmonization process, a human mainly utilizes his long-term prior on harmonious images and makes a composite image close to that prior. To imitate that, we resort to pretrained generative models for the prior of natural images. For the guidance of the harmonization direction, we propose an Attention-Constraint Text which is optimized to well illustrate the image environments. Some further designs are introduced for preserving the foreground content structure. The resulting framework, highly consistent with human behavior, can achieve harmonious results without burdensome training. Extensive experiments have demonstrated the effectiveness of our approach, and we have also explored some interesting applications.
If you find this paper useful in your research, please consider citing:
@misc{chen2023zeroshot,
title={Zero-Shot Image Harmonization with Generative Model Prior},
author={Jianqi Chen and Zhengxia Zou and Yilan Zhang and Keyan Chen and Zhenwei Shi},
year={2023},
eprint={2307.08182},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This project is licensed under the Apache-2.0 license. See LICENSE for details.