DiffArtist: Towards Structure and Appearance Controllable Image Stylization

ACM MM 2025

The Hong Kong Polytechnic University, *Corresponding Author

Gallery (Hover to Stylize)

TL;DR: DiffArtist is a zero-shot text-driven stylization method. It requires no finetuning, nor additional ControlNets. Unlike existing stylization methods that entangles structure and appearance, the strength of them are highly controllable in DiffArtist.

Method overview

Method Overview

DiffArtist decompose the structure and appearance stylization with two two separate denoising processes, allowing fine-grained control for both terms.

Disentangled Structure and Appearance Control

Disentangled Structure and Appearance Control

DiffArtist Enables Disentangled Control over Structure and Appearance. Compared with previous methods, the unprecedented dual controllability allows for more customizable stylization.

Additional Control Visualizations

Disentangled Structure and Appearance Control
Disentangled Structure and Appearance Control

BibTeX

@article{jiang2024diffartist,
      title={DiffArtist: Towards Structure and Appearance Controllable Image Stylization},
      author={Jiang, Ruixiang and Chen, Changwen},
      journal={arXiv preprint arXiv:2407.15842},
      year={2024}
      }