Papers
arxiv:2512.13592

Image Diffusion Preview with Consistency Solver

Published on Dec 15
· Submitted by
Fu-Yun Wang
on Dec 16
Authors:
,
,
,
,
,
,
,
,
,

Abstract

Diffusion Preview uses ConsistencySolver, a high-order trainable solver, to improve quality and consistency in low-step image generation, enhancing interactive user experiences.

AI-generated summary

The slow inference process of image diffusion models significantly degrades interactive user experiences. To address this, we introduce Diffusion Preview, a novel paradigm employing rapid, low-step sampling to generate preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. Existing acceleration methods, including training-free solvers and post-training distillation, struggle to deliver high-quality previews or ensure consistency between previews and final outputs. We propose ConsistencySolver derived from general linear multistep methods, a lightweight, trainable high-order solver optimized via Reinforcement Learning, that enhances preview quality and consistency. Experimental results demonstrate that ConsistencySolver significantly improves generation quality and consistency in low-step scenarios, making it ideal for efficient preview-and-refine workflows. Notably, it achieves FID scores on-par with Multistep DPM-Solver using 47% fewer steps, while outperforming distillation baselines. Furthermore, user studies indicate our approach reduces overall user interaction time by nearly 50% while maintaining generation quality. Code is available at https://github.com/G-U-N/consolver.

Community

Paper author Paper submitter
edited 2 days ago

The slow inference process of image diffusion models significantly degrades interactive user experiences. We introduce Diffusion Preview, a novel preview-and-refine paradigm that generates rapid, low-step preliminary outputs for user evaluation, deferring full-step refinement until the preview is deemed satisfactory. This workflow enables users to quickly iterate through different prompts or random seeds with minimal computational cost, only triggering expensive full-step sampling when a preview meets their expectations.

Diffusion Preview framework
Diffusion Preview framework: Fast preview generation followed by full-step refinement.

To achieve high-quality and consistent previews, we propose ConsistencySolver, a learnable high-order ODE solver derived from Linear Multistep Methods and optimized via Reinforcement Learning. Unlike existing training-free solvers that rely on rigid numerical schemes or distillation methods that sacrifice consistency, ConsistencySolver dynamically adapts its integration strategy to maximize alignment between low-step previews and high-step reference generations, ensuring previews serve as reliable proxies for final outputs.

Consistency Solver Training Pipeline
Overview of our RL framework for optimizing a learnable ODE solver in diffusion sampling.

Empirical validation demonstrates that ConsistencySolver significantly outperforms training-free ODE solvers (e.g., DDIM, UniPC, Multistep DPM), distillation-based methods (e.g., LCM, PCM, DMD2), and distillation-based solvers (e.g., AMED) across both consistency metrics and FID scores.

Quantitative Results on Stable Diffusion v1-5
Quantitative Results on Stable Diffusion v1-5 for Text-to-Image Generation.

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.13592 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.13592 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.13592 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.