Classification Diffusion Models:
Revitalizing Density Ratio Estimation

NeurIPS 2024

Technion – Israel Institute of Technology


Abstract

A prominent family of methods for learning data distributions relies on density ratio estimation (DRE), where a model is trained to classify between data samples and samples from some reference distribution. DRE-based models can directly output the likelihood for any given input, a highly desired property that is lacking in most generative techniques. Nevertheless, to date, DRE methods have failed in accurately capturing the distributions of complex high-dimensional data, like images, and have thus been drawing reduced research attention in recent years. In this work we present classification diffusion models (CDMs), a DRE-based generative method that adopts the formalism of denoising diffusion models (DDMs) while making use of a classifier that predicts the level of noise added to a clean signal. Our method is based on an analytical connection that we derive between the MSE-optimal denoiser for removing white Gaussian noise and the cross-entropy-optimal classifier for predicting the noise level. Our method is the first DRE-based technique that can successfully generate images beyond the MNIST dataset. Furthermore, it can output the likelihood of any input in a single forward pass, achieving state-of-the-art negative log likelihood (NLL) among methods with this property.


Overview

DDMs are based on minimum-MSE (MMSE) denoising, while DRE methods hinge on optimal classification. In this work, we develop a connection between the optimal classifier for predicting the level of white Gaussian noise added to a data sample, and the MMSE denoiser for cleaning such noise. Specifically, we show that the latter can be obtained from the gradient of the former. Utilizing this connection, we propose classification diffusion model (CDM), a generative method that combines the formalism of DDMs, but instead of a denoiser, employs a noise-level classifier. CDM is the first instance of a DRE-based method that can successfully generate images beyond MNIST. In addition, as a DRE method, CDM is inherently capable of outputting the exact log-likelihood in a single NFE. In fact, it achieves state-of-the-art negative-log-likelihood (NLL) results among methods that use a single NFE, and comparable results to computationally-expensive ODE-based methods.

A diagram of CDM (right) compared with DDM (left)


CDM Samples from CelebA $64\times64$ and CIFAR-10


Numerical Evaluation

NLL (bits/dim) calculated on CIFAR-10 test-set. For each model we list the NFEs required for calculating the NLL.

Negative Log Likelihood

Model NLL ↓ NFE
iResNet 3.45 100
FFJORD 3.40 ~3K
MintNet 3.32 120
FlowMatching 2.99 142
VDM 2.65 10K
DDPM ($L$) ≤3.70 1K
DDPM ($L_{simple}$) ≤3.75 1K
DDPM (SDE) 3.28 ~200
DDPM++ cont. 2.99 ~200

RealNVP 3.49 1
Glow 3.35 1
Residual Flow 3.28 1
CDM 3.38 1
CDM(unif.) 2.98 1
CDM(OT) 2.89 1
Image generation quality. We compare the FID (lower is better) achieved by DDM and CDM using several sampling schemes for CelebA, unconditional CIFAR-10 and conditional CIFAR-10

Fréchet Inception Distance

Sampling Method Model
 
CelebA $64\times64$
DDM CDM
DDIM Sampler, 50 steps 8.47 4.78
DDPM Sampler, 1000 steps 4.13 2.51
2nd order DPM Solver, 25 steps 6.16 4.45
 

Unconditional CIFAR-10 $32\times32$
DDM CDM
DDIM Sampler, 50 steps 7.19 7.56
DDPM Sampler, 1000 steps 4.77 4.74
2nd order DPM Solver, 25 steps 6.91 7.29
 

Conditional CIFAR-10 $32\times32$
DDM CDM
DDIM Sampler, 50 steps 5.92 5.08
DDPM Sampler, 1000 steps 4.70 3.66
2nd order DPM Solver, 25 steps 5.87 4.87

Bibtex

	
	@article{yadin2024classification,
		title={Classification Diffusion Models},
		author={Yadin, Shahar and Elata, Noam and Michaeli, Tomer},
		journal={arXiv preprint arXiv:2402.10095},
		year={2024}
	  }
		


Acknowledgements

This webpage was originally made by Matan Kleiner with the help of Hila Manor for SinDDM and can be used as a template. It is inspired by the template that was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code for the original template can be found here.