Shortcuts

image generation

Summary

  • Number of checkpoints: 6

  • Number of configs: 6

  • Number of papers: 1

    • ALGORITHM: 1

Guided Diffusion (NeurIPS’2021)

Task: Image Generation

Abstract

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For conditional image synthesis, we further improve sample quality with classifier guidance: a simple, compute-efficient method for trading off diversity for fidelity using gradients from a classifier. We achieve an FID of 2.97 on ImageNet 128x128, 4.59 on ImageNet 256x256, and 7.72 on ImageNet 512x512, and we match BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. Finally, we find that classifier guidance combines well with upsampling diffusion models, further improving FID to 3.94 on ImageNet 256x256 and 3.85 on ImageNet 512x512.

Results and models

hamster, classifier-guidance samplings with CGS=1.0

ImageNet

Model Dataset Scheduler Steps CGS Time Consuming(A100) FID-Full-50K Download
adm_ddim250_8xb32_imagenet-64x64 ImageNet 64x64 DDIM 250 - 1h 3.2284 ckpt
adm-g_ddim25_8xb32_imagenet-64x64 ImageNet 64x64 DDIM 25 1.0 2h 3.7566 ckpt
adm_ddim250_8xb32_imagenet-256x256 ImageNet 256x256 DDIM 250 - - - ckpt
adm-g_ddim25_8xb32_imagenet-256x256 ImageNet 256x256 DDIM 25 1.0 - - ckpt
adm_ddim250_8xb32_imagenet-512x512 ImageNet 512x512 DDIM 250 - - - ckpt
adm-g_ddim25_8xb32_imagenet-512x512 ImageNet 512x512 DDIM 25 1.0 - - ckpt

Quick Start

infer

Infer Instructions

You can run adm as follows:

from mmengine import Config, MODELS
from mmengine.registry import init_default_scope
from torchvision.utils import save_image

init_default_scope('mmagic')

## sampling without classifier guidance, CGS=1.0
config = 'configs/guided_diffusion/adm-g_ddim25_8xb32_imagenet-64x64.py'
ckpt_path = 'https://download.openmmlab.com/mmediting/guided_diffusion/adm-g_8xb32_imagenet-64x64-2c0fbeda.pth'  ## noqa

model_cfg = Config.fromfile(config).model
model_cfg.pretrained_cfgs = dict(unet=dict(ckpt_path=ckpt_path, prefix='unet'),
                                 classifier=dict(ckpt_path=ckpt_path, prefix='classifier'))
model = MODELS.build(model_cfg).cuda().eval()

samples = model.infer(
            init_image=None,
            batch_size=4,
            num_inference_steps=25,
            labels=333,
            classifier_scale=1.0,
            show_progress=True)['samples']

## sampling without classifier guidance
config = 'configs/guided_diffusion/adm_ddim250_8xb32_imagenet-64x64.py'
ckpt_path = 'https://download.openmmlab.com/mmediting/guided_diffusion/adm-u-cvt-rgb_8xb32_imagenet-64x64-7ff0080b.pth'  ## noqa

model_cfg = Config.fromfile(config).model
model_cfg.pretrained_cfgs = dict(unet=dict(ckpt_path=ckpt_path, prefix='unet'))
model = MODELS.build(model_cfg).cuda().eval()

samples = model.infer(
            init_image=None,
            batch_size=4,
            num_inference_steps=250,
            labels=None,
            classifier_scale=0.0,
            show_progress=True)['samples']

Test

Test Instructions

You can use the following commands to test a model with cpu or single/multiple GPUs.

## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/guided_diffusion/adm-u_ddim250_8xb32_imagenet-64x64.py https://download.openmmlab.com/mmgen/guided_diffusion/adm-u-cvt-rgb_8xb32_imagenet-64x64-7ff0080b.pth

## single-gpu test
python tools/test.py configs/guided_diffusion/adm-u_ddim250_8xb32_imagenet-64x64.py https://download.openmmlab.com/mmgen/guided_diffusion/adm-u-cvt-rgb_8xb32_imagenet-64x64-7ff0080b.pth

## multi-gpu test
./tools/dist_test.sh configs/guided_diffusion/adm-u_ddim250_8xb32_imagenet-64x64.py https://download.openmmlab.com/mmgen/guided_diffusion/adm-u-cvt-rgb_8xb32_imagenet-64x64-7ff0080b.pth 8

For more details, you can refer to Test a pre-trained model part in train_test.md.

Citation

@article{PrafullaDhariwal2021DiffusionMB,
  title={Diffusion Models Beat GANs on Image Synthesis},
  author={Prafulla Dhariwal and Alex Nichol},
  journal={arXiv: Learning},
  year={2021}
}
Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.