Shortcuts

条件生成对抗网络

概览

  • 预训练权重个数: 7

  • 配置文件个数: 0

  • 论文个数: 1

    • ALGORITHM: 1

BigGAN (ICLR’2019)

任务: 条件生成对抗网络

Abstract

Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple “truncation trick,” allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator’s input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6.

Introduction

BigGAN/BigGAN-Deep是一个条件生成模型,通过扩大批次大小和模型参数的数量,可以生成高分辨率和高质量的图像。

我们已经在Cifar10(32x32)中完成了BigGAN的训练,并在ImageNet1k(128x128)上对齐了训练性能。下面是一些抽样的结果,供你参考。

我们在 CIFAR10 上训练的 BigGAN 的结果
我们在 ImageNet 上训练的 BigGAN 的结果

对我们训练的 BigGAN 进行评估.

算法 数据集 FID (Iter) IS (Iter) 下载
BigGAN 32x32 CIFAR10 9.78(390000) 8.70(390000) model|log
BigGAN 128x128 Best FID ImageNet1k 8.69(1232000) 101.15(1232000) model|log
BigGAN 128x128 Best IS ImageNet1k 13.51(1328000) 129.07(1328000) model|log

关于可复现性的说明

BigGAN 128x128模型是用 V100 GPU 和 CUDA 10.1 训练的,用 A100 和 CUDA 11.3 很难再现结果。如果你对复现有任何想法,请随时与我们联系。

转换后的权重

由于我们还没有完成对模型的训练,我们为您提供了几个已经评估过的预训练权重。这里,我们指的是BigGAN-PyTorchpytorch-pretrained-BigGAN

下面提供了评估结果和下载链接

模型 数据集 FID IS 下载 原始权重下载链接
BigGAN 128x128 ImageNet1k 10.1414 96.728 model link
BigGAN-Deep 128x128 ImageNet1k 5.9471 107.161 model link
BigGAN-Deep 256x256 ImageNet1k 11.3151 135.107 model link
BigGAN-Deep 512x512 ImageNet1k 16.8728 124.368 model link

采样结果如下。

BigGAN-Deep 在 ImageNet 128x128 中使用预训练权重的结果,截断因子为 0.4
BigGAN-Deep 在 ImageNet 256x256 中使用预训练权重的结果,截断因子为 0.4
BigGAN-Deep 在 ImageNet 512x512 中使用预训练权重的结果,截断因子为 0.4
上面的截断取样技巧可以通过下面的命令进行。
python demo/conditional_demo.py CONFIG_PATH CKPT_PATH --sample-cfg truncation=0.4 ## set truncation value as you want

对于转换后的权重,我们在configs/_base_/models下提供模型配置,列举如下。

## biggan_cvt-BigGAN-PyTorch-rgb_imagenet1k-128x128.py
## biggan-deep_cvt-hugging-face-rgb_imagenet1k-128x128.py
## biggan-deep_cvt-hugging-face_rgb_imagenet1k-256x256.py
## biggan-deep_cvt-hugging-face_rgb_imagenet1k-512x512.py

Interpolation

要在 BigGAN(或其他条件模型)上执行图像插值,请运行

python apps/conditional_interpolate.py CONFIG_PATH  CKPT_PATH  --samples-path SAMPLES_PATH
我们的 BigGAN-Deep 的图像插值结果

要在 BigGAN 上进行具有固定噪声的图像插值,请运行

python apps/conditional_interpolate.py CONFIG_PATH  CKPT_PATH  --samples-path SAMPLES_PATH --fix-z
我们的 BigGAN-Deep 在固定噪音下的图像插值结果
要在 BigGAN 上执行具有固定标签的图像插值,请运行
python apps/conditional_interpolate.py CONFIG_PATH  CKPT_PATH  --samples-path SAMPLES_PATH --fix-y
我们的 BigGAN-Deep 带有固定标签的图像插值结果

Citation

@inproceedings{
    brock2018large,
    title={Large Scale {GAN} Training for High Fidelity Natural Image Synthesis},
    author={Andrew Brock and Jeff Donahue and Karen Simonyan},
    booktitle={International Conference on Learning Representations},
    year={2019},
    url={https://openreview.net/forum?id=B1xsqj09Fm},
}
Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.