mmagic.models.editors.pggan
¶
Package Contents¶
Classes¶
Progressive Growing Unconditional GAN. |
|
Discriminator for PGGAN. |
|
Generator for PGGAN. |
|
Equalized Learning Rate. |
|
Equalized LR (Conv + Downsample) Module. |
|
Equalized LR ConvModule. |
|
Equalized LR (Upsample + Conv) Module. |
|
Equalized LR LinearModule. |
|
Minibatch standard deviation. |
|
Base module for all modules in openmmlab. |
|
Pixel Normalization. |
Functions¶
|
Equalized Learning Rate. |
- class mmagic.models.editors.pggan.ProgressiveGrowingGAN(generator, discriminator, data_preprocessor, nkimgs_per_scale, noise_size=None, interp_real=None, transition_kimgs: int = 600, prev_stage: int = 0, ema_config: Optional[Dict] = None)[source]¶
Bases:
mmagic.models.base_models.BaseGAN
Progressive Growing Unconditional GAN.
In this GAN model, we implement progressive growing training schedule, which is proposed in Progressive Growing of GANs for improved Quality, Stability and Variation, ICLR 2018.
We highly recommend to use
GrowScaleImgDataset
for saving computational load in data pre-processing.Notes for using PGGAN:
In official implementation, Tero uses gradient penalty with
norm_mode="HWC"
We do not implement
minibatch_repeats
where has been used in official Tensorflow implementation.
Notes for resuming progressive growing GANs: Users should specify the
prev_stage
intrain_cfg
. Otherwise, the model is possible to reset the optimizer status, which will bring inferior performance. For example, if your model is resumed from the 256 stage, you should settrain_cfg=dict(prev_stage=256)
.- Parameters
generator (dict) – Config for generator.
discriminator (dict) – Config for discriminator.
- forward(inputs: mmagic.utils.typing.ForwardInputs, data_samples: Optional[list] = None, mode: Optional[str] = None) mmagic.utils.typing.SampleList [source]¶
Sample images from noises by using the generator.
- Parameters
batch_inputs (ForwardInputs) – Dict containing the necessary information (e.g. noise, num_batches, mode) to generate image.
data_samples (Optional[list]) – Data samples collated by
data_preprocessor
. Defaults to None.mode (Optional[str]) – mode is not used in
ProgressiveGrowingGAN
. Defaults to None.
- Returns
A list of
DataSample
contain generated results.- Return type
SampleList
- train_discriminator(inputs: torch.Tensor, data_samples: List[mmagic.structures.DataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor] [source]¶
Train discriminator.
- Parameters
inputs (Tensor) – Inputs from current resolution training.
data_samples (List[DataSample]) – Data samples from dataloader. Do not used in generator’s training.
optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.
- Returns
A
dict
of tensor for logging.- Return type
Dict[str, Tensor]
- disc_loss(disc_pred_fake: torch.Tensor, disc_pred_real: torch.Tensor, fake_data: torch.Tensor, real_data: torch.Tensor) Tuple[torch.Tensor, dict] [source]¶
Get disc loss. PGGAN use WGAN-GP’s loss and discriminator shift loss to train the discriminator.
- Parameters
disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.
disc_pred_real (Tensor) – Discriminator’s prediction of the real images.
fake_data (Tensor) – Generated images, used to calculate gradient penalty.
real_data (Tensor) – Real images, used to calculate gradient penalty.
- Returns
Loss value and a dict of log variables.
- Return type
Tuple[Tensor, dict]
- train_generator(inputs: torch.Tensor, data_samples: List[mmagic.structures.DataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor] [source]¶
Train generator.
- Parameters
inputs (Tensor) – Inputs from current resolution training.
data_samples (List[DataSample]) – Data samples from dataloader. Do not used in generator’s training.
optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.
- Returns
A
dict
of tensor for logging.- Return type
Dict[str, Tensor]
- gen_loss(disc_pred_fake: torch.Tensor) Tuple[torch.Tensor, dict] [source]¶
Generator loss for PGGAN. PGGAN use WGAN’s loss to train the generator.
- Parameters
disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.
recon_imgs (Tensor) – Reconstructive images.
- Returns
Loss value and a dict of log variables.
- Return type
Tuple[Tensor, dict]
- train_step(data: dict, optim_wrapper: mmengine.optim.OptimWrapperDict)[source]¶
Train step function.
This function implements the standard training iteration for asynchronous adversarial training. Namely, in each iteration, we first update discriminator and then compute loss for generator with the newly updated discriminator.
As for distributed training, we use the
reducer
from ddp to synchronize the necessary params in current computational graph.- Parameters
data_batch (dict) – Input data from dataloader.
optimizer (dict) – Dict contains optimizer for generator and discriminator.
ddp_reducer (
Reducer
| None, optional) – Reducer from ddp. It is used to prepare forbackward()
in ddp. Defaults to None.running_status (dict | None, optional) – Contains necessary basic information for training, e.g., iteration number. Defaults to None.
- Returns
Contains ‘log_vars’, ‘num_samples’, and ‘results’.
- Return type
dict
- class mmagic.models.editors.pggan.PGGANDiscriminator(in_scale, label_size=0, base_channels=8192, max_channels=512, in_channels=3, channel_decay=1.0, mbstd_cfg=dict(group_size=4), fused_convdown=True, conv_module_cfg=None, fused_convdown_cfg=None, fromrgb_layer_cfg=None, downsample_cfg=None)[source]¶
Bases:
mmengine.model.BaseModule
Discriminator for PGGAN.
- Parameters
in_scale (int) – The scale of the input image.
label_size (int, optional) – Size of the label vector. Defaults to 0.
base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 8192.
max_channels (int, optional) – Maximum channels for the feature maps in the discriminator block. Defaults to 512.
in_channels (int, optional) – Number of channels in input images. Defaults to 3.
channel_decay (float, optional) – Decay for channels of feature maps. Defaults to 1.0.
mbstd_cfg (dict, optional) – Configs for minibatch-stddev layer. Defaults to dict(group_size=4).
fused_convdown (bool, optional) – Whether use fused downconv. Defaults to True.
conv_module_cfg (dict, optional) – Config for the convolution module used in this generator. Defaults to None.
fused_convdown_cfg (dict, optional) – Config for the fused downconv module used in this discriminator. Defaults to None.
fromrgb_layer_cfg (dict, optional) – Config for the fromrgb layer. Defaults to None.
downsample_cfg (dict, optional) – Config for the downsampling operation. Defaults to None.
- _default_fromrgb_cfg¶
- _default_conv_module_cfg¶
- _default_convdown_cfg¶
- _num_out_channels(log_scale: int) int [source]¶
Calculate the number of output channels of the current network from logarithm of current scale.
- Parameters
log_scale (int) – The logarithm of the current scale.
- Returns
The number of output channels.
- Return type
int
- _get_fromrgb_layer(in_channels: int, log2_scale: int) torch.nn.Module [source]¶
Get the ‘fromrgb’ layer from logarithm of current scale.
- Parameters
in_channels (int) – The number of input channels.
log2_scale (int) – The logarithm of the current scale.
- Returns
The built from-rgb layer.
- Return type
nn.Module
- _get_convdown_block(in_channels: int, log2_scale: int) torch.nn.Module [source]¶
Get the downsample layer from logarithm of current scale.
- Parameters
in_channels (int) – The number of input channels.
log2_scale (int) – The logarithm of the current scale.
- Returns
The built Conv layer.
- Return type
nn.Module
- forward(x, transition_weight=1.0, curr_scale=- 1)[source]¶
Forward function.
- Parameters
x (torch.Tensor) – Input image tensor.
transition_weight (float, optional) – The weight used in resolution transition. Defaults to 1.0.
curr_scale (int, optional) – The scale for the current inference or training. Defaults to -1.
- Returns
Predict score for the input image.
- Return type
Tensor
- class mmagic.models.editors.pggan.PGGANGenerator(noise_size, out_scale, label_size=0, base_channels=8192, channel_decay=1.0, max_channels=512, fused_upconv=True, conv_module_cfg=None, fused_upconv_cfg=None, upsample_cfg=None)[source]¶
Bases:
mmengine.model.BaseModule
Generator for PGGAN.
- Parameters
noise_size (int) – Size of the input noise vector.
out_scale (int) – Output scale for the generated image.
label_size (int, optional) – Size of the label vector. Defaults to 0.
base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 8192.
channel_decay (float, optional) – Decay for channels of feature maps. Defaults to 1.0.
max_channels (int, optional) – Maximum channels for the feature maps in the generator block. Defaults to 512.
fused_upconv (bool, optional) – Whether use fused upconv. Defaults to True.
conv_module_cfg (dict, optional) – Config for the convolution module used in this generator. Defaults to None.
fused_upconv_cfg (dict, optional) – Config for the fused upconv module used in this generator. Defaults to None.
upsample_cfg (dict, optional) – Config for the upsampling operation. Defaults to None.
- _default_fused_upconv_cfg¶
- _default_conv_module_cfg¶
- _default_upsample_cfg¶
- _get_torgb_layer(in_channels: int)[source]¶
Get the to-rgb layer based on in_channels.
- Parameters
in_channels (int) – Number of input channels.
- Returns
To-rgb layer.
- Return type
nn.Module
- _num_out_channels(log_scale: int)[source]¶
Calculate the number of output channels based on logarithm of current scale.
- Parameters
log_scale (int) – The logarithm of the current scale.
- Returns
The current number of output channels.
- Return type
int
- _get_upconv_block(in_channels, log_scale)[source]¶
Get the conv block for upsampling.
- Parameters
in_channels (int) – The number of input channels.
log_scale (int) – The logarithmic of the current scale.
- Returns
The conv block for upsampling.
- Return type
nn.Module
- forward(noise, label=None, num_batches=0, return_noise=False, transition_weight=1.0, curr_scale=- 1)[source]¶
Forward function.
- Parameters
noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.label (Tensor, optional) – Label vector with shape [N, C]. Defaults to None.
num_batches (int, optional) – The number of batch size. Defaults to 0.
return_noise (bool, optional) – If True,
noise_batch
will be returned in a dict withfake_img
. Defaults to False.transition_weight (float, optional) – The weight used in resolution transition. Defaults to 1.0.
curr_scale (int, optional) – The scale for the current inference or training. Defaults to -1.
- Returns
- If not
return_noise
, only the output image will be returned. Otherwise, a dict contains
fake_img
andnoise_batch
will be returned.
- If not
- Return type
torch.Tensor | dict
- class mmagic.models.editors.pggan.EqualizedLR(name='weight', gain=2 ** 0.5, mode='fan_in', lr_mul=1.0)[source]¶
Equalized Learning Rate.
This trick is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation
The general idea is to dynamically rescale the weight in training instead of in initializing so that the variance of the responses in each layer is guaranteed with some statistical properties.
Note that this function is always combined with a convolution module which is initialized with \(\mathcal{N}(0, 1)\).
- Parameters
name (str | optional) – The name of weights. Defaults to ‘weight’.
mode (str, optional) – The mode of computing
fan
which is the same askaiming_init
in pytorch. You can choose one from [‘fan_in’, ‘fan_out’]. Defaults to ‘fan_in’.
- compute_weight(module)[source]¶
Compute weight with equalized learning rate.
- Parameters
module (nn.Module) – A module that is wrapped with equalized lr.
- Returns
Updated weight.
- Return type
torch.Tensor
- static apply(module, name, gain=2 ** 0.5, mode='fan_in', lr_mul=1.0)[source]¶
Apply function.
This function is to register an equalized learning rate hook in an
nn.Module
.- Parameters
module (nn.Module) – Module to be wrapped.
name (str | optional) – The name of weights. Defaults to ‘weight’.
mode (str, optional) – The mode of computing
fan
which is the same askaiming_init
in pytorch. You can choose one from [‘fan_in’, ‘fan_out’]. Defaults to ‘fan_in’.
- Returns
Module that is registered with equalized lr hook.
- Return type
nn.Module
- class mmagic.models.editors.pggan.EqualizedLRConvDownModule(*args, downsample=dict(type='fused_pool'), **kwargs)[source]¶
Bases:
EqualizedLRConvModule
Equalized LR (Conv + Downsample) Module.
In this module, we inherit
EqualizedLRConvModule
and adopt downsampling after convolution. As for downsampling, we provide two modes of “avgpool” and “fused_pool”. “avgpool” denotes the commonly used average pooling operation, while “fused_pool” represents fusing downsampling and convolution. The fusion is modified from the official Tensorflow implementation in: https://github.com/tkarras/progressive_growing_of_gans/blob/master/networks.py#L109- Parameters
downsample (dict | None, optional) – Config for downsampling operation. If
None
, downsampling is ignored. Currently, we support the types of [“avgpool”, “fused_pool”]. Defaults to dict(type=’fused_pool’).
- class mmagic.models.editors.pggan.EqualizedLRConvModule(*args, equalized_lr_cfg=dict(mode='fan_in'), **kwargs)[source]¶
Bases:
mmcv.cnn.bricks.ConvModule
Equalized LR ConvModule.
In this module, we inherit default
mmcv.cnn.ConvModule
and adopt equalized lr in convolution. The equalized learning rate is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and VariationNote that, the initialization of
self.conv
will be overwritten as \(\mathcal{N}(0, 1)\).- Parameters
equalized_lr_cfg (dict | None, optional) – Config for
EqualizedLR
. IfNone
, equalized learning rate is ignored. Defaults to dict(mode=’fan_in’).
- class mmagic.models.editors.pggan.EqualizedLRConvUpModule(*args, upsample=dict(type='nearest', scale_factor=2), **kwargs)[source]¶
Bases:
EqualizedLRConvModule
Equalized LR (Upsample + Conv) Module.
In this module, we inherit
EqualizedLRConvModule
and adopt upsampling before convolution. As for upsampling, in addition to the sampling layer in MMCV, we also offer the “fused_nn” type. “fused_nn” denotes fusing upsampling and convolution. The fusion is modified from the official Tensorflow implementation in: https://github.com/tkarras/progressive_growing_of_gans/blob/master/networks.py#L86- Parameters
upsample (dict | None, optional) – Config for upsampling operation. If
None –
as (you should set it) –
Tensorflow (the official PGGAN in) –
as –
``dict –
``dict –
- class mmagic.models.editors.pggan.EqualizedLRLinearModule(*args, equalized_lr_cfg=dict(mode='fan_in'), **kwargs)[source]¶
Bases:
torch.nn.Linear
Equalized LR LinearModule.
In this module, we adopt equalized lr in
nn.Linear
. The equalized learning rate is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and VariationNote that, the initialization of
self.weight
will be overwritten as \(\mathcal{N}(0, 1)\).- Parameters
equalized_lr_cfg (dict | None, optional) – Config for
EqualizedLR
. IfNone
, equalized learning rate is ignored. Defaults to dict(mode=’fan_in’).
- class mmagic.models.editors.pggan.MiniBatchStddevLayer(group_size=4, eps=1e-08, gather_all_batch=False)[source]¶
Bases:
mmengine.model.BaseModule
Minibatch standard deviation.
- Parameters
group_size (int, optional) – The size of groups in batch dimension. Defaults to 4.
eps (float, optional) – Epsilon value to avoid computation error. Defaults to 1e-8.
gather_all_batch (bool, optional) – Whether gather batch from all GPUs. Defaults to False.
- class mmagic.models.editors.pggan.PGGANNoiseTo2DFeat(noise_size, out_channels, act_cfg=dict(type='LeakyReLU', negative_slope=0.2), norm_cfg=dict(type='PixelNorm'), normalize_latent=True, order=('linear', 'act', 'norm'))[source]¶
Bases:
mmengine.model.BaseModule
Base module for all modules in openmmlab.
BaseModule
is a wrapper oftorch.nn.Module
with additional functionality of parameter initialization. Compared withtorch.nn.Module
,BaseModule
mainly adds three attributes.init_cfg
: the config to control the initialization.init_weights
: The function of parameter initialization and recording initialization information._params_init_info
: Used to track the parameter initialization information. This attribute only exists during executing theinit_weights
.
Note
PretrainedInit
has a higher priority than any other initializer. The loaded pretrained weights will overwrite the previous initialized weights.- Parameters
init_cfg (dict or List[dict], optional) – Initialization config dict.
- class mmagic.models.editors.pggan.PixelNorm(in_channels=None, eps=1e-06)[source]¶
Bases:
mmengine.model.BaseModule
Pixel Normalization.
This module is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation
- Parameters
eps (float, optional) – Epsilon value. Defaults to 1e-6.
- _abbr_ = pn¶
- mmagic.models.editors.pggan.equalized_lr(module, name='weight', gain=2 ** 0.5, mode='fan_in', lr_mul=1.0)[source]¶
Equalized Learning Rate.
This trick is proposed in: Progressive Growing of GANs for Improved Quality, Stability, and Variation
The general idea is to dynamically rescale the weight in training instead of in initializing so that the variance of the responses in each layer is guaranteed with some statistical properties.
Note that this function is always combined with a convolution module which is initialized with \(\mathcal{N}(0, 1)\).
- Parameters
module (nn.Module) – Module to be wrapped.
name (str | optional) – The name of weights. Defaults to ‘weight’.
mode (str, optional) – The mode of computing
fan
which is the same askaiming_init
in pytorch. You can choose one from [‘fan_in’, ‘fan_out’]. Defaults to ‘fan_in’.
- Returns
Module that is registered with equalized lr hook.
- Return type
nn.Module