mmagic.models.editors.biggan.biggan_deep_generator
¶
Module Contents¶
Classes¶
BigGAN-Deep Generator. The implementation refers to |
- class mmagic.models.editors.biggan.biggan_deep_generator.BigGANDeepGenerator(output_scale, noise_size=120, num_classes=0, out_channels=3, base_channels=96, block_depth=2, input_scale=4, with_shared_embedding=True, shared_dim=128, sn_eps=1e-06, sn_style='ajbrock', init_type='ortho', concat_noise=True, act_cfg=dict(type='ReLU', inplace=False), upsample_cfg=dict(type='nearest', scale_factor=2), with_spectral_norm=True, auto_sync_bn=True, blocks_cfg=dict(type='BigGANDeepGenResBlock'), arch_cfg=None, out_norm_cfg=dict(type='BN'), pretrained=None, rgb2bgr=False)[source]¶
Bases:
torch.nn.Module
BigGAN-Deep Generator. The implementation refers to https://github.com/ajbrock/BigGAN-PyTorch/blob/master/BigGANdeep.py # noqa.
In BigGAN, we use a SAGAN-based architecture composing of an self-attention block and number of convolutional residual blocks with spectral normalization. BigGAN-deep follow the same architecture.
The main difference between BigGAN and BigGAN-deep is that BigGAN-deep uses deeper residual blocks to construct the whole model.
More details can be found in: Large Scale GAN Training for High Fidelity Natural Image Synthesis (ICLR2019).
The design of the model structure is highly corresponding to the output resolution. For the original BigGAN-Deep’s generator, you can set
output_scale
as you need and use the default value ofarch_cfg
andblocks_cfg
. If you want to customize the model, you can set the arguments in this way:arch_cfg
: Config for the architecture of this generator. You can refer the_default_arch_cfgs
in the_get_default_arch_cfg
function to see the format of thearch_cfg
. Basically, you need to provide information of each block such as the numbers of input and output channels, whether to perform upsampling, etc.blocks_cfg
: Config for the convolution block. You can adjust block params likechannel_ratio
here. You can also replace the block type to your registered customized block. However, you should notice that some params are shared among these blocks likeact_cfg
,with_spectral_norm
,sn_eps
, etc.- Parameters
output_scale (int) – Output scale for the generated image.
noise_size (int, optional) – Size of the input noise vector. Defaults to 120.
num_classes (int, optional) – The number of conditional classes. If set to 0, this model will be degraded to an unconditional model. Defaults to 0.
out_channels (int, optional) – Number of channels in output images. Defaults to 3.
base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 96.
block_depth (int, optional) – The repeat times of Residual Blocks in each level of architecture. Defaults to 2.
input_scale (int, optional) – The scale of the input 2D feature map. Defaults to 4.
with_shared_embedding (bool, optional) – Whether to use shared embedding. Defaults to True.
shared_dim (int, optional) – The output channels of shared embedding. Defaults to 128.
sn_eps (float, optional) – Epsilon value for spectral normalization. Defaults to 1e-6.
sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to ajbrock.
init_type (str, optional) – The name of an initialization method: ortho | N02 | xavier. Defaults to ‘ortho’.
concat_noise (bool, optional) – Whether to concat input noise vector with class vector. Defaults to True.
act_cfg (dict, optional) – Config for the activation layer. Defaults to dict(type=’ReLU’).
upsample_cfg (dict, optional) – Config for the upsampling operation. Defaults to dict(type=’nearest’, scale_factor=2).
with_spectral_norm (bool, optional) – Whether to use spectral normalization. Defaults to True.
auto_sync_bn (bool, optional) – Whether to use synchronized batch normalization. Defaults to True.
blocks_cfg (dict, optional) – Config for the convolution block. Defaults to dict(type=’BigGANGenResBlock’).
arch_cfg (dict, optional) – Config for the architecture of this generator. Defaults to None.
out_norm_cfg (dict, optional) – Config for the norm of output layer. Defaults to dict(type=’BN’).
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
rgb2bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained BigGAN-Deep weights whose output channels order is rgb. You can set this argument to True to use the weights.
- forward(noise, label=None, num_batches=0, return_noise=False, truncation=- 1.0, use_outside_embedding=False)[source]¶
Forward function.
- Parameters
noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.label (torch.Tensor | callable | None) – You can directly give a batch of label through a
torch.Tensor
or offer a callable function to sample a batch of label data. Otherwise, theNone
indicates to use the default label sampler. Defaults to None.num_batches (int, optional) – The number of batch size. Defaults to 0.
return_noise (bool, optional) – If True,
noise_batch
andlabel
will be returned in a dict withfake_img
. Defaults to False.truncation (float, optional) – Truncation factor. Give value not less than 0., the truncation trick will be adopted. Otherwise, the truncation trick will not be adopted. Defaults to -1..
use_outside_embedding (bool, optional) – Whether to use outside embedding or use shared_embedding. Set to True if embedding has already be performed outside this function. Default to False.
- Returns
- If not
return_noise
, only the output image will be returned. Otherwise, a dict contains
fake_img
,noise_batch
andlabel
will be returned.
- If not
- Return type
torch.Tensor | dict
- init_weights(pretrained=None, init_type='ortho')[source]¶
Init weights for models.
- Parameters
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
init_type (str, optional) – The name of an initialization method: ortho | N02 | xavier. Defaults to ‘ortho’.