mmagic.models.editors.sagan.sagan_generator
¶
Module Contents¶
Classes¶
Generator for SNGAN / Proj-GAN. The implementation refers to |
- class mmagic.models.editors.sagan.sagan_generator.SNGANGenerator(output_scale, num_classes=0, base_channels=64, out_channels=3, input_scale=4, noise_size=128, attention_cfg=dict(type='SelfAttentionBlock'), attention_after_nth_block=0, channels_cfg=None, blocks_cfg=dict(type='SNGANGenResBlock'), act_cfg=dict(type='ReLU'), use_cbn=True, auto_sync_bn=True, with_spectral_norm=False, with_embedding_spectral_norm=None, sn_style='torch', norm_eps=0.0001, sn_eps=1e-12, init_cfg=dict(type='BigGAN'), pretrained=None, rgb_to_bgr=False)[source]¶
Bases:
mmengine.model.BaseModule
Generator for SNGAN / Proj-GAN. The implementation refers to https://github.com/pfnet-research/sngan_projection/tree/master/gen_models
In our implementation, we have two notable design. Namely,
channels_cfg
andblocks_cfg
.channels_cfg
: In default config of SNGAN / Proj-GAN, the number ofResBlocks and the channels of those blocks are corresponding to the resolution of the output image. Therefore, we allow user to define
channels_cfg
to try their own models. We also provide a default config to allow users to build the model only from the output resolution.block_cfg
: In reference code, the generator consists of a group ofResBlock. However, in our implementation, to make this model more generalize, we support defining
blocks_cfg
by users and loading the blocks by calling the build_module method.
- Parameters
output_scale (int) – Output scale for the generated image.
num_classes (int, optional) – The number classes you would like to generate. This arguments would influence the structure of the intermediate blocks and label sampling operation in
forward
(e.g. If num_classes=0, ConditionalNormalization layers would degrade to unconditional ones.). This arguments would be passed to intermediate blocks by overwrite their config. Defaults to 0.base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Default to 64.
out_channels (int, optional) – Channels of the output images. Default to 3.
input_scale (int, optional) – Input scale for the features. Defaults to 4.
noise_size (int, optional) – Size of the input noise vector. Default to 128.
attention_cfg (dict, optional) – Config for the self-attention block. Default to
dict(type='SelfAttentionBlock')
.attention_after_nth_block (int | list[int], optional) – Self attention block would be added after which ConvBlock. If
int
is passed, only one attention block would be added. Iflist
is passed, self-attention blocks would be added after multiple ConvBlocks. To be noted that if the input is smaller than1
, self-attention corresponding to this index would be ignored. Default to 0.channels_cfg (list | dict[list], optional) – Config for input channels of the intermediate blocks. If list is passed, each element of the list means the input channels of current block is how many times compared to the
base_channels
. For blocki
, the input and output channels should bechannels_cfg[i]
andchannels_cfg[i+1]
If dict is provided, the key of the dict should be the output scale and corresponding value should be a list to define channels. Default: Please refer to_defualt_channels_cfg
.blocks_cfg (dict, optional) – Config for the intermediate blocks. Defaults to
dict(type='SNGANGenResBlock')
act_cfg (dict, optional) – Activation config for the final output layer. Defaults to
dict(type='ReLU')
.use_cbn (bool, optional) – Whether use conditional normalization. This argument would pass to norm layers. Defaults to True.
auto_sync_bn (bool, optional) – Whether convert Batch Norm to Synchronized ones when Distributed training is on. Defaults to True.
with_spectral_norm (bool, optional) – Whether use spectral norm for conv blocks or not. Default to False.
with_embedding_spectral_norm (bool, optional) – Whether use spectral norm for embedding layers in normalization blocks or not. If not specified (set as
None
),with_embedding_spectral_norm
would be set as the same value aswith_spectral_norm
. Defaults to None.sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.
norm_eps (float, optional) – eps for Normalization layers (both conditional and non-conditional ones). Default to 1e-4.
sn_eps (float, optional) – eps for spectral normalization operation. Defaults to 1e-12.
init_cfg (string, optional) – Config for weight initialization. Defaults to
dict(type='BigGAN')
.pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
rgb_to_bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained BigGAN weights whose output channels order is rgb. You can set this argument to True to use the weights.
- forward(noise, num_batches=0, label=None, return_noise=False)[source]¶
Forward function.
- Parameters
noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.num_batches (int, optional) – The number of batch size. Defaults to 0.
label (torch.Tensor | callable | None) – You can directly give a batch of label through a
torch.Tensor
or offer a callable function to sample a batch of label data. Otherwise, theNone
indicates to use the default label sampler.return_noise (bool, optional) – If True,
noise_batch
will be returned in a dict withfake_img
. Defaults to False.
- Returns
- If not
return_noise
, only the output image will be returned. Otherwise, a dict contains
fake_image
,noise_batch
andlabel_batch
would be returned.
- If not
- Return type
torch.Tensor | dict
- init_weights(pretrained=None, strict=True)[source]¶
Init weights for SNGAN-Proj and SAGAN. If
pretrained=None
, weight initialization would follow theINIT_TYPE
ininit_cfg=dict(type=INIT_TYPE)
.For SNGAN-Proj, (
INIT_TYPE.upper() in ['SNGAN', 'SNGAN-PROJ', 'GAN-PROJ']
), we follow the initialization method in the official Chainer’s implementation (https://github.com/pfnet-research/sngan_projection).For SAGAN (
INIT_TYPE.upper() == 'SAGAN'
), we follow the initialization method in official tensorflow’s implementation (https://github.com/brain-research/self-attention-gan).Besides the reimplementation of the official code’s initialization, we provide BigGAN’s and Pytorch-StudioGAN’s style initialization (
INIT_TYPE.upper() == BIGGAN
andINIT_TYPE.upper() == STUDIO
). Please refer to https://github.com/ajbrock/BigGAN-PyTorch and https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.- Parameters
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.