`mmagic.models.editors.sagan.sagan_generator`¶

Module Contents¶

Classes¶

SNGANGenerator

Generator for SNGAN / Proj-GAN. The implementation refers to

class mmagic.models.editors.sagan.sagan_generator.SNGANGenerator(output_scale, num_classes=0, base_channels=64, out_channels=3, input_scale=4, noise_size=128, attention_cfg=dict(type='SelfAttentionBlock'), attention_after_nth_block=0, channels_cfg=None, blocks_cfg=dict(type='SNGANGenResBlock'), act_cfg=dict(type='ReLU'), use_cbn=True, auto_sync_bn=True, with_spectral_norm=False, with_embedding_spectral_norm=None, sn_style='torch', norm_eps=0.0001, sn_eps=1e-12, init_cfg=dict(type='BigGAN'), pretrained=None, rgb_to_bgr=False)[source]¶

Bases: mmengine.model.BaseModule

Generator for SNGAN / Proj-GAN. The implementation refers to https://github.com/pfnet-research/sngan_projection/tree/master/gen_models

In our implementation, we have two notable design. Namely, channels_cfg and blocks_cfg.

channels_cfg: In default config of SNGAN / Proj-GAN, the number of: ResBlocks and the channels of those blocks are corresponding to the resolution of the output image. Therefore, we allow user to define channels_cfg to try their own models. We also provide a default config to allow users to build the model only from the output resolution.
block_cfg: In reference code, the generator consists of a group of: ResBlock. However, in our implementation, to make this model more generalize, we support defining blocks_cfg by users and loading the blocks by calling the build_module method.

Parameters

output_scale (int) – Output scale for the generated image.
num_classes (int, optional) – The number classes you would like to generate. This arguments would influence the structure of the intermediate blocks and label sampling operation in forward (e.g. If num_classes=0, ConditionalNormalization layers would degrade to unconditional ones.). This arguments would be passed to intermediate blocks by overwrite their config. Defaults to 0.
base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Default to 64.
out_channels (int, optional) – Channels of the output images. Default to 3.
input_scale (int, optional) – Input scale for the features. Defaults to 4.
noise_size (int, optional) – Size of the input noise vector. Default to 128.
attention_cfg (dict, optional) – Config for the self-attention block. Default to dict(type='SelfAttentionBlock').
attention_after_nth_block (int | list[int], optional) – Self attention block would be added after which ConvBlock. If int is passed, only one attention block would be added. If list is passed, self-attention blocks would be added after multiple ConvBlocks. To be noted that if the input is smaller than 1, self-attention corresponding to this index would be ignored. Default to 0.
channels_cfg (list | dict[list], optional) – Config for input channels of the intermediate blocks. If list is passed, each element of the list means the input channels of current block is how many times compared to the base_channels. For block i, the input and output channels should be channels_cfg[i] and channels_cfg[i+1] If dict is provided, the key of the dict should be the output scale and corresponding value should be a list to define channels. Default: Please refer to _defualt_channels_cfg.
blocks_cfg (dict, optional) – Config for the intermediate blocks. Defaults to dict(type='SNGANGenResBlock')
act_cfg (dict, optional) – Activation config for the final output layer. Defaults to dict(type='ReLU').
use_cbn (bool, optional) – Whether use conditional normalization. This argument would pass to norm layers. Defaults to True.
auto_sync_bn (bool, optional) – Whether convert Batch Norm to Synchronized ones when Distributed training is on. Defaults to True.
with_spectral_norm (bool, optional) – Whether use spectral norm for conv blocks or not. Default to False.
with_embedding_spectral_norm (bool, optional) – Whether use spectral norm for embedding layers in normalization blocks or not. If not specified (set as None), with_embedding_spectral_norm would be set as the same value as with_spectral_norm. Defaults to None.
sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.
norm_eps (float, optional) – eps for Normalization layers (both conditional and non-conditional ones). Default to 1e-4.
sn_eps (float, optional) – eps for spectral normalization operation. Defaults to 1e-12.
init_cfg (string, optional) – Config for weight initialization. Defaults to dict(type='BigGAN').
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
rgb_to_bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained BigGAN weights whose output channels order is rgb. You can set this argument to True to use the weights.

_default_channels_cfg[source]¶

forward(noise, num_batches=0, label=None, return_noise=False)[source]¶

Forward function.

Parameters

noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a torch.Tensor or offer a callable function to sample a batch of noise data. Otherwise, the None indicates to use the default noise sampler.
num_batches (int, optional) – The number of batch size. Defaults to 0.
label (torch.Tensor | callable | None) – You can directly give a batch of label through a torch.Tensor or offer a callable function to sample a batch of label data. Otherwise, the None indicates to use the default label sampler.
return_noise (bool, optional) – If True, noise_batch will be returned in a dict with fake_img. Defaults to False.

Returns

If not return_noise, only the output: image will be returned. Otherwise, a dict contains fake_image, noise_batch and label_batch would be returned.

Return type

torch.Tensor | dict

init_weights(pretrained=None, strict=True)[source]¶

Init weights for SNGAN-Proj and SAGAN. If pretrained=None, weight initialization would follow the INIT_TYPE in init_cfg=dict(type=INIT_TYPE).

For SNGAN-Proj, (INIT_TYPE.upper() in ['SNGAN', 'SNGAN-PROJ', 'GAN-PROJ']), we follow the initialization method in the official Chainer’s implementation (https://github.com/pfnet-research/sngan_projection).

For SAGAN (INIT_TYPE.upper() == 'SAGAN'), we follow the initialization method in official tensorflow’s implementation (https://github.com/brain-research/self-attention-gan).

Besides the reimplementation of the official code’s initialization, we provide BigGAN’s and Pytorch-StudioGAN’s style initialization (INIT_TYPE.upper() == BIGGAN and INIT_TYPE.upper() == STUDIO). Please refer to https://github.com/ajbrock/BigGAN-PyTorch and https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.

Parameters: pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.

mmagic.models.editors.sagan.sagan_generator¶

Module Contents¶

Classes¶

`mmagic.models.editors.sagan.sagan_generator`¶