mmagic.models.editors.sagan.sagan_discriminator
¶
Module Contents¶
Classes¶
Discriminator for SNGAN / Proj-GAN. The implementation is refer to |
- class mmagic.models.editors.sagan.sagan_discriminator.ProjDiscriminator(input_scale, num_classes=0, base_channels=128, input_channels=3, attention_cfg=dict(type='SelfAttentionBlock'), attention_after_nth_block=- 1, channels_cfg=None, downsample_cfg=None, from_rgb_cfg=dict(type='SNGANDiscHeadResBlock'), blocks_cfg=dict(type='SNGANDiscResBlock'), act_cfg=dict(type='ReLU'), with_spectral_norm=True, sn_style='torch', sn_eps=1e-12, init_cfg=dict(type='BigGAN'), pretrained=None)[source]¶
Bases:
mmengine.model.BaseModule
Discriminator for SNGAN / Proj-GAN. The implementation is refer to https://github.com/pfnet-research/sngan_projection/tree/master/dis_models
The overall structure of the projection discriminator can be split into a
from_rgb
layer, a group of ResBlocks, a linear decision layer, and a projection layer. To support defining custom layers, we introducefrom_rgb_cfg
andblocks_cfg
.The design of the model structure is highly corresponding to the output resolution. Therefore, we provide channels_cfg and downsample_cfg to control the input channels and the downsample behavior of the intermediate blocks.
downsample_cfg
: In default config of SNGAN / Proj-GAN, whether to applydownsample in each intermediate blocks is quite flexible and corresponding to the resolution of the output image. Therefore, we support user to define the
downsample_cfg
by themselves, and to control the structure of the discriminator.channels_cfg
: In default config of SNGAN / Proj-GAN, the number ofResBlocks and the channels of those blocks are corresponding to the resolution of the output image. Therefore, we allow user to define channels_cfg for try their own models. We also provide a default config to allow users to build the model only from the output resolution.
- Parameters
input_scale (int) – The scale of the input image.
num_classes (int, optional) – The number classes you would like to generate. If num_classes=0, no label projection would be used. Default to 0.
base_channels (int, optional) – The basic channel number of the discriminator. The other layers contains channels based on this number. Defaults to 128.
input_channels (int, optional) – Channels of the input image. Defaults to 3.
attention_cfg (dict, optional) – Config for the self-attention block. Default to
dict(type='SelfAttentionBlock')
.attention_after_nth_block (int | list[int], optional) – Self-attention block would be added after which ConvBlock (including the head block). If
int
is passed, only one attention block would be added. Iflist
is passed, self-attention blocks would be added after multiple ConvBlocks. To be noted that if the input is smaller than1
, self-attention corresponding to this index would be ignored. Default to 0.channels_cfg (list | dict[list], optional) – Config for input channels of the intermediate blocks. If list is passed, each element of the list means the input channels of current block is how many times compared to the
base_channels
. For blocki
, the input and output channels should bechannels_cfg[i]
andchannels_cfg[i+1]
If dict is provided, the key of the dict should be the output scale and corresponding value should be a list to define channels. Default: Please refer to_defualt_channels_cfg
.downsample_cfg (list[bool] | dict[list], optional) – Config for downsample behavior of the intermediate layers. If a list is passed,
downsample_cfg[idx] == True
means apply downsample in idx-th block, and vice versa. If dict is provided, the key dict should be the input scale of the image and corresponding value should be a list ti define the downsample behavior. Default: Please refer to_default_downsample_cfg
.from_rgb_cfg (dict, optional) – Config for the first layer to convert rgb image to feature map. Defaults to
dict(type='SNGANDiscHeadResBlock')
.blocks_cfg (dict, optional) – Config for the intermediate blocks. Defaults to
dict(type='SNGANDiscResBlock')
act_cfg (dict, optional) – Activation config for the final output layer. Defaults to
dict(type='ReLU')
.with_spectral_norm (bool, optional) – Whether use spectral norm for all conv blocks or not. Default to True.
sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.
sn_eps (float, optional) – eps for spectral normalization operation. Defaults to 1e-12.
init_cfg (dict, optional) – Config for weight initialization. Default to
dict(type='BigGAN')
.pretrained (str | dict , optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
- forward(x, label=None)[source]¶
Forward function. If self.num_classes is larger than 0, label projection would be used.
- Parameters
x (torch.Tensor) – Fake or real image tensor.
label (torch.Tensor, options) – Label correspond to the input image. Noted that, if self.num_classed is larger than 0, label should not be None. Default to None.
- Returns
Prediction for the reality of the input image.
- Return type
torch.Tensor
- init_weights(pretrained=None, strict=True)[source]¶
Init weights for SNGAN-Proj and SAGAN. If
pretrained=None
and weight initialization would follow theINIT_TYPE
ininit_cfg=dict(type=INIT_TYPE)
.For SNGAN-Proj (
INIT_TYPE.upper() in ['SNGAN', 'SNGAN-PROJ', 'GAN-PROJ']
), we follow the initialization method in the official Chainer’s implementation (https://github.com/pfnet-research/sngan_projection).For SAGAN (
INIT_TYPE.upper() == 'SAGAN'
), we follow the initialization method in official tensorflow’s implementation (https://github.com/brain-research/self-attention-gan).Besides the reimplementation of the official code’s initialization, we provide BigGAN’s and Pytorch-StudioGAN’s style initialization (
INIT_TYPE.upper() == BIGGAN
andINIT_TYPE.upper() == STUDIO
). Please refer to https://github.com/ajbrock/BigGAN-PyTorch and https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.- Parameters
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.