Module Contents



Multi-Scale Generator used in SinGAN with positional encoding.

class mmagic.models.editors.mspie.pe_singan_generator.SinGANMSGeneratorPE(in_channels, out_channels, num_scales, kernel_size=3, padding=0, num_layers=5, base_channels=32, min_feat_channels=32, out_act_cfg=dict(type='Tanh'), padding_mode='zero', pad_at_head=True, interp_pad=False, noise_with_pad=False, positional_encoding=None, first_stage_in_channels=None, **kwargs)[source]

Bases: mmagic.models.editors.singan.singan_generator.SinGANMultiScaleGenerator

Multi-Scale Generator used in SinGAN with positional encoding.

More details can be found in: Positional Encoding as Spatial Inductive Bias in GANs, CVPR’2021.


  • In this version, we adopt the interpolation function from the official PyTorch APIs, which is different from the original implementation by the authors. However, in our experiments, this influence can be ignored.

  • in_channels (int) – Input channels.

  • out_channels (int) – Output channels.

  • num_scales (int) – The number of scales/stages in generator. Note that this number is counted from zero, which is the same as the original paper.

  • kernel_size (int, optional) – Kernel size, same as nn.Conv2d. Defaults to 3.

  • padding (int, optional) – Padding for the convolutional layer, same as nn.Conv2d. Defaults to 0.

  • num_layers (int, optional) – The number of convolutional layers in each generator block. Defaults to 5.

  • base_channels (int, optional) – The basic channels for convolutional layers in the generator block. Defaults to 32.

  • min_feat_channels (int, optional) – Minimum channels for the feature maps in the generator block. Defaults to 32.

  • out_act_cfg (dict | None, optional) – Configs for output activation layer. Defaults to dict(type=’Tanh’).

  • padding_mode (str, optional) – The mode of convolutional padding, same as nn.Conv2d. Defaults to ‘zero’.

  • pad_at_head (bool, optional) – Whether to add padding at head. Defaults to True.

  • interp_pad (bool, optional) – The padding value of interpolating feature maps. Defaults to False.

  • noise_with_pad (bool, optional) – Whether the input fixed noises are with explicit padding. Defaults to False.

  • positional_encoding (dict | None, optional) – Configs for the positional encoding. Defaults to None.

  • first_stage_in_channels (int | None, optional) – The input channel of the first generator block. If None, the first stage will adopt the same input channels as other stages. Defaults to None.

forward(input_sample, fixed_noises, noise_weights, rand_mode, curr_scale, num_batches=1, get_prev_res=False, return_noise=False)[source]

Forward function.

  • input_sample (Tensor | None) – The input for generator. In the original implementation, a tensor filled with zeros is adopted. If None is given, we will construct it from the first fixed noises.

  • fixed_noises (list[Tensor]) – List of the fixed noises in SinGAN.

  • noise_weights (list[float]) – List of the weights for random noises.

  • rand_mode (str) – Choices from [‘rand’, ‘recon’]. In rand mode, it will sample from random noises. Otherwise, the reconstruction for the single image will be returned.

  • curr_scale (int) – The scale for the current inference or training.

  • num_batches (int, optional) – The number of batches. Defaults to 1.

  • get_prev_res (bool, optional) – Whether to return results from previous stages. Defaults to False.

  • return_noise (bool, optional) – Whether to return noises tensor. Defaults to False.


Generated image tensor or dictionary containing more data.

Return type

Tensor | dict

Read the Docs v: latest
On Read the Docs
Project Home

Free document hosting provided by Read the Docs.