Shortcuts

mmagic.models.editors.stylegan3

Package Contents

Classes

StyleGAN3

Implementation of Alias-Free Generative Adversarial Networks. # noqa.

StyleGAN3Generator

StyleGAN3 Generator.

SynthesisInput

Module which generate input for synthesis layer.

SynthesisLayer

Layer of Synthesis network for stylegan3.

SynthesisNetwork

Synthesis network for stylegan3.

class mmagic.models.editors.stylegan3.StyleGAN3(generator: ModelType, discriminator: Optional[ModelType] = None, data_preprocessor: Optional[Union[dict, mmengine.Config]] = None, generator_steps: int = 1, discriminator_steps: int = 1, forward_kwargs: Optional[Dict] = None, ema_config: Optional[Dict] = None, loss_config=dict())[source]

Bases: mmagic.models.editors.stylegan2.StyleGAN2

Implementation of Alias-Free Generative Adversarial Networks. # noqa.

Paper link: https://nvlabs-fi-cdn.nvidia.com/stylegan3/stylegan3-paper.pdf # noqa

Detailed architecture can be found in

StyleGAN3Generator and StyleGAN2Discriminator

test_step(data: dict) mmagic.utils.typing.SampleList[source]

Gets the generated image of given data. Same as val_step().

Parameters

data (dict) – Data sampled from metric specific sampler. More details in Metrics and Evaluator.

Returns

A list of DataSample contain generated results.

Return type

SampleList

val_step(data: dict) mmagic.utils.typing.SampleList[source]

Gets the generated image of given data. Same as val_step().

Parameters

data (dict) – Data sampled from metric specific sampler. More details in Metrics and Evaluator.

Returns

A list of DataSample contain generated results.

Return type

SampleList

train_discriminator(inputs: dict, data_samples: mmagic.structures.DataSample, optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor][source]

Train discriminator.

Parameters
  • inputs (dict) – Inputs from dataloader.

  • data_samples (DataSample) – Data samples from dataloader.

  • optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.

Returns

A dict of tensor for logging.

Return type

Dict[str, Tensor]

train_generator(inputs: dict, data_samples: mmagic.structures.DataSample, optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor][source]

Train generator.

Parameters
  • inputs (dict) – Inputs from dataloader.

  • data_samples (DataSample) – Data samples from dataloader. Do not used in generator’s training.

  • optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.

Returns

A dict of tensor for logging.

Return type

Dict[str, Tensor]

sample_equivarience_pairs(batch_size, sample_mode='ema', eq_cfg=dict(compute_eqt_int=False, compute_eqt_frac=False, compute_eqr=False, translate_max=0.125, rotate_max=1), sample_kwargs=dict())[source]
class mmagic.models.editors.stylegan3.StyleGAN3Generator(out_size, style_channels, img_channels, noise_size=512, rgb2bgr=False, pretrained=None, synthesis_cfg=dict(type='SynthesisNetwork'), mapping_cfg=dict(type='MappingNetwork'))[source]

Bases: mmengine.model.BaseModule

StyleGAN3 Generator.

In StyleGAN3, we make several changes to StyleGANv2’s generator which include transformed fourier features, filtered nonlinearity and non-critical sampling, etc. More details can be found in: Alias-Free Generative Adversarial Networks NeurIPS’2021.

Ref: https://github.com/NVlabs/stylegan3

Parameters
  • out_size (int) – The output size of the StyleGAN3 generator.

  • style_channels (int) – The number of channels for style code.

  • img_channels (int) – The number of output’s channels.

  • noise_size (int, optional) – Size of the input noise vector. Defaults to 512.

  • rgb2bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained StyleGAN3 weights whose output channels order is rgb. You can set this argument to True to use the weights.

  • pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.

  • synthesis_cfg (dict, optional) – Config for synthesis network. Defaults to dict(type=’SynthesisNetwork’).

  • mapping_cfg (dict, optional) – Config for mapping network. Defaults to dict(type=’MappingNetwork’).

_load_pretrained_model(ckpt_path, prefix='', map_location='cpu', strict=True)[source]
forward(noise, num_batches=0, input_is_latent=False, truncation=1, num_truncation_layer=None, update_emas=False, force_fp32=True, return_noise=False, return_latents=False)[source]

Forward Function for stylegan3.

Parameters
  • noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a torch.Tensor or offer a callable function to sample a batch of noise data. Otherwise, the None indicates to use the default noise sampler.

  • num_batches (int, optional) – The number of batch size. Defaults to 0.

  • input_is_latent (bool, optional) – If True, the input tensor is the latent tensor. Defaults to False.

  • truncation (float, optional) – Truncation factor. Give value less than 1., the truncation trick will be adopted. Defaults to 1.

  • num_truncation_layer (int, optional) – Number of layers use truncated latent. Defaults to None.

  • update_emas (bool, optional) – Whether update moving average of mean latent. Defaults to False.

  • force_fp32 (bool, optional) – Force fp32 ignore the weights. Defaults to True.

  • return_noise (bool, optional) – If True, noise_batch will be returned in a dict with fake_img. Defaults to False.

  • return_latents (bool, optional) – If True, latent will be returned in a dict with fake_img. Defaults to False.

Returns

Generated image tensor or dictionary containing more data.

Return type

torch.Tensor | dict

get_mean_latent(num_samples=4096, **kwargs)[source]

Get mean latent of W space in this generator.

Parameters

num_samples (int, optional) – Number of sample times. Defaults to 4096.

Returns

Mean latent of this generator.

Return type

Tensor

get_training_kwargs(phase)[source]

Get training kwargs. In StyleGANv3, we enable fp16, and update magnitude ema during training of discriminator. This function is used to pass related arguments.

Parameters

phase (str) – Current training phase.

Returns

Training kwargs.

Return type

dict

class mmagic.models.editors.stylegan3.SynthesisInput(style_channels, channels, size, sampling_rate, bandwidth)[source]

Bases: mmengine.model.BaseModule

Module which generate input for synthesis layer.

Parameters
  • style_channels (int) – The number of channels for style code.

  • channels (int) – The number of output channel.

  • size (int) – The size of sampling grid.

  • sampling_rate (int) – Sampling rate for construct sampling grid.

  • bandwidth (float) – Bandwidth of random frequencies.

forward(w)[source]

Forward function.

class mmagic.models.editors.stylegan3.SynthesisLayer(style_channels, is_torgb, is_critically_sampled, use_fp16, in_channels, out_channels, in_size, out_size, in_sampling_rate, out_sampling_rate, in_cutoff, out_cutoff, in_half_width, out_half_width, conv_kernel=3, filter_size=6, lrelu_upsampling=2, use_radial_filters=False, conv_clamp=256, magnitude_ema_beta=0.999)[source]

Bases: mmengine.model.BaseModule

Layer of Synthesis network for stylegan3.

Parameters
  • style_channels (int) – The number of channels for style code.

  • is_torgb (bool) – Whether output of this layer is transformed to rgb image.

  • is_critically_sampled (bool) – Whether filter cutoff is set exactly at the bandlimit.

  • use_fp16 (bool, optional) – Whether to use fp16 training in this module. If this flag is True, the whole module will be wrapped with auto_fp16.

  • in_channels (int) – The channel number of the input feature map.

  • out_channels (int) – The channel number of the output feature map.

  • in_size (int) – The input size of feature map.

  • out_size (int) – The output size of feature map.

  • in_sampling_rate (int) – Sampling rate for upsampling filter.

  • out_sampling_rate (int) – Sampling rate for downsampling filter.

  • in_cutoff (float) – Cutoff frequency for upsampling filter.

  • out_cutoff (float) – Cutoff frequency for downsampling filter.

  • in_half_width (float) – The approximate width of the transition region for upsampling filter.

  • out_half_width (float) – The approximate width of the transition region for downsampling filter.

  • conv_kernel (int, optional) – The kernel of modulated convolution. Defaults to 3.

  • filter_size (int, optional) – Base filter size. Defaults to 6.

  • lrelu_upsampling (int, optional) – Upsamling rate for filtered_lrelu. Defaults to 2.

  • use_radial_filters (bool, optional) – Whether use radially symmetric jinc-based filter in downsamping filter. Defaults to False.

  • conv_clamp (int, optional) – Clamp bound for convolution. Defaults to 256.

  • magnitude_ema_beta (float, optional) – Beta coefficient for calculating input magnitude ema. Defaults to 0.999.

forward(x, w, force_fp32=False, update_emas=False)[source]

Forward function for synthesis layer.

Parameters
  • x (torch.Tensor) – Input feature map tensor.

  • w (torch.Tensor) – Input style tensor.

  • force_fp32 (bool, optional) – Force fp32 ignore the weights. Defaults to True.

  • update_emas (bool, optional) – Whether update moving average of input magnitude. Defaults to False.

Returns

Output feature map tensor.

Return type

torch.Tensor

static design_lowpass_filter(numtaps, cutoff, width, fs, radial=False)[source]

Design lowpass filter giving related arguments.

Parameters
  • numtaps (int) – Length of the filter. numtaps must be odd if a passband includes the Nyquist frequency.

  • cutoff (float) – Cutoff frequency of filter

  • width (float) – The approximate width of the transition region.

  • fs (float) – The sampling frequency of the signal.

  • radial (bool, optional) – Whether use radially symmetric jinc-based filter. Defaults to False.

Returns

Kernel of lowpass filter.

Return type

torch.Tensor

class mmagic.models.editors.stylegan3.SynthesisNetwork(style_channels, out_size, img_channels, channel_base=32768, channel_max=512, num_layers=14, num_critical=2, first_cutoff=2, first_stopband=2 ** 2.1, last_stopband_rel=2 ** 0.3, margin_size=10, output_scale=0.25, num_fp16_res=4, **layer_kwargs)[source]

Bases: mmengine.model.BaseModule

Synthesis network for stylegan3.

Parameters
  • style_channels (int) – The number of channels for style code.

  • out_size (int) – The resolution of output image.

  • img_channels (int) – The number of channels for output image.

  • channel_base (int, optional) – Overall multiplier for the number of channels. Defaults to 32768.

  • channel_max (int, optional) – Maximum number of channels in any layer. Defaults to 512.

  • num_layers (int, optional) – Total number of layers, excluding Fourier features and ToRGB. Defaults to 14.

  • num_critical (int, optional) – Number of critically sampled layers at the end. Defaults to 2.

  • first_cutoff (int, optional) – Cutoff frequency of the first layer. Defaults to 2.

  • first_stopband (int, optional) – Minimum stopband of the first layer. Defaults to 2**2.1.

  • last_stopband_rel (float, optional) – Minimum stopband of the last layer, expressed relative to the cutoff. Defaults to 2**0.3.

  • margin_size (int, optional) – Number of additional pixels outside the image. Defaults to 10.

  • output_scale (float, optional) – Scale factor for output value. Defaults to 0.25.

  • num_fp16_res (int, optional) – Number of first few layers use fp16. Defaults to 4.

forward(ws, **layer_kwargs)[source]

Forward function.

Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.