mmagic.models.editors.stylegan1
¶
Package Contents¶
Classes¶
Implementation of `A Style-Based Generator Architecture for Generative |
|
StyleGAN1 Discriminator. |
|
StyleGAN1 Generator. |
|
Blur module. |
|
Constant Input. |
|
Equalized LR Linear Module with Activation Layer. |
|
Noise Injection Module. |
Functions¶
|
|
|
Get mean latent of W space in Style-based GANs. |
|
- class mmagic.models.editors.stylegan1.StyleGAN1(generator: ModelType, discriminator: Optional[ModelType] = None, data_preprocessor: Optional[Union[dict, mmengine.Config]] = None, style_channels: int = 512, nkimgs_per_scale: dict = {}, interp_real: Optional[dict] = None, transition_kimgs: int = 600, prev_stage: int = 0, ema_config: Optional[Dict] = None)[source]¶
Bases:
mmagic.models.editors.pggan.ProgressiveGrowingGAN
Implementation of A Style-Based Generator Architecture for Generative Adversarial Networks.
<https://openaccess.thecvf.com/content_CVPR_2019/html/Karras_A_Style-Based_Generator_Architecture_for_Generative_Adversarial_Networks_CVPR_2019_paper.html>`_ # noqa (StyleGANv1). This class is inherited from
ProgressiveGrowingGAN
to support progressive training.Detailed architecture can be found in
StyleGAN1Generator
andStyleGAN1Discriminator
- Parameters
generator (ModelType) – The config or model of the generator.
discriminator (Optional[ModelType]) – The config or model of the discriminator. Defaults to None.
data_preprocessor (Optional[Union[dict, Config]]) – The pre-process config or
DataPreprocessor
.style_channels (int) – The number of channels for style code. Defaults to 128.
nkimgs_per_scale (dict) – The number of images need for each resolution’s training. Defaults to {}.
intep_real (dict, optional) – The config of interpolation method for real images. If not passed, bilinear interpolation with align_corners will be used. Defaults to None.
transition_kimgs (int, optional) – The number of images during used to transit from the previous torgb layer to newer torgb layer. Defaults to 600.
prev_stage (int, optional) – The resolution of previous stage. Used for resume training. Defaults to 0.
ema_config (Optional[Dict]) – The config for generator’s exponential moving average setting. Defaults to None.
- disc_loss(disc_pred_fake: torch.Tensor, disc_pred_real: torch.Tensor, fake_data: torch.Tensor, real_data: torch.Tensor) Tuple[torch.Tensor, dict] [source]¶
Get disc loss. StyleGANv1 use non-saturating gan loss and R1 gradient penalty. loss to train the discriminator.
- Parameters
disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.
disc_pred_real (Tensor) – Discriminator’s prediction of the real images.
fake_data (Tensor) – Generated images, used to calculate gradient penalty.
real_data (Tensor) – Real images, used to calculate gradient penalty.
- Returns
Loss value and a dict of log variables.
- Return type
Tuple[Tensor, dict]
- gen_loss(disc_pred_fake: torch.Tensor) Tuple[torch.Tensor, dict] [source]¶
Generator loss for PGGAN. PGGAN use WGAN’s loss to train the generator.
- Parameters
disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.
- Returns
Loss value and a dict of log variables.
- Return type
Tuple[Tensor, dict]
- class mmagic.models.editors.stylegan1.StyleGAN1Discriminator(in_size, blur_kernel=[1, 2, 1], mbstd_cfg=dict(group_size=4))[source]¶
Bases:
mmengine.model.BaseModule
StyleGAN1 Discriminator.
The architecture of this discriminator is proposed in StyleGAN1. More details can be found in: A Style-Based Generator Architecture for Generative Adversarial Networks CVPR2019.
- Parameters
in_size (int) – The input size of images.
blur_kernel (list, optional) – The blurry kernel. Defaults to [1, 2, 1].
mbstd_cfg (dict, optional) – Configs for minibatch-stddev layer. Defaults to dict(group_size=4).
- forward(input, transition_weight=1.0, curr_scale=- 1)[source]¶
Forward function.
- Parameters
input (torch.Tensor) – Input image tensor.
transition_weight (float, optional) – The weight used in resolution transition. Defaults to 1..
curr_scale (int, optional) – The resolution scale of image tensor. -1 means the max resolution scale of the StyleGAN1. Defaults to -1.
- Returns
Predict score for the input image.
- Return type
torch.Tensor
- class mmagic.models.editors.stylegan1.StyleGAN1Generator(out_size, style_channels, num_mlps=8, blur_kernel=[1, 2, 1], lr_mlp=0.01, default_style_mode='mix', eval_style_mode='single', mix_prob=0.9)[source]¶
Bases:
mmengine.model.BaseModule
StyleGAN1 Generator.
In StyleGAN1, we use a progressive growing architecture composing of a style mapping module and number of convolutional style blocks. More details can be found in: A Style-Based Generator Architecture for Generative Adversarial Networks CVPR2019.
- Parameters
out_size (int) – The output size of the StyleGAN1 generator.
style_channels (int) – The number of channels for style code.
num_mlps (int, optional) – The number of MLP layers. Defaults to 8.
blur_kernel (list, optional) – The blurry kernel. Defaults to [1, 2, 1].
lr_mlp (float, optional) – The learning rate for the style mapping layer. Defaults to 0.01.
default_style_mode (str, optional) – The default mode of style mixing. In training, we adopt mixing style mode in default. However, in the evaluation, we use ‘single’ style mode. [‘mix’, ‘single’] are currently supported. Defaults to ‘mix’.
eval_style_mode (str, optional) – The evaluation mode of style mixing. Defaults to ‘single’.
mix_prob (float, optional) – Mixing probability. The value should be in range of [0, 1]. Defaults to 0.9.
- train(mode=True)[source]¶
Sets the module in training mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.- Parameters
mode (bool) – whether to set training mode (
True
) or evaluation mode (False
). Default:True
.- Returns
self
- Return type
Module
- make_injected_noise()[source]¶
make noises that will be injected into feature maps.
- Returns
List of layer-wise noise tensor.
- Return type
list[Tensor]
- get_mean_latent(num_samples=4096, **kwargs)[source]¶
Get mean latent of W space in this generator.
- Parameters
num_samples (int, optional) – Number of sample times. Defaults to 4096.
- Returns
Mean latent of this generator.
- Return type
Tensor
- style_mixing(n_source, n_target, inject_index=1, truncation_latent=None, truncation=0.7, curr_scale=- 1, transition_weight=1)[source]¶
- forward(styles, num_batches=- 1, return_noise=False, return_latents=False, inject_index=None, truncation=1, truncation_latent=None, input_is_latent=False, injected_noise=None, randomize_noise=True, transition_weight=1.0, curr_scale=- 1)[source]¶
Forward function.
This function has been integrated with the truncation trick. Please refer to the usage of truncation and truncation_latent.
- Parameters
styles (torch.Tensor | list[torch.Tensor] | callable | None) – In StyleGAN1, you can provide noise tensor or latent tensor. Given a list containing more than one noise or latent tensors, style mixing trick will be used in training. Of course, You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.num_batches (int, optional) – The number of batch size. Defaults to 0.
return_noise (bool, optional) – If True,
noise_batch
will be returned in a dict withfake_img
. Defaults to False.return_latents (bool, optional) – If True,
latent
will be returned in a dict withfake_img
. Defaults to False.inject_index (int | None, optional) – The index number for mixing style codes. Defaults to None.
truncation (float, optional) – Truncation factor. Give value less than 1., the truncation trick will be adopted. Defaults to 1.
truncation_latent (torch.Tensor, optional) – Mean truncation latent. Defaults to None.
input_is_latent (bool, optional) – If True, the input tensor is the latent tensor. Defaults to False.
injected_noise (torch.Tensor | None, optional) – Given a tensor, the random noise will be fixed as this input injected noise. Defaults to None.
randomize_noise (bool, optional) – If False, images are sampled with the buffered noise tensor injected to the style conv block. Defaults to True.
transition_weight (float, optional) – The weight used in resolution transition. Defaults to 1..
curr_scale (int, optional) – The resolution scale of generated image tensor. -1 means the max resolution scale of the StyleGAN1. Defaults to -1.
- Returns
Generated image tensor or dictionary containing more data.
- Return type
torch.Tensor | dict
- class mmagic.models.editors.stylegan1.Blur(kernel, pad, upsample_factor=1)[source]¶
Bases:
mmengine.model.BaseModule
Blur module.
This module is adopted rightly after upsampling operation in StyleGAN2.
- Parameters
kernel (Array) – Blur kernel/filter used in UpFIRDn.
pad (list[int]) – Padding for features.
upsample_factor (int, optional) – Upsampling factor. Defaults to 1.
- class mmagic.models.editors.stylegan1.ConstantInput(channel, size=4)[source]¶
Bases:
mmengine.model.BaseModule
Constant Input.
In StyleGAN2, they substitute the original head noise input with such a constant input module.
- Parameters
channel (int) – Channels for the constant input tensor.
size (int, optional) – Spatial size for the constant input. Defaults to 4.
- class mmagic.models.editors.stylegan1.EqualLinearActModule(*args, equalized_lr_cfg=dict(gain=1.0, lr_mul=1.0), bias=True, bias_init=0.0, act_cfg=None, **kwargs)[source]¶
Bases:
mmengine.model.BaseModule
Equalized LR Linear Module with Activation Layer.
This module is modified from
EqualizedLRLinearModule
defined in PGGAN. The major features updated in this module is adding support for activation layers used in StyleGAN2.- Parameters
equalized_lr_cfg (dict | None, optional) – Config for equalized lr. Defaults to dict(gain=1., lr_mul=1.).
bias (bool, optional) – Whether to use bias item. Defaults to True.
bias_init (float, optional) – The value for bias initialization. Defaults to
0.
.act_cfg (dict | None, optional) – Config for activation layer. Defaults to None.
- class mmagic.models.editors.stylegan1.NoiseInjection(noise_weight_init=0.0, fixed_noise=False)[source]¶
Bases:
mmengine.model.BaseModule
Noise Injection Module.
In StyleGAN2, they adopt this module to inject spatial random noise map in the generators.
- Parameters
noise_weight_init (float, optional) – Initialization weight for noise injection. Defaults to
0.
.fixed_noise (bool, optional) – Whether to inject a fixed noise. Defaults
False. (to) –
- forward(image, noise=None, return_noise=False)[source]¶
Forward Function.
- Parameters
image (Tensor) – Spatial features with a shape of (N, C, H, W).
noise (Tensor, optional) – Noises from the outside. Defaults to None.
return_noise (bool, optional) – Whether to return noise tensor. Defaults to False.
- Returns
Output features.
- Return type
Tensor
- mmagic.models.editors.stylegan1.get_mean_latent(generator, num_samples=4096, bs_per_repeat=1024)[source]¶
Get mean latent of W space in Style-based GANs.
- Parameters
generator (BaseModule) – Generator of a Style-based GAN.
num_samples (int, optional) – Number of sample times. Defaults to 4096.
bs_per_repeat (int, optional) – Batch size of noises per sample. Defaults to 1024.
- Returns
Mean latent of this generator.
- Return type
Tensor