mmagic.models.editors.stylegan2.stylegan2_generator
¶
Module Contents¶
Classes¶
StyleGAN2 Generator. |
- class mmagic.models.editors.stylegan2.stylegan2_generator.StyleGAN2Generator(out_size, style_channels, out_channels=3, noise_size=None, cond_size=None, cond_mapping_channels=None, num_mlps=8, channel_multiplier=2, blur_kernel=[1, 3, 3, 1], lr_mlp=0.01, default_style_mode='mix', eval_style_mode='single', norm_eps=1e-06, mix_prob=0.9, update_mean_latent_with_ema=False, w_avg_beta=0.998, num_fp16_scales=0, fp16_enabled=False, bgr2rgb=False, pretrained=None, fixed_noise=False)[source]¶
Bases:
mmengine.model.BaseModule
StyleGAN2 Generator.
In StyleGAN2, we use a static architecture composing of a style mapping module and number of convolutional style blocks. More details can be found in: Analyzing and Improving the Image Quality of StyleGAN CVPR2020.
You can load pretrained model through passing information into
pretrained
argument. We have already offered official weights as follows:stylegan2-ffhq-config-f: https://download.openmmlab.com/mmediting/stylegan2/official_weights/stylegan2-ffhq-config-f-official_20210327_171224-bce9310c.pth # noqa
stylegan2-horse-config-f: https://download.openmmlab.com/mmediting/stylegan2/official_weights/stylegan2-horse-config-f-official_20210327_173203-ef3e69ca.pth # noqa
stylegan2-car-config-f: https://download.openmmlab.com/mmediting/stylegan2/official_weights/stylegan2-car-config-f-official_20210327_172340-8cfe053c.pth # noqa
stylegan2-cat-config-f: https://download.openmmlab.com/mmediting/stylegan2/official_weights/stylegan2-cat-config-f-official_20210327_172444-15bc485b.pth # noqa
stylegan2-church-config-f: https://download.openmmlab.com/mmediting/stylegan2/official_weights/stylegan2-church-config-f-official_20210327_172657-1d42b7d1.pth # noqa
If you want to load the ema model, you can just use following codes:
# ckpt_http is one of the valid path from http source generator = StyleGANv2Generator(1024, 512, pretrained=dict( ckpt_path=ckpt_http, prefix='generator_ema'))
Of course, you can also download the checkpoint in advance and set
ckpt_path
with local path. If you just want to load the original generator (not the ema model), please set the prefix with ‘generator’.Note that our implementation allows to generate BGR image, while the original StyleGAN2 outputs RGB images by default. Thus, we provide
bgr2rgb
argument to convert the image space.- Parameters
out_size (int) – The output size of the StyleGAN2 generator.
style_channels (int) – The number of channels for style code.
out_channels (int) – The number of channels for output. Defaults to 3.
noise_size (int, optional) – The size of (number of channels) the input noise. If not passed, will be set the same value as
style_channels
. Defaults to None.cond_size (int, optional) – The size of the conditional input. If not passed or less than 1, no conditional embedding will be used. Defaults to None.
cond_mapping_channels (int, optional) – The channels of the conditional mapping layers. If not passed, will use the same value as
style_channels
. Defaults to None.num_mlps (int, optional) – The number of MLP layers. Defaults to 8.
channel_multiplier (int, optional) – The multiplier factor for the channel number. Defaults to 2.
blur_kernel (list, optional) – The blurry kernel. Defaults to [1, 3, 3, 1].
lr_mlp (float, optional) – The learning rate for the style mapping layer. Defaults to 0.01.
default_style_mode (str, optional) – The default mode of style mixing. In training, we adopt mixing style mode in default. However, in the evaluation, we use ‘single’ style mode. [‘mix’, ‘single’] are currently supported. Defaults to ‘mix’.
eval_style_mode (str, optional) – The evaluation mode of style mixing. Defaults to ‘single’.
mix_prob (float, optional) – Mixing probability. The value should be in range of [0, 1]. Defaults to
0.9
.update_mean_latent_with_ema (bool, optional) – Whether update mean latent code (w) with EMA. Defaults to False.
w_avg_beta (float, optional) – The value used for update w_avg. Defaults to 0.998.
num_fp16_scales (int, optional) – The number of resolutions to use auto fp16 training. Different from
fp16_enabled
, this argument allows users to adopt FP16 training only in several blocks. This behaviour is much more similar to the official implementation by Tero. Defaults to 0.fp16_enabled (bool, optional) – Whether to use fp16 training in this module. If this flag is True, the whole module will be wrapped with
auto_fp16
. Defaults to False.pretrained (dict | None, optional) – Information for pretrained models. The necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
- train(mode=True)[source]¶
Sets the module in training mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.- Parameters
mode (bool) – whether to set training mode (
True
) or evaluation mode (False
). Default:True
.- Returns
self
- Return type
Module
- make_injected_noise()[source]¶
make noises that will be injected into feature maps.
- Returns
List of layer-wise noise tensor.
- Return type
list[Tensor]
- get_mean_latent(num_samples=4096, **kwargs)[source]¶
Get mean latent of W space in this generator.
- Parameters
num_samples (int, optional) – Number of sample times. Defaults to 4096.
- Returns
Mean latent of this generator.
- Return type
Tensor
- forward(styles, label=None, num_batches=- 1, return_noise=False, return_latents=False, inject_index=None, truncation=1, truncation_latent=None, input_is_latent=False, injected_noise=None, add_noise=True, randomize_noise=True, update_ws=False, return_features=False, feat_idx=5, return_latent_only=False)[source]¶
Forward function.
This function has been integrated with the truncation trick. Please refer to the usage of truncation and truncation_latent.
- Parameters
styles (torch.Tensor | list[torch.Tensor] | callable | None) – In StyleGAN2, you can provide noise tensor or latent tensor. Given a list containing more than one noise or latent tensors, style mixing trick will be used in training. Of course, You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.label (torch.Tensor, optional) – Conditional inputs for the generator. Defaults to None.
num_batches (int, optional) – The number of batch size. Defaults to 0.
return_noise (bool, optional) – If True,
noise_batch
will be returned in a dict withfake_img
. Defaults to False.return_latents (bool, optional) – If True,
latent
will be returned in a dict withfake_img
. Defaults to False.inject_index (int | None, optional) – The index number for mixing style codes. Defaults to None.
truncation (float, optional) – Truncation factor. Give value less than 1., the truncation trick will be adopted. Defaults to 1.
truncation_latent (torch.Tensor, optional) – Mean truncation latent. Defaults to None.
input_is_latent (bool, optional) – If True, the input tensor is the latent tensor. Defaults to False.
injected_noise (torch.Tensor | None, optional) – Given a tensor, the random noise will be fixed as this input injected noise. Defaults to None.
add_noise (bool) – Whether apply noise injection. Defaults to True.
randomize_noise (bool, optional) – If False, images are sampled with the buffered noise tensor injected to the style conv block. Defaults to True.
update_ws (bool) – Whether update latent code with EMA. Only work when w_avg is registered. Defaults to False.
- Returns
Generated image tensor or dictionary containing more data.
- Return type
torch.Tensor | dict