mmagic.models.editors.stylegan3
¶
Package Contents¶
Classes¶
Implementation of Alias-Free Generative Adversarial Networks. # noqa. |
|
StyleGAN3 Generator. |
|
Module which generate input for synthesis layer. |
|
Layer of Synthesis network for stylegan3. |
|
Synthesis network for stylegan3. |
- class mmagic.models.editors.stylegan3.StyleGAN3(generator: ModelType, discriminator: Optional[ModelType] = None, data_preprocessor: Optional[Union[dict, mmengine.Config]] = None, generator_steps: int = 1, discriminator_steps: int = 1, forward_kwargs: Optional[Dict] = None, ema_config: Optional[Dict] = None, loss_config=dict())[source]¶
Bases:
mmagic.models.editors.stylegan2.StyleGAN2
Implementation of Alias-Free Generative Adversarial Networks. # noqa.
Paper link: https://nvlabs-fi-cdn.nvidia.com/stylegan3/stylegan3-paper.pdf # noqa
Detailed architecture can be found in
StyleGAN3Generator
andStyleGAN2Discriminator
- test_step(data: dict) mmagic.utils.typing.SampleList [source]¶
Gets the generated image of given data. Same as
val_step()
.- Parameters
data (dict) – Data sampled from metric specific sampler. More details in Metrics and Evaluator.
- Returns
A list of
DataSample
contain generated results.- Return type
SampleList
- val_step(data: dict) mmagic.utils.typing.SampleList [source]¶
Gets the generated image of given data. Same as
val_step()
.- Parameters
data (dict) – Data sampled from metric specific sampler. More details in Metrics and Evaluator.
- Returns
A list of
DataSample
contain generated results.- Return type
SampleList
- train_discriminator(inputs: dict, data_samples: mmagic.structures.DataSample, optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor] [source]¶
Train discriminator.
- Parameters
inputs (dict) – Inputs from dataloader.
data_samples (DataSample) – Data samples from dataloader.
optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.
- Returns
A
dict
of tensor for logging.- Return type
Dict[str, Tensor]
- train_generator(inputs: dict, data_samples: mmagic.structures.DataSample, optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor] [source]¶
Train generator.
- Parameters
inputs (dict) – Inputs from dataloader.
data_samples (DataSample) – Data samples from dataloader. Do not used in generator’s training.
optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.
- Returns
A
dict
of tensor for logging.- Return type
Dict[str, Tensor]
- class mmagic.models.editors.stylegan3.StyleGAN3Generator(out_size, style_channels, img_channels, noise_size=512, rgb2bgr=False, pretrained=None, synthesis_cfg=dict(type='SynthesisNetwork'), mapping_cfg=dict(type='MappingNetwork'))[source]¶
Bases:
mmengine.model.BaseModule
StyleGAN3 Generator.
In StyleGAN3, we make several changes to StyleGANv2’s generator which include transformed fourier features, filtered nonlinearity and non-critical sampling, etc. More details can be found in: Alias-Free Generative Adversarial Networks NeurIPS’2021.
Ref: https://github.com/NVlabs/stylegan3
- Parameters
out_size (int) – The output size of the StyleGAN3 generator.
style_channels (int) – The number of channels for style code.
img_channels (int) – The number of output’s channels.
noise_size (int, optional) – Size of the input noise vector. Defaults to 512.
rgb2bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained StyleGAN3 weights whose output channels order is rgb. You can set this argument to True to use the weights.
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretrained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
synthesis_cfg (dict, optional) – Config for synthesis network. Defaults to dict(type=’SynthesisNetwork’).
mapping_cfg (dict, optional) – Config for mapping network. Defaults to dict(type=’MappingNetwork’).
- forward(noise, num_batches=0, input_is_latent=False, truncation=1, num_truncation_layer=None, update_emas=False, force_fp32=True, return_noise=False, return_latents=False)[source]¶
Forward Function for stylegan3.
- Parameters
noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.num_batches (int, optional) – The number of batch size. Defaults to 0.
input_is_latent (bool, optional) – If True, the input tensor is the latent tensor. Defaults to False.
truncation (float, optional) – Truncation factor. Give value less than 1., the truncation trick will be adopted. Defaults to 1.
num_truncation_layer (int, optional) – Number of layers use truncated latent. Defaults to None.
update_emas (bool, optional) – Whether update moving average of mean latent. Defaults to False.
force_fp32 (bool, optional) – Force fp32 ignore the weights. Defaults to True.
return_noise (bool, optional) – If True,
noise_batch
will be returned in a dict withfake_img
. Defaults to False.return_latents (bool, optional) – If True,
latent
will be returned in a dict withfake_img
. Defaults to False.
- Returns
Generated image tensor or dictionary containing more data.
- Return type
torch.Tensor | dict
- class mmagic.models.editors.stylegan3.SynthesisInput(style_channels, channels, size, sampling_rate, bandwidth)[source]¶
Bases:
mmengine.model.BaseModule
Module which generate input for synthesis layer.
- Parameters
style_channels (int) – The number of channels for style code.
channels (int) – The number of output channel.
size (int) – The size of sampling grid.
sampling_rate (int) – Sampling rate for construct sampling grid.
bandwidth (float) – Bandwidth of random frequencies.
- class mmagic.models.editors.stylegan3.SynthesisLayer(style_channels, is_torgb, is_critically_sampled, use_fp16, in_channels, out_channels, in_size, out_size, in_sampling_rate, out_sampling_rate, in_cutoff, out_cutoff, in_half_width, out_half_width, conv_kernel=3, filter_size=6, lrelu_upsampling=2, use_radial_filters=False, conv_clamp=256, magnitude_ema_beta=0.999)[source]¶
Bases:
mmengine.model.BaseModule
Layer of Synthesis network for stylegan3.
- Parameters
style_channels (int) – The number of channels for style code.
is_torgb (bool) – Whether output of this layer is transformed to rgb image.
is_critically_sampled (bool) – Whether filter cutoff is set exactly at the bandlimit.
use_fp16 (bool, optional) – Whether to use fp16 training in this module. If this flag is True, the whole module will be wrapped with
auto_fp16
.in_channels (int) – The channel number of the input feature map.
out_channels (int) – The channel number of the output feature map.
in_size (int) – The input size of feature map.
out_size (int) – The output size of feature map.
in_sampling_rate (int) – Sampling rate for upsampling filter.
out_sampling_rate (int) – Sampling rate for downsampling filter.
in_cutoff (float) – Cutoff frequency for upsampling filter.
out_cutoff (float) – Cutoff frequency for downsampling filter.
in_half_width (float) – The approximate width of the transition region for upsampling filter.
out_half_width (float) – The approximate width of the transition region for downsampling filter.
conv_kernel (int, optional) – The kernel of modulated convolution. Defaults to 3.
filter_size (int, optional) – Base filter size. Defaults to 6.
lrelu_upsampling (int, optional) – Upsamling rate for filtered_lrelu. Defaults to 2.
use_radial_filters (bool, optional) – Whether use radially symmetric jinc-based filter in downsamping filter. Defaults to False.
conv_clamp (int, optional) – Clamp bound for convolution. Defaults to 256.
magnitude_ema_beta (float, optional) – Beta coefficient for calculating input magnitude ema. Defaults to 0.999.
- forward(x, w, force_fp32=False, update_emas=False)[source]¶
Forward function for synthesis layer.
- Parameters
x (torch.Tensor) – Input feature map tensor.
w (torch.Tensor) – Input style tensor.
force_fp32 (bool, optional) – Force fp32 ignore the weights. Defaults to True.
update_emas (bool, optional) – Whether update moving average of input magnitude. Defaults to False.
- Returns
Output feature map tensor.
- Return type
torch.Tensor
- static design_lowpass_filter(numtaps, cutoff, width, fs, radial=False)[source]¶
Design lowpass filter giving related arguments.
- Parameters
numtaps (int) – Length of the filter. numtaps must be odd if a passband includes the Nyquist frequency.
cutoff (float) – Cutoff frequency of filter
width (float) – The approximate width of the transition region.
fs (float) – The sampling frequency of the signal.
radial (bool, optional) – Whether use radially symmetric jinc-based filter. Defaults to False.
- Returns
Kernel of lowpass filter.
- Return type
torch.Tensor
- class mmagic.models.editors.stylegan3.SynthesisNetwork(style_channels, out_size, img_channels, channel_base=32768, channel_max=512, num_layers=14, num_critical=2, first_cutoff=2, first_stopband=2 ** 2.1, last_stopband_rel=2 ** 0.3, margin_size=10, output_scale=0.25, num_fp16_res=4, **layer_kwargs)[source]¶
Bases:
mmengine.model.BaseModule
Synthesis network for stylegan3.
- Parameters
style_channels (int) – The number of channels for style code.
out_size (int) – The resolution of output image.
img_channels (int) – The number of channels for output image.
channel_base (int, optional) – Overall multiplier for the number of channels. Defaults to 32768.
channel_max (int, optional) – Maximum number of channels in any layer. Defaults to 512.
num_layers (int, optional) – Total number of layers, excluding Fourier features and ToRGB. Defaults to 14.
num_critical (int, optional) – Number of critically sampled layers at the end. Defaults to 2.
first_cutoff (int, optional) – Cutoff frequency of the first layer. Defaults to 2.
first_stopband (int, optional) – Minimum stopband of the first layer. Defaults to 2**2.1.
last_stopband_rel (float, optional) – Minimum stopband of the last layer, expressed relative to the cutoff. Defaults to 2**0.3.
margin_size (int, optional) – Number of additional pixels outside the image. Defaults to 10.
output_scale (float, optional) – Scale factor for output value. Defaults to 0.25.
num_fp16_res (int, optional) – Number of first few layers use fp16. Defaults to 4.