mmagic.models.editors.lsgan
¶
Package Contents¶
Classes¶
Implementation of Least Squares Generative Adversarial Networks. |
|
Discriminator for LSGAN. |
|
Generator for LSGAN. |
- class mmagic.models.editors.lsgan.LSGAN(generator: ModelType, discriminator: Optional[ModelType] = None, data_preprocessor: Optional[Union[dict, mmengine.Config]] = None, generator_steps: int = 1, discriminator_steps: int = 1, noise_size: Optional[int] = None, ema_config: Optional[Dict] = None, loss_config: Optional[Dict] = None)[source]¶
Bases:
mmagic.models.base_models.BaseGAN
Implementation of Least Squares Generative Adversarial Networks.
Paper link: https://arxiv.org/pdf/1611.04076.pdf
Detailed architecture can be found in
LSGANGenerator
andLSGANDiscriminator
- disc_loss(disc_pred_fake: torch.Tensor, disc_pred_real: torch.Tensor) Tuple [source]¶
Get disc loss. LSGAN use the least squares loss to train the discriminator.
\[L_{D}=\left(D\left(X_{\text {data }}\right)-1\right)^{2} +(D(G(z)))^{2}\]- Parameters
disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.
disc_pred_real (Tensor) – Discriminator’s prediction of the real images.
- Returns
Loss value and a dict of log variables.
- Return type
tuple[Tensor, dict]
- gen_loss(disc_pred_fake: torch.Tensor) Tuple [source]¶
Get gen loss. LSGAN use the least squares loss to train the generator.
\[L_{G}=(D(G(z))-1)^{2}\]- Parameters
disc_pred_fake (Tensor) – Discriminator’s prediction of the fake images.
- Returns
Loss value and a dict of log variables.
- Return type
tuple[Tensor, dict]
- train_discriminator(inputs: dict, data_samples: mmagic.structures.DataSample, optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor] [source]¶
Train discriminator.
- Parameters
inputs (dict) – Inputs from dataloader.
data_samples (DataSample) – Data samples from dataloader.
optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.
- Returns
A
dict
of tensor for logging.- Return type
Dict[str, Tensor]
- train_generator(inputs: dict, data_samples: List[mmagic.structures.DataSample], optimizer_wrapper: mmengine.optim.OptimWrapper) Dict[str, torch.Tensor] [source]¶
Train generator.
- Parameters
inputs (dict) – Inputs from dataloader.
data_samples (List[DataSample]) – Data samples from dataloader. Do not used in generator’s training.
optim_wrapper (OptimWrapper) – OptimWrapper instance used to update model parameters.
- Returns
A
dict
of tensor for logging.- Return type
Dict[str, Tensor]
- class mmagic.models.editors.lsgan.LSGANDiscriminator(input_scale=128, output_scale=8, out_channels=1, in_channels=3, base_channels=64, conv_cfg=dict(type='Conv2d'), default_norm_cfg=dict(type='BN'), default_act_cfg=dict(type='LeakyReLU', negative_slope=0.2), out_act_cfg=None, init_cfg=None)[source]¶
Bases:
mmengine.model.BaseModule
Discriminator for LSGAN.
Implementation Details for LSGAN architecture:
Adopt convolution in the discriminator;
Use batchnorm in the discriminator except for the input and final output layer;
Use LeakyReLU in the discriminator in addition to the output layer;
Use fully connected layer in the output layer;
Use 5x5 conv rather than 4x4 conv in DCGAN.
- Parameters
input_scale (int, optional) – The scale of the input image. Defaults to 128.
output_scale (int, optional) – The final scale of the convolutional feature. Defaults to 8.
out_channels (int, optional) – The channel number of the final output layer. Defaults to 1.
in_channels (int, optional) – The channel number of the input image. Defaults to 3.
base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 128.
conv_cfg (dict, optional) – Config for the convolution module used in this discriminator. Defaults to dict(type=’Conv2d’).
default_norm_cfg (dict, optional) – Norm config for all of layers except for the final output layer. Defaults to
dict(type='BN')
.default_act_cfg (dict, optional) – Activation config for all of layers except for the final output layer. Defaults to
dict(type='LeakyReLU', negative_slope=0.2)
.out_act_cfg (dict, optional) – Activation config for the final output layer. Defaults to
dict(type='Tanh')
.init_cfg (dict, optional) – Initialization config dict.
- class mmagic.models.editors.lsgan.LSGANGenerator(output_scale=128, out_channels=3, base_channels=256, input_scale=8, noise_size=1024, conv_cfg=dict(type='ConvTranspose2d'), default_norm_cfg=dict(type='BN'), default_act_cfg=dict(type='ReLU'), out_act_cfg=dict(type='Tanh'), init_cfg=None)[source]¶
Bases:
mmengine.model.BaseModule
Generator for LSGAN.
Implementation Details for LSGAN architecture:
Adopt transposed convolution in the generator;
Use batchnorm in the generator except for the final output layer;
Use ReLU in the generator in addition to the final output layer;
Keep channels of feature maps unchanged in the convolution backbone;
Use one more 3x3 conv every upsampling in the convolution backbone.
We follow the implementation details of the origin paper: Least Squares Generative Adversarial Networks https://arxiv.org/pdf/1611.04076.pdf
- Parameters
output_scale (int, optional) – Output scale for the generated image. Defaults to 128.
out_channels (int, optional) – The channel number of the output feature. Defaults to 3.
base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Defaults to 256.
input_scale (int, optional) – The scale of the input 2D feature map. Defaults to 8.
noise_size (int, optional) – Size of the input noise vector. Defaults to 1024.
conv_cfg (dict, optional) – Config for the convolution module used in this generator. Defaults to dict(type=’ConvTranspose2d’).
default_norm_cfg (dict, optional) – Norm config for all of layers except for the final output layer. Defaults to dict(type=’BN’).
default_act_cfg (dict, optional) – Activation config for all of layers except for the final output layer. Defaults to dict(type=’ReLU’).
out_act_cfg (dict, optional) – Activation config for the final output layer. Defaults to dict(type=’Tanh’).
init_cfg (dict, optional) – Initialization config dict.
- forward(noise, num_batches=0, return_noise=False)[source]¶
Forward function.
- Parameters
noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a
torch.Tensor
or offer a callable function to sample a batch of noise data. Otherwise, theNone
indicates to use the default noise sampler.num_batches (int, optional) – The number of batch size. Defaults to 0.
return_noise (bool, optional) – If True,
noise_batch
will be returned in a dict withfake_img
. Defaults to False.
- Returns
- If not
return_noise
, only the output image will be returned. Otherwise, a dict contains
fake_img
andnoise_batch
will be returned.
- If not
- Return type
torch.Tensor | dict