Shortcuts

mmagic.models.editors.eg3d.eg3d_generator

Module Contents

Classes

TriplaneGenerator

The generator for EG3D.

class mmagic.models.editors.eg3d.eg3d_generator.TriplaneGenerator(out_size: int, noise_size: int = 512, style_channels: int = 512, cond_size: int = 25, cond_mapping_channels: Optional[int] = None, cond_scale: float = 1, zero_cond_input: bool = False, num_mlps: int = 8, triplane_size: int = 256, triplane_channels: int = 32, sr_in_size: int = 64, sr_in_channels: int = 32, sr_hidden_channels: int = 128, sr_out_channels: int = 64, sr_antialias: bool = True, sr_add_noise: bool = True, neural_rendering_resolution: int = 64, renderer_cfg: dict = dict(), rgb2bgr: bool = False, init_cfg: Optional[dict] = None)[source]

Bases: mmengine.model.BaseModule

The generator for EG3D.

EG3D generator contains three components:

  • A StyleGAN2 based backbone to generate a triplane feature

  • A neural renderer to sample and render low-resolution 2D feature and image from generated triplane feature

  • A super resolution module to upsample low-resolution image to high-resolution one

Parameters
  • out_size (int) – The resolution of the generated 2D image.

  • noise_size (int) – The size of the noise vector of the StyleGAN2 backbone. Defaults to 512.

  • style_channels (int) – The number of channels for style code. Defaults to 512.

  • cond_size (int) – The size of the conditional input. Defaults to 25 (first 16 elements are flattened camera-to-world matrix and the last 9 elements are flattened intrinsic matrix).

  • cond_mapping_channels (Optional[int]) – The channels of the conditional mapping layers. If not passed, will use the same value as style_channels. Defaults to None.

  • cond_scale (float) – The scale factor is multiple by the conditional input. Defaults to 1.

  • zero_cond_input (bool) – Whether use ‘zero tensor’ as the conditional input. Defaults to False.

  • num_mlps (int) – The number of MLP layers (mapping network) used in backbone. Defaults to 8.

  • triplane_size (int) – The size of generated triplane feature. Defaults to 256.

  • triplane_channels (int) – The number of channels for each plane of the triplane feature. Defaults to 32.

  • sr_in_size (int) – The input resolution of super resolution module. If the input feature not match with the passed sr_in_size, bilinear interpolation will be used to resize feature to target size. Defaults to 64.

  • sr_in_channels (int) – The number of the input channels of super resolution module. Defaults to 32.

  • sr_hidden_channels (int) – The number of the hidden channels of super resolution module. Defaults to 128.

  • sr_out_channels (int) – The number of the output channels of super resolution module. Defaults to 64.

  • sr_add_noise (bool) – Whether use noise injection to super resolution module. Defaults to False.

  • neural_rendering_resolution (int) – The resolution of the neural rendering output. Defaults to 64. Noted that in the training process, neural rendering resolution will be changed. Defaults to 64.

  • renderer_cfg (int) – The config to build EG3DRenderer. Defaults to ‘{}’.

  • rgb2bgr (bool) – Whether convert the RGB output to BGR. This is useful when pretrained model is trained on RGB dataset. Defaults to False.

  • init_cfg (Optional[dict]) – Initialization config. Defaults to None.

sample_ray(cond: torch.Tensor) Tuple[torch.Tensor][source]

Sample render points corresponding to the given conditional.

Parameters

cond (torch.Tensor) – Conditional inputs.

Returns

The original and direction vector of sampled rays.

Return type

Tuple[Tensor]

forward(noise: torch.Tensor, label: Optional[torch.Tensor] = None, truncation: Optional[float] = 1, num_truncation_layer: Optional[int] = None, input_is_latent: bool = False, plane: Optional[torch.Tensor] = None, add_noise: bool = True, randomize_noise: bool = True, render_kwargs: Optional[dict] = None) dict[source]

The forward function for EG3D generator.

Parameters
  • noise (Tensor) – The input noise vector.

  • label (Optional[Tensor]) – The conditional input. Defaults to None.

  • truncation (float, optional) – Truncation factor. Give value less than 1., the truncation trick will be adopted. Defaults to 1.

  • num_truncation_layer (int, optional) – Number of layers use truncated latent. Defaults to None.

  • input_is_latent (bool) – Whether the input latent. Defaults to False.

  • plane (Optional[Tensor]) – The pre-generated triplane feature. If passed, will use the passed plane to generate 2D image. Defaults to None.

  • add_noise (bool) – Whether apply noise injection to the triplane backbone. Defaults to True.

  • randomize_noise (bool, optional) – If False, images are sampled with the buffered noise tensor injected to the style conv block. Defaults to True.

  • render_kwargs (Optional[dict], optional) – The specific kwargs for rendering. Defaults to None.

Returns

A dict contains ‘fake_img’, ‘lr_img’, ‘depth’,

’ray_directions’ and ‘ray_origins’.

Return type

dict

Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.