mmagic.models.archs
¶
Package Contents¶
Classes¶
All gather layer with backward propagation path. |
|
ASPP module from DeepLabV3. |
|
Wrapper for stable diffusion unet. |
|
Apply spatial and temporal ensemble and compute outputs. |
|
Simple Gated Convolutional Module. |
|
Normalize images with the given mean and std value. |
|
A linear block that contains linear/norm/activation layers. |
|
Wrapper for LoRA layer. |
|
Multilayer Discriminator. |
|
A PatchGAN discriminator. |
|
General ResNet. |
|
Depthwise separable convolution module. |
|
Simple encoder-decoder model from matting. |
|
A Soft Mask-Guided PatchGAN discriminator. |
|
Residual block without BN. |
|
Tokenizer wrapper for CLIPTokenizer. Only support CLIPTokenizer |
|
Pixel Shuffle upsample layer. |
|
Customized VGG16 Encoder. |
Functions¶
|
Down-sample by pixel unshuffle. |
|
Set LoRA for module. |
|
Disable LoRA modules. |
|
Enable LoRA modules. |
|
Set only LoRA modules trainable. |
- class mmagic.models.archs.AllGatherLayer(*args, **kwargs)[source]¶
Bases:
torch.autograd.Function
All gather layer with backward propagation path.
Indeed, this module is to make
dist.all_gather()
in the backward graph. Such kind of operation has been widely used in Moco and other contrastive learning algorithms.
- class mmagic.models.archs.ASPP(in_channels: int, out_channels: int = 256, mid_channels: int = 256, dilations: Sequence[int] = (12, 24, 36), conv_cfg: Optional[dict] = None, norm_cfg: Optional[dict] = dict(type='BN'), act_cfg: Optional[dict] = dict(type='ReLU'), separable_conv: bool = False)[source]¶
Bases:
torch.nn.Module
ASPP module from DeepLabV3.
The code is adopted from https://github.com/pytorch/vision/blob/master/torchvision/models/ segmentation/deeplabv3.py
For more information about the module: “Rethinking Atrous Convolution for Semantic Image Segmentation”.
- Parameters
in_channels (int) – Input channels of the module.
out_channels (int) – Output channels of the module. Default: 256.
mid_channels (int) – Output channels of the intermediate ASPP conv modules. Default: 256.
dilations (Sequence[int]) – Dilation rate of three ASPP conv module. Default: [12, 24, 36].
conv_cfg (dict) – Config dict for convolution layer. If “None”, nn.Conv2d will be applied. Default: None.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
act_cfg (dict) – Config dict for activation layer. Default: dict(type=’ReLU’).
separable_conv (bool) – Whether replace normal conv with depthwise separable conv which is faster. Default: False.
- class mmagic.models.archs.AttentionInjection(module: torch.nn.Module, injection_weight=5)[source]¶
Bases:
torch.nn.Module
Wrapper for stable diffusion unet.
- Parameters
module (nn.Module) – The module to be wrapped.
- mmagic.models.archs.pixel_unshuffle(x: torch.Tensor, scale: int) torch.Tensor [source]¶
Down-sample by pixel unshuffle.
- Parameters
x (Tensor) – Input tensor.
scale (int) – Scale factor.
- Returns
Output tensor.
- Return type
Tensor
- class mmagic.models.archs.SpatialTemporalEnsemble(is_temporal_ensemble: Optional[bool] = False)[source]¶
Bases:
torch.nn.Module
Apply spatial and temporal ensemble and compute outputs.
- Parameters
is_temporal_ensemble (bool, optional) – Whether to apply ensemble temporally. If True, the sequence will also be flipped temporally. If the input is an image, this argument must be set to False. Default: False.
- _transform(imgs: torch.Tensor, mode: str) torch.Tensor [source]¶
Apply spatial transform (flip, rotate) to the images.
- Parameters
imgs (torch.Tensor) – The images to be transformed/
mode (str) – The mode of transform. Supported values are ‘vertical’, ‘horizontal’, and ‘transpose’, corresponding to vertical flip, horizontal flip, and rotation, respectively.
- Returns
Output of the model with spatial ensemble applied.
- Return type
torch.Tensor
- spatial_ensemble(imgs: torch.Tensor, model: torch.nn.Module) torch.Tensor [source]¶
Apply spatial ensemble.
- Parameters
imgs (torch.Tensor) – The images to be processed by the model. Its size should be either (n, t, c, h, w) or (n, c, h, w).
model (nn.Module) – The model to process the images.
- Returns
Output of the model with spatial ensemble applied.
- Return type
torch.Tensor
- forward(imgs: torch.Tensor, model: torch.nn.Module) torch.Tensor [source]¶
Apply spatial and temporal ensemble.
- Parameters
imgs (torch.Tensor) – The images to be processed by the model. Its size should be either (n, t, c, h, w) or (n, c, h, w).
model (nn.Module) – The model to process the images.
- Returns
Output of the model with spatial ensemble applied.
- Return type
torch.Tensor
- class mmagic.models.archs.SimpleGatedConvModule(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], feat_act_cfg: Optional[dict] = dict(type='ELU'), gate_act_cfg: Optional[dict] = dict(type='Sigmoid'), **kwargs)[source]¶
Bases:
torch.nn.Module
Simple Gated Convolutional Module.
This module is a simple gated convolutional module. The detailed formula is:
\[y = \phi(conv1(x)) * \sigma(conv2(x)),\]where phi is the feature activation function and sigma is the gate activation function. In default, the gate activation function is sigmoid.
- Parameters
in_channels (int) – Same as nn.Conv2d.
out_channels (int) – The number of channels of the output feature. Note that out_channels in the conv module is doubled since this module contains two convolutions for feature and gate separately.
kernel_size (int or tuple[int]) – Same as nn.Conv2d.
feat_act_cfg (dict) – Config dict for feature activation layer. Default: dict(type=’ELU’).
gate_act_cfg (dict) – Config dict for gate activation layer. Default: dict(type=’Sigmoid’).
kwargs (keyword arguments) – Same as ConvModule.
- class mmagic.models.archs.ImgNormalize(pixel_range: float, img_mean: Tuple[float, float, float], img_std: Tuple[float, float, float], sign: int = - 1)[source]¶
Bases:
torch.nn.Conv2d
Normalize images with the given mean and std value.
Based on Conv2d layer, can work in GPU.
- Parameters
pixel_range (float) – Pixel range of feature.
img_mean (Tuple[float]) – Image mean of each channel.
img_std (Tuple[float]) – Image std of each channel.
sign (int) – Sign of bias. Default -1.
- class mmagic.models.archs.LinearModule(in_features: int, out_features: int, bias: bool = True, act_cfg: Optional[dict] = dict(type='ReLU'), inplace: bool = True, with_spectral_norm: bool = False, order: Tuple[str, str] = ('linear', 'act'))[source]¶
Bases:
torch.nn.Module
A linear block that contains linear/norm/activation layers.
For low level vision, we add spectral norm and padding layer.
- Parameters
in_features (int) – Same as nn.Linear.
out_features (int) – Same as nn.Linear.
bias (bool) – Same as nn.Linear. Default: True.
act_cfg (dict) – Config dict for activation layer, “relu” by default.
inplace (bool) – Whether to use inplace mode for activation. Default: True.
with_spectral_norm (bool) – Whether use spectral norm in linear module. Default: False.
order (tuple[str]) – The order of linear/activation layers. It is a sequence of “linear”, “norm” and “act”. Examples are (“linear”, “act”) and (“act”, “linear”).
- forward(x: torch.Tensor, activate: Optional[bool] = True) torch.Tensor [source]¶
Forward Function.
- Parameters
x (torch.Tensor) – Input tensor with shape of \((n, *, c)\). Same as
torch.nn.Linear
.activate (bool, optional) – Whether to use activation layer. Defaults to True.
- Returns
Same as
torch.nn.Linear
.- Return type
torch.Tensor
- class mmagic.models.archs.LoRAWrapper(module: torch.nn.Module, in_feat: int, out_feat: int, rank: int, scale: float = 1, names: Optional[Union[str, List[str]]] = None)[source]¶
Bases:
torch.nn.Module
Wrapper for LoRA layer.
- Parameters
module (nn.Module) – The module to be wrapped.
in_feat (int) – Number of input features.
out_feat (int) – Number of output features.
rank (int) – The rank of LoRA.
scale (float) – The scale of LoRA feature.
names (Union[str, List[str]], optional) – The name of LoRA layers. If you want to add multi LoRA for one module, names for each LoRA mapping must be defined.
- add_lora(name: str, rank: int, scale: float = 1, state_dict: Optional[dict] = None)[source]¶
Add LoRA mapping.
- Parameters
name (str) – The name of added LoRA.
rank (int) – The rank of added LoRA.
scale (float, optional) – The scale of added LoRA. Defaults to 1.
state_dict (dict, optional) – The state dict of added LoRA. Defaults to None.
- _set_value(attr_name: str, value: Any, name: Optional[str] = None)[source]¶
Set value of attribute.
- Parameters
attr_name (str) – The name of attribute to be set value.
value (Any) – The value to be set.
name (str, optional) – The name of field in attr_name. If passed, will set value to attr_name[name]. Defaults to None.
- set_scale(scale: float, name: Optional[str] = None)[source]¶
Set LoRA scale.
- Parameters
scale (float) – The scale to be set.
name (str, optional) – The name of LoRA to be set. Defaults to None.
- set_enable(name: Optional[str] = None)[source]¶
Enable LoRA for the current layer.
- Parameters
name (str, optional) – The name of LoRA to be set. Defaults to None.
- set_disable(name: Optional[str] = None)[source]¶
Disable LoRA for the current layer.
- Parameters
name (str, optional) – The name of LoRA to be set. Defaults to None.
- forward_lora_mapping(x: torch.Tensor) torch.Tensor [source]¶
Forward LoRA mapping.
- Parameters
x (Tensor) – The input tensor.
- Returns
The output tensor.
- Return type
Tensor
- forward(x: torch.Tensor, *args, **kwargs) torch.Tensor [source]¶
Forward and add LoRA mapping.
- Parameters
x (Tensor) – The input tensor.
- Returns
The output tensor.
- Return type
Tensor
- classmethod wrap_lora(module, rank=4, scale=1, names=None, state_dict=None)[source]¶
Wrap LoRA.
Use case: >>> linear = nn.Linear(2, 4) >>> lora_linear = LoRAWrapper.wrap_lora(linear, 4, 1)
- Parameters
module (nn.Module) – The module to add LoRA.
rank (int) – The rank for LoRA.
scale (float) –
- Return type
- mmagic.models.archs.set_lora(module: torch.nn.Module, config: dict, verbose: bool = True) torch.nn.Module [source]¶
Set LoRA for module.
Use case: >>> 1. set all lora with same parameters >>> lora_config = dict( >>> rank=4, >>> scale=1, >>> target_modules=[‘to_q’, ‘to_k’, ‘to_v’])
>>> 2. set lora with different parameters >>> lora_config = dict( >>> rank=4, >>> scale=1, >>> target_modules=[ >>> # set `to_q` the default parameters >>> 'to_q', >>> # set `to_k` the defined parameters >>> dict(target_module='to_k', rank=8, scale=1), >>> # set `to_v` the defined `rank` and default `scale` >>> dict(target_module='to_v', rank=16) >>> ])
- Parameters
module (nn.Module) – The module to set LoRA.
config (dict) – The config dict.
verbose (bool) – Whether to print log. Defaults to True.
- mmagic.models.archs.set_lora_disable(module: torch.nn.Module) torch.nn.Module [source]¶
Disable LoRA modules.
- mmagic.models.archs.set_lora_enable(module: torch.nn.Module) torch.nn.Module [source]¶
Enable LoRA modules.
- mmagic.models.archs.set_only_lora_trainable(module: torch.nn.Module) torch.nn.Module [source]¶
Set only LoRA modules trainable.
- class mmagic.models.archs.MultiLayerDiscriminator(in_channels: int, max_channels: int, num_convs: int = 5, fc_in_channels: Optional[int] = None, fc_out_channels: int = 1024, kernel_size: int = 5, conv_cfg: Optional[dict] = None, norm_cfg: Optional[dict] = None, act_cfg: Optional[dict] = dict(type='ReLU'), out_act_cfg: Optional[dict] = dict(type='ReLU'), with_input_norm: bool = True, with_out_convs: bool = False, with_spectral_norm: bool = False, **kwargs)[source]¶
Bases:
torch.nn.Module
Multilayer Discriminator.
This is a commonly used structure with stacked multiply convolution layers.
- Parameters
in_channels (int) – Input channel of the first input convolution.
max_channels (int) – The maximum channel number in this structure.
num_conv (int) – Number of stacked intermediate convs (including input conv but excluding output conv). Default to 5.
fc_in_channels (int | None) – Input dimension of the fully connected layer. If fc_in_channels is None, the fully connected layer will be removed. Default to None.
fc_out_channels (int) – Output dimension of the fully connected layer. Default to 1024.
kernel_size (int) – Kernel size of the conv modules. Default to 5.
conv_cfg (dict) – Config dict to build conv layer.
norm_cfg (dict) – Config dict to build norm layer.
act_cfg (dict) – Config dict for activation layer, “relu” by default.
out_act_cfg (dict) – Config dict for output activation, “relu” by default.
with_input_norm (bool) – Whether add normalization after the input conv. Default to True.
with_out_convs (bool) – Whether add output convs to the discriminator. The output convs contain two convs. The first out conv has the same setting as the intermediate convs but a stride of 1 instead of 2. The second out conv is a conv similar to the first out conv but reduces the number of channels to 1 and has no activation layer. Default to False.
with_spectral_norm (bool) – Whether use spectral norm after the conv layers. Default to False.
kwargs (keyword arguments) –
- class mmagic.models.archs.PatchDiscriminator(in_channels: int, base_channels: int = 64, num_conv: int = 3, norm_cfg: dict = dict(type='BN'), init_cfg: Optional[dict] = dict(type='normal', gain=0.02))[source]¶
Bases:
mmengine.model.BaseModule
A PatchGAN discriminator.
- Parameters
in_channels (int) – Number of channels in input images.
base_channels (int) – Number of channels at the first conv layer. Default: 64.
num_conv (int) – Number of stacked intermediate convs (excluding input and output conv). Default: 3.
norm_cfg (dict) – Config dict to build norm layer. Default: dict(type=’BN’).
init_cfg (dict) – Config dict for initialization. type: The name of our initialization method. Default: ‘normal’. gain: Scaling factor for normal, xavier and orthogonal. Default: 0.02.
- class mmagic.models.archs.ResNet(depth: int, in_channels: int = 3, stem_channels: int = 64, base_channels: int = 64, num_stages: int = 4, strides: Sequence[int] = (1, 2, 2, 2), dilations: Sequence[int] = (1, 1, 2, 4), deep_stem: bool = False, avg_down: bool = False, frozen_stages: int = - 1, act_cfg: dict = dict(type='ReLU'), conv_cfg: Optional[dict] = None, norm_cfg: dict = dict(type='BN'), with_cp: bool = False, multi_grid: Optional[Sequence[int]] = None, contract_dilation: bool = False, zero_init_residual: bool = True)[source]¶
Bases:
torch.nn.Module
General ResNet.
This class is adopted from https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/backbones/resnet.py.
- Parameters
depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.
in_channels (int) – Number of input image channels. Default” 3.
stem_channels (int) – Number of stem channels. Default: 64.
base_channels (int) – Number of base channels of res layer. Default: 64.
num_stages (int) – Resnet stages, normally 4.
strides (Sequence[int]) – Strides of the first block of each stage. Default: (1, 2, 2, 2).
dilations (Sequence[int]) – Dilation of each stage. Default: (1, 1, 2, 4).
deep_stem (bool) – Replace 7x7 conv in input stem with 3 3x3 conv. Default: False.
avg_down (bool) – Use AvgPool instead of stride conv when downsampling in the bottleneck. Default: False.
frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters. Default: -1.
act_cfg (dict) – Dictionary to construct and config activation layer. Default: dict(type=’ReLU’).
conv_cfg (dict) – Dictionary to construct and config convolution layer. Default: None.
norm_cfg (dict) – Dictionary to construct and config norm layer. Default: dict(type=’BN’).
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
multi_grid (Sequence[int]|None) – Multi grid dilation rates of last stage. Default: None.
contract_dilation (bool) – Whether contract first dilation of each layer Default: False.
zero_init_residual (bool) – Whether to use zero init for last norm layer in resblocks to let them behave as identity. Default: True.
- property norm1: torch.nn.Module¶
normalization layer after the second convolution layer
- Type
nn.Module
- arch_settings¶
- _make_layer(block: BasicBlock, planes: int, blocks: int, stride: int = 1, dilation: int = 1) torch.nn.Module [source]¶
- class mmagic.models.archs.DepthwiseSeparableConvModule(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[int, Tuple[int, int]] = 0, dilation: Union[int, Tuple[int, int]] = 1, norm_cfg: Optional[dict] = None, act_cfg: Optional[dict] = dict(type='ReLU'), dw_norm_cfg: Union[dict, str] = 'default', dw_act_cfg: Union[dict, str] = 'default', pw_norm_cfg: Union[dict, str] = 'default', pw_act_cfg: Union[dict, str] = 'default', **kwargs)[source]¶
Bases:
torch.nn.Module
Depthwise separable convolution module.
See https://arxiv.org/pdf/1704.04861.pdf for details.
This module can replace a ConvModule with the conv block replaced by two conv block: depthwise conv block and pointwise conv block. The depthwise conv block contains depthwise-conv/norm/activation layers. The pointwise conv block contains pointwise-conv/norm/activation layers. It should be noted that there will be norm/activation layer in the depthwise conv block if
norm_cfg
andact_cfg
are specified.- Parameters
in_channels (int) – Same as nn.Conv2d.
out_channels (int) – Same as nn.Conv2d.
kernel_size (int or tuple[int]) – Same as nn.Conv2d.
stride (int or tuple[int]) – Same as nn.Conv2d. Default: 1.
padding (int or tuple[int]) – Same as nn.Conv2d. Default: 0.
dilation (int or tuple[int]) – Same as nn.Conv2d. Default: 1.
norm_cfg (dict) – Default norm config for both depthwise ConvModule and pointwise ConvModule. Default: None.
act_cfg (dict) – Default activation config for both depthwise ConvModule and pointwise ConvModule. Default: dict(type=’ReLU’).
dw_norm_cfg (dict) – Norm config of depthwise ConvModule. If it is ‘default’, it will be the same as
norm_cfg
. Default: ‘default’.dw_act_cfg (dict) – Activation config of depthwise ConvModule. If it is ‘default’, it will be the same as
act_cfg
. Default: ‘default’.pw_norm_cfg (dict) – Norm config of pointwise ConvModule. If it is ‘default’, it will be the same as norm_cfg. Default: ‘default’.
pw_act_cfg (dict) – Activation config of pointwise ConvModule. If it is ‘default’, it will be the same as
act_cfg
. Default: ‘default’.kwargs (optional) – Other shared arguments for depthwise and pointwise ConvModule. See ConvModule for ref.
- class mmagic.models.archs.SimpleEncoderDecoder(encoder: dict, decoder: dict, init_cfg: Optional[dict] = None)[source]¶
Bases:
mmengine.model.BaseModule
Simple encoder-decoder model from matting.
- Parameters
encoder (dict) – Config of the encoder.
decoder (dict) – Config of the decoder.
init_cfg (dict, optional) – Initialization config dict.
- class mmagic.models.archs.SoftMaskPatchDiscriminator(in_channels: int, base_channels: Optional[int] = 64, num_conv: Optional[int] = 3, norm_cfg: Optional[dict] = None, init_cfg: Optional[dict] = dict(type='normal', gain=0.02), with_spectral_norm: Optional[bool] = False)[source]¶
Bases:
mmengine.model.BaseModule
A Soft Mask-Guided PatchGAN discriminator.
- Parameters
in_channels (int) – Number of channels in input images.
base_channels (int, optional) – Number of channels at the first conv layer. Default: 64.
num_conv (int, optional) – Number of stacked intermediate convs (excluding input and output conv). Default: 3.
norm_cfg (dict, optional) – Config dict to build norm layer. Default: None.
init_cfg (dict, optional) – Config dict for initialization. type: The name of our initialization method. Default: ‘normal’. gain: Scaling factor for normal, xavier and orthogonal. Default: 0.02.
with_spectral_norm (bool, optional) – Whether use spectral norm after the conv layers. Default: False.
- class mmagic.models.archs.ResidualBlockNoBN(mid_channels: int = 64, res_scale: float = 1.0)[source]¶
Bases:
torch.nn.Module
Residual block without BN.
It has a style of:
---Conv-ReLU-Conv-+- |________________|
- Parameters
mid_channels (int) – Channel number of intermediate features. Default: 64.
res_scale (float) – Used to scale the residual before addition. Default: 1.0.
- init_weights() None [source]¶
Initialize weights for ResidualBlockNoBN.
Initialization methods like kaiming_init are for VGG-style modules. For modules with residual paths, using smaller std is better for stability and performance. We empirically use 0.1. See more details in “ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks”
- class mmagic.models.archs.TokenizerWrapper(from_pretrained: Optional[Union[str, os.PathLike]] = None, from_config: Optional[Union[str, os.PathLike]] = None, *args, **kwargs)[source]¶
Tokenizer wrapper for CLIPTokenizer. Only support CLIPTokenizer currently. This wrapper is modified from https://github.com/huggingface/dif fusers/blob/e51f19aee82c8dd874b715a09dbc521d88835d68/src/diffusers/loaders. py#L358 # noqa.
- Parameters
from_pretrained (Union[str, os.PathLike], optional) – The model id of a pretrained model or a path to a directory containing model weights and config. Defaults to None.
from_config (Union[str, os.PathLike], optional) – The model id of a pretrained model or a path to a directory containing model weights and config. Defaults to None.
*args – If from_pretrained is passed, *args and **kwargs will be passed to from_pretrained function. Otherwise, *args and **kwargs will be used to initialize the model by self._module_cls(*args, **kwargs).
**kwargs –
If from_pretrained is passed, *args and **kwargs will be passed to from_pretrained function. Otherwise, *args and **kwargs will be used to initialize the model by self._module_cls(*args, **kwargs).
- try_adding_tokens(tokens: Union[str, List[str]], *args, **kwargs)[source]¶
Attempt to add tokens to the tokenizer.
- Parameters
tokens (Union[str, List[str]]) – The tokens to be added.
- get_token_info(token: str) dict [source]¶
Get the information of a token, including its start and end index in the current tokenizer.
- Parameters
token (str) – The token to be queried.
- Returns
- The information of the token, including its start and end
index in current tokenizer.
- Return type
dict
- add_placeholder_token(placeholder_token: str, *args, num_vec_per_token: int = 1, **kwargs)[source]¶
Add placeholder tokens to the tokenizer.
- Parameters
placeholder_token (str) – The placeholder token to be added.
num_vec_per_token (int, optional) – The number of vectors of the added placeholder token.
*args – The arguments for self.wrapped.add_tokens.
**kwargs –
The arguments for self.wrapped.add_tokens.
- replace_placeholder_tokens_in_text(text: Union[str, List[str]], vector_shuffle: bool = False, prop_tokens_to_load: float = 1.0) Union[str, List[str]] [source]¶
Replace the keywords in text with placeholder tokens. This function will be called in self.__call__ and self.encode.
- Parameters
text (Union[str, List[str]]) – The text to be processed.
vector_shuffle (bool, optional) – Whether to shuffle the vectors. Defaults to False.
prop_tokens_to_load (float, optional) – The proportion of tokens to be loaded. If 1.0, all tokens will be loaded. Defaults to 1.0.
- Returns
The processed text.
- Return type
Union[str, List[str]]
- replace_text_with_placeholder_tokens(text: Union[str, List[str]]) Union[str, List[str]] [source]¶
Replace the placeholder tokens in text with the original keywords. This function will be called in self.decode.
- Parameters
text (Union[str, List[str]]) – The text to be processed.
- Returns
The processed text.
- Return type
Union[str, List[str]]
- __call__(text: Union[str, List[str]], *args, vector_shuffle: bool = False, prop_tokens_to_load: float = 1.0, **kwargs)[source]¶
The call function of the wrapper.
- Parameters
text (Union[str, List[str]]) – The text to be tokenized.
vector_shuffle (bool, optional) – Whether to shuffle the vectors. Defaults to False.
prop_tokens_to_load (float, optional) – The proportion of tokens to be loaded. If 1.0, all tokens will be loaded. Defaults to 1.0
*args – The arguments for self.wrapped.__call__.
**kwargs –
The arguments for self.wrapped.__call__.
- encode(text: Union[str, List[str]], *args, **kwargs)[source]¶
Encode the passed text to token index.
- Parameters
text (Union[str, List[str]]) – The text to be encode.
*args – The arguments for self.wrapped.__call__.
**kwargs –
The arguments for self.wrapped.__call__.
- decode(token_ids, return_raw: bool = False, *args, **kwargs) Union[str, List[str]] [source]¶
Decode the token index to text.
- Parameters
token_ids – The token index to be decoded.
return_raw – Whether keep the placeholder token in the text. Defaults to False.
*args – The arguments for self.wrapped.decode.
**kwargs –
The arguments for self.wrapped.decode.
- Returns
The decoded text.
- Return type
Union[str, List[str]]
- class mmagic.models.archs.PixelShufflePack(in_channels: int, out_channels: int, scale_factor: int, upsample_kernel: int)[source]¶
Bases:
torch.nn.Module
Pixel Shuffle upsample layer.
- Parameters
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
scale_factor (int) – Upsample ratio.
upsample_kernel (int) – Kernel size of Conv layer to expand channels.
- Returns
Upsampled feature map.
- class mmagic.models.archs.VGG16(in_channels: int, batch_norm: Optional[bool] = False, aspp: Optional[bool] = False, dilations: Optional[List[int]] = None, init_cfg: Optional[dict] = None)[source]¶
Bases:
mmengine.model.BaseModule
Customized VGG16 Encoder.
A 1x1 conv is added after the original VGG16 conv layers. The indices of max pooling layers are returned for unpooling layers in decoders.
- Parameters
in_channels (int) – Number of input channels.
batch_norm (bool, optional) – Whether use
nn.BatchNorm2d
. Default to False.aspp (bool, optional) – Whether use ASPP module after the last conv layer. Default to False.
dilations (list[int], optional) – Atrous rates of ASPP module. Default to None.
init_cfg (dict, optional) – Initialization config dict.