mmagic.models.editors.deepfillv1.contextual_attention
¶
Module Contents¶
Classes¶
Contexture attention module. |
- class mmagic.models.editors.deepfillv1.contextual_attention.ContextualAttentionModule(unfold_raw_kernel_size=4, unfold_raw_stride=2, unfold_raw_padding=1, unfold_corr_kernel_size=3, unfold_corr_stride=1, unfold_corr_dilation=1, unfold_corr_padding=1, scale=0.5, fuse_kernel_size=3, softmax_scale=10, return_attention_score=True)[source]¶
Bases:
mmengine.model.BaseModule
Contexture attention module.
The details of this module can be found in: Generative Image Inpainting with Contextual Attention
- Parameters
unfold_raw_kernel_size (int) – Kernel size used in unfolding raw feature. Default: 4.
unfold_raw_stride (int) – Stride used in unfolding raw feature. Default: 2.
unfold_raw_padding (int) – Padding used in unfolding raw feature. Default: 1.
unfold_corr_kernel_size (int) – Kernel size used in unfolding context for computing correlation maps. Default: 3.
unfold_corr_stride (int) – Stride used in unfolding context for computing correlation maps. Default: 1.
unfold_corr_dilation (int) – Dilation used in unfolding context for computing correlation maps. Default: 1.
unfold_corr_padding (int) – Padding used in unfolding context for computing correlation maps. Default: 1.
scale (float) – The resale factor used in resize input features. Default: 0.5.
fuse_kernel_size (int) – The kernel size used in fusion module. Default: 3.
softmax_scale (float) – The scale factor for softmax function. Default: 10.
return_attention_score (bool) – If True, the attention score will be returned. Default: True.
- forward(x, context, mask=None)[source]¶
Forward Function.
- Parameters
x (torch.Tensor) – Tensor with shape (n, c, h, w).
context (torch.Tensor) – Tensor with shape (n, c, h, w).
mask (torch.Tensor) – Tensor with shape (n, 1, h, w). Default: None.
- Returns
Features after contextural attention.
- Return type
tuple(torch.Tensor)
- patch_correlation(x, kernel)[source]¶
Calculate patch correlation.
- Parameters
x (torch.Tensor) – Input tensor.
kernel (torch.Tensor) – Kernel tensor.
- Returns
Tensor with shape of (n, l, h, w).
- Return type
torch.Tensor
- patch_copy_deconv(attention_score, context_filter)[source]¶
Copy patches using deconv.
- Parameters
attention_score (torch.Tensor) – Tensor with shape of (n, l , h, w).
context_filter (torch.Tensor) – Filter kernel.
- Returns
Tensor with shape of (n, c, h, w).
- Return type
torch.Tensor
- fuse_correlation_map(correlation_map, h_unfold, w_unfold)[source]¶
Fuse correlation map.
This operation is to fuse correlation map for increasing large consistent correlation regions.
The mechanism behind this op is simple and easy to understand. A standard ‘Eye’ matrix will be applied as a filter on the correlation map in horizontal and vertical direction.
The shape of input correlation map is (n, h_unfold*w_unfold, h, w). When adopting fusing, we will apply convolutional filter in the reshaped feature map with shape of (n, 1, h_unfold*w_fold, h*w).
A simple specification for horizontal direction is shown below:
(h, (h, (h, (h, 0) 1) 2) 3) ... (h, 0) (h, 1) 1 (h, 2) 1 (h, 3) 1 ...
- calculate_unfold_hw(input_size, kernel_size=3, stride=1, dilation=1, padding=0)[source]¶
Calculate (h, w) after unfolding.
The official implementation of unfold in pytorch will put the dimension (h, w) into L. Thus, this function is just to calculate the (h, w) according to the equation in: https://pytorch.org/docs/stable/nn.html#torch.nn.Unfold
- calculate_overlap_factor(attention_score)[source]¶
Calculate the overlap factor after applying deconv.
- Parameters
attention_score (torch.Tensor) – The attention score with shape of (n, c, h, w).
- Returns
The overlap factor will be returned.
- Return type
torch.Tensor
- mask_correlation_map(correlation_map, mask)[source]¶
Add mask weight for correlation map.
Add a negative infinity number to the masked regions so that softmax function will result in ‘zero’ in those regions.
- Parameters
correlation_map (torch.Tensor) – Correlation map with shape of (n, h_unfold*w_unfold, h_map, w_map).
mask (torch.Tensor) – Mask tensor with shape of (n, c, h, w). ‘1’ in the mask indicates masked region while ‘0’ indicates valid region.
- Returns
Updated correlation map with mask.
- Return type
torch.Tensor
- im2col(img, kernel_size, stride=1, padding=0, dilation=1, normalize=False, return_cols=False)[source]¶
Reshape image-style feature to columns.
This function is used for unfold feature maps to columns. The details of this function can be found in: https://pytorch.org/docs/1.1.0/nn.html?highlight=unfold#torch.nn.Unfold
- Parameters
img (torch.Tensor) – Features to be unfolded. The shape of this feature should be (n, c, h, w).
kernel_size (int) – In this function, we only support square kernel with same height and width.
stride (int) – Stride number in unfolding. Default: 1.
padding (int) – Padding number in unfolding. Default: 0.
dilation (int) – Dilation number in unfolding. Default: 1.
normalize (bool) – If True, the unfolded feature will be normalized. Default: False.
return_cols (bool) – The official implementation in PyTorch of unfolding will return features with shape of (n, c*$prod{kernel_size}$, L). If True, the features will be reshaped to (n, L, c, kernel_size, kernel_size). Otherwise, the results will maintain the shape as the official implementation.
- Returns
Unfolded columns. If return_cols is True, the shape of output tensor is (n, L, c, kernel_size, kernel_size). Otherwise, the shape will be (n, c*$prod{kernel_size}$, L).
- Return type
torch.Tensor