`mmagic.models.editors.deepfillv1.contextual_attention`¶

Module Contents¶

Classes¶

ContextualAttentionModule

Contexture attention module.

class mmagic.models.editors.deepfillv1.contextual_attention.ContextualAttentionModule(unfold_raw_kernel_size=4, unfold_raw_stride=2, unfold_raw_padding=1, unfold_corr_kernel_size=3, unfold_corr_stride=1, unfold_corr_dilation=1, unfold_corr_padding=1, scale=0.5, fuse_kernel_size=3, softmax_scale=10, return_attention_score=True)[source]¶

Bases: mmengine.model.BaseModule

Contexture attention module.

The details of this module can be found in: Generative Image Inpainting with Contextual Attention

Parameters

unfold_raw_kernel_size (int) – Kernel size used in unfolding raw feature. Default: 4.
unfold_raw_stride (int) – Stride used in unfolding raw feature. Default: 2.
unfold_raw_padding (int) – Padding used in unfolding raw feature. Default: 1.
unfold_corr_kernel_size (int) – Kernel size used in unfolding context for computing correlation maps. Default: 3.
unfold_corr_stride (int) – Stride used in unfolding context for computing correlation maps. Default: 1.
unfold_corr_dilation (int) – Dilation used in unfolding context for computing correlation maps. Default: 1.
unfold_corr_padding (int) – Padding used in unfolding context for computing correlation maps. Default: 1.
scale (float) – The resale factor used in resize input features. Default: 0.5.
fuse_kernel_size (int) – The kernel size used in fusion module. Default: 3.
softmax_scale (float) – The scale factor for softmax function. Default: 10.
return_attention_score (bool) – If True, the attention score will be returned. Default: True.

forward(x, context, mask=None)[source]¶

Forward Function.

Parameters

x (torch.Tensor) – Tensor with shape (n, c, h, w).
context (torch.Tensor) – Tensor with shape (n, c, h, w).
mask (torch.Tensor) – Tensor with shape (n, 1, h, w). Default: None.

Returns

Features after contextural attention.

Return type

tuple(torch.Tensor)

patch_correlation(x, kernel)[source]¶

Calculate patch correlation.

Parameters

x (torch.Tensor) – Input tensor.
kernel (torch.Tensor) – Kernel tensor.

Returns

Tensor with shape of (n, l, h, w).

Return type

torch.Tensor

patch_copy_deconv(attention_score, context_filter)[source]¶

Copy patches using deconv.

Parameters

attention_score (torch.Tensor) – Tensor with shape of (n, l , h, w).
context_filter (torch.Tensor) – Filter kernel.

Returns

Tensor with shape of (n, c, h, w).

Return type

torch.Tensor

fuse_correlation_map(correlation_map, h_unfold, w_unfold)[source]¶

Fuse correlation map.

This operation is to fuse correlation map for increasing large consistent correlation regions.

The mechanism behind this op is simple and easy to understand. A standard ‘Eye’ matrix will be applied as a filter on the correlation map in horizontal and vertical direction.

The shape of input correlation map is (n, h_unfold*w_unfold, h, w). When adopting fusing, we will apply convolutional filter in the reshaped feature map with shape of (n, 1, h_unfold*w_fold, h*w).

A simple specification for horizontal direction is shown below:

       (h, (h, (h, (h,
        0)  1)  2)  3)  ...
(h, 0)
(h, 1)      1
(h, 2)          1
(h, 3)              1
...

calculate_unfold_hw(input_size, kernel_size=3, stride=1, dilation=1, padding=0)[source]¶

Calculate (h, w) after unfolding.

The official implementation of unfold in pytorch will put the dimension (h, w) into L. Thus, this function is just to calculate the (h, w) according to the equation in: https://pytorch.org/docs/stable/nn.html#torch.nn.Unfold

calculate_overlap_factor(attention_score)[source]¶

Calculate the overlap factor after applying deconv.

Parameters: attention_score (torch.Tensor) – The attention score with shape of (n, c, h, w).
Returns: The overlap factor will be returned.
Return type: torch.Tensor

mask_correlation_map(correlation_map, mask)[source]¶

Add mask weight for correlation map.

Add a negative infinity number to the masked regions so that softmax function will result in ‘zero’ in those regions.

Parameters

correlation_map (torch.Tensor) – Correlation map with shape of (n, h_unfold*w_unfold, h_map, w_map).
mask (torch.Tensor) – Mask tensor with shape of (n, c, h, w). ‘1’ in the mask indicates masked region while ‘0’ indicates valid region.

Returns

Updated correlation map with mask.

Return type

torch.Tensor

im2col(img, kernel_size, stride=1, padding=0, dilation=1, normalize=False, return_cols=False)[source]¶

Reshape image-style feature to columns.

This function is used for unfold feature maps to columns. The details of this function can be found in: https://pytorch.org/docs/1.1.0/nn.html?highlight=unfold#torch.nn.Unfold

Parameters

img (torch.Tensor) – Features to be unfolded. The shape of this feature should be (n, c, h, w).
kernel_size (int) – In this function, we only support square kernel with same height and width.
stride (int) – Stride number in unfolding. Default: 1.
padding (int) – Padding number in unfolding. Default: 0.
dilation (int) – Dilation number in unfolding. Default: 1.
normalize (bool) – If True, the unfolded feature will be normalized. Default: False.
return_cols (bool) – The official implementation in PyTorch of unfolding will return features with shape of (n, c*$prod{kernel_size}$, L). If True, the features will be reshaped to (n, L, c, kernel_size, kernel_size). Otherwise, the results will maintain the shape as the official implementation.

Returns

Unfolded columns. If return_cols is True, the shape of output tensor is (n, L, c, kernel_size, kernel_size). Otherwise, the shape will be (n, c*$prod{kernel_size}$, L).

Return type

torch.Tensor

mmagic.models.editors.deepfillv1.contextual_attention¶

Module Contents¶

Classes¶

`mmagic.models.editors.deepfillv1.contextual_attention`¶