`mmagic.models.editors.mspie.positional_encoding`¶

Module Contents¶

Classes¶

`SinusoidalPositionalEmbedding`	Sinusoidal Positional Embedding 1D or 2D (SPE/SPE2d).
`CatersianGrid`	Catersian Grid for 2d tensor.

class mmagic.models.editors.mspie.positional_encoding.SinusoidalPositionalEmbedding(embedding_dim, padding_idx, init_size=1024, div_half_dim=False, center_shift=None)[source]¶

Bases: mmengine.model.BaseModule

Sinusoidal Positional Embedding 1D or 2D (SPE/SPE2d).

This module is a modified from: https://github.com/pytorch/fairseq/blob/master/fairseq/modules/sinusoidal_positional_embedding.py # noqa

Based on the original SPE in single dimension, we implement a 2D sinusoidal positional encoding (SPE2d), as introduced in Positional Encoding as Spatial Inductive Bias in GANs, CVPR’2021.

Parameters

embedding_dim (int) – The number of dimensions for the positional encoding.
padding_idx (int | list[int]) – The index for the padding contents. The padding positions will obtain an encoding vector filling in zeros.
init_size (int, optional) – The initial size of the positional buffer. Defaults to 1024.
div_half_dim (bool, optional) – If true, the embedding will be divided by \(d/2\). Otherwise, it will be divided by \((d/2 -1)\). Defaults to False.
center_shift (int | None, optional) – Shift the center point to some index. Defaults to None.

static get_embedding(num_embeddings, embedding_dim, padding_idx=None, div_half_dim=False)[source]¶

Build sinusoidal embeddings.

This matches the implementation in tensor2tensor, but differs slightly from the description in Section 3.5 of “Attention Is All You Need”.

forward(input, **kwargs)[source]¶

Input is expected to be of size [bsz x seqlen].

Returned tensor is expected to be of size [bsz x seq_len x emb_dim]

make_positions(input, padding_idx)[source]¶

Make position tensors.

Parameters

input (tensor) – Input tensor.
padding_idx (int | list[int]) – The index for the padding contents.
filling (The padding positions will obtain an encoding vector) –
zeros. (in) –

Returns

Position tensors.

Return type

tensor

make_grid2d(height, width, num_batches=1, center_shift=None)[source]¶

Make 2-d grid mask.

Parameters

height (int) – Height of the grid.
width (int) – Width of the grid.
num_batches (int, optional) – The number of batch size. Defaults to 1.
center_shift (int | None, optional) – Shift the center point to some index. Defaults to None.

Returns

2-d Grid mask.

Return type

Tensor

make_grid2d_like(x, center_shift=None)[source]¶

Input tensor with shape of (b, …, h, w) Return tensor with shape of (b, 2 x emb_dim, h, w)

Note that the positional embedding highly depends on the the function, make_positions.

class mmagic.models.editors.mspie.positional_encoding.CatersianGrid(init_cfg: Union[dict, List[dict], None] = None)[source]¶

Bases: mmengine.model.BaseModule

Catersian Grid for 2d tensor.

The Catersian Grid is a common-used positional encoding in deep learning. In this implementation, we follow the convention of grid_sample in PyTorch. In other words, [-1, -1] denotes the left-top corner while [1, 1] denotes the right-bottom corner.

forward(x, **kwargs)[source]¶

make_grid2d(height, width, num_batches=1, requires_grad=False)[source]¶

make_grid2d_like(x, requires_grad=False)[source]¶

Input tensor with shape of (b, …, h, w) Return tensor with shape of (b, 2 x emb_dim, h, w)

Note that the positional embedding highly depends on the the function, make_grid2d.

mmagic.models.editors.mspie.positional_encoding¶

Module Contents¶

Classes¶

`mmagic.models.editors.mspie.positional_encoding`¶