mmagic.models.editors.mspie.positional_encoding
¶
Module Contents¶
Classes¶
Sinusoidal Positional Embedding 1D or 2D (SPE/SPE2d). |
|
Catersian Grid for 2d tensor. |
- class mmagic.models.editors.mspie.positional_encoding.SinusoidalPositionalEmbedding(embedding_dim, padding_idx, init_size=1024, div_half_dim=False, center_shift=None)[source]¶
Bases:
mmengine.model.BaseModule
Sinusoidal Positional Embedding 1D or 2D (SPE/SPE2d).
This module is a modified from: https://github.com/pytorch/fairseq/blob/master/fairseq/modules/sinusoidal_positional_embedding.py # noqa
Based on the original SPE in single dimension, we implement a 2D sinusoidal positional encoding (SPE2d), as introduced in Positional Encoding as Spatial Inductive Bias in GANs, CVPR’2021.
- Parameters
embedding_dim (int) – The number of dimensions for the positional encoding.
padding_idx (int | list[int]) – The index for the padding contents. The padding positions will obtain an encoding vector filling in zeros.
init_size (int, optional) – The initial size of the positional buffer. Defaults to 1024.
div_half_dim (bool, optional) – If true, the embedding will be divided by \(d/2\). Otherwise, it will be divided by \((d/2 -1)\). Defaults to False.
center_shift (int | None, optional) – Shift the center point to some index. Defaults to None.
- static get_embedding(num_embeddings, embedding_dim, padding_idx=None, div_half_dim=False)[source]¶
Build sinusoidal embeddings.
This matches the implementation in tensor2tensor, but differs slightly from the description in Section 3.5 of “Attention Is All You Need”.
- forward(input, **kwargs)[source]¶
Input is expected to be of size [bsz x seqlen].
Returned tensor is expected to be of size [bsz x seq_len x emb_dim]
- make_positions(input, padding_idx)[source]¶
Make position tensors.
- Parameters
input (tensor) – Input tensor.
padding_idx (int | list[int]) – The index for the padding contents.
filling (The padding positions will obtain an encoding vector) –
zeros. (in) –
- Returns
Position tensors.
- Return type
tensor
- make_grid2d(height, width, num_batches=1, center_shift=None)[source]¶
Make 2-d grid mask.
- Parameters
height (int) – Height of the grid.
width (int) – Width of the grid.
num_batches (int, optional) – The number of batch size. Defaults to 1.
center_shift (int | None, optional) – Shift the center point to some index. Defaults to None.
- Returns
2-d Grid mask.
- Return type
Tensor
- class mmagic.models.editors.mspie.positional_encoding.CatersianGrid(init_cfg: Union[dict, List[dict], None] = None)[source]¶
Bases:
mmengine.model.BaseModule
Catersian Grid for 2d tensor.
The Catersian Grid is a common-used positional encoding in deep learning. In this implementation, we follow the convention of
grid_sample
in PyTorch. In other words,[-1, -1]
denotes the left-top corner while[1, 1]
denotes the right-bottom corner.