Shortcuts

mmagic.models.editors.mspie.positional_encoding

Module Contents

Classes

SinusoidalPositionalEmbedding

Sinusoidal Positional Embedding 1D or 2D (SPE/SPE2d).

CatersianGrid

Catersian Grid for 2d tensor.

class mmagic.models.editors.mspie.positional_encoding.SinusoidalPositionalEmbedding(embedding_dim, padding_idx, init_size=1024, div_half_dim=False, center_shift=None)[source]

Bases: mmengine.model.BaseModule

Sinusoidal Positional Embedding 1D or 2D (SPE/SPE2d).

This module is a modified from: https://github.com/pytorch/fairseq/blob/master/fairseq/modules/sinusoidal_positional_embedding.py # noqa

Based on the original SPE in single dimension, we implement a 2D sinusoidal positional encoding (SPE2d), as introduced in Positional Encoding as Spatial Inductive Bias in GANs, CVPR’2021.

Parameters
  • embedding_dim (int) – The number of dimensions for the positional encoding.

  • padding_idx (int | list[int]) – The index for the padding contents. The padding positions will obtain an encoding vector filling in zeros.

  • init_size (int, optional) – The initial size of the positional buffer. Defaults to 1024.

  • div_half_dim (bool, optional) – If true, the embedding will be divided by \(d/2\). Otherwise, it will be divided by \((d/2 -1)\). Defaults to False.

  • center_shift (int | None, optional) – Shift the center point to some index. Defaults to None.

static get_embedding(num_embeddings, embedding_dim, padding_idx=None, div_half_dim=False)[source]

Build sinusoidal embeddings.

This matches the implementation in tensor2tensor, but differs slightly from the description in Section 3.5 of “Attention Is All You Need”.

forward(input, **kwargs)[source]

Input is expected to be of size [bsz x seqlen].

Returned tensor is expected to be of size [bsz x seq_len x emb_dim]

make_positions(input, padding_idx)[source]

Make position tensors.

Parameters
  • input (tensor) – Input tensor.

  • padding_idx (int | list[int]) – The index for the padding contents.

  • filling (The padding positions will obtain an encoding vector) –

  • zeros. (in) –

Returns

Position tensors.

Return type

tensor

make_grid2d(height, width, num_batches=1, center_shift=None)[source]

Make 2-d grid mask.

Parameters
  • height (int) – Height of the grid.

  • width (int) – Width of the grid.

  • num_batches (int, optional) – The number of batch size. Defaults to 1.

  • center_shift (int | None, optional) – Shift the center point to some index. Defaults to None.

Returns

2-d Grid mask.

Return type

Tensor

make_grid2d_like(x, center_shift=None)[source]

Input tensor with shape of (b, …, h, w) Return tensor with shape of (b, 2 x emb_dim, h, w)

Note that the positional embedding highly depends on the the function, make_positions.

class mmagic.models.editors.mspie.positional_encoding.CatersianGrid(init_cfg: Union[dict, List[dict], None] = None)[source]

Bases: mmengine.model.BaseModule

Catersian Grid for 2d tensor.

The Catersian Grid is a common-used positional encoding in deep learning. In this implementation, we follow the convention of grid_sample in PyTorch. In other words, [-1, -1] denotes the left-top corner while [1, 1] denotes the right-bottom corner.

forward(x, **kwargs)[source]
make_grid2d(height, width, num_batches=1, requires_grad=False)[source]
make_grid2d_like(x, requires_grad=False)[source]

Input tensor with shape of (b, …, h, w) Return tensor with shape of (b, 2 x emb_dim, h, w)

Note that the positional embedding highly depends on the the function, make_grid2d.

Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.