Shortcuts

mmagic.models.editors.textual_inversion.textual_inversion

Module Contents

Classes

TextualInversion

Implementation of `An Image is Worth One Word: Personalizing Text-to-

Attributes

logger

ModelType

mmagic.models.editors.textual_inversion.textual_inversion.logger[source]
mmagic.models.editors.textual_inversion.textual_inversion.ModelType[source]
class mmagic.models.editors.textual_inversion.textual_inversion.TextualInversion(placeholder_token: str, vae: ModelType, text_encoder: ModelType, tokenizer: str, unet: ModelType, scheduler: ModelType, test_scheduler: Optional[ModelType] = None, dtype: Optional[str] = None, enable_xformers: bool = True, noise_offset_weight: float = 0, tomesd_cfg: Optional[dict] = None, initialize_token: Optional[str] = None, num_vectors_per_token: int = 1, val_prompts=None, data_preprocessor: Optional[ModelType] = dict(type='DataPreprocessor'), init_cfg: Optional[dict] = None)[source]

Bases: mmagic.models.editors.stable_diffusion.stable_diffusion.StableDiffusion

Implementation of `An Image is Worth One Word: Personalizing Text-to- Image Generation using Textual Inversion.

<https://arxiv.org/abs/2208.01618>`_ (Textual Inversion).

Parameters
  • vae (Union[dict, nn.Module]) – The config or module for VAE model.

  • text_encoder (Union[dict, nn.Module]) – The config or module for text encoder.

  • tokenizer (str) – The name for CLIP tokenizer.

  • unet (Union[dict, nn.Module]) – The config or module for Unet model.

  • schedule (Union[dict, nn.Module]) – The config or module for diffusion scheduler.

  • test_scheduler (Union[dict, nn.Module], optional) – The config or module for diffusion scheduler in test stage (self.infer). If not passed, will use the same scheduler as schedule. Defaults to None.

  • dtype (str, optional) – The dtype for the model. Defaults to ‘fp16’.

  • enable_xformers (bool, optional) – Whether to use xformers. Defaults to True.

  • noise_offset_weight (bool, optional) – The weight of noise offset introduced in https://www.crosslabs.org/blog/diffusion-with-offset-noise # noqa Defaults to 0.

  • tomesd_cfg (dict, optional) – The config for TOMESD. Please refers to https://github.com/dbolya/tomesd and https://github.com/open-mmlab/mmagic/blob/main/mmagic/models/utils/tome_utils.py for detail. # noqa Defaults to None.

  • initialize_token (str, optional) – The initialization token for textual embedding to train. Defaults to None.

  • num_vefctor_per_token (int) – The length of the learnable embedding. Defaults to 1.

  • val_prompts (Union[str, List[str]], optional) – The prompts for validation. Defaults to None.

  • data_preprocessor (dict, optional) –

    The pre-process config of BaseDataPreprocessor. Defaults to

    dict(type=’DataPreprocessor’).

  • init_cfg (dict, optional) – The weight initialized config for BaseModule. Defaults to None/

prepare_models()[source]

Disable gradient for untrainable modules to save memory.

val_step(data: dict) mmagic.utils.typing.SampleList[source]

Gets the generated image of given data. Calls self.data_preprocessor and self.infer in order. Return the generated results which will be passed to evaluator or visualizer.

Parameters

data (dict or tuple or list) – Data sampled from dataset.

Returns

Generated image or image dict.

Return type

SampleList

test_step(data: dict) mmagic.utils.typing.SampleList[source]

Gets the generated image of given data. Calls self.data_preprocessor and self.infer in order. Return the generated results which will be passed to evaluator or visualizer.

Parameters

data (dict or tuple or list) – Data sampled from dataset.

Returns

Generated image or image dict.

Return type

SampleList

add_tokens(placeholder_token: str, initialize_token: str = None, num_vectors_per_token: int = 1)[source]

Add token for training.

# TODO: support add tokens as dict, then we can load pretrained tokens.

train_step(data, optim_wrapper)[source]

Training step.

Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.