`mmagic.models.editors.textual_inversion`¶

Package Contents¶

Classes¶

TextualInversion

Implementation of `An Image is Worth One Word: Personalizing Text-to-

class mmagic.models.editors.textual_inversion.TextualInversion(placeholder_token: str, vae: ModelType, text_encoder: ModelType, tokenizer: str, unet: ModelType, scheduler: ModelType, test_scheduler: Optional[ModelType] = None, dtype: Optional[str] = None, enable_xformers: bool = True, noise_offset_weight: float = 0, tomesd_cfg: Optional[dict] = None, initialize_token: Optional[str] = None, num_vectors_per_token: int = 1, val_prompts=None, data_preprocessor: Optional[ModelType] = dict(type='DataPreprocessor'), init_cfg: Optional[dict] = None)[source]¶

Bases: mmagic.models.editors.stable_diffusion.stable_diffusion.StableDiffusion

Implementation of `An Image is Worth One Word: Personalizing Text-to- Image Generation using Textual Inversion.

<https://arxiv.org/abs/2208.01618>`_ (Textual Inversion).

Parameters

vae (Union[dict, nn.Module]) – The config or module for VAE model.
text_encoder (Union[dict, nn.Module]) – The config or module for text encoder.
tokenizer (str) – The name for CLIP tokenizer.
unet (Union[dict, nn.Module]) – The config or module for Unet model.
schedule (Union[dict, nn.Module]) – The config or module for diffusion scheduler.
test_scheduler (Union[dict, nn.Module], optional) – The config or module for diffusion scheduler in test stage (self.infer). If not passed, will use the same scheduler as schedule. Defaults to None.
dtype (str, optional) – The dtype for the model. Defaults to ‘fp16’.
enable_xformers (bool, optional) – Whether to use xformers. Defaults to True.
noise_offset_weight (bool, optional) – The weight of noise offset introduced in https://www.crosslabs.org/blog/diffusion-with-offset-noise # noqa Defaults to 0.
tomesd_cfg (dict, optional) – The config for TOMESD. Please refers to https://github.com/dbolya/tomesd and https://github.com/open-mmlab/mmagic/blob/main/mmagic/models/utils/tome_utils.py for detail. # noqa Defaults to None.
initialize_token (str, optional) – The initialization token for textual embedding to train. Defaults to None.
num_vefctor_per_token (int) – The length of the learnable embedding. Defaults to 1.
val_prompts (Union[str, List[str]], optional) – The prompts for validation. Defaults to None.
data_preprocessor (dict, optional) –
The pre-process config of BaseDataPreprocessor. Defaults to

dict(type=’DataPreprocessor’).
init_cfg (dict, optional) – The weight initialized config for BaseModule. Defaults to None/

prepare_models()[source]¶: Disable gradient for untrainable modules to save memory.

val_step(data: dict) → mmagic.utils.typing.SampleList[source]¶

Gets the generated image of given data. Calls self.data_preprocessor and self.infer in order. Return the generated results which will be passed to evaluator or visualizer.

Parameters: data (dict or tuple or list) – Data sampled from dataset.
Returns: Generated image or image dict.
Return type: SampleList

test_step(data: dict) → mmagic.utils.typing.SampleList[source]¶

Gets the generated image of given data. Calls self.data_preprocessor and self.infer in order. Return the generated results which will be passed to evaluator or visualizer.

Parameters: data (dict or tuple or list) – Data sampled from dataset.
Returns: Generated image or image dict.
Return type: SampleList

add_tokens(placeholder_token: str, initialize_token: str = None, num_vectors_per_token: int = 1)[source]¶

Add token for training.

# TODO: support add tokens as dict, then we can load pretrained tokens.

train_step(data, optim_wrapper)[source]¶: Training step.

mmagic.models.editors.textual_inversion¶

Package Contents¶

Classes¶

`mmagic.models.editors.textual_inversion`¶