Shortcuts

mmagic.models.editors.controlnet

Package Contents

Classes

ControlStableDiffusion

Implementation of `ControlNet with Stable Diffusion.

Functions

change_base_model(→ torch.nn.Module)

This function is used to change the base model of ControlNet. Refers to

class mmagic.models.editors.controlnet.ControlStableDiffusion(vae: ModelType, text_encoder: ModelType, tokenizer: str, unet: ModelType, controlnet: ModelType, scheduler: ModelType, test_scheduler: Optional[ModelType] = None, dtype: str = 'fp32', enable_xformers: bool = True, noise_offset_weight: float = 0, tomesd_cfg: Optional[dict] = None, data_preprocessor=dict(type='DataPreprocessor'), init_cfg: Optional[dict] = None, attention_injection=False)[source]

Bases: mmagic.models.editors.stable_diffusion.StableDiffusion

Implementation of `ControlNet with Stable Diffusion.

<https://arxiv.org/abs/2302.05543>`_ (ControlNet).

Parameters
  • vae (Union[dict, nn.Module]) – The config or module for VAE model.

  • text_encoder (Union[dict, nn.Module]) – The config or module for text encoder.

  • tokenizer (str) – The name for CLIP tokenizer.

  • unet (Union[dict, nn.Module]) – The config or module for Unet model.

  • controlnet (Union[dict, nn.Module]) – The config or module for ControlNet.

  • schedule (Union[dict, nn.Module]) – The config or module for diffusion scheduler.

  • test_scheduler (Union[dict, nn.Module], optional) – The config or module for diffusion scheduler in test stage (self.infer). If not passed, will use the same scheduler as schedule. Defaults to None.

  • dtype (str, optional) – The dtype for the model. Defaults to ‘fp16’.

  • enable_xformers (bool, optional) – Whether to use xformers. Defaults to True.

  • noise_offset_weight (bool, optional) – The weight of noise offset introduced in https://www.crosslabs.org/blog/diffusion-with-offset-noise # noqa Defaults to 0.

  • data_preprocessor (dict, optional) –

    The pre-process config of BaseDataPreprocessor. Defaults to

    dict(type=’DataPreprocessor’).

  • init_cfg (dict, optional) – The weight initialized config for BaseModule. Defaults to None/

init_weights()[source]

Initialize the weights. Noted that this function will only be called at train. If you want to inference with a different unet model, you can call this function manually or use mmagic.models.editors.controlnet.controlnet_utils.change_base_model to convert the weight of ControlNet manually.

Example: >>> 1. init controlnet from unet >>> init_cfg = dict(type=’init_from_unet’)

>>> 2. switch controlnet weight from unet
>>> # base model is not defined, use `runwayml/stable-diffusion-v1-5`
>>> # as default
>>> init_cfg = dict(type='convert_from_unet')
>>> # base model is defined
>>> init_cfg = dict(
>>>     type='convert_from_unet',
>>>     base_model=dict(
>>>         type='UNet2DConditionModel',
>>>         from_pretrained='REPO_ID',
>>>         subfolder='unet'))
train_step(data: dict, optim_wrapper: mmengine.optim.OptimWrapperDict) Dict[str, torch.Tensor][source]

Train step for ControlNet model. :param data: Data sampled from dataloader. :type data: dict :param optim_wrapper: OptimWrapperDict instance

contains OptimWrapper of generator and discriminator.

Returns

A dict of tensor for logging.

Return type

Dict[str, torch.Tensor]

val_step(data: dict) mmagic.utils.typing.SampleList[source]

Gets the generated image of given data. Calls self.data_preprocessor and self.infer in order. Return the generated results which will be passed to evaluator or visualizer.

Parameters

data (dict or tuple or list) – Data sampled from dataset.

Returns

Generated image or image dict.

Return type

SampleList

test_step(data: dict) mmagic.utils.typing.SampleList[source]

Gets the generated image of given data. Calls self.data_preprocessor and self.infer in order. Return the generated results which will be passed to evaluator or visualizer.

Parameters

data (dict or tuple or list) – Data sampled from dataset.

Returns

Generated image or image dict.

Return type

SampleList

static prepare_control(image: Tuple[PIL.Image.Image, List[PIL.Image.Image], torch.Tensor, List[torch.Tensor]], width: int, height: int, batch_size: int, num_images_per_prompt: int, device: str, dtype: str) torch.Tensor[source]

A helper function to prepare single control images.

Parameters
  • image (Tuple[Image.Image, List[Image.Image], Tensor, List[Tensor]]) – # noqa The input image for control.

  • batch_size (int) – The number of the prompt. The control will be repeated for batch_size times.

  • num_images_per_prompt (int) – The number images generate for one prompt.

  • device (str) – The device of the control.

  • dtype (str) – The dtype of the control.

Returns

The control in torch.tensor.

Return type

Tensor

train(mode: bool = True)[source]

Set train/eval mode.

Parameters

mode (bool, optional) – Whether set train mode. Defaults to True.

infer(prompt: Union[str, List[str]], height: Optional[int] = None, width: Optional[int] = None, control: Optional[Union[str, numpy.ndarray, torch.Tensor]] = None, controlnet_conditioning_scale: float = 1.0, num_inference_steps: int = 20, guidance_scale: float = 7.5, negative_prompt: Optional[Union[str, List[str]]] = None, num_images_per_prompt: Optional[int] = 1, eta: float = 0.0, generator: Optional[torch.Generator] = None, latents: Optional[torch.FloatTensor] = None, return_type='image', show_progress=True)[source]

Function invoked when calling the pipeline for generation.

Parameters
  • prompt (str or List[str]) – The prompt or prompts to guide the image generation.

  • height (int, Optional) – The height in pixels of the generated image. If not passed, the height will be self.unet_sample_size * self.vae_scale_factor Defaults to None.

  • width (int, Optional) – The width in pixels of the generated image. If not passed, the width will be self.unet_sample_size * self.vae_scale_factor Defaults to None.

  • num_inference_steps (int) – The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference. Defaults to 50.

  • guidance_scale (float) – Guidance scale as defined in Classifier- Free Diffusion Guidance (https://arxiv.org/abs/2207.12598). Defaults to 7.5

  • negative_prompt (str or List[str], optional) – The prompt or prompts not to guide the image generation. Ignored when not using guidance (i.e., ignored if guidance_scale is less than 1). Defaults to None.

  • num_images_per_prompt (int) – The number of images to generate per prompt. Defaults to 1.

  • eta (float) – Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to DDIMScheduler, will be ignored for others. Defaults to 0.0.

  • generator (torch.Generator, optional) – A torch generator to make generation deterministic. Defaults to None.

  • latents (torch.FloatTensor, optional) – Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image generation. Can be used to tweak the same generation with different prompts. If not provided, a latents tensor will be generated by sampling using the supplied random generator. Defaults to None.

  • return_type (str) – The return type of the inference results. Supported types are ‘image’, ‘numpy’, ‘tensor’. If ‘image’ is passed, a list of PIL images will be returned. If ‘numpy’ is passed, a numpy array with shape [N, C, H, W] will be returned, and the value range will be same as decoder’s output range. If ‘tensor’ is passed, the decoder’s output will be returned. Defaults to ‘image’.

Returns

A dict containing the generated images and Control image.

Return type

dict

abstract forward(*args, **kwargs)[source]

forward is not implemented now.

mmagic.models.editors.controlnet.change_base_model(controlnet: torch.nn.Module, curr_model: torch.nn.Module, base_model: torch.nn.Module, save_path: Optional[str] = None, *args, **kwargs) torch.nn.Module[source]

This function is used to change the base model of ControlNet. Refers to https://github.com/lllyasviel/ControlNet/blob/main/tool_transfer_control.py .

# noqa.

Parameters
  • controlnet (nn.Module) – The model for ControlNet to convert.

  • curr_model (nn.Module) – The model of current Stable Diffusion’s Unet.

  • base_model (nn.Module) – The model of Stable Diffusion’s Unet which ControlNet initialized with.

  • save_path (str, optional) – The path to save the converted model. Defaults to None.

  • *args – Arguments for save_checkpoint.

  • **kwargs

    Arguments for save_checkpoint.

Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.