`mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint`¶

Module Contents¶

Classes¶

StableDiffusionInpaint

Class for Stable Diffusion. Refers to https://github.com/Stability-

Functions¶

prepare_mask_and_masked_image(image, mask[, height, ...])

Prepare latents for diffusion to run in latent space.

Attributes¶

`logger`
`ModelType`

mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint.logger[source]¶

mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint.ModelType[source]¶

class mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint.StableDiffusionInpaint(*args, **kwargs)[source]¶

Bases: mmagic.models.editors.stable_diffusion.stable_diffusion.StableDiffusion

Class for Stable Diffusion. Refers to https://github.com/Stability- AI/stablediffusion and https://github.com/huggingface/diffusers/blob/main/s rc/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_an d_excite.py # noqa.

Parameters

unet (Union[dict, nn.Module]) – The config or module for Unet model.
text_encoder (Union[dict, nn.Module]) – The config or module for text encoder.
vae (Union[dict, nn.Module]) – The config or module for VAE model.
tokenizer (str) – The name for CLIP tokenizer.
schedule (Union[dict, nn.Module]) – The config or module for diffusion scheduler.
test_scheduler (Union[dict, nn.Module], optional) – The config or module for diffusion scheduler in test stage (self.infer). If not passed, will use the same scheduler as schedule. Defaults to None.
dtype (str, optional) – The dtype for the model This argument will not work when dtype is defined for submodels. Defaults to None.
enable_xformers (bool, optional) – Whether to use xformers. Defaults to True.
noise_offset_weight (bool, optional) – The weight of noise offset introduced in https://www.crosslabs.org/blog/diffusion-with-offset-noise Defaults to 0.
data_preprocessor (dict, optional) – The pre-process config of BaseDataPreprocessor.
init_cfg (dict, optional) – The weight initialized config for BaseModule.

infer(prompt: Union[str, List[str]], image: Union[torch.FloatTensor, PIL.Image.Image] = None, mask_image: Union[torch.FloatTensor, PIL.Image.Image] = None, height: Optional[int] = None, width: Optional[int] = None, num_inference_steps: int = 50, guidance_scale: float = 7.5, negative_prompt: Optional[Union[str, List[str]]] = None, num_images_per_prompt: Optional[int] = 1, eta: float = 0.0, generator: Optional[torch.Generator] = None, latents: Optional[torch.FloatTensor] = None, show_progress=True, seed=1, return_type='image')[source]¶

Function invoked when calling the pipeline for generation.

Parameters

prompt (str or List[str]) – The prompt or prompts to guide the image generation.
image (Union[torch.FloatTensor, Image.Image]) – The image to inpaint.
mask_image (Union[torch.FloatTensor, Image.Image]) – The mask to apply to the image, i.e. regions to inpaint.
(int (height) – defaults to self.unet_sample_size * self.vae_scale_factor): The height in pixels of the generated image.
optional – defaults to self.unet_sample_size * self.vae_scale_factor): The height in pixels of the generated image.

:paramdefaults to self.unet_sample_size * self.vae_scale_factor):: The height in pixels of the generated image.

Parameters

(int (width) – defaults to self.unet_sample_size * self.vae_scale_factor): The width in pixels of the generated image.
optional – defaults to self.unet_sample_size * self.vae_scale_factor): The width in pixels of the generated image.

:paramdefaults to self.unet_sample_size * self.vae_scale_factor):: The width in pixels of the generated image.

Parameters

num_inference_steps (int, optional, defaults to 50) – The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.
guidance_scale (float, optional, defaults to 7.5) – Guidance scale as defined in [Classifier-Free Diffusion Guidance] (https://arxiv.org/abs/2207.12598).
negative_prompt (str or List[str], optional) – The prompt or prompts not to guide the image generation. Ignored when not using guidance (i.e., ignored if guidance_scale is less than 1).
num_images_per_prompt (int, optional, defaults to 1) – The number of images to generate per prompt.
eta (float, optional, defaults to 0.0) – Corresponds to parameter eta (η) in the DDIM paper: https://arxiv.org/abs/2010.02502. Only applies to [schedulers.DDIMScheduler], will be ignored for others.
generator (torch.Generator, optional) – A [torch generator] to make generation deterministic.
latents (torch.FloatTensor, optional) – Pre-generated noisy latents, sampled from a Gaussian distribution, to be used as inputs for image generation. Can be used to tweak the same generation with different prompts. If not provided, a latents tensor will be generated by sampling using the supplied random generator.
return_type (str) – The return type of the inference results. Supported types are ‘image’, ‘numpy’, ‘tensor’. If ‘image’ is passed, a list of PIL images will be returned. If ‘numpy’ is passed, a numpy array with shape [N, C, H, W] will be returned, and the value range will be same as decoder’s output range. If ‘tensor’ is passed, the decoder’s output will be returned. Defaults to ‘image’.

Returns

A dict containing the generated images.

Return type

dict

prepare_mask_latents(mask, masked_image, batch_size, num_channels_latents, height, width, dtype, device, generator, do_classifier_free_guidance)[source]¶

prepare latents for diffusion to run in latent space.

Parameters

mask (torch.Tensor) – The mask to apply to the image, i.e. regions to inpaint.
image (torch.Tensor) – The image to be masked.
batch_size (int) – batch size.
num_channels_latents (int) – latent channel nums.
height (int) – image height.
width (int) – image width.
dtype (torch.dtype) – float type.
device (torch.device) – torch device.
generator (torch.Generator) – generator for random functions, defaults to None.
latents (torch.Tensor) – Pre-generated noisy latents, defaults to None.
do_classifier_free_guidance (bool) – Whether to apply classifier-free guidance.

Returns

prepared latents.

Return type

latents (torch.Tensor)

abstract val_step(data: dict) → mmagic.utils.typing.SampleList[source]¶

Performs a validation step on the provided data.

This method is decorated with torch.no_grad() which indicates no gradients will be computed during the operations. This ensures efficient memory usage during testing.

Parameters: data (dict) – Dictionary containing input data for testing.
Returns: List of samples processed during the testing step.
Return type: SampleList
Raises: NotImplementedError – This method has not been implemented.

abstract test_step(data: dict) → mmagic.utils.typing.SampleList[source]¶

Performs a testing step on the provided data.

This method is decorated with torch.no_grad() which indicates no gradients will be computed during the operations. This ensures efficient memory usage during testing.

Parameters: data (dict) – Dictionary containing input data for testing.
Returns: List of samples processed during the testing step.
Return type: SampleList
Raises: NotImplementedError – This method has not been implemented.

abstract train_step(data, optim_wrapper_dict)[source]¶

Performs a training step on the provided data.

Parameters

data – Input data for training.
optim_wrapper_dict – Dictionary containing optimizer wrappers which may contain optimizers, schedulers, etc. required for the training step.

Raises

NotImplementedError – This method has not been implemented.

mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint.prepare_mask_and_masked_image(image: torch.Tensor, mask: torch.Tensor, height: int = 512, width: int = 512, return_image: bool = False)[source]¶

Prepare latents for diffusion to run in latent space.

Parameters

image (torch.Tensor) – The image to be masked.
mask (torch.Tensor) – The mask to apply to the image, i.e. regions to inpaint.
height (int) – Image height.
width (int) – Image width.
return_image (bool) – Whether to return the original image. Default to False.

Returns

A binary mask image. masked_image (torch.Tensor): An image that applied mask.

Return type

mask (torch.Tensor)

mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint¶

Module Contents¶

Classes¶

Functions¶

Attributes¶

`mmagic.models.editors.stable_diffusion.stable_diffusion_inpaint`¶