`mmagic.models.editors.fastcomposer.fastcomposer_util`¶

Module Contents¶

Classes¶

`FastComposerModel`	FastComposerModel is based on the StableDiffusion Model and the Clip
`FastComposerTextEncoder`	TextEncoder for FastComposerModel.
`FastComposerCLIPImageEncoder`	CLIPImageEncoder for FastComposerModel.
`FastComposerPostfuseModule`	Postfuse Module for FastComposerModel.
`BalancedL1Loss`	BalancedL1Loss for object localization.
`RandomZoomIn`	RandomZoomIn for object transform.
`PadToSquare`	If the height of the image is greater than the width, padding will be
`CropTopSquare`	If the height of the image is greater than the width, the image will be
`MLP`	Multilayer Perceptron.

Functions¶

`get_object_transforms`(cfg)	Get Object transforms.
`unet_store_cross_attention_scores`(unet, attention_scores)	Unet store cross attention scores.
`get_object_localization_loss`(cross_attention_scores, ...)	To obtain the average of the loss for each layer of object
`get_object_localization_loss_for_one_layer`(...)	Get object localization loss for one layer.
`fuse_object_embeddings`(inputs_embeds, ...[, fuse_fn])	Fuse object embeddings.
`build_causal_attention_mask`(bsz, seq_len, dtype[, device])	The function originally belonged to CLIPTextTransformer, but it has been

Attributes¶

`_expand_mask`
`_expand_mask`

mmagic.models.editors.fastcomposer.fastcomposer_util._expand_mask[源代码]¶

mmagic.models.editors.fastcomposer.fastcomposer_util._expand_mask[源代码]¶

class mmagic.models.editors.fastcomposer.fastcomposer_util.FastComposerModel(text_encoder, image_encoder, vae, unet, cfg)[源代码]¶

Bases: torch.nn.Module

FastComposerModel is based on the StableDiffusion Model and the Clip Model.

_clear_cross_attention_scores()[源代码]¶: Delete cross attention scores.

static from_pretrained(cfg, vae, unet)[源代码]¶: Init FastComposerTextEncoder and FastComposerCLIPImageEncoder.

forward(batch, noise_scheduler)[源代码]¶

Forward function.

参数

batch (torch.Tensor) – You can directly input a torch.Tensor.
noise_scheduler (torch.Tensor) – You can directly input a torch.Tensor.

返回

Dict

class mmagic.models.editors.fastcomposer.fastcomposer_util.FastComposerTextEncoder(text_model)[源代码]¶

Bases: transformers.CLIPPreTrainedModel

TextEncoder for FastComposerModel.

static from_pretrained(model_name_or_path, **kwargs)[源代码]¶: Init textEncoder with Stable Diffusion Model name or path.

forward(input_ids, image_token_mask=None, object_embeds=None, num_objects=None, attention_mask: Optional[torch.Tensor] = None, output_attentions: Optional[bool] = None, output_hidden_states: Optional[bool] = None, return_dict: Optional[bool] = None) → Union[Tuple, transformers.modeling_outputs.BaseModelOutputWithPooling][源代码]¶

Forward function.

参数

input_ids (torch.Tensor) – You can directly input a torch.Tensor.
image_token_mask (torch.Tensor) – You can directly input a torch.Tensor.
object_embeds (torch.Tensor) – You can directly input a torch.Tensor.
num_objects (torch.Tensor) – You can directly input a torch.Tensor.
attention_mask (torch.Tensor) – You can directly input a torch.Tensor.
output_attentions (bool) – Default to None.
output_hidden_states (bool) – Default to None.
return_dict (bool) – Default to None.

返回

Union[Tuple, BaseModelOutputWithPooling]

class mmagic.models.editors.fastcomposer.fastcomposer_util.FastComposerCLIPImageEncoder(vision_model, visual_projection, vision_processor)[源代码]¶

Bases: transformers.CLIPPreTrainedModel

CLIPImageEncoder for FastComposerModel.

static from_pretrained(global_model_name_or_path)[源代码]¶: Init CLIPModel with Clip model name or path.

forward(object_pixel_values)[源代码]¶

Forward function.

参数: object_pixel_values (torch.Tensor) – You can directly input a torch.Tensor.
返回: torch.tensor will be returned.
返回类型: torch.Tensor

mmagic.models.editors.fastcomposer.fastcomposer_util.get_object_transforms(cfg)[源代码]¶: Get Object transforms.

class mmagic.models.editors.fastcomposer.fastcomposer_util.FastComposerPostfuseModule(embed_dim)[源代码]¶

Bases: torch.nn.Module

Postfuse Module for FastComposerModel.

fuse_fn(text_embeds, object_embeds)[源代码]¶

Fuse function.

参数

text_embeds (torch.Tensor) – You can directly input a torch.Tensor.
object_embeds (torch.Tensor) – You can directly input a torch.Tensor.

返回

torch.tensor will be returned.

返回类型

torch.Tensor

forward(text_embeds, object_embeds, image_token_mask, num_objects) → torch.Tensor[源代码]¶

Forward function.

参数

text_embeds (torch.Tensor) – You can directly input a torch.Tensor.
object_embeds (torch.Tensor) – You can directly input a torch.Tensor.
image_token_mask (torch.Tensor) – You can directly input a torch.Tensor.

返回

torch.tensor will be returned.

返回类型

torch.Tensor

mmagic.models.editors.fastcomposer.fastcomposer_util.unet_store_cross_attention_scores(unet, attention_scores, layers=5)[源代码]¶: Unet store cross attention scores.

class mmagic.models.editors.fastcomposer.fastcomposer_util.BalancedL1Loss(threshold=1.0, normalize=False)[源代码]¶

Bases: torch.nn.Module

BalancedL1Loss for object localization.

forward(object_token_attn_prob, object_segmaps)[源代码]¶

Forward function.

参数

object_token_attn_prob (torch.Tensor) – You can directly input a torch.Tensor.
object_segmaps (torch.Tensor) – You can directly input a torch.Tensor.

返回

float will be returned.

返回类型

float

mmagic.models.editors.fastcomposer.fastcomposer_util.get_object_localization_loss(cross_attention_scores, object_segmaps, image_token_idx, image_token_idx_mask, loss_fn)[源代码]¶: To obtain the average of the loss for each layer of object localization.

mmagic.models.editors.fastcomposer.fastcomposer_util.get_object_localization_loss_for_one_layer(cross_attention_scores, object_segmaps, object_token_idx, object_token_idx_mask, loss_fn)[源代码]¶: Get object localization loss for one layer.

class mmagic.models.editors.fastcomposer.fastcomposer_util.RandomZoomIn(min_zoom=1.0, max_zoom=1.5)[源代码]¶

Bases: torch.nn.Module

RandomZoomIn for object transform.

forward(image: torch.Tensor)[源代码]¶

Forward function.

参数: image (torch.Tensor) – You can directly input a torch.Tensor.
返回: torch.tensor will be returned.
返回类型: torch.Tensor

class mmagic.models.editors.fastcomposer.fastcomposer_util.PadToSquare(fill=0, padding_mode='constant')[源代码]¶

Bases: torch.nn.Module

If the height of the image is greater than the width, padding will be added on both sides of the image to make it a square.

forward(image: torch.Tensor)[源代码]¶

Forward function.

参数: image (torch.Tensor) – You can directly input a torch.Tensor.
返回: torch.tensor will be returned.
返回类型: torch.Tensor

class mmagic.models.editors.fastcomposer.fastcomposer_util.CropTopSquare[源代码]¶

Bases: torch.nn.Module

If the height of the image is greater than the width, the image will be cropped into a square starting from the top of the image.

forward(image: torch.Tensor)[源代码]¶

Forward function.

参数: image (torch.Tensor) – You can directly input a torch.Tensor.
返回: torch.tensor will be returned.
返回类型: torch.Tensor

class mmagic.models.editors.fastcomposer.fastcomposer_util.MLP(in_dim, out_dim, hidden_dim, use_residual=True)[源代码]¶

Bases: torch.nn.Module

Multilayer Perceptron.

forward(x)[源代码]¶

Forward function.

参数: x (torch.Tensor) – You can directly input a torch.Tensor.
返回: torch.tensor will be returned.
返回类型: torch.Tensor

mmagic.models.editors.fastcomposer.fastcomposer_util.fuse_object_embeddings(inputs_embeds, image_token_mask, object_embeds, num_objects, fuse_fn=torch.add)[源代码]¶: Fuse object embeddings.

mmagic.models.editors.fastcomposer.fastcomposer_util.build_causal_attention_mask(bsz, seq_len, dtype, device=None)[源代码]¶: The function originally belonged to CLIPTextTransformer, but it has been removed in versions of transformers after 4.25.1.

mmagic.models.editors.fastcomposer.fastcomposer_util¶

Module Contents¶

Classes¶

Functions¶

Attributes¶

`mmagic.models.editors.fastcomposer.fastcomposer_util`¶