`mmagic.models.editors.disco_diffusion.guider`¶

Module Contents¶

Classes¶

`MakeCutouts`	Each iteration, the AI cuts the image into smaller pieces known as cuts.
`MakeCutoutsDango`	Dango233(https://github.com/Dango233)'s version of MakeCutouts.
`ImageTextGuider`	Disco-Diffusion uses text and images to guide image generation. We will

Functions¶

`sinc`(x)	Sinc function.
`lanczos`(x, a)	Lanczos filter's reconstruction kernel L(x).
`ramp`(ratio, width)	_summary_
`resample`(input, size[, align_corners])	Lanczos resampling image.
`range_loss`(input)	range loss.
`spherical_dist_loss`(x, y)	spherical distance loss.
`parse_prompt`(prompt)	Parse prompt, return text and text weight.
`split_prompts`(prompts[, max_frames])	Split prompts to a list of prompts.

Attributes¶

`clip`
`normalize`

mmagic.models.editors.disco_diffusion.guider.clip[源代码]¶

mmagic.models.editors.disco_diffusion.guider.normalize[源代码]¶

mmagic.models.editors.disco_diffusion.guider.sinc(x)[源代码]¶

Sinc function. If x equal to 0,

sinc(x) = 1

else:: sinc(x) = sin(x)/ x

参数: x (torch.Tensor) – Input Tensor
返回: Function output.
返回类型: torch.Tensor

mmagic.models.editors.disco_diffusion.guider.lanczos(x, a)[源代码]¶: Lanczos filter’s reconstruction kernel L(x).

mmagic.models.editors.disco_diffusion.guider.ramp(ratio, width)[源代码]¶

_summary_

参数

ratio (_type_) – _description_
width (_type_) – _description_

返回

_description_

返回类型

_type_

mmagic.models.editors.disco_diffusion.guider.resample(input, size, align_corners=True)[源代码]¶

Lanczos resampling image.

参数

input (torch.Tensor) – Input image tensor.
size (Tuple[int, int]) – Output image size.
align_corners (bool) – align_corners argument of F.interpolate. Defaults to True.

返回

Resampling results.

返回类型

torch.Tensor

mmagic.models.editors.disco_diffusion.guider.range_loss(input)[源代码]¶: range loss.

mmagic.models.editors.disco_diffusion.guider.spherical_dist_loss(x, y)[源代码]¶: spherical distance loss.

class mmagic.models.editors.disco_diffusion.guider.MakeCutouts(cut_size, cutn)[源代码]¶

Bases: torch.nn.Module

Each iteration, the AI cuts the image into smaller pieces known as cuts.

, and compares each cut to the prompt to decide how to guide the next diffusion step. This classes will randomly cut patches and perform image augmentation to these patches.

参数

cut_size (int) – Size of the patches.
cutn (int) – Number of patches to cut.

forward(input, skip_augs=False)[源代码]¶

class mmagic.models.editors.disco_diffusion.guider.MakeCutoutsDango(cut_size, Overview=4, InnerCrop=0, IC_Size_Pow=0.5, IC_Grey_P=0.2)[源代码]¶

Bases: torch.nn.Module

Dango233(https://github.com/Dango233)’s version of MakeCutouts.

The improvement compared to MakeCutouts is that it use partial greyscale augmentation to capture structure, and partial rotation augmentation to capture whole frames.

参数

cut_size (int) – Size of the patches.
Overview (int) – The total number of overview cuts.
details (In) – Overview=1, Add whole frame; Overview=2, Add grayscaled frame; Overview=3, Add horizontal flip frame; Overview=4, Add grayscaled horizontal flip frame; Overview>4, Repeat add frame Overview times. Defaults to 4.

:paramOverview=1, Add whole frame;: Overview=2, Add grayscaled frame; Overview=3, Add horizontal flip frame; Overview=4, Add grayscaled horizontal flip frame; Overview>4, Repeat add frame Overview times. Defaults to 4.

参数

InnerCrop (int) – The total number of inner cuts. Defaults to 0.
IC_Size_Pow (float) – This sets the size of the border used for inner cuts. High values have larger borders, and therefore the cuts themselves will be smaller and provide finer details. Defaults to 0.5.
IC_Grey_P (float) – The portion of the inner cuts can be set to be grayscale instead of color. This may help with improved definition of shapes and edges, especially in the early diffusion steps where the image structure is being defined. Defaults to 0.2.

forward(input, skip_augs=False)[源代码]¶: Forward function.

mmagic.models.editors.disco_diffusion.guider.parse_prompt(prompt)[源代码]¶: Parse prompt, return text and text weight.

mmagic.models.editors.disco_diffusion.guider.split_prompts(prompts, max_frames=1)[源代码]¶: Split prompts to a list of prompts.

class mmagic.models.editors.disco_diffusion.guider.ImageTextGuider(clip_models)[源代码]¶

Bases: torch.nn.Module

Disco-Diffusion uses text and images to guide image generation. We will use the clip models to extract text and image features as prompts, and then during the iteration, the features of the image patches are computed, and the similarity loss between the prompts features and the generated features is computed. Other losses also include RGB Range loss, total variation loss. Using these losses we can guide the image generation towards the desired target.

参数: clip_models (List[Dict]) – List of clip model settings.

property device[源代码]¶

Get current device of the model.

返回: The current device of the model.
返回类型: torch.device

frame_prompt_from_text(text_prompts, frame_num=0)[源代码]¶: Get current frame prompt.

compute_prompt_stats(text_prompts=[], image_prompt=None, fuzzy_prompt=False, rand_mag=0.05)[源代码]¶

Compute prompts statistics.

参数

text_prompts (list) – Text prompts. Defaults to [].
image_prompt (list) – Image prompts. Defaults to None.
fuzzy_prompt (bool, optional) – Controls whether to add multiple noisy prompts to the prompt losses. If True, can increase variability of image output. Defaults to False.
rand_mag (float, optional) – Controls the magnitude of the random noise added by fuzzy_prompt. Defaults to 0.05.

cond_fn(model, diffusion_scheduler, x, t, beta_prod_t, model_stats, secondary_model=None, init_image=None, clamp_grad=True, clamp_max=0.05, clip_guidance_scale=5000, init_scale=1000, tv_scale=0.0, sat_scale=0.0, range_scale=150, cut_overview=[12] * 400 + [4] * 600, cut_innercut=[4] * 400 + [12] * 600, cut_ic_pow=[1] * 1000, cut_icgray_p=[0.2] * 400 + [0] * 600, cutn_batches=4)[源代码]¶

Clip guidance function.

参数

model (nn.Module) – _description_
diffusion_scheduler (object) – _description_
x (torch.Tensor) – _description_
t (int) – _description_
beta_prod_t (torch.Tensor) – _description_
model_stats (List[torch.Tensor]) – _description_
secondary_model (nn.Module) – A smaller secondary diffusion model trained by Katherine Crowson to remove noise from intermediate timesteps to prepare them for CLIP. Ref: https://twitter.com/rivershavewings/status/1462859669454536711 # noqa Defaults to None.
init_image (torch.Tensor) – Initial image for denoising. Defaults to None.
clamp_grad (bool, optional) – Whether clamp gradient. Defaults to True.
clamp_max (float, optional) – Clamp max values. Defaults to 0.05.
clip_guidance_scale (int, optional) – The scale of influence of clip guidance on image generation. Defaults to 5000.

abstract forward(x)[源代码]¶: forward function.

mmagic.models.editors.disco_diffusion.guider¶

Module Contents¶

Classes¶

Functions¶

Attributes¶

`mmagic.models.editors.disco_diffusion.guider`¶