controlnet_animation¶

Summary¶

Number of checkpoints: 1
Number of configs: 1
Number of papers: 1
- ALGORITHM: 1

Controlnet Animation (2023)¶

Controlnet Application

Task: controlnet_animation

Abstract¶

It is difficult to keep consistency and avoid video frame flickering when using stable diffusion to generate video frame by frame. Here we reproduce two methods that effectively avoid video flickering:

Controlnet with multi-frame rendering. ControlNet is a neural network structure to control diffusion models by adding extra conditions. Multi-frame rendering is a community method to reduce flickering. We use controlnet with hed condition and stable diffusion img2img for multi-frame rendering.

Controlnet with attention injection. Attention injection is widely used to generate the current frame from a reference image. There is an implementation in sd-webui-controlnet and we use some of their code to create the animation in this repo.

You may need 40G GPU memory to run controlnet with multi-frame rendering and 10G GPU memory for controlnet with attention injection. If the config file is not changed, it defaults to using controlnet with attention injection.

Demos¶

prompt key words: a handsome man, silver hair, smiling, play basketball

prompt key words: a handsome man

Change prompt to get different result

prompt key words: a girl, black hair, white pants, smiling, play basketball

Pretrained models¶

We use pretrained model from hugging face.

Model	Dataset	Download
anythingv3 config	-	stable diffusion model

Quick Start¶

There are two ways to try controlnet animation.

1. Use MMagic inference API.¶

Running the following codes, you can get an generated animation video.

from mmagic.apis import MMagicInferencer

## Create a MMEdit instance and infer
editor = MMagicInferencer(model_name='controlnet_animation')

prompt = 'a girl, black hair, T-shirt, smoking, best quality, extremely detailed'
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, ' + \
                  'extra digit, fewer digits, cropped, worst quality, low quality'

## you can download the example video with this link
## https://user-images.githubusercontent.com/12782558/227418400-80ad9123-7f8e-4c1a-8e19-0892ebad2a4f.mp4
video = '/path/to/your/input/video.mp4'
save_path = '/path/to/your/output/video.mp4'

## Do the inference to get result
editor.infer(video=video, prompt=prompt, negative_prompt=negative_prompt, save_path=save_path)

2. Use controlnet animation gradio demo.¶

python demo/gradio_controlnet_animation.py

3. Change config to use multi-frame rendering or attention injection.¶

change “inference_method” in anythingv3 config

To use multi-frame rendering.

inference_method = 'multi-frame rendering'

To use attention injection.

inference_method = 'attention_injection'

Play animation with SAM¶

We also provide a demo to play controlnet animation with sam, for details, please see OpenMMLab PlayGround.

Citation¶

@misc{zhang2023adding,
  title={Adding Conditional Control to Text-to-Image Diffusion Models},
  author={Lvmin Zhang and Maneesh Agrawala},
  year={2023},
  eprint={2302.05543},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}