Shortcuts

video interpolation

Summary

  • Number of checkpoints: 7

  • Number of configs: 7

  • Number of papers: 3

    • ALGORITHM: 3

CAIN (AAAI’2020)

Task: Video Interpolation

Abstract

Prevailing video frame interpolation techniques rely heavily on optical flow estimation and require additional model complexity and computational cost; it is also susceptible to error propagation in challenging scenarios with large motion and heavy occlusion. To alleviate the limitation, we propose a simple but effective deep neural network for video frame interpolation, which is end-to-end trainable and is free from a motion estimation network component. Our algorithm employs a special feature reshaping operation, referred to as PixelShuffle, with a channel attention, which replaces the optical flow computation module. The main idea behind the design is to distribute the information in a feature map into multiple channels and extract motion information by attending the channels for pixel-level frame synthesis. The model given by this principle turns out to be effective in the presence of challenging motion and occlusion. We construct a comprehensive evaluation benchmark and demonstrate that the proposed approach achieves outstanding performance compared to the existing models with a component for optical flow computation.

Results and models

Evaluated on RGB channels. The metrics are PSNR / SSIM . The learning rate adjustment strategy is Step LR scheduler with min_lr clipping.

Model Dataset PSNR SSIM Training Resources Download
cain_b5_g1b32_vimeo90k_triplet vimeo90k-T 34.6010 0.9578 1 (Tesla V100-SXM2-32GB) model/log

Quick Start

Train

Train Instructions

You can use the following commands to train a model with cpu or single/multiple GPUs.

## cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py

## single-gpu train
python tools/train.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py

## multi-gpu train
./tools/dist_train.sh configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py 8

For more details, you can refer to Train a model part in train_test.md.

Test

Test Instructions

You can use the following commands to test a model with cpu or single/multiple GPUs.

## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth

## single-gpu test
python tools/test.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth

## multi-gpu test
./tools/dist_test.sh configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth 8

For more details, you can refer to Test a pre-trained model part in train_test.md.

Citation

@inproceedings{choi2020channel,
  title={Channel attention is all you need for video frame interpolation},
  author={Choi, Myungsub and Kim, Heewon and Han, Bohyung and Xu, Ning and Lee, Kyoung Mu},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={07},
  pages={10663--10671},
  year={2020}
}

FLAVR (arXiv’2020)

Task: Video Interpolation

Abstract

Most modern frame interpolation approaches rely on explicit bidirectional optical flows between adjacent frames, thus are sensitive to the accuracy of underlying flow estimation in handling occlusions while additionally introducing computational bottlenecks unsuitable for efficient deployment. In this work, we propose a flow-free approach that is completely end-to-end trainable for multi-frame video interpolation. Our method, FLAVR, is designed to reason about non-linear motion trajectories and complex occlusions implicitly from unlabeled videos and greatly simplifies the process of training, testing and deploying frame interpolation models. Furthermore, FLAVR delivers up to 6× speed up compared to the current state-of-the-art methods for multi-frame interpolation while consistently demonstrating superior qualitative and quantitative results compared with prior methods on popular benchmarks including Vimeo-90K, Adobe-240FPS, and GoPro. Finally, we show that frame interpolation is a competitive self-supervised pre-training task for videos via demonstrating various novel applications of FLAVR including action recognition, optical flow estimation, motion magnification, and video object tracking. Code and trained models are provided in the supplementary material.

Results and models

Evaluated on RGB channels. The metrics are PSNR / SSIM .

Model Dataset scale PSNR SSIM Training Resources Download
flavr_in4out1_g8b4_vimeo90k_septuplet vimeo90k-T x2 36.3340 0.96015 8 (Tesla PG503-216) model | log

Note: FLAVR for x8 VFI task will supported in the future.

Quick Start

Train

Train Instructions

You can use the following commands to train a model with cpu or single/multiple GPUs.

## cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py

## single-gpu train
python tools/train.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py

## multi-gpu train
./tools/dist_train.sh configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py 8

For more details, you can refer to Train a model part in train_test.md.

Test

Test Instructions

You can use the following commands to test a model with cpu or single/multiple GPUs.

## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet_20220509-c2468995.pth

## single-gpu test
python tools/test.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet_20220509-c2468995.pth

## multi-gpu test
./tools/dist_test.sh configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet_20220509-c2468995.pth 8

For more details, you can refer to Test a pre-trained model part in train_test.md.

Citation

@article{kalluri2020flavr,
  title={Flavr: Flow-agnostic video representations for fast frame interpolation},
  author={Kalluri, Tarun and Pathak, Deepak and Chandraker, Manmohan and Tran, Du},
  journal={arXiv preprint arXiv:2012.08512},
  year={2020}
}

TOFlow (IJCV’2019)

Task: Video Interpolation, Video Super-Resolution

Abstract

Many video enhancement algorithms rely on optical flow to register frames in a video sequence. Precise flow estimation is however intractable; and optical flow itself is often a sub-optimal representation for particular video processing tasks. In this paper, we propose task-oriented flow (TOFlow), a motion representation learned in a self-supervised, task-specific manner. We design a neural network with a trainable motion estimation component and a video processing component, and train them jointly to learn the task-oriented flow. For evaluation, we build Vimeo-90K, a large-scale, high-quality video dataset for low-level video processing. TOFlow outperforms traditional optical flow on standard benchmarks as well as our Vimeo-90K dataset in three video processing tasks: frame interpolation, video denoising/deblocking, and video super-resolution.

Results and models

Evaluated on Vimeo90k-triplet (RGB channels). The metrics are PSNR / SSIM .

Model Dataset Task Pretrained SPyNet PSNR Training Resources Download
tof_vfi_spynet_chair_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3294 1 (Tesla PG503-216) model | log
tof_vfi_spynet_kitti_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3339 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_clean_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3170 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_final_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3237 1 (Tesla PG503-216) model | log
tof_vfi_spynet_pytoflow_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Interpolation spynet_chairs_final 33.3426 1 (Tesla PG503-216) model | log
Model Dataset Task Pretrained SPyNet SSIM Training Resources Download
tof_vfi_spynet_chair_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9465 1 (Tesla PG503-216) model | log
tof_vfi_spynet_kitti_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9466 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_clean_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9464 1 (Tesla PG503-216) model | log
tof_vfi_spynet_sintel_final_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9465 1 (Tesla PG503-216) model | log
tof_vfi_spynet_pytoflow_nobn_1xb1_vimeo90k Vimeo90k-triplet Video Super-Resolution spynet_chairs_final 0.9467 1 (Tesla PG503-216) model | log

Note: These pretrained SPyNets don’t contain BN layer since batch_size=1, which is consistent with https://github.com/Coldog2333/pytoflow.

Evaluated on RGB channels. The metrics are PSNR / SSIM .

Model Dataset Task Vid4 Training Resources Download
tof_x4_vimeo90k_official vimeo90k Video Super-Resolution 24.4377 / 0.7433 - model

Quick Start

Train

Train Instructions

You can use the following commands to train a model with cpu or single/multiple GPUs.

TOF only supports video interpolation task for training now.

## cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py

## single-gpu train
python tools/train.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py

## multi-gpu train
./tools/dist_train.sh configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py 8

For more details, you can refer to Train a model part in train_test.md.

Test

Test Instructions

You can use the following commands to test a model with cpu or single/multiple GPUs.

TOF supports two tasks for testing.

Task 1: Video Interpolation

## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth

## single-gpu test
python tools/test.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth

## multi-gpu test
./tools/dist_test.sh configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth 8

Task 2: Video Super-Resolution

## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth

## single-gpu test
python tools/test.py configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth

## multi-gpu test
./tools/dist_test.sh configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth 8

For more details, you can refer to Test a pre-trained model part in train_test.md.

Citation

@article{xue2019video,
  title={Video enhancement with task-oriented flow},
  author={Xue, Tianfan and Chen, Baian and Wu, Jiajun and Wei, Donglai and Freeman, William T},
  journal={International Journal of Computer Vision},
  volume={127},
  number={8},
  pages={1106--1125},
  year={2019},
  publisher={Springer}
}
Read the Docs v: latest
Versions
latest
stable
0.x
Downloads
pdf
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.