video interpolation¶
Summary¶
Number of checkpoints: 7
Number of configs: 7
Number of papers: 3
ALGORITHM: 3
FLAVR (arXiv’2020)¶
Task: Video Interpolation
Abstract¶
Most modern frame interpolation approaches rely on explicit bidirectional optical flows between adjacent frames, thus are sensitive to the accuracy of underlying flow estimation in handling occlusions while additionally introducing computational bottlenecks unsuitable for efficient deployment. In this work, we propose a flow-free approach that is completely end-to-end trainable for multi-frame video interpolation. Our method, FLAVR, is designed to reason about non-linear motion trajectories and complex occlusions implicitly from unlabeled videos and greatly simplifies the process of training, testing and deploying frame interpolation models. Furthermore, FLAVR delivers up to 6× speed up compared to the current state-of-the-art methods for multi-frame interpolation while consistently demonstrating superior qualitative and quantitative results compared with prior methods on popular benchmarks including Vimeo-90K, Adobe-240FPS, and GoPro. Finally, we show that frame interpolation is a competitive self-supervised pre-training task for videos via demonstrating various novel applications of FLAVR including action recognition, optical flow estimation, motion magnification, and video object tracking. Code and trained models are provided in the supplementary material.

Results and models¶
Evaluated on RGB channels.
The metrics are PSNR / SSIM
.
Model | Dataset | scale | PSNR | SSIM | Training Resources | Download |
---|---|---|---|---|---|---|
flavr_in4out1_g8b4_vimeo90k_septuplet | vimeo90k-T | x2 | 36.3340 | 0.96015 | 8 (Tesla PG503-216) | model | log |
Note: FLAVR for x8 VFI task will supported in the future.
Quick Start¶
Train
Train Instructions
You can use the following commands to train a model with cpu or single/multiple GPUs.
## cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py
## single-gpu train
python tools/train.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py
## multi-gpu train
./tools/dist_train.sh configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py 8
For more details, you can refer to Train a model part in train_test.md.
Test
Test Instructions
You can use the following commands to test a model with cpu or single/multiple GPUs.
## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet_20220509-c2468995.pth
## single-gpu test
python tools/test.py configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet_20220509-c2468995.pth
## multi-gpu test
./tools/dist_test.sh configs/flavr/flavr_in4out1_8xb4_vimeo90k-septuplet.py https://download.openmmlab.com/mmediting/video_interpolators/flavr/flavr_in4out1_g8b4_vimeo90k_septuplet_20220509-c2468995.pth 8
For more details, you can refer to Test a pre-trained model part in train_test.md.
Citation¶
@article{kalluri2020flavr,
title={Flavr: Flow-agnostic video representations for fast frame interpolation},
author={Kalluri, Tarun and Pathak, Deepak and Chandraker, Manmohan and Tran, Du},
journal={arXiv preprint arXiv:2012.08512},
year={2020}
}
CAIN (AAAI’2020)¶
Task: Video Interpolation
Abstract¶
Prevailing video frame interpolation techniques rely heavily on optical flow estimation and require additional model complexity and computational cost; it is also susceptible to error propagation in challenging scenarios with large motion and heavy occlusion. To alleviate the limitation, we propose a simple but effective deep neural network for video frame interpolation, which is end-to-end trainable and is free from a motion estimation network component. Our algorithm employs a special feature reshaping operation, referred to as PixelShuffle, with a channel attention, which replaces the optical flow computation module. The main idea behind the design is to distribute the information in a feature map into multiple channels and extract motion information by attending the channels for pixel-level frame synthesis. The model given by this principle turns out to be effective in the presence of challenging motion and occlusion. We construct a comprehensive evaluation benchmark and demonstrate that the proposed approach achieves outstanding performance compared to the existing models with a component for optical flow computation.

Results and models¶
Evaluated on RGB channels.
The metrics are PSNR / SSIM
.
The learning rate adjustment strategy is Step LR scheduler with min_lr clipping
.
Model | Dataset | PSNR | SSIM | Training Resources | Download |
---|---|---|---|---|---|
cain_b5_g1b32_vimeo90k_triplet | vimeo90k-T | 34.6010 | 0.9578 | 1 (Tesla V100-SXM2-32GB) | model/log |
Quick Start¶
Train
Train Instructions
You can use the following commands to train a model with cpu or single/multiple GPUs.
## cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py
## single-gpu train
python tools/train.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py
## multi-gpu train
./tools/dist_train.sh configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py 8
For more details, you can refer to Train a model part in train_test.md.
Test
Test Instructions
You can use the following commands to test a model with cpu or single/multiple GPUs.
## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth
## single-gpu test
python tools/test.py configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth
## multi-gpu test
./tools/dist_test.sh configs/cain/cain_g1b32_1xb5_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/cain/cain_b5_g1b32_vimeo90k_triplet_20220530-3520b00c.pth 8
For more details, you can refer to Test a pre-trained model part in train_test.md.
Citation¶
@inproceedings{choi2020channel,
title={Channel attention is all you need for video frame interpolation},
author={Choi, Myungsub and Kim, Heewon and Han, Bohyung and Xu, Ning and Lee, Kyoung Mu},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={34},
number={07},
pages={10663--10671},
year={2020}
}
TOFlow (IJCV’2019)¶
Task: Video Interpolation, Video Super-Resolution
Abstract¶
Many video enhancement algorithms rely on optical flow to register frames in a video sequence. Precise flow estimation is however intractable; and optical flow itself is often a sub-optimal representation for particular video processing tasks. In this paper, we propose task-oriented flow (TOFlow), a motion representation learned in a self-supervised, task-specific manner. We design a neural network with a trainable motion estimation component and a video processing component, and train them jointly to learn the task-oriented flow. For evaluation, we build Vimeo-90K, a large-scale, high-quality video dataset for low-level video processing. TOFlow outperforms traditional optical flow on standard benchmarks as well as our Vimeo-90K dataset in three video processing tasks: frame interpolation, video denoising/deblocking, and video super-resolution.

Results and models¶
Evaluated on Vimeo90k-triplet (RGB channels).
The metrics are PSNR / SSIM
.
Model | Dataset | Task | Pretrained SPyNet | PSNR | Training Resources | Download |
---|---|---|---|---|---|---|
tof_vfi_spynet_chair_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Interpolation | spynet_chairs_final | 33.3294 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_kitti_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Interpolation | spynet_chairs_final | 33.3339 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_sintel_clean_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Interpolation | spynet_chairs_final | 33.3170 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_sintel_final_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Interpolation | spynet_chairs_final | 33.3237 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_pytoflow_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Interpolation | spynet_chairs_final | 33.3426 | 1 (Tesla PG503-216) | model | log |
Model | Dataset | Task | Pretrained SPyNet | SSIM | Training Resources | Download |
---|---|---|---|---|---|---|
tof_vfi_spynet_chair_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Super-Resolution | spynet_chairs_final | 0.9465 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_kitti_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Super-Resolution | spynet_chairs_final | 0.9466 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_sintel_clean_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Super-Resolution | spynet_chairs_final | 0.9464 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_sintel_final_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Super-Resolution | spynet_chairs_final | 0.9465 | 1 (Tesla PG503-216) | model | log |
tof_vfi_spynet_pytoflow_nobn_1xb1_vimeo90k | Vimeo90k-triplet | Video Super-Resolution | spynet_chairs_final | 0.9467 | 1 (Tesla PG503-216) | model | log |
Note: These pretrained SPyNets don’t contain BN layer since batch_size=1
, which is consistent with https://github.com/Coldog2333/pytoflow
.
Evaluated on RGB channels.
The metrics are PSNR / SSIM
.
Model | Dataset | Task | Vid4 | Training Resources | Download |
---|---|---|---|---|---|
tof_x4_vimeo90k_official | vimeo90k | Video Super-Resolution | 24.4377 / 0.7433 | - | model |
Quick Start¶
Train
Train Instructions
You can use the following commands to train a model with cpu or single/multiple GPUs.
TOF only supports video interpolation task for training now.
## cpu train
CUDA_VISIBLE_DEVICES=-1 python tools/train.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py
## single-gpu train
python tools/train.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py
## multi-gpu train
./tools/dist_train.sh configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py 8
For more details, you can refer to Train a model part in train_test.md.
Test
Test Instructions
You can use the following commands to test a model with cpu or single/multiple GPUs.
TOF supports two tasks for testing.
Task 1: Video Interpolation
## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth
## single-gpu test
python tools/test.py configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth
## multi-gpu test
./tools/dist_test.sh configs/tof/tof_spynet-chair-wobn_1xb1_vimeo90k-triplet.py https://download.openmmlab.com/mmediting/video_interpolators/toflow/pretrained_spynet_chair_20220321-4d82e91b.pth 8
Task 2: Video Super-Resolution
## cpu test
CUDA_VISIBLE_DEVICES=-1 python tools/test.py configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth
## single-gpu test
python tools/test.py configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth
## multi-gpu test
./tools/dist_test.sh configs/tof/tof_x4_official_vimeo90k.py https://download.openmmlab.com/mmediting/restorers/tof/tof_x4_vimeo90k_official-a569ff50.pth 8
For more details, you can refer to Test a pre-trained model part in train_test.md.
Citation¶
@article{xue2019video,
title={Video enhancement with task-oriented flow},
author={Xue, Tianfan and Chen, Baian and Wu, Jiajun and Wei, Donglai and Freeman, William T},
journal={International Journal of Computer Vision},
volume={127},
number={8},
pages={1106--1125},
year={2019},
publisher={Springer}
}