Video Frame Interpolation Transformer

2021-11-27CVPR 2022Code Available1· sign in to hype

Zhihao Shi, Xiangyu Xu, Xiaohong Liu, Jun Chen, Ming-Hsuan Yang

Code Available — Be the first to reproduce this paper.

Code

github.com/zhshi0816/video-frame-interpolation-transformer
OfficialIn paperpytorch★ 103

Abstract

Existing methods for video interpolation heavily rely on deep convolution neural networks, and thus suffer from their intrinsic limitations, such as content-agnostic kernel weights and restricted receptive field. To address these issues, we propose a Transformer-based video interpolation framework that allows content-aware aggregation weights and considers long-range dependencies with the self-attention operations. To avoid the high computational cost of global self-attention, we introduce the concept of local attention into video interpolation and extend it to the spatial-temporal domain. Furthermore, we propose a space-time separation strategy to save memory usage, which also improves performance. In addition, we develop a multi-scale frame synthesis scheme to fully realize the potential of Transformers. Extensive experiments demonstrate the proposed model performs favorably against the state-of-the-art methods both quantitatively and qualitatively on a variety of benchmark datasets.

Tasks

Video Frame Interpolation

Video Frame Interpolation Transformer

Code

Abstract

Tasks

Reproductions