Predcnn: Predictive learning with cascade convolutions

2018-07-01Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18} 2018Code Available0· sign in to hype

Ziru Xu, Yunbo Wang, Mingsheng Long, Jian-Min Wang

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/xzr12/PredCNN
tf★ 27

Abstract

Predicting future frames in videos remains an unsolved but challenging problem. Mainstream recurrent models suffer from huge memory usage and computation cost, while convolutional models are unable to effectively capture the temporal dependencies between consecutive video frames. To tackle this problem, we introduce an entirely CNN-based architecture, PredCNN, that models the dependencies between the next frame and the sequential video inputs. Inspired by the core idea of recurrent models that previous states have more transition operations than future states, we design a cascade multiplicative unit (CMU) that provides relatively more operations for previous video frames. This newly proposed unit enables PredCNN to predict future spatiotemporal data without any recurrent chain structures, which eases gradient propagation and enables a fully paralleled optimization. We show that PredCNN outperforms the state-of-the-art recurrent models for video prediction on the standard Moving MNIST dataset and two challenging crowd flow prediction datasets, and achieves a faster training speed and lower memory footprint.

Tasks

Pose Prediction Video Prediction

Predcnn: Predictive learning with cascade convolutions

Code

Abstract

Tasks

Reproductions