Learning Pose Grammar for Monocular 3D Pose Estimation

2019-06-01IEEE Transactions on Pattern Analysis and Machine Intelligence 2019Unverified0· sign in to hype

Yuanlu Xu, Wenguan Wang, Xiaobai Liu, Jianwen Xie, Song-Chun Zhu

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we propose a pose grammar to tackle the problem of 3D human pose estimation from a monocular RGB image. Our model takes estimated 2D pose as the input and learns a generalized 2D-3D mapping function to leverage into 3D pose. The proposed model consists of a base network which efficiently captures pose-aligned features and a hierarchy of Bi-directional RNNs (BRNNs) on the top to explicitly incorporate a set of knowledge regarding human body configuration (i.e., kinematics, symmetry, motor coordination). The proposed model thus enforces high-level constraints over human poses. In learning, we develop a data augmentation algorithm to further improve model robustness against appearance variations and cross-view generalization ability. We validate our method on public 3D human pose benchmarks and propose a new evaluation protocol working on cross-view setting to verify the generalization capability of different methods. We empirically observe that most state-of-the-art methods encounter difficulty under such setting while our method can well handle such challenges.

Tasks

3D Human Pose Estimation 3D Pose Estimation Data Augmentation Pose Estimation

Learning Pose Grammar for Monocular 3D Pose Estimation

Abstract

Tasks

Reproductions