Predicting Camera Viewpoint Improves Cross-dataset Generalization for 3D Human Pose Estimation

2020-04-07Unverified0· sign in to hype

Zhe Wang, Daeyun Shin, Charless C. Fowlkes

Unverified — Be the first to reproduce this paper.

Abstract

Monocular estimation of 3d human pose has attracted increased attention with the availability of large ground-truth motion capture datasets. However, the diversity of training data available is limited and it is not clear to what extent methods generalize outside the specific datasets they are trained on. In this work we carry out a systematic study of the diversity and biases present in specific datasets and its effect on cross-dataset generalization across a compendium of 5 pose datasets. We specifically focus on systematic differences in the distribution of camera viewpoints relative to a body-centered coordinate frame. Based on this observation, we propose an auxiliary task of predicting the camera viewpoint in addition to pose. We find that models trained to jointly predict viewpoint and pose systematically show significantly improved cross-dataset generalization.

Tasks

3D Human Pose Estimation Diversity Monocular 3D Human Pose Estimation Pose Estimation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Geometric Pose Affordance	Cross Dataset Generalization	MPJPE	53.3	—	Unverified
Surreal	Cross Dataset Generalization	MPJPE	37.1	—	Unverified

Predicting Camera Viewpoint Improves Cross-dataset Generalization for 3D Human Pose Estimation

Abstract

Tasks

Benchmark Results

Reproductions