DeepFuse: An IMU-Aware Network for Real-Time 3D Human Pose Estimation from Multi-View Image

2019-12-09Unverified0· sign in to hype

Fuyang Huang, Ailing Zeng, Minhao Liu, Qiuxia Lai, Qiang Xu

Unverified — Be the first to reproduce this paper.

Abstract

In this paper, we propose a two-stage fully 3D network, namely DeepFuse, to estimate human pose in 3D space by fusing body-worn Inertial Measurement Unit (IMU) data and multi-view images deeply. The first stage is designed for pure vision estimation. To preserve data primitiveness of multi-view inputs, the vision stage uses multi-channel volume as data representation and 3D soft-argmax as activation layer. The second one is the IMU refinement stage which introduces an IMU-bone layer to fuse the IMU and vision data earlier at data level. without requiring a given skeleton model a priori, we can achieve a mean joint error of 28.9mm on TotalCapture dataset and 13.4mm on Human3.6M dataset under protocol 1, improving the SOTA result by a large margin. Finally, we discuss the effectiveness of a fully 3D network for 3D pose estimation experimentally which may benefit future research.

Tasks

3D Human Pose Estimation 3D Pose Estimation Pose Estimation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Total Capture	DeepFuse-IMU	Average MPJPE (mm)	28.9	—	Unverified
Total Capture	DeepFuse-Vision Only	Average MPJPE (mm)	32.7	—	Unverified

DeepFuse: An IMU-Aware Network for Real-Time 3D Human Pose Estimation from Multi-View Image

Abstract

Tasks

Benchmark Results

Reproductions