SOTAVerified

6D-VNet: End-to-end 6DoF Vehicle Pose Estimation from Monocular RGB Images

2019-06-15The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2019 2019Code Available0· sign in to hype

Di Wu, Zhaoyong Zhuang, Canqun Xiang, Wenbin Zou and Xia Li

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We present a conceptually simple framework for 6DoF object pose estimation, especially for autonomous driving scenario. Our approach efficiently detects traffic partic- ipants in a monocular RGB image while simultaneously regressing their 3D translation and rotation vectors. The method, called 6D-VNet, extends Mask R-CNN by adding customised heads for predicting vehicle s finer class, ro- tation and translation. The proposed 6D-VNet is trained end-to-end compared to previous methods. Furthermore, we show that the inclusion of translational regression in the joint losses is crucial for the 6DoF pose estimation task, where object translation distance along longitudinal axis varies significantly, e.g., in autonomous driving sce- narios. Additionally, we incorporate the mutual informa- tion between traffic participants via a modified non-local block. As opposed to the original non-local block imple- mentation, the proposed weighting modification takes the spatial neighbouring information into consideration whilst counteracting the effect of extreme gradient values. Our 6D-VNet reaches the 1 st place in ApolloScape challenge 3D Car Instance task1 [21]. Code has been made available at: https://github.com/stevenwudi/6DVNET.

Tasks

Reproductions