Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction

2025-04-18Code Available1· sign in to hype

Yushen He, Lei Zhao, Tianchen Deng, Zipeng Fang, Weidong Chen

Code Available — Be the first to reproduce this paper.

Code

github.com/tosshero/3d_perception
OfficialIn paperpytorch★ 11
github.com/tosshero/ros_packages
OfficialIn paperpytorch★ 6

Abstract

Service mobile robots are often required to avoid dynamic objects while performing their tasks, but they usually have only limited computational resources. So we present a lightweight multi-modal framework for 3D object detection and trajectory prediction. Our system synergistically integrates LiDAR and camera inputs to achieve real-time perception of pedestrians, vehicles, and riders in 3D space. The framework proposes two novel modules: 1) a Cross-Modal Deformable Transformer (CMDT) for object detection with high accuracy and acceptable amount of computation, and 2) a Reference Trajectory-based Multi-Class Transformer (RTMCT) for efficient and diverse trajectory prediction of mult-class objects with flexible trajectory lengths. Evaluations on the CODa benchmark demonstrate superior performance over existing methods across detection (+2.03% in mAP) and trajectory prediction (-0.408m in minADE5 of pedestrians) metrics. Remarkably, the system exhibits exceptional deployability - when implemented on a wheelchair robot with an entry-level NVIDIA 3060 GPU, it achieves real-time inference at 13.2 fps. To facilitate reproducibility and practical deployment, we release the related code of the method at https://github.com/TossherO/3D_Perception and its ROS inference version at https://github.com/TossherO/ros_packages.

Tasks

3D Object Detection GPU object-detection Object Detection Prediction Trajectory Prediction

Lightweight LiDAR-Camera 3D Dynamic Object Detection and Multi-Class Trajectory Prediction

Code

Abstract

Tasks

Reproductions