Real-time object detection and tracking using flash LiDAR imagery
Daniel Carvalho, Art Lompado, Riccardo Consolo, Abhijit Bhattacharjee, Jarrod Brown
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
In this study, we present a real-time vehicle detection program that combines the You-Only-Look-Once-X (YOLOX) object detection algorithm with a multi-object Kalman filter tracker, specifically designed for analyzing 3D light detection and ranging (LiDAR) data. The use of an active imager, such as LiDAR, offers significant advantages over conventional passive 2D imagery. By providing its illumination source, LiDAR eliminates color fluctuations caused by shadowing or diurnal cycling, resulting in improved precision and accuracy for object detection and classification. Our approach involves capturing videos of 8 vehicles using an Advanced Scientific Concepts TigerCub 3D Flash LiDAR camera, which provides intensity and range data sequences. These sequences are then converted into representative color images, which are used to train the YOLOX object detector neural network. To further enhance the detection accuracy for obscured vehicles and minimize the wrong detection rate, we integrate Kalman filter trackers into the detection algorithm. These trackers identify the vehicles and predict their future locations, effectively reducing both false positive and false negative detections. The resulting algorithm is lightweight and capable of producing highly accurate inference results in near real-time on a live-stream of LiDAR data. To demonstrate the applicability of our approach on small, unmanned vehicles/drones, we deploy the application on NVIDIA's Jetson Orin Nano embedded processor for AI. By optimizing the code using TensorRT for real-time performance, we achieve object detection and classification of flash LiDAR data at an average precision exceeding 95% and a rate of 60 frames-per-second. MATLAB plays a crucial role in enabling rapid prototyping and algorithm testing, facilitating the smooth transfer and deployment of the complex deep learning logic to an edge device without compromising performance or accuracy.