Sparse PointPillars: Maintaining and Exploiting Input Sparsity to Improve Runtime on Embedded Systems
Kyle Vedder, Eric Eaton
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/kylevedder/SparsePointPillarsOfficialIn paperpytorch★ 21
- github.com/Mind23-2/MindCode-5/tree/main/pointpillarsmindspore★ 0
- github.com/code-implementation1/Code6/tree/main/pointpillarsmindspore★ 0
Abstract
Bird's Eye View (BEV) is a popular representation for processing 3D point clouds, and by its nature is fundamentally sparse. Motivated by the computational limitations of mobile robot platforms, we create a fast, high-performance BEV 3D object detector that maintains and exploits this input sparsity to decrease runtimes over non-sparse baselines and avoids the tradeoff between pseudoimage area and runtime. We present results on KITTI, a canonical 3D detection dataset, and Matterport-Chair, a novel Matterport3D-derived chair detection dataset from scenes in real furnished homes. We evaluate runtime characteristics using a desktop GPU, an embedded ML accelerator, and a robot CPU, demonstrating that our method results in significant detection speedups (2X or more) for embedded systems with only a modest decrease in detection quality. Our work represents a new approach for practitioners to optimize models for embedded systems by maintaining and exploiting input sparsity throughout their entire pipeline to reduce runtime and resource usage while preserving detection performance.