SOTAVerified

3D Object Detection

3D Object Detection is a task in computer vision where the goal is to identify and locate objects in a 3D environment based on their shape, location, and orientation. It involves detecting the presence of objects and determining their location in the 3D space in real-time. This task is crucial for applications such as autonomous vehicles, robotics, and augmented reality.

( Image credit: AVOD )

Papers

Showing 125 of 1576 papers

TitleStatusHype
ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object DetectionCode4
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal TransformersCode4
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective SupervisionCode4
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View RepresentationCode4
TUMTraf V2X Cooperative Perception DatasetCode4
UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2heightCode4
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel ObjectsCode4
PETR: Position Embedding Transformation for Multi-View 3D Object DetectionCode3
LION: Linear Group RNN for 3D Object Detection in Point CloudsCode3
PETRv2: A Unified Framework for 3D Perception from Multi-Camera ImagesCode3
Leveraging Vision-Centric Multi-Modal Expertise for 3D Object DetectionCode3
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown ObjectsCode3
MagicDrive: Street View Generation with Diverse 3D Geometry ControlCode3
Panacea+: Panoramic and Controllable Video Generation for Autonomous DrivingCode3
EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose EstimationCode3
Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object DetectionCode3
EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose EstimationCode3
Cross Modal Transformer: Towards Fast and Robust 3D Object DetectionCode3
Detect Anything 3D in the WildCode3
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object DetectionCode3
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation ModelsCode3
Cubify Anything: Scaling Indoor 3D Object DetectionCode3
Geometric-aware Pretraining for Vision-centric 3D Object DetectionCode3
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object DetectionCode3
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object DetectionCode3
Show:102550
← PrevPage 1 of 64Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1EA-LSSNDS0.78Unverified
2MegFusionNDS0.77Unverified
3MMFusion-eNDS0.77Unverified
4BEVFusion-eNDS0.76Unverified
5RacoonPowerNDS0.76Unverified
6DeepInteraction-largeNDS0.76Unverified
7DeepInteraction-eNDS0.76Unverified
8FusionVPENDS0.75Unverified
9FocalFormer3D-FNDS0.75Unverified
10CenterPoint-FusionNDS0.75Unverified