SOTAVerified

Object Detection

Papers

Showing 101150 of 10957 papers

TitleStatusHype
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense PredictionsCode3
VisionLLaMA: A Unified LLaMA Backbone for Vision TasksCode3
Theoretically Achieving Continuous Representation of Oriented Bounding BoxesCode3
State Space Models for Event CamerasCode3
Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and BaselineCode3
General Object Foundation Model for Images and Videos at ScaleCode3
AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into OneCode3
UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image RecognitionCode3
Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object DetectionCode3
Leveraging Vision-Centric Multi-Modal Expertise for 3D Object DetectionCode3
MagicDrive: Street View Generation with Diverse 3D Geometry ControlCode3
How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary DetectionCode3
SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and MoreCode3
Geometric-aware Pretraining for Vision-centric 3D Object DetectionCode3
EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose EstimationCode3
Cross-Modal Causal Intervention for Medical Report GenerationCode3
SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous DrivingCode3
Universal Instance Perception as Object Discovery and RetrievalCode3
Cut and Learn for Unsupervised Object Detection and Instance SegmentationCode3
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked ModelingCode3
Cross Modal Transformer: Towards Fast and Robust 3D Object DetectionCode3
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked AutoencodersCode3
DETRs with Collaborative Hybrid Assignments TrainingCode3
Vision-Language Pre-training: Basics, Recent Advances, and Future TrendsCode3
Revisiting Image Pyramid Structure for High Resolution Salient Object DetectionCode3
OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection networkCode3
Vision Transformers: From Semantic Segmentation to Dense PredictionCode3
Separable Self-attention for Mobile Vision TransformersCode3
PETRv2: A Unified Framework for 3D Perception from Multi-Camera ImagesCode3
TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous DrivingCode3
Vision Transformer Adapter for Dense PredictionsCode3
MaxViT: Multi-Axis Vision TransformerCode3
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object DetectionCode3
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object TrackingCode3
PETR: Position Embedding Transformation for Multi-View 3D Object DetectionCode3
Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception ChallengeCode3
XCiT: Cross-Covariance Image TransformersCode3
Robust and Accurate Object Detection via Adversarial LearningCode3
A Comparative Analysis of Object Detection Metrics with a Companion Open-Source ToolkitCode3
Deformable DETR: Deformable Transformers for End-to-End Object DetectionCode3
A Survey on Performance Metrics for Object-Detection AlgorithmsCode3
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object DetectionCode3
U^2-Net: Going Deeper with Nested U-Structure for Salient Object DetectionCode3
YOLOv4: Optimal Speed and Accuracy of Object DetectionCode3
ResNeSt: Split-Attention NetworksCode3
Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample SelectionCode3
EfficientDet: Scalable and Efficient Object DetectionCode3
Bag of Freebies for Training Object Detection Neural NetworksCode3
MMLSpark: Unifying Machine Learning Ecosystems at Massive ScalesCode3
Realtime Multi-Person 2D Pose Estimation using Part Affinity FieldsCode3
Show:102550
← PrevPage 3 of 220Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Co-DETRbox mAP66Unverified
2InternImage-H (M3I Pre-training)box mAP65.5Unverified
3M3I Pre-training (InternImage-H)box mAP65.4Unverified
4MoCaEbox mAP65.1Unverified
5Co-DETR (Swin-L)box mAP64.8Unverified
6Focal-Stable-DINO (Focal-Huge, no TTA)box mAP64.8Unverified
7EVAbox mAP64.7Unverified
8Group DETR v2box mAP64.5Unverified
9FocalNet-H (DINO)box mAP64.4Unverified
10InternImage-XLbox mAP64.3Unverified