SOTAVerified

Object Detection

Papers

Showing 51100 of 10957 papers

TitleStatusHype
GLIPv2: Unifying Localization and Vision-Language UnderstandingCode4
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and SegmentationCode4
Vision GNN: An Image is Worth Graph of NodesCode4
GCoNet+: A Stronger Group Collaborative Co-Salient Object DetectorCode4
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense PredictionCode4
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNNCode4
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View RepresentationCode4
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual ModelsCode4
PP-YOLOE: An evolved version of YOLOCode4
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object DetectionCode4
DN-DETR: Accelerate DETR Training by Introducing Query DeNoisingCode4
Visual Attention NetworkCode4
Detectron2 Object Detection & Manipulating Images using CartoonizationCode4
Deep Residual Learning for Image RecognitionCode4
Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation ModelsCode3
OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in InfographicsCode3
Detect Anything 3D in the WildCode3
Playing Non-Embedded Card-Based Games with Reinforcement LearningCode3
Frequency Dynamic Convolution for Dense Image PredictionCode3
Falcon: A Remote Sensing Vision-Language Foundation ModelCode3
Text-guided Sparse Voxel Pruning for Efficient 3D Visual GroundingCode3
SM3Det: A Unified Model for Multi-Modal Remote Sensing Object DetectionCode3
Cubify Anything: Scaling Indoor 3D Object DetectionCode3
Video-RAG: Visually-aligned Retrieval-Augmented Long Video ComprehensionCode3
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
Rethinking the Evaluation of Visible and Infrared Image FusionCode3
RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive SupervisionCode3
A Survey of Camouflaged Object Detection and BeyondCode3
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing CommunityCode3
5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition TasksCode3
Panacea+: Panoramic and Controllable Video Generation for Autonomous DrivingCode3
DeepInteraction++: Multi-Modality Interaction for Autonomous DrivingCode3
Hyper-YOLO: When Visual Object Detection Meets Hypergraph ComputationCode3
Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object DetectionCode3
Practical Video Object Detection via Feature Selection and AggregationCode3
LION: Linear Group RNN for 3D Object Detection in Point CloudsCode3
Relation DETR: Exploring Explicit Position Relation Prior for Object DetectionCode3
TCFormer: Visual Recognition via Token Clustering TransformerCode3
OVLW-DETR: Open-Vocabulary Light-Weighted Detection TransformerCode3
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective FusionCode3
Visible-Thermal Tiny Object Detection: A Benchmark Dataset and BaselinesCode3
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language ModelsCode3
Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance SegmentationCode3
Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object DetectionCode3
PlainMamba: Improving Non-Hierarchical Mamba in Visual RecognitionCode3
RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object DetectionCode3
Multiple Object Tracking as ID PredictionCode3
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering RefinementCode3
IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object DetectionCode3
MTP: Advancing Remote Sensing Foundation Model via Multi-Task PretrainingCode3
Show:102550
← PrevPage 2 of 220Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Co-DETRbox mAP66Unverified
2InternImage-H (M3I Pre-training)box mAP65.5Unverified
3M3I Pre-training (InternImage-H)box mAP65.4Unverified
4MoCaEbox mAP65.1Unverified
5Co-DETR (Swin-L)box mAP64.8Unverified
6Focal-Stable-DINO (Focal-Huge, no TTA)box mAP64.8Unverified
7EVAbox mAP64.7Unverified
8Group DETR v2box mAP64.5Unverified
9FocalNet-H (DINO)box mAP64.4Unverified
10InternImage-XLbox mAP64.3Unverified