SOTAVerified

Object Detection

Papers

Showing 226250 of 10957 papers

TitleStatusHype
Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial ImagesCode2
MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object DetectionCode2
COALA: A Practical and Vision-Centric Federated Learning PlatformCode2
PartGLEE: A Foundation Model for Recognizing and Parsing Any ObjectsCode2
ESOD: Efficient Small Object Detection on High-Resolution ImagesCode2
GroupMamba: Efficient Group-Based Visual State Space ModelCode2
GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook RetrievalCode2
Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded ScenesCode2
LaMI-DETR: Open-Vocabulary Detection with Language Model InstructionCode2
OPEN: Object-wise Position Embedding for Multi-view 3D Object DetectionCode2
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark DatasetCode2
Projecting Points to Axes: Oriented Object Detection via Point-Axis RepresentationCode2
SCSA: Exploring the Synergistic Effects Between Spatial and Channel AttentionCode2
Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detectionCode2
SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing IndustryCode2
SegVG: Transferring Object Bounding Box to Segmentation for Visual GroundingCode2
SOOD++: Leveraging Unlabeled Data to Boost Oriented Object DetectionCode2
The Surprising Effectiveness of Multimodal Large Language Models for Video Moment RetrievalCode2
LeYOLO, New Scalable and Efficient CNN Architecture for Object DetectionCode2
Scaling Efficient Masked Image Modeling on Large Remote Sensing DatasetCode2
Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object DetectionCode2
EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation ModelsCode2
BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object DetectionCode2
STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite ImageryCode2
EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy NetworkCode2
Show:102550
← PrevPage 10 of 439Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Co-DETRbox mAP66Unverified
2InternImage-H (M3I Pre-training)box mAP65.5Unverified
3M3I Pre-training (InternImage-H)box mAP65.4Unverified
4MoCaEbox mAP65.1Unverified
5Co-DETR (Swin-L)box mAP64.8Unverified
6Focal-Stable-DINO (Focal-Huge, no TTA)box mAP64.8Unverified
7EVAbox mAP64.7Unverified
8Group DETR v2box mAP64.5Unverified
9FocalNet-H (DINO)box mAP64.4Unverified
10InternImage-XLbox mAP64.3Unverified