SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 101150 of 2042 papers

TitleStatusHype
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance SegmentationCode1
Decoding Natural Images from EEG for Object RecognitionCode1
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual DependencyCode1
Deep Gaze I: Boosting Saliency Prediction with Feature Maps Trained on ImageNetCode1
Matching the Neuronal Representations of V1 is Necessary to Improve Robustness in CNNs with V1-like Front-endsCode1
Efficient Attention: Attention with Linear ComplexitiesCode1
Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking ConsistencyCode1
Deep Predictive Coding Networks for Video Prediction and Unsupervised LearningCode1
Deep Learning for Event-based Vision: A Comprehensive Survey and BenchmarksCode1
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
From Chaos Comes Order: Ordering Event Representations for Object Recognition and DetectionCode1
Learning what and where to attendCode1
ImageNet Large Scale Visual Recognition ChallengeCode1
Deep Subdomain Adaptation Network for Image ClassificationCode1
Densely Connected Convolutional NetworksCode1
ObjectNet Dataset: Reanalysis and CorrectionCode1
DesCo: Learning Object Recognition with Rich Language DescriptionsCode1
Describing Textures in the WildCode1
When and how CNNs generalize to out-of-distribution category-viewpoint combinationsCode1
DetMatch: Two Teachers are Better Than One for Joint 2D and 3D Semi-Supervised Object DetectionCode1
Adaptive Subspaces for Few-Shot LearningCode1
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic SurveyCode1
Adaptive Threshold for Online Object Recognition and Re-identification TasksCode1
Discover and Cure: Concept-aware Mitigation of Spurious CorrelationCode1
OverFeat: Integrated Recognition, Localization and Detection using Convolutional NetworksCode1
Expanding Event Modality Applications through a Robust CLIP-Based EncoderCode1
Evolving Deep Neural NetworksCode1
Explainability-Aware One Point Attack for Point Cloud Neural NetworksCode1
EventCLIP: Adapting CLIP for Event-based Object RecognitionCode1
BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in VideoCode1
EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge DistillationCode1
EventRPG: Event Data Augmentation with Relevance Propagation GuidanceCode1
Explainable GeoAI: Can saliency maps help interpret artificial intelligence's learning process? An empirical study on natural feature detectionCode1
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning modelsCode1
Dynamic Few-Shot Visual Learning without ForgettingCode1
Equalization Loss for Long-Tailed Object RecognitionCode1
Attribution in Scale and SpaceCode1
Event-based Asynchronous Sparse Convolutional NetworksCode1
Billion-scale semi-supervised learning for image classificationCode1
Bilateral Event Mining and Complementary for Event Stream Super-ResolutionCode1
Ev-TTA: Test-Time Adaptation for Event-Based Object RecognitionCode1
Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?Code1
Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image ClassificationCode1
Exploring the Transferability of Visual Prompting for Multimodal Large Language ModelsCode1
FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing ImageryCode1
Rehearsal-Free Continual Learning over Small Non-I.I.D. BatchesCode1
Causal Transportability for Visual RecognitionCode1
FSD: Fast Self-Supervised Single RGB-D to Categorical 3D ObjectsCode1
Dual-Hybrid Attention Network for Specular Highlight RemovalCode1
E2PNet: Event to Point Cloud Registration with Spatio-Temporal Representation LearningCode1
Show:102550
← PrevPage 3 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified