SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 51100 of 2042 papers

TitleStatusHype
Benchmarking Multimodal Mathematical Reasoning with Explicit Visual DependencyCode1
Generalizable Data-free Objective for Crafting Universal Adversarial PerturbationsCode1
Large-scale Remote Sensing Image Target Recognition and Automatic AnnotationCode1
Going Deeper with ConvolutionsCode1
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningCode1
ImageNet Large Scale Visual Recognition ChallengeCode1
Implicit Feature Refinement for Instance SegmentationCode1
Comparison of semi-supervised deep learning algorithms for audio classificationCode1
Ev-TTA: Test-Time Adaptation for Event-Based Object RecognitionCode1
EventCLIP: Adapting CLIP for Event-based Object RecognitionCode1
Expanding Event Modality Applications through a Robust CLIP-Based EncoderCode1
Intriguing properties of generative classifiersCode1
Enriching ImageNet with Human Similarity Judgments and Psychological EmbeddingsCode1
E2PNet: Event to Point Cloud Registration with Spatio-Temporal Representation LearningCode1
Equalization Loss for Long-Tailed Object RecognitionCode1
Explainability-Aware One Point Attack for Point Cloud Neural NetworksCode1
DOCTOR: A Simple Method for Detecting Misclassification ErrorsCode1
Distributed Deep Neural Networks over the Cloud, the Edge and End DevicesCode1
Domain Generalization for Object Recognition with Multi-task AutoencodersCode1
Describing Textures in the WildCode1
Deep Predictive Coding Networks for Video Prediction and Unsupervised LearningCode1
Deep Subdomain Adaptation Network for Image ClassificationCode1
Doubly Right Object Recognition: A Why Prompt for Visual RationalesCode1
Explainable GeoAI: Can saliency maps help interpret artificial intelligence's learning process? An empirical study on natural feature detectionCode1
DaWin: Training-free Dynamic Weight Interpolation for Robust AdaptationCode1
Debiased Self-Training for Semi-Supervised LearningCode1
CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessmentCode1
CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal DynamicsCode1
Decoding Natural Images from EEG for Object RecognitionCode1
DeepScores -- A Dataset for Segmentation, Detection and Classification of Tiny ObjectsCode1
Densely Connected Convolutional NetworksCode1
DesCo: Learning Object Recognition with Rich Language DescriptionsCode1
DetMatch: Two Teachers are Better Than One for Joint 2D and 3D Semi-Supervised Object DetectionCode1
Discover and Cure: Concept-aware Mitigation of Spurious CorrelationCode1
Divergences in Color Perception between Deep Neural Networks and HumansCode1
Do Adversarially Robust ImageNet Models Transfer Better?Code1
Dual-Hybrid Attention Network for Specular Highlight RemovalCode1
Dynamic Few-Shot Visual Learning without ForgettingCode1
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning modelsCode1
Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object DetectorsCode1
EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge DistillationCode1
Event-based Asynchronous Sparse Convolutional NetworksCode1
EventRPG: Event Data Augmentation with Relevance Propagation GuidanceCode1
Evolving Deep Neural NetworksCode1
Contributions of Shape, Texture, and Color in Visual RecognitionCode1
Contemplating real-world object classificationCode1
Rehearsal-Free Continual Learning over Small Non-I.I.D. BatchesCode1
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance SegmentationCode1
3D ShapeNets: A Deep Representation for Volumetric ShapesCode1
Convolutional Neural Networks with Gated Recurrent ConnectionsCode1
Show:102550
← PrevPage 2 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified