SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 151200 of 2042 papers

TitleStatusHype
Full-Glow: Fully conditional Glow for more realistic image generationCode1
The Lottery Ticket Hypothesis for Object RecognitionCode1
Simulating a Primary Visual Cortex at the Front of CNNs Improves Robustness to Image PerturbationsCode1
Enriching ImageNet with Human Similarity Judgments and Psychological EmbeddingsCode1
RAMP-CNN: A Novel Neural Network for Enhanced Automotive Radar Object RecognitionCode1
Unsupervised Vision-and-Language Pre-training Without Parallel Images and CaptionsCode1
LCD -- Line Clustering and Description for Place RecognitionCode1
Natural Language Rationales with Full-Stack Visual Reasoning: From Pixels to Semantic Frames to Commonsense GraphsCode1
The MECCANO Dataset: Understanding Human-Object Interactions from Egocentric Videos in an Industrial-like DomainCode1
Robust and Efficient Post-Processing for Video Object Detection (REPP)Code1
Robust and efficient post-processing for video object detectionCode1
Offline Meta-Reinforcement Learning with Advantage WeightingCode1
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance SegmentationCode1
TactileSGNet: A Spiking Graph Neural Network for Event-based Tactile Object RecognitionCode1
Do Adversarially Robust ImageNet Models Transfer Better?Code1
When and how CNNs generalize to out-of-distribution category-viewpoint combinationsCode1
Single Shot MC Dropout ApproximationCode1
Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group SoftmaxCode1
Noise or Signal: The Role of Image Backgrounds in Object RecognitionCode1
v2e: From Video Frames to Realistic DVS EventsCode1
Adaptive Subspaces for Few-Shot LearningCode1
Computing the Testing Error Without a Testing SetCode1
Traditional Method Inspired Deep Neural Network for Edge DetectionCode1
Computing the Testing Error without a Testing SetCode1
When CNNs Meet Random RNNs: Towards Multi-Level Analysis for RGB-D Object and Scene RecognitionCode1
SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action RecognitionCode1
Understanding Integrated Gradients with SmoothTaylor for Deep Neural Network AttributionCode1
VOWEL: A Local Online Learning Rule for Recurrent Networks of Probabilistic Spiking Winner-Take-All CircuitsCode1
TOG: Targeted Adversarial Objectness Gradient Attacks on Real-time Object Detection SystemsCode1
Empirical Upper Bound, Error Diagnosis and Invariance Analysis of Modern Object DetectorsCode1
ObjectNet Dataset: Reanalysis and CorrectionCode1
Attribution in Scale and SpaceCode1
Look-into-Object: Self-supervised Structure Modeling for Object RecognitionCode1
Exploit Clues from Views: Self-Supervised and Regularized Learning for Multiview Object RecognitionCode1
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning modelsCode1
Event-based Asynchronous Sparse Convolutional NetworksCode1
Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual CategorizationCode1
Pose Augmentation: Class-agnostic Object Pose Transformation for Object RecognitionCode1
Equalization Loss for Long-Tailed Object RecognitionCode1
Self-Supervised Linear Motion DeblurringCode1
OPFython: A Python-Inspired Optimum-Path Forest ClassifierCode1
Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?Code1
Rehearsal-Free Continual Learning over Small Non-I.I.D. BatchesCode1
Billion-scale semi-supervised learning for image classificationCode1
Efficient Attention: Attention with Linear ComplexitiesCode1
ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustnessCode1
Compact Generalized Non-local NetworkCode1
PCL: Proposal Cluster Learning for Weakly Supervised Object DetectionCode1
Task-Driven Convolutional Recurrent Models of the Visual SystemCode1
Why do deep convolutional networks generalize so poorly to small image transformations?Code1
Show:102550
← PrevPage 4 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified