SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 151200 of 2042 papers

TitleStatusHype
Limited but consistent gains in adversarial robustness by co-training object recognition models with human EEG0
UAV (Unmanned Aerial Vehicles): Diverse Applications of UAV Datasets in Segmentation, Classification, Detection, and Tracking0
Low-Resolution Object Recognition with Cross-Resolution Relational Contrastive Distillation0
Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOSCode0
OpenNav: Efficient Open Vocabulary 3D Object Detection for Smart Wheelchair NavigationCode0
Optimizing Spatio-Temporal Information Processing in Spiking Neural Networks via Unconstrained Leaky Integrate-and-Fire Neurons and Hybrid CodingCode0
Finding Closure: A Closer Look at the Gestalt Law of Closure in Convolutional Neural Networks0
How Small is Big Enough? Open Labeled Datasets and the Development of Deep Learning0
Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image ClassificationCode1
Robust Domain Generalization for Multi-modal Object Recognition0
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic SurveyCode1
UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond ScalingCode3
ArtVLM: Attribute Recognition Through Vision-Based Prefix Language ModelingCode0
Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges0
Source-Free Domain-Invariant Performance Prediction0
A General Ambiguity Model for Binary Edge Images with Edge Tracing and its Implementation0
THOR2: Topological Analysis for 3D Shape and Color-Based Human-Inspired Object Recognition in Unseen EnvironmentsCode0
EZSR: Event-based Zero-Shot Recognition0
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object DetectionCode1
Combined CNN and ViT features off-the-shelf: Another astounding baseline for recognition0
AI-based Density Recognition0
A Dataset for Crucial Object Recognition in Blind and Low-Vision Individuals' NavigationCode0
Affordance Labeling and Exploration: A Manifold-Based Approach0
EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition0
OCTrack: Benchmarking the Open-Corpus Multi-Object Tracking0
SUSTechGAN: Image Generation for Object Detection in Adverse Conditions of Autonomous DrivingCode0
Real-Time 3D Occupancy Prediction via Geometric-Semantic Disentanglement0
Data-driven Verification of DNNs for Object Recognition0
Dual-Hybrid Attention Network for Specular Highlight RemovalCode1
PartImageNet++ Dataset: Scaling up Part-based Models for Robust RecognitionCode1
Teaching CORnet Human fMRI Representations for Enhanced Model-Brain Alignment0
Introducing VaDA: Novel Image Segmentation Model for Maritime Object Segmentation Using New Dataset0
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understanding0
Enhanced Model Robustness to Input Corruptions by Per-corruption Adaptation of Normalization Statistics0
The Cooperative Network Architecture: Learning Structured Networks as Representation of Sensory PatternsCode0
Interpreting the Residual Stream of ResNet180
CBM: Curriculum by MaskingCode0
Object recognition in primates: What can early visual areas contribute?0
Beyond Viewpoint: Robust 3D Object Recognition under Arbitrary Views through Joint Multi-Part Representation0
Comics Datasets Framework: Mix of Comics datasets for detection benchmarkingCode1
EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More0
Efficient Event Stream Super-Resolution with Recursive Multi-Branch FusionCode0
MG-LLaVA: Towards Multi-Granularity Visual Instruction TuningCode2
Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency0
3D Instance Segmentation Using Deep Learning on RGB-D Indoor Data0
StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic ImagesCode2
Benchmarking Out-of-Distribution Generalization Capabilities of DNN-based Encoding Models for the Ventral Visual Cortex0
The BabyView dataset: High-resolution egocentric videos of infants' and young children's everyday experiences0
I-MPN: Inductive Message Passing Network for Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data0
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis0
Show:102550
← PrevPage 4 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified