SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 101150 of 2042 papers

TitleStatusHype
Comparing Photorealism in Game Engines for Synthetic Maritime Computer Vision Datasets0
LRSAA: Large-scale Remote Sensing Image Target Recognition and Automatic AnnotationCode1
Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation0
ViSTa Dataset: Do vision-language models understand sequential tasks?Code0
Interactive Medical Image Segmentation: A Benchmark Dataset and BaselineCode3
Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot LearningCode1
LightFFDNets: Lightweight Convolutional Neural Networks for Rapid Facial Forgery Detection0
Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media ContextsCode0
Multiscale Dubuc: A New Similarity Measure for Time SeriesCode0
Long-Tailed Object Detection Pre-training: Dynamic Rebalancing Contrastive Learning with Dual Reconstruction0
DipMe: Haptic Recognition of Granular Media for Tangible Interactive Applications0
Large-scale Remote Sensing Image Target Recognition and Automatic AnnotationCode1
Hidden in Plain Sight: Evaluating Abstract Shape Recognition in Vision-Language ModelsCode0
Scaling Laws for Task-Optimized Models of the Primate Visual Ventral StreamCode0
Object Recognition in Human Computer Interaction:- A Comparative Analysis0
Lost in Context: The Influence of Context on Feature Attribution Methods for Object RecognitionCode0
Learning Where to Edit Vision TransformersCode0
Active Gaze Behavior Boosts Self-Supervised Object Learning0
Investigating the Gestalt Principle of Closure in Deep Convolutional Neural NetworksCode0
Unsupervised Object Discovery: A Comprehensive Survey and Unified Taxonomy0
Training the Untrainable: Introducing Inductive Bias via Representational Alignment0
Few-shot target-driven instance detection based on open-vocabulary object detection models0
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
Development of Image Collection Method Using YOLO and Siamese Network0
big.LITTLE Vision Transformer for Efficient Visual Recognition0
ChartKG: A Knowledge-Graph-Based Representation for Chart Images0
Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts0
DAAL: Density-Aware Adaptive Line Margin Loss for Multi-Modal Deep Metric LearningCode0
MVP-Bench: Can Large Vision--Language Models Conduct Multi-level Visual Perception Like Humans?Code0
Fast Object Detection with a Machine Learning Edge Device0
DaWin: Training-free Dynamic Weight Interpolation for Robust AdaptationCode1
CSIM: A Copula-based similarity index sensitive to local changes for Image quality assessmentCode1
Perceptual Piercing: Human Visual Cue-based Object Detection in Low Visibility ConditionsCode0
Can We Remove the Ground? Obstacle-aware Point Cloud Compression for Remote Object Detection0
Semantic Segmentation of Unmanned Aerial Vehicle Remote Sensing Images using SegFormer0
You Only Speak Once to See0
Enhancing Crime Scene Investigations through Virtual Reality and Deep Learning Techniques0
AI-Powered Augmented Reality for Satellite Assembly, Integration and Test0
SeqNet: Sequential Networks for One-Shot Traffic Sign Recognition With Transfer LearningCode0
Formula-Supervised Visual-Geometric Pre-training0
EventDance++: Language-guided Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition0
A dynamic vision sensor object recognition model based on trainable event-driven convolution and spiking attention mechanism0
Benchmarking VLMs' Reasoning About Persuasive Atypical Images0
Do Pre-trained Vision-Language Models Encode Object States?Code0
Can Large Language Models Grasp Event Signals? Exploring Pure Zero-Shot Event-based RecognitionCode0
Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory AnnotationsCode0
Generalization Boosted Adapter for Open-Vocabulary Segmentation0
Performance Assessment of Feature Detection Methods for 2-D FS Sonar Imagery0
A Bayesian Framework for Active Tactile Object Recognition, Pose Estimation and Shape Transfer Learning0
Fast Deep Predictive Coding Networks for Videos Feature Extraction without Labels0
Show:102550
← PrevPage 3 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified