SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 10511100 of 2042 papers

TitleStatusHype
'Part'ly first among equals: Semantic part-based benchmarking for state-of-the-art object recognition systems0
PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database0
PathTrack: Fast Trajectory Annotation with Path Supervision0
PennSyn2Real: Training Object Recognition Models without Human Labeling0
People infer recursive visual concepts from just a few examples0
Perceptual Inductive Bias Is What You Need Before Contrastive Learning0
Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images0
Psychophysical-Score: A Behavioral Measure for Assessing the Biological Plausibility of Visual Recognition Models0
PERCH: Perception via Search for Multi-Object Recognition and Localization0
Performance Assessment of Feature Detection Methods for 2-D FS Sonar Imagery0
Performance comparison of 3D correspondence grouping algorithm for 3D plant point clouds0
Performance Evaluation of 3D Correspondence Grouping Algorithms0
Performance Evaluation of Learned 3D Features0
Performance Evaluation of Raster Based Shape Vectors in Object Recognition0
Performance of object recognition in wearable videos0
Performance-optimized deep neural networks are evolving into worse models of inferotemporal visual cortex0
Periocular Recognition Using CNN Features Off-the-Shelf0
Persistence-based Structural Recognition0
Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning0
Physics-based Scene-level Reasoning for Object Pose Estimation in Clutter0
PiCoDes: Learning a Compact Code for Novel-Category Recognition0
PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Edge-Preserving Coherence0
PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Spatial Priors0
Pixel personality for dense object tracking in a 2D honeybee hive0
Pixels to Voxels: Modeling Visual Representation in the Human Brain0
Pixel-wise Segmentation of Street with Neural Networks0
Place recognition survey: An update on deep learning approaches0
PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI0
Point Cloud Sampling via Graph Balancing and Gershgorin Disc Alignment0
Pointing Novel Objects in Image Captioning0
Polyhedral Object Recognition by Indexing0
Poseidon: A System Architecture for Efficient GPU-based Deep Learning on Multiple Machines0
Pose-Invariant Object Recognition for Event-Based Vision with Slow-ELM0
Positive-Unlabeled Domain Adaptation0
Pragmatic descriptions of perceptual stimuli0
Predicting beauty, liking, and aesthetic quality: A comparative analysis of image databases for visual aesthetics research0
Predicting the Road Ahead: A Knowledge Graph based Foundation Model for Scene Understanding in Autonomous Driving0
Predicting When Saliency Maps Are Accurate and Eye Fixations Consistent0
Pre-Trained Convolutional Neural Network Features for Facial Expression Recognition0
Proactive Adversarial Defense: Harnessing Prompt Tuning in Vision-Language Models to Detect Unseen Backdoored Images0
Importance Filtered Cross-Domain Adaptation0
Procedural Text Generation from an Execution Video0
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks0
Projection: A Mechanism for Human-like Reasoning in Artificial Intelligence0
PROTOTYPE-ASSISTED ADVERSARIAL LEARNING FOR UNSUPERVISED DOMAIN ADAPTATION0
Prune and distill: similar reformatting of image information along rat visual cortex and deep neural networks0
PseudoTouch: Efficiently Imaging the Surface Feel of Objects for Robotic Manipulation0
Putting GPT-4o to the Sword: A Comprehensive Evaluation of Language, Vision, Speech, and Multimodal Proficiency0
Quantifying Adversarial Sensitivity of a Model as a Function of the Image Distribution0
Quantifying Translation-Invariance in Convolutional Neural Networks0
Show:102550
← PrevPage 22 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified