SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 17011750 of 2042 papers

TitleStatusHype
Does resistance to style-transfer equal Global Shape Bias? Measuring network sensitivity to global shape configurationCode0
MARVIS: Motion & Geometry Aware Real and Virtual Image SegmentationCode0
Genetic CNNCode0
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural ImagesCode0
MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving ObjectsCode0
Geometric and Textural Augmentation for Domain Gap ReductionCode0
MASSeg : 2nd Technical Report for 4th PVUW MOSE TrackCode0
Geometry-Based Region Proposals for Real-Time Robot Detection of Tabletop ObjectsCode0
CNN Fixations: An unraveling approach to visualize the discriminative image regionsCode0
Mixed Evidence for Gestalt Grouping in Deep Neural NetworksCode0
Generalisation in humans and deep neural networksCode0
Deep Cross Residual Learning for Multitask Visual RecognitionCode0
Global Second-order Pooling Convolutional NetworksCode0
A Domain Guided CNN Architecture for Predicting Age from Structural Brain ImagesCode0
GAANet: Ghost Auto Anchor Network for Detecting Varying Size Drones in DarkCode0
DeepCorrect: Correcting DNN models against Image DistortionsCode0
Deep Co-Occurrence Feature Learning for Visual Object RecognitionCode0
Deep Compositional Captioning: Describing Novel Object Categories without Paired Training DataCode0
Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistencyCode0
The Freiburg Groceries DatasetCode0
Deep Competitive Pathway NetworksCode0
Decision-making and control with diffractive optical networksCode0
Memory Aware Synapses: Learning what (not) to forgetCode0
DeCAF: A Deep Convolutional Activation Feature for Generic Visual RecognitionCode0
Grasp Pre-shape Selection by Synthetic Training: Eye-in-hand Shared Control on the Hannes ProsthesisCode0
DAAL: Density-Aware Adaptive Line Margin Loss for Multi-Modal Deep Metric LearningCode0
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agentsCode0
Grid Cell Path Integration For Movement-Based Visual Object RecognitionCode0
Grounded Human-Object Interaction Hotspots from VideoCode0
Adapting Deep Network Features to Capture Psychological RepresentationsCode0
Verbalized Representation Learning for Interpretable Few-Shot GeneralizationCode0
Cutting the Error by Half: Investigation of Very Deep CNN and Advanced Training Strategies for Document Image ClassificationCode0
PCANet: A Simple Deep Learning Baseline for Image Classification?Code0
FPNN: Field Probing Neural Networks for 3D DataCode0
Do Deep Neural Networks Suffer from Crowding?Code0
Do deep nets really need weight decay and dropout?Code0
Diverse, Difficult, and Odd Instances (D2O): A New Test Set for Object ClassificationCode0
CURE-OR: Challenging Unreal and Real Environments for Object RecognitionCode0
Handwritten Bangla Character Recognition Using The State-of-Art Deep Convolutional Neural NetworksCode0
MindSet: Vision. A toolbox for testing DNNs on key psychological experimentsCode0
Task-generalizable Adversarial Attack based on Perceptual MetricCode0
Adding Knowledge to Unsupervised Algorithms for the Recognition of IntentCode0
Video to Events: Recycling Video Datasets for Event CamerasCode0
Foveation in the Era of Deep LearningCode0
HD-CNN: Hierarchical Deep Convolutional Neural Network for Large Scale Visual RecognitionCode0
What does LIME really see in images?Code0
Sample Correlation for Fingerprinting Deep Face RecognitionCode0
MISC210K: A Large-Scale Dataset for Multi-Instance Semantic CorrespondenceCode0
Perceptual Piercing: Human Visual Cue-based Object Detection in Low Visibility ConditionsCode0
Hidden in Plain Sight: Evaluating Abstract Shape Recognition in Vision-Language ModelsCode0
Show:102550
← PrevPage 35 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified