SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 101150 of 2042 papers

TitleStatusHype
Learning Counterfactually Invariant PredictorsCode1
Contributions of Shape, Texture, and Color in Visual RecognitionCode1
Learning Iterative Reasoning through Energy MinimizationCode1
Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking ConsistencyCode1
Sparse Mixture-of-Experts are Domain Generalizable LearnersCode1
ProxyMix: Proxy-based Mixup Training with Label Refinery for Source-Free Domain AdaptationCode1
Causal Transportability for Visual RecognitionCode1
Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot ClassificationCode1
Ev-TTA: Test-Time Adaptation for Event-Based Object RecognitionCode1
DetMatch: Two Teachers are Better Than One for Joint 2D and 3D Semi-Supervised Object DetectionCode1
Debiased Self-Training for Semi-Supervised LearningCode1
SafePicking: Learning Safe Object Extraction via Object-Level MappingCode1
Rethinking the Two-Stage Framework for Grounded Situation RecognitionCode1
Implicit Feature Refinement for Instance SegmentationCode1
PartImageNet: A Large, High-Quality Dataset of PartsCode1
N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event CamerasCode1
Neural Regression, Representational Similarity, Model Zoology & Neural Taskonomy at Scale in Rodent Visual CortexCode1
The Norm Must Go On: Dynamic Unsupervised Domain Adaptation by NormalizationCode1
TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNsCode1
EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge DistillationCode1
IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language ReasoningCode1
Explainability-Aware One Point Attack for Point Cloud Neural NetworksCode1
Voxel Transformer for 3D Object DetectionCode1
Patchwork: Concentric Zone-based Region-wise Ground Segmentation with Ground Likelihood Estimation Using a 3D LiDAR SensorCode1
On the Challenges of Open World Recognitionunder Shifting Visual DomainsCode1
Mind Your Outliers! Investigating the Negative Impact of Outliers on Active Learning for Visual Question AnsweringCode1
Hebbian learning with gradients: Hebbian convolutional neural networks with modern deep learning frameworksCode1
Wasserstein Barycenter for Multi-Source Domain AdaptationCode1
Deep Subdomain Adaptation Network for Image ClassificationCode1
Self-Supervised Learning with Kernel Dependence MaximizationCode1
Partial success in closing the gap between human and machine visionCode1
Person Re-Identification with a Locally Aware TransformerCode1
Convolutional Neural Networks with Gated Recurrent ConnectionsCode1
DOCTOR: A Simple Method for Detecting Misclassification ErrorsCode1
Superpixel-based Knowledge Infusion in Deep Neural Networks for Image ClassificationCode1
Are Convolutional Neural Networks or Transformers more like human vision?Code1
This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep NetworksCode1
RelTransformer: A Transformer-Based Long-Tail Visual Relationship RecognitionCode1
ORBIT: A Real-World Few-Shot Dataset for Teachable Object RecognitionCode1
F-SIOL-310: A Robotic Dataset and Benchmark for Few-Shot Incremental Object LearningCode1
A Study of Face Obfuscation in ImageNetCode1
PatchNet -- Short-range Template Matching for Efficient Video ProcessingCode1
FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in High-Resolution Remote Sensing ImageryCode1
Contemplating real-world object classificationCode1
Comparison of semi-supervised deep learning algorithms for audio classificationCode1
Sill-Net: Feature Augmentation with Separated Illumination RepresentationCode1
Using Shape to Categorize: Low-Shot Learning with an Explicit Shape BiasCode1
Self-Supervised Pretraining of 3D Features on any Point-CloudCode1
Adaptive Threshold for Online Object Recognition and Re-identification TasksCode1
Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling TransferCode1
Show:102550
← PrevPage 3 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified