SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 201250 of 2042 papers

TitleStatusHype
Person Re-Identification with a Locally Aware TransformerCode1
CLIP-guided Federated Learning on Heterogeneous and Long-Tailed DataCode1
AdaNorm: Adaptive Gradient Norm Correction based Optimizer for CNNsCode1
Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal ModelsCode1
Recognize Any RegionsCode1
Relation Networks for Object DetectionCode1
Causal Transportability for Visual RecognitionCode1
CLoVe: Encoding Compositional Language in Contrastive Vision-Language ModelsCode1
SafePicking: Learning Safe Object Extraction via Object-Level MappingCode1
BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in VideoCode1
Brain-Score: Which Artificial Neural Network for Object Recognition is most Brain-Like?Code1
Category-Prompt Refined Feature Learning for Long-Tailed Multi-Label Image ClassificationCode1
Self-Supervised Learning with Kernel Dependence MaximizationCode1
Self-Supervised Linear Motion DeblurringCode1
Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking ConsistencyCode1
Sill-Net: Feature Augmentation with Separated Illumination RepresentationCode1
Single Shot MC Dropout ApproximationCode1
Comics Datasets Framework: Mix of Comics datasets for detection benchmarkingCode1
COTR: Compact Occupancy TRansformer for Vision-based 3D Occupancy PredictionCode1
FSD: Fast Self-Supervised Single RGB-D to Categorical 3D ObjectsCode1
On the Element-Wise Representation and Reasoning in Zero-Shot Image Recognition: A Systematic SurveyCode1
An Adaptive Sampling Scheme to Efficiently Train Fully Convolutional Networks for Semantic Segmentation0
An Adaptive Descriptor Design for Object Recognition in the Wild0
A biologically plausible network for the computation of orientation dominance0
PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras0
A Variational Feature Encoding Method of 3D Object for Probabilistic Semantic SLAM0
Boosting Object Recognition in Point Clouds by Saliency Detection0
Boosting with Maximum Adaptive Sampling0
A Multi-purpose Realistic Haze Benchmark with Quantifiable Haze Levels and Ground Truth0
Automatic Dataset Augmentation0
A Multiclass Boosting Framework for Achieving Fast and Provable Adversarial Robustness0
Amplitude-Based Approach to Evidence Accumulation0
Amodal Completion and Size Constancy in Natural Scenes0
A Benchmark Grocery Dataset of Realworld Point Clouds From Single View0
A Light and Smart Wearable Platform with Multimodal Foundation Model for Enhanced Spatial Reasoning in People with Blindness and Low Vision0
BORDER: An Oriented Rectangles Approach to Texture-Less Object Recognition0
Auditing ImageNet: Towards a Model-driven Framework for Annotating Demographic Attributes of Large-Scale Image Datasets0
Audiovisual Highlight Detection in Videos0
Augmenting Image Annotation: A Human-LMM Collaborative Framework for Efficient Object Selection and Label Generation0
Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization0
A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning0
Automatically Discovering Local Visual Material Attributes0
Adaptive Object Detection with Dual Multi-Label Prediction0
Automatic Graphic Logo Detection via Fast Region-based Convolutional Networks0
Automatic Ultrasound Image Segmentation of Supraclavicular Nerve Using Dilated U-Net Deep Learning Architecture0
Autonomous Manipulation Learning for Similar Deformable Objects via Only One Demonstration0
AU Dataset for Visuo-Haptic Object Recognition for Robots0
Background Invariance Testing According to Semantic Proximity0
A Multisensory Learning Architecture for Rotation-invariant Object Recognition0
ATZSL: Defensive Zero-Shot Recognition in the Presence of Adversaries0
Show:102550
← PrevPage 5 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified