SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 14011450 of 2042 papers

TitleStatusHype
Using Spatial Pooler of Hierarchical Temporal Memory to classify noisy videos with predefined complexity0
Using Web Co-occurrence Statistics for Improving Image Categorization0
Utility-Oriented Underwater Image Quality Assessment Based on Transfer Learning0
UWN: A Large Multilingual Lexical Knowledge Base0
UW-NET: AN INCEPTION-ATTENTION NETWORK FOR UNDERWATER IMAGE CLASSIFICATION0
V1Net: A computational model of cortical horizontal connections0
V^2R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations0
V2X Cooperative Perception for Autonomous Driving: Recent Advances and Challenges0
Variable-Viewpoint Representations for 3D Object Recognition0
Variant Parallelism: Lightweight Deep Convolutional Models for Distributed Inference on IoT Devices0
Variation of Gender Biases in Visual Recognition Models Before and After Finetuning0
VDM-DA: Virtual Domain Modeling for Source Data-free Domain Adaptation0
Ventral-Dorsal Neural Networks: Object Detection via Selective Attention0
VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification0
ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding0
View-Invariant Template Matching Using Homography Constraints0
View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation0
VisBuddy -- A Smart Wearable Assistant for the Visually Challenged0
Vision at A Glance: Interplay between Fine and Coarse Information Processing Pathways0
Vision-Based Layout Detection from Scientific Literature using Recurrent Convolutional Neural Networks0
Vision Transformer with Convolutions Architecture Search0
Visual7W: Grounded Question Answering in Images0
Visual Classifier Prediction by Distributional Semantic Embedding of Text Descriptions0
Visual Features for Linguists: Basic image analysis techniques for multimodally-curious NLPers0
Visual Ground Truth Construction as Faceted Classification0
Visual Language Models show widespread visual deficits on neuropsychological tests0
Visual recognition in the wild by sampling deep similarity functions0
Visual Relationship Detection using Scene Graphs: A Survey0
Visual Sentiment Prediction with Deep Convolutional Neural Networks0
Visuo-Haptic Object Perception for Robots: An Overview0
ViTCN: Vision Transformer Contrastive Network For Reasoning0
Volumetric Convolution: Automatic Representation Learning in Unit Ball0
VSEM: An open library for visual semantics representation0
Warping of Radar Data into Camera Image for Cross-Modal Supervision in Automotive Applications0
Wasserstein Dependency Measure for Representation Learning0
WaveTransform: Crafting Adversarial Examples via Input Decomposition0
Weakly Supervised Image Classification Through Noise Regularization0
Weakly Supervised Instance Attention for Multisource Fine-Grained Object Recognition with an Application to Tree Species Classification0
Weakly Supervised Learning of Mid-Level Features with Beta-Bernoulli Process Restricted Boltzmann Machines0
Weakly Supervised Localization using Deep Feature Maps0
Webly Supervised Semantic Embeddings for Large Scale Zero-Shot Learning0
Weighted Sigmoid Gate Unit for an Activation Function of Deep Neural Network0
What a difference a pixel makes: An empirical examination of features used by CNNs for categorisation0
What are the visual features underlying human versus machine vision?0
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots0
What can we learn about CNNs from a large scale controlled object dataset?0
What deep learning can tell us about higher cognitive functions like mindreading?0
What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?0
What is the Best Feature Learning Procedure in Hierarchical Recognition Architectures?0
What's in a Name? Beyond Class Indices for Image Recognition0
Show:102550
← PrevPage 29 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified