SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 19512000 of 2042 papers

TitleStatusHype
EdgeOL: Efficient in-situ Online Learning on Edge Devices0
Effective Baselines for Multiple Object Rearrangement Planning in Partially Observable Mapped Environments0
Effects of Real-Life Traffic Sign Alteration on YOLOv7- an Object Recognition Model0
Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition0
Efficient Anomaly Detection Using Self-Supervised Multi-Cue Tasks0
Efficient Codebook and Factorization for Second Order Representation Learning0
Efficient Estimation of Regularized Tyler's M-Estimator Using Approximate LOOCV0
Efficient Gesture Recognition for the Assistance of Visually Impaired People using Multi-Head Neural Networks0
Efficient Global Point Cloud Alignment using Bayesian Nonparametric Mixtures0
Efficient Image Categorization with Sparse Fisher Vector0
Efficient Multi-Band Temporal Video Filter for Reducing Human-Robot Interaction0
Efficient multi-scale representation of visual objects using a biologically plausible spike-latency code and winner-take-all inhibition0
Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images0
Efficient Point-to-Subspace Query in ^1 with Application to Robust Object Instance Recognition0
Efficient visual object representation using a biologically plausible spike-latency code and winner-take-all inhibition0
Egocentric Audio-Visual Noise Suppression0
Egocentric Height Estimation0
Egocentric Hierarchical Visual Semantics0
EGO-CH: Dataset and Fundamental Tasks for Visitors BehavioralUnderstanding using Egocentric Vision0
Eigen-Distortions of Hierarchical Representations0
EIT-1M: One Million EEG-Image-Text Pairs for Human Visual-textual Recognition and More0
Embedding Visual Hierarchy with Deep Networks for Large-Scale Visual Recognition0
EmBench: Quantifying Performance Variations of Deep Neural Networks across Modern Commodity Devices0
Embodied vision for learning object representations0
Emergent communication for AR0
EmoCAM: Toward Understanding What Drives CNN-based Emotion Recognition0
EMPIRICAL UPPER BOUND IN OBJECT DETECTION0
Empowering Knowledge Distillation via Open Set Recognition for Robust 3D Point Cloud Classification0
Empowering Local Communities Using Artificial Intelligence0
Enabling Pedestrian Safety using Computer Vision Techniques: A Case Study of the 2018 Uber Inc. Self-driving Car Crash0
Encoder-Decoder based CNN and Fully Connected CRFs for Remote Sensed Image Segmentation0
Encoding High Dimensional Local Features by Sparse Coding Based Fisher Vectors0
End-to-End Auditory Object Recognition via Inception Nucleus0
End-to-end Binary Representation Learning via Direct Binary Embedding0
End-to-End Race Driving with Deep Reinforcement Learning0
End-to-end topographic networks as models of cortical map formation and human visual behaviour: moving beyond convolutions0
Energy-based Dropout in Restricted Boltzmann Machines: Why not go random0
Energy-based Tuning of Convolutional Neural Networks on Multi-GPUs0
Energy Efficient Hadamard Neural Networks0
Enhanced Model Robustness to Input Corruptions by Per-corruption Adaptation of Normalization Statistics0
Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering0
Enhancing 3D Object Detection in Autonomous Vehicles Based on Synthetic Virtual Environment Analysis0
Enhancing Crime Scene Investigations through Virtual Reality and Deep Learning Techniques0
Enhancing efficiency of object recognition in different categorization levels by reinforcement learning in modular spiking neural networks0
Enhancing Object Detection in Adverse Conditions using Thermal Imaging0
Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and Detection0
Enhancing Visual Representations for Efficient Object Recognition during Online Distillation0
Enlightening Deep Neural Networks with Knowledge of Confounding Factors0
Enriched Deep Recurrent Visual Attention Model for Multiple Object Recognition0
Estimating Bicycle Route Attractivity from Image Data0
Show:102550
← PrevPage 40 of 41Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified