SOTAVerified

Object Recognition

Object recognition is a computer vision technique for detecting + classifying objects in images or videos. Since this is a combined task of object detection plus image classification, the state-of-the-art tables are recorded for each component task here and here.

( Image credit: Tensorflow Object Detection API )

Papers

Showing 76100 of 2042 papers

TitleStatusHype
Hierarchical Superpixel Segmentation via Structural Information TheoryCode0
Perceptual Inductive Bias Is What You Need Before Contrastive Learning0
Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Mutimodal Models0
Enhanced Multimodal RAG-LLM for Accurate Visual Question Answering0
Sample Correlation for Fingerprinting Deep Face RecognitionCode0
AI-based Wearable Vision Assistance System for the Visually Impaired: Integrating Real-Time Object Recognition and Contextual Understanding Using Large Vision-Language Models0
The same but different: impact of animal facility sanitary status on a transgenic mouse model of Alzheimer's disease0
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object DetectionCode1
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object LocalizationCode0
Real Classification by Description: Extending CLIP's Limits of Part Attributes RecognitionCode0
Targeted View-Invariant Adversarial Perturbations for 3D Object RecognitionCode0
Efficient Oriented Object Detection with Enhanced Small Object Recognition in Aerial Images0
CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal DynamicsCode1
WiseAD: Knowledge Augmented End-to-End Autonomous Driving with Vision-Language ModelCode1
CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs0
Proactive Adversarial Defense: Harnessing Prompt Tuning in Vision-Language Models to Detect Unseen Backdoored Images0
Enhancing 3D Object Detection in Autonomous Vehicles Based on Synthetic Virtual Environment Analysis0
Can foundation models actively gather information in interactive environments to test hypotheses?0
Expanding Event Modality Applications through a Robust CLIP-Based EncoderCode1
Optimized CNNs for Rapid 3D Point Cloud Object Recognition0
LVLM-COUNT: Enhancing the Counting Ability of Large Vision-Language ModelsCode0
Textured As-Is BIM via GIS-informed Point Cloud Segmentation0
Verbalized Representation Learning for Interpretable Few-Shot GeneralizationCode0
Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agentsCode0
NEMO: Can Multimodal LLMs Identify Attribute-Modified Objects?0
Show:102550
← PrevPage 4 of 82Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Imagenshape bias98.7Unverified
2Stable Diffusionshape bias92.7Unverified
3Partishape bias91.7Unverified
4ViT-22B-384shape bias86.4Unverified
5ViT-22B-560shape bias83.8Unverified
6CLIP (ViT-B)shape bias79.9Unverified
7ViT-22B-224shape bias78Unverified
8ResNet-50 (L2 eps 5.0 adv trained)shape bias69.5Unverified
9ResNet-50 (with strong augmentations)shape bias62.2Unverified
10SWSL (ResNeXt-101)shape bias49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.55Unverified
2SSNNAccuracy (% )78.57Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )85.62Unverified
2SSNNAccuracy (% )79.25Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy18.75Unverified
2yunTop 5 Accuracy14.75Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2DYTop 5 Accuracy0.08Unverified
#ModelMetricClaimedVerifiedStatus
1ObjectNet-BaselineTop 5 Accuracy52.24Unverified
2AJ2021Top 5 Accuracy27.68Unverified
#ModelMetricClaimedVerifiedStatus
1SSNNAccuracy (% )94.91Unverified
#ModelMetricClaimedVerifiedStatus
1Faster-RCNNmAP30.39Unverified
#ModelMetricClaimedVerifiedStatus
1Spike-VGG11Accuracy (% )96Unverified