SOTAVerified

Object Counting

The goal of Object Counting task is to count the number of object instances in a single image or video sequence. It has many real-world applications such as traffic flow monitoring, crowdedness estimation, and product counting.

Source: Learning to Count Objects with Few Exemplar Annotations

Papers

Showing 150 of 158 papers

TitleStatusHype
Car Object Counting and Position Estimation via Extension of the CLIP-EBC FrameworkCode0
OmniSpatial: Towards Comprehensive Spatial Reasoning Benchmark for Vision Language Models0
Improving Contrastive Learning for Referring Expression CountingCode0
InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object RecognitionCode2
Expanding Zero-Shot Object Counting with Rich Prompts0
Are Multimodal Large Language Models Ready for Omnidirectional Spatial Reasoning?0
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement LearningCode4
Learning What NOT to Count0
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment0
MATHGLANCE: Multimodal Large Language Models Do Not Know Where to Look in Mathematical Diagrams0
A Causal Lens for Evaluating Faithfulness Metrics0
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding0
FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting0
SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object CountingCode1
AquaticCLIP: A Vision-Language Foundation Model for Underwater Scene Analysis0
A Survey on Class-Agnostic Counting: Advancements from Reference-Based to Open-World Text-Guided Approaches0
Mamba-MOC: A Multicategory Remote Object Counting via State Space Model0
Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension0
T2ICount: Enhancing Cross-modal Understanding for Zero-Shot CountingCode1
Vision Transformers for Weakly-Supervised Microorganism EnumerationCode0
GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial TasksCode2
Counting Stacked Objects from Multi-View Images0
Efficient Masked AutoEncoder for Video Object Counting and A Large-Scale Benchmark0
Boundary Attention Constrained Zero-Shot Layout-To-Image Generation0
A Novel Unified Architecture for Low-Shot Counting by Detection and SegmentationCode2
Mind the Prompt: A Novel Benchmark for Prompt-based Class-Agnostic CountingCode1
GCA-SUNet: A Gated Context-Aware Swin-UNet for Exemplar-Free CountingCode0
Dense Center-Direction Regression for Object Counting and Localization with Point SupervisionCode0
Detection-Driven Object Count Optimization for Text-to-Image Diffusion Models0
Mutually-Aware Feature Learning for Few-Shot Object Counting0
Zero-shot Object Counting with Good ExemplarsCode1
CountGD: Multi-Modal Open-World CountingCode3
RS-Agent: Automating Remote Sensing Tasks through Intelligent AgentCode2
Learning Spatial Similarity Distribution for Few-shot Object CountingCode0
Overconfidence is Key: Verbalized Uncertainty Evaluation in Large Language and Vision-Language Models0
DAVE -- A Detect-and-Verify Paradigm for Low-Shot CountingCode2
ChatGPT and general-purpose AI count fruits in pictures surprisingly well0
Counting Objects in a Robotic Hand0
Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and AnalysisCode2
Few-shot Object LocalizationCode1
Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-ReferringCode0
TFCounter:Polishing Gems for Training-Free Object Counting0
OmniCount: Multi-label Object Counting with Semantic-Geometric Priors0
AFreeCA: Annotation-Free Counting for AllCode0
Effectiveness Assessment of Recent Large Vision-Language Models0
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video0
Enhancing Zero-shot Counting via Language-guided Exemplar Learning0
Do Object Detection Localization Errors Affect Human Performance and Trust?0
Diffusion-based Data Augmentation for Object Counting Problems0
NWPU-MOC: A Benchmark for Fine-grained Multi-category Object Counting in Aerial ImagesCode1
Show:102550
← PrevPage 1 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1FamNetMAE(test)22.08Unverified
2Omnicount (Open vocabulary, multi-label, without training)MAE(test)18.63Unverified
3RCCMAE(test)17.12Unverified
4Counting-DETRMAE(test)16.79Unverified
5CounTX (uses text descriptions instead of visual exemplars)MAE(test)15.88Unverified
6LaoNetMAE(test)15.78Unverified
7BMNet+MAE(test)14.62Unverified
8SAFECountMAE(test)14.32Unverified
9GCA-SUNMAE(test)14Unverified
10SPDCNMAE(test)13.51Unverified
#ModelMetricClaimedVerifiedStatus
1YOLO (2016)MAE156Unverified
2YOLO9000opt (2017)MAE130.4Unverified
3Faster R-CNN (2015)MAE39.88Unverified
4RetinaNet (2018)MAE24.58Unverified
5LPN Counting (2017)MAE22.76Unverified
6One-Look Regression (2016)MAE21.88Unverified
7RetinaNet (2018)MAE16.62Unverified
8CounTX (uses arbitrary text input to specify object to count, used "the cars" for CARPK)MAE8.13Unverified
9Soft-IoU + EM-Merger unitMAE6.77Unverified
10VLCounterMAE6.46Unverified
#ModelMetricClaimedVerifiedStatus
1Fast-RCNNm-reIRMSE-nz0.85Unverified
2glance-noft-2Lm-reIRMSE-nz0.73Unverified
3LC-PSPNetm-reIRMSE-nz0.7Unverified
4Seq-sub-ft-3x3m-reIRMSE-nz0.68Unverified
5ensm-reIRMSE-nz0.65Unverified
6LC-ResFCNm-reIRMSE-nz0.61Unverified
7Supervised Density Mapm-reIRMSE-nz0.61Unverified
8OmnicountmRMSE0Unverified
#ModelMetricClaimedVerifiedStatus
1Aso-sub-ft-3x3m-reIRMSE0.24Unverified
2glance-ft-2Lm-reIRMSE0.23Unverified
3Fast-RCNNm-reIRMSE0.2Unverified
4LC-ResFCNm-reIRMSE0.19Unverified
5Supervised Density Mapm-reIRMSE0.18Unverified
6ensm-reIRMSE0.18Unverified
7Seq-sub-ft-3x3m-reIRMSE0.18Unverified
#ModelMetricClaimedVerifiedStatus
1SMoLA-PaLI-X SpecialistAccuracy77.1Unverified
2PaLI-X-VPDAccuracy76.6Unverified
3SMoLA-PaLI-X Generalist (0 shot)Accuracy70.7Unverified
4MoVie-ResNeXtAccuracy56.8Unverified
5RCNAccuracy56.2Unverified
6MoVieAccuracy54.1Unverified
#ModelMetricClaimedVerifiedStatus
1SMoLA-PaLI-X SpecialistAccuracy86.3Unverified
2PaLI-X-VPDAccuracy86.2Unverified
3SMoLA-PaLI-X Generalist (0 shot)Accuracy83.3Unverified
4MoVie-ResNeXtAccuracy74.9Unverified
5RCNAccuracy71.8Unverified
6MoVieAccuracy70.8Unverified
#ModelMetricClaimedVerifiedStatus
1MoVie-ResNeXtAccuracy64Unverified
2MoVieAccuracy61.2Unverified
3RCNAccuracy60.3Unverified
#ModelMetricClaimedVerifiedStatus
1CEOESmRMSE0.42Unverified
2ILCmRMSE0.29Unverified
3TFOCmRMSE0.01Unverified
#ModelMetricClaimedVerifiedStatus
1OmnicountmRMSE0Unverified
#ModelMetricClaimedVerifiedStatus
1GauNet (ResNet-50)MAE2.1Unverified