SOTAVerified

Object Localization

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Papers

Showing 201250 of 617 papers

TitleStatusHype
SilVar: Speech Driven Multimodal Model for Reasoning Visual Question Answering and Object LocalizationCode0
Demystifying the Potential of ChatGPT-4 Vision for Construction Progress Monitoring0
SuperGSeg: Open-Vocabulary 3D Segmentation with Structured Super-Gaussians0
3D Spatial Understanding in MLLMs: Disambiguation and Evaluation0
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding0
GraPix: Exploring Graph Modularity Optimization for Unsupervised Pixel ClusteringCode0
RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations0
SpaRC: Sparse Radar-Camera Fusion for 3D Object DetectionCode0
ObjectRelator: Enabling Cross-View Object Relation Understanding in Ego-Centric and Exo-Centric Videos0
GloFinder: AI-empowered QuPath Plugin for WSI-level Glomerular Detection, Visualization, and Curation0
Probing the Mid-level Vision Capabilities of Self-Supervised Learning0
Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot0
FAST-Splat: Fast, Ambiguity-Free Semantics Transfer in Gaussian Splatting0
YCB-LUMA: YCB Object Dataset with Luminance Keying for Object LocalizationCode0
Text-guided Zero-Shot Object Localization0
Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning0
LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes0
Co-Segmentation without any Pixel-level Supervision with Application to Large-Scale Sketch ClassificationCode0
Optimizing Multi-Task Learning for Accurate Spacecraft Pose Estimation0
Training-Free Open-Ended Object Detection and Segmentation via Attention as Prompts0
DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation0
QUB-PHEO: A Visual-Based Dyadic Multi-View Dataset for Intention Inference in Collaborative AssemblyCode0
PMR-Net: Parallel Multi-Resolution Encoder-Decoder Network Framework for Medical Image Segmentation0
Do Pre-trained Vision-Language Models Encode Object States?Code0
Top-GAP: Integrating Size Priors in CNNs for more Interpretability, Robustness, and Bias Mitigation0
Prediction Accuracy & Reliability: Classification and Object Localization under Distribution Shift0
Evaluation and Comparison of Visual Language Models for Transportation Engineering ProblemsCode0
Multi-scale Multi-instance Visual Sound Localization and Segmentation0
Language-guided Scale-aware MedSegmentor for Lesion Segmentation in Medical Imaging0
Optimal Weight Scheme for Fusion-Assisted Cooperative Multi-Monostatic Object Localization in 6G Networks0
Multi-Beam Object-Localization for Millimeter-Wave ISAC-Aided Connected Autonomous Vehicles0
Stimulating Imagination: Towards General-purpose Object Rearrangement0
Categorical Knowledge Fused Recognition: Fusing Hierarchical Knowledge with Image Classification through Aligning and Deep Metric Learning0
A Model Generalization Study in Localizing Indoor Cows with COw LOcalization (COLO) dataset0
BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments0
PEEKABOO: Hiding parts of an image for unsupervised object localizationCode0
DenseTrack: Drone-based Crowd Tracking via Density-aware Motion-appearance SynergyCode0
Evaluating and Enhancing Trustworthiness of LLMs in Perception Tasks0
Leveraging Transformers for Weakly Supervised Object Localization in Unconstrained VideosCode0
ALINA: Advanced Line Identification and Notation AlgorithmCode0
FlexLoc: Conditional Neural Networks for Zero-Shot Sensor Perspective Invariance in Object Localization with Distributed Multimodal SensorsCode0
Leveraging Activations for Superpixel Explanations0
Explaining Multi-modal Large Language Models by Analyzing their Vision PerceptionCode0
Concept Visualization: Explaining the CLIP Multi-modal Embedding Using WordNetCode0
Masked Multi-Query Slot Attention for Unsupervised Object DiscoveryCode0
Source-Free Domain Adaptation of Weakly-Supervised Object Localization Models for HistologyCode0
Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection0
A Realistic Protocol for Evaluation of Weakly Supervised Object LocalizationCode0
Real-world Instance-specific Image Goal Navigation: Bridging Domain Gaps via Contrastive Learning0
Improving Weakly-Supervised Object Localization Using Adversarial Erasing and Pseudo Label0
Show:102550
← PrevPage 5 of 13Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1OSMaNRGSPL32.99Unverified
2SUSARGSPL27.31Unverified
3ShanksRGSPL22.85Unverified
4CVPR22RGSPL22.06Unverified
5damm1RGSPL15.96Unverified
61637RGSPL14.03Unverified
7init. PREVALENTRGSPL13.51Unverified
8AirbertRGSPL13.28Unverified
9init. OSCARRGSPL10Unverified
10SIARGSPL9.2Unverified
#ModelMetricClaimedVerifiedStatus
1VoxelNetAP89.35Unverified
2VoxelNetAP89.35Unverified
3Frustum PointNetsAP88.7Unverified
4Frustum PointNetsAP81.2Unverified
5VoxelNetAP77.47Unverified
#ModelMetricClaimedVerifiedStatus
1Frustrum-PointPillarsAP48.3Unverified
2Frustum PointNetsAP47.2Unverified
3Frustum PointNetsAP40.23Unverified
4VoxelNetAP38.11Unverified
5VoxelNetAP31.51Unverified
#ModelMetricClaimedVerifiedStatus
1Frustrum-PointPillarsAP52.23Unverified
2Frustum PointNetsAP50.22Unverified
3Frustum PointNetsAP42.15Unverified
4VoxelNetAP40.74Unverified
5VoxelNetAP33.69Unverified
#ModelMetricClaimedVerifiedStatus
1VoxelNetAP77.39Unverified
2Frustum PointNetsAP75.33Unverified
3Frustum PointNetsAP62.19Unverified
4VoxelNetAP57.73Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP75.38Unverified
2Frustum PointNetsAP71.96Unverified
3VoxelNetAP66.7Unverified
4VoxelNetAP61.22Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP61.96Unverified
2Frustum PointNetsAP56.77Unverified
3VoxelNetAP54.76Unverified
4VoxelNetAP48.36Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP58.09Unverified
2Frustum PointNetsAP51.21Unverified
3VoxelNetAP46.13Unverified
4VoxelNetAP39.48Unverified
#ModelMetricClaimedVerifiedStatus
1Unified-IOXLLocalization (ablation)67Unverified
2GPV-2Localization (ablation)53.6Unverified
3Mask R-CNNLocalization (ablation)44.7Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP54.68Unverified
2VoxelNeAP50.55Unverified
3Frustum PointNetsAP50.39Unverified
#ModelMetricClaimedVerifiedStatus
1GPT4-Vision 4-shot+CoTAccuracy49.7Unverified
2Gemini-Pro 4-shot+CoTAccuracy33.9Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP84Unverified
2VoxelNetAP79.26Unverified
#ModelMetricClaimedVerifiedStatus
1Frustrum-PointPillarsAP60.98Unverified
#ModelMetricClaimedVerifiedStatus
1Hausdorff LossPrecision88.1Unverified
#ModelMetricClaimedVerifiedStatus
1oursCorLoc41.2Unverified
#ModelMetricClaimedVerifiedStatus
1oursCorLoc47.45Unverified
#ModelMetricClaimedVerifiedStatus
1Hausdorff LossF-Score88.6Unverified
#ModelMetricClaimedVerifiedStatus
1Hausdorff LossRecall89.2Unverified