SOTAVerified

Object Localization

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Papers

Showing 101150 of 617 papers

TitleStatusHype
Improving Weakly-Supervised Object Localization Using Adversarial Erasing and Pseudo Label0
IDD-X: A Multi-View Dataset for Ego-relative Important Object Localization and Explanation in Dense and Unstructured TrafficCode0
O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation0
MOSE: Boosting Vision-based Roadside 3D Object Detection with Scene Cues0
FlightScope: An Experimental Comparative Review of Aircraft Detection Algorithms in Satellite ImageryCode1
Towards Two-Stream Foveation-based Active Vision Learning0
IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language ModelsCode1
Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking0
Semantic Gaussians: Open-Vocabulary Scene Understanding with 3D Gaussian Splatting0
Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection0
EcoSense: Energy-Efficient Intelligent Sensing for In-Shore Ship Detection through Edge-Cloud Collaboration0
Few-shot Object LocalizationCode1
Could We Generate Cytology Images from Histopathology Images? An Empirical Study0
CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization PerspectiveCode1
The All-Seeing Project V2: Towards General Relation Comprehension of the Open WorldCode4
Weakly Supervised Monocular 3D Detection with a Single-View Image0
Foveated Retinotopy Improves Classification and Localization in CNNs0
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language ModelsCode4
Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration0
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation dataCode0
CPR++: Object Localization via Single Coarse Point SupervisionCode0
MsSVT++: Mixed-scale Sparse Voxel Transformer with Center Voting for 3D Object Detection0
Spatial Structure Constraints for Weakly Supervised Semantic SegmentationCode1
Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object DetectionCode2
Domain Adaptation for Large-Vocabulary Object Detectors0
Bilateral Reference for High-Resolution Dichotomous Image SegmentationCode7
GTA: Guided Transfer of Spatial Attention from Object-Centric Representations0
Point Segment and Count: A Generalized Framework for Object CountingCode2
Cyclic Learning for Binaural Audio Generation and Localization0
LangSplat: 3D Language Gaussian SplattingCode3
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection0
Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect SegmentationCode1
Object-Aware Domain Generalization for Object DetectionCode1
Weakly Supervised Open-Vocabulary Object Detection0
Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask GuidanceCode1
Exploring Foveation and Saccade for Improved Weakly-Supervised LocalizationCode1
Multiscale Vision Transformer With Deep Clustering-Guided Refinement for Weakly Supervised Object Localization0
Mono3DVG: 3D Visual Grounding in Monocular ImagesCode1
Boosting Segment Anything Model Towards Open-Vocabulary LearningCode1
ZeroReg: Zero-Shot Point Cloud Registration with Foundation Models0
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object DetectionCode1
SANeRF-HQ: Segment Anything for NeRF in High Quality0
Grounding Everything: Emerging Localization Properties in Vision-Language TransformersCode1
Language Embedded 3D Gaussians for Open-Vocabulary Scene UnderstandingCode1
Union-over-Intersections: Object Detection beyond Winner-Takes-AllCode0
Seeing Beyond Cancer: Multi-Institutional Validation of Object Localization and 3D Semantic Segmentation using Deep Learning for Breast MRI0
Cooperative Multi-Monostatic Sensing for Object Localization in 6G Networks0
Point, Segment and Count: A Generalized Framework for Object CountingCode1
DSD-DA: Distillation-based Source Debiasing for Domain Adaptive Object Detection0
Towards Learning Monocular 3D Object Localization From 2D Labels using the Physical Laws of MotionCode0
Show:102550
← PrevPage 3 of 13Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1OSMaNRGSPL32.99Unverified
2SUSARGSPL27.31Unverified
3ShanksRGSPL22.85Unverified
4CVPR22RGSPL22.06Unverified
5damm1RGSPL15.96Unverified
61637RGSPL14.03Unverified
7init. PREVALENTRGSPL13.51Unverified
8AirbertRGSPL13.28Unverified
9init. OSCARRGSPL10Unverified
10SIARGSPL9.2Unverified
#ModelMetricClaimedVerifiedStatus
1VoxelNetAP89.35Unverified
2VoxelNetAP89.35Unverified
3Frustum PointNetsAP88.7Unverified
4Frustum PointNetsAP81.2Unverified
5VoxelNetAP77.47Unverified
#ModelMetricClaimedVerifiedStatus
1Frustrum-PointPillarsAP48.3Unverified
2Frustum PointNetsAP47.2Unverified
3Frustum PointNetsAP40.23Unverified
4VoxelNetAP38.11Unverified
5VoxelNetAP31.51Unverified
#ModelMetricClaimedVerifiedStatus
1Frustrum-PointPillarsAP52.23Unverified
2Frustum PointNetsAP50.22Unverified
3Frustum PointNetsAP42.15Unverified
4VoxelNetAP40.74Unverified
5VoxelNetAP33.69Unverified
#ModelMetricClaimedVerifiedStatus
1VoxelNetAP77.39Unverified
2Frustum PointNetsAP75.33Unverified
3Frustum PointNetsAP62.19Unverified
4VoxelNetAP57.73Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP75.38Unverified
2Frustum PointNetsAP71.96Unverified
3VoxelNetAP66.7Unverified
4VoxelNetAP61.22Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP61.96Unverified
2Frustum PointNetsAP56.77Unverified
3VoxelNetAP54.76Unverified
4VoxelNetAP48.36Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP58.09Unverified
2Frustum PointNetsAP51.21Unverified
3VoxelNetAP46.13Unverified
4VoxelNetAP39.48Unverified
#ModelMetricClaimedVerifiedStatus
1Unified-IOXLLocalization (ablation)67Unverified
2GPV-2Localization (ablation)53.6Unverified
3Mask R-CNNLocalization (ablation)44.7Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP54.68Unverified
2VoxelNeAP50.55Unverified
3Frustum PointNetsAP50.39Unverified
#ModelMetricClaimedVerifiedStatus
1GPT4-Vision 4-shot+CoTAccuracy49.7Unverified
2Gemini-Pro 4-shot+CoTAccuracy33.9Unverified
#ModelMetricClaimedVerifiedStatus
1Frustum PointNetsAP84Unverified
2VoxelNetAP79.26Unverified
#ModelMetricClaimedVerifiedStatus
1Frustrum-PointPillarsAP60.98Unverified
#ModelMetricClaimedVerifiedStatus
1Hausdorff LossPrecision88.1Unverified
#ModelMetricClaimedVerifiedStatus
1oursCorLoc41.2Unverified
#ModelMetricClaimedVerifiedStatus
1oursCorLoc47.45Unverified
#ModelMetricClaimedVerifiedStatus
1Hausdorff LossF-Score88.6Unverified
#ModelMetricClaimedVerifiedStatus
1Hausdorff LossRecall89.2Unverified