Qwen2.5-VL Technical Report Feb 19, 2025 document understanding
Code Code Available 115 Bilateral Reference for High-Resolution Dichotomous Image Segmentation Jan 7, 2024 Camouflaged Object Segmentation Dichotomous Image Segmentation
Code Code Available 75 VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Jun 12, 2024 Image Generation Language Modeling
Code Code Available 55 Mamba-FETrack: Frame-Event Tracking via State Space Model Apr 28, 2024 GPU Mamba
Code Code Available 45 The All-Seeing Project V2: Towards General Relation Comprehension of the Open World Feb 29, 2024 All Hallucination
Code Code Available 45 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models Feb 12, 2024 Hallucination Object Localization
Code Code Available 45 LangSplat: 3D Language Gaussian Splatting Dec 26, 2023 NeRF Object Localization
Code Code Available 35 DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation Nov 7, 2024 Object Localization
Code Code Available 35 CrossOver: 3D Scene Cross-Modal Alignment Feb 20, 2025 cross-modal alignment Object
Code Code Available 35 Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D Apr 19, 2025 Decoder Object Localization
Code Code Available 35 CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection Oct 4, 2023 3D Object Detection cross-modal alignment
Code Code Available 25 Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation Mar 25, 2022 Contrastive Learning image-classification
Code Code Available 25 Omnidirectional Multi-Object Tracking Mar 6, 2025 Multi-Object Tracking Object
Code Code Available 25 Crafting Better Contrastive Views for Siamese Representation Learning Feb 7, 2022 Contrastive Learning Object Localization
Code Code Available 25 BOP Challenge 2020 on 6D Object Localization Sep 15, 2020 6D Pose Estimation 6D Pose Estimation using RGB
Code Code Available 25 C2AM: Contrastive Learning of Class-Agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation Jan 1, 2022 Contrastive Learning image-classification
Code Code Available 25 Point Segment and Count: A Generalized Framework for Object Counting Jan 1, 2024 Few-shot Object Counting and Detection Knowledge Distillation
Code Code Available 25 Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection Jan 19, 2024 Multispectral Object Detection Object
Code Code Available 25 A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation Sep 27, 2024 Exemplar-Free Counting Few-shot Object Counting and Detection
Code Code Available 25 Deep Snake for Real-Time Instance Segmentation Jan 6, 2020 GPU Instance Segmentation
Code Code Available 25 Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs Jan 18, 2021 3D Reconstruction Object Localization
Code Code Available 25 Many-Shot In-Context Learning in Multimodal Foundation Models May 16, 2024 image-classification Image Classification
Code Code Available 25 Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark Nov 24, 2022 2D Object Detection Image Retrieval
Code Code Available 25 Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs Sep 27, 2023 Form Navigate
Code Code Available 15 Context-Aware 3D Object Localization from Single Calibrated Images: A Study of Basketballs Sep 7, 2023 Autonomous Driving Camera Calibration
Code Code Available 15 Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation Dec 21, 2023 Edge Detection Feature Engineering
Code Code Available 15 Dual Progressive Transformations for Weakly Supervised Semantic Segmentation Sep 30, 2022 Inductive Bias Object
Code Code Available 15 Dual-attention Guided Dropblock Module for Weakly Supervised Object Localization Mar 9, 2020 Object Localization Weakly-Supervised Object Localization
Code Code Available 15 An Attention-guided Multistream Feature Fusion Network for Localization of Risky Objects in Driving Videos Sep 16, 2022 Anomaly Detection Object
Code Code Available 15 Anchor-free Small-scale Multispectral Pedestrian Detection Aug 19, 2020 Autonomous Driving Data Augmentation
Code Code Available 15 DETReg: Unsupervised Pretraining with Region Priors for Object Detection Jun 8, 2021 Few-Shot Learning Few-Shot Object Detection
Code Code Available 15 Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching Oct 12, 2020 Object Object Localization
Code Code Available 15 CLIP the Gap: A Single Domain Generalization Approach for Object Detection Jan 13, 2023 Domain Generalization image-classification
Code Code Available 15 Background Activation Suppression for Weakly Supervised Object Localization Dec 1, 2021 Object Object Localization
Code Code Available 15 A Low-Shot Object Counting Network With Iterative Prototype Adaptation Nov 15, 2022 Exemplar-Free Counting Object
Code Code Available 15 CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation Mar 20, 2022 image-classification Image Classification
Code Code Available 15 Distilling Knowledge from Refinement in Multiple Instance Detection Networks Apr 23, 2020 Knowledge Distillation Multiple Instance Learning
Code Code Available 15 DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection Sep 13, 2021 object-detection Object Detection
Code Code Available 15 CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features May 13, 2019 Domain Generalization Image Captioning
Code Code Available 15 DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion Nov 29, 2021 Multi-Object Tracking Object
Code Code Available 15 Audio-Visual Grouping Network for Sound Localization from Mixtures Mar 29, 2023 Object Localization Sound Source Localization
Code Code Available 15 Class-aware Sounding Objects Localization via Audiovisual Correspondence Dec 22, 2021 Object object-detection
Code Code Available 15 Cross-Modal Weighting Network for RGB-D Salient Object Detection Jul 9, 2020 object-detection Object Detection
Code Code Available 15 CIA-SSD: Confident IoU-Aware Single-Stage Object Detector From Point Cloud Dec 5, 2020 3D Object Detection Birds Eye View Object Detection
Code Code Available 15 Cascade-DETR: Delving into High-Quality Universal Object Detection Jul 20, 2023 Decoder Object
Code Code Available 15 CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective Mar 11, 2024 Data Augmentation Object Localization
Code Code Available 15 Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation Dec 9, 2024 Object Localization Vision and Language Navigation
Code Code Available 15 Building Calibrated Deep Models via Uncertainty Matching with Auxiliary Interval Predictors Sep 9, 2019 Object Localization Prediction
Code Code Available 15 Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation Sep 22, 2023 Object Object Localization
Code Code Available 15 DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering Dec 12, 2022 Clustering Graph Neural Network
Code Code Available 15