Qwen2.5-VL Technical Report Feb 19, 2025 document understanding
Code Code Available 11Bilateral Reference for High-Resolution Dichotomous Image Segmentation Jan 7, 2024 Camouflaged Object Segmentation Dichotomous Image Segmentation
Code Code Available 7VisionLLM v2: An End-to-End Generalist Multimodal Large Language Model for Hundreds of Vision-Language Tasks Jun 12, 2024 Image Generation Language Modeling
Code Code Available 5Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models Feb 12, 2024 Hallucination Object Localization
Code Code Available 4The All-Seeing Project V2: Towards General Relation Comprehension of the Open World Feb 29, 2024 All Hallucination
Code Code Available 4Mamba-FETrack: Frame-Event Tracking via State Space Model Apr 28, 2024 GPU Mamba
Code Code Available 4LangSplat: 3D Language Gaussian Splatting Dec 26, 2023 NeRF Object Localization
Code Code Available 3CrossOver: 3D Scene Cross-Modal Alignment Feb 20, 2025 cross-modal alignment Object
Code Code Available 3Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D Apr 19, 2025 Decoder Object Localization
Code Code Available 3DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation Nov 7, 2024 Object Localization
Code Code Available 3Deep Snake for Real-Time Instance Segmentation Jan 6, 2020 GPU Instance Segmentation
Code Code Available 2Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark Nov 24, 2022 2D Object Detection Image Retrieval
Code Code Available 2CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection Oct 4, 2023 3D Object Detection cross-modal alignment
Code Code Available 2Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation Mar 25, 2022 Contrastive Learning image-classification
Code Code Available 2C2AM: Contrastive Learning of Class-Agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation Jan 1, 2022 Contrastive Learning image-classification
Code Code Available 2Many-Shot In-Context Learning in Multimodal Foundation Models May 16, 2024 image-classification Image Classification
Code Code Available 2A Novel Unified Architecture for Low-Shot Counting by Detection and Segmentation Sep 27, 2024 Exemplar-Free Counting Few-shot Object Counting and Detection
Code Code Available 2Omnidirectional Multi-Object Tracking Mar 6, 2025 Multi-Object Tracking Object
Code Code Available 2BOP Challenge 2020 on 6D Object Localization Sep 15, 2020 6D Pose Estimation 6D Pose Estimation using RGB
Code Code Available 2Crafting Better Contrastive Views for Siamese Representation Learning Feb 7, 2022 Contrastive Learning Object Localization
Code Code Available 2Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs Jan 18, 2021 3D Reconstruction Object Localization
Code Code Available 2Removal then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection Jan 19, 2024 Multispectral Object Detection Object
Code Code Available 2Point Segment and Count: A Generalized Framework for Object Counting Jan 1, 2024 Few-shot Object Counting and Detection Knowledge Distillation
Code Code Available 2Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation Dec 21, 2023 Edge Detection Feature Engineering
Code Code Available 1Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching Oct 12, 2020 Object Object Localization
Code Code Available 1Dual-attention Guided Dropblock Module for Weakly Supervised Object Localization Mar 9, 2020 Object Localization Weakly-Supervised Object Localization
Code Code Available 1Dual Progressive Transformations for Weakly Supervised Semantic Segmentation Sep 30, 2022 Inductive Bias Object
Code Code Available 1Distilling Knowledge from Refinement in Multiple Instance Detection Networks Apr 23, 2020 Knowledge Distillation Multiple Instance Learning
Code Code Available 1An Attention-guided Multistream Feature Fusion Network for Localization of Risky Objects in Driving Videos Sep 16, 2022 Anomaly Detection Object
Code Code Available 1Anchor-free Small-scale Multispectral Pedestrian Detection Aug 19, 2020 Autonomous Driving Data Augmentation
Code Code Available 1Deep Learning Innovations for Underwater Waste Detection: An In-Depth Analysis May 28, 2024 Object Localization
Code Code Available 1DeepCut: Unsupervised Segmentation using Graph Neural Networks Clustering Dec 12, 2022 Clustering Graph Neural Network
Code Code Available 1DAFNe: A One-Stage Anchor-Free Approach for Oriented Object Detection Sep 13, 2021 object-detection Object Detection
Code Code Available 1A Low-Shot Object Counting Network With Iterative Prototype Adaptation Nov 15, 2022 Exemplar-Free Counting Object
Code Code Available 1CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features May 13, 2019 Domain Generalization Image Captioning
Code Code Available 1DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion Nov 29, 2021 Multi-Object Tracking Object
Code Code Available 1DETReg: Unsupervised Pretraining with Region Priors for Object Detection Jun 8, 2021 Few-Shot Learning Few-Shot Object Detection
Code Code Available 1CREAM: Weakly Supervised Object Localization via Class RE-Activation Mapping May 27, 2022 Clustering Object
Code Code Available 1Context-Aware 3D Object Localization from Single Calibrated Images: A Study of Basketballs Sep 7, 2023 Autonomous Driving Camera Calibration
Code Code Available 1UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling Nov 23, 2021 Image Captioning Image Description
Code Code Available 1Audio-Visual Grouping Network for Sound Localization from Mixtures Mar 29, 2023 Object Localization Sound Source Localization
Code Code Available 1CoWs on Pasture: Baselines and Benchmarks for Language-Driven Zero-Shot Object Navigation Mar 20, 2022 image-classification Image Classification
Code Code Available 1CLIP the Gap: A Single Domain Generalization Approach for Object Detection Jan 13, 2023 Domain Generalization image-classification
Code Code Available 1Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs Sep 27, 2023 Form Navigate
Code Code Available 1CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching Mar 23, 2023 Described Object Detection object-detection
Code Code Available 1Boosting Segment Anything Model Towards Open-Vocabulary Learning Dec 6, 2023 model Object
Code Code Available 1Agent Journey Beyond RGB: Unveiling Hybrid Semantic-Spatial Environmental Representations for Vision-and-Language Navigation Dec 9, 2024 Object Localization Vision and Language Navigation
Code Code Available 1Background Activation Suppression for Weakly Supervised Object Localization Dec 1, 2021 Object Object Localization
Code Code Available 1Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation Sep 22, 2023 Object Object Localization
Code Code Available 1CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free Sep 25, 2023 Image Segmentation Object Localization
Code Code Available 1