| Semi-supervised Open-World Object Detection | Feb 25, 2024 | Incremental LearningObject | CodeCode Available | 1 |
| Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding | Feb 23, 2024 | HallucinationObject | CodeCode Available | 1 |
| TransGOP: Transformer-Based Gaze Object Prediction | Feb 21, 2024 | Gaze EstimationObject | CodeCode Available | 1 |
| MuLan: Multimodal-LLM Agent for Progressive and Interactive Multi-Object Diffusion | Feb 20, 2024 | AttributeLanguage Modeling | CodeCode Available | 1 |
| Object-level Geometric Structure Preserving for Natural Image Stitching | Feb 20, 2024 | Image StitchingObject | CodeCode Available | 1 |
| UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object Tracking | Feb 19, 2024 | Autonomous DrivingMulti-Object Tracking | CodeCode Available | 1 |
| Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models | Feb 18, 2024 | HallucinationObject | CodeCode Available | 1 |
| Lester: rotoscope animation through video object segmentation and tracking | Feb 15, 2024 | 3D Human Pose EstimationObject | CodeCode Available | 1 |
| Exploring Perceptual Limitation of Multimodal Large Language Models | Feb 12, 2024 | ObjectQuestion Answering | CodeCode Available | 1 |
| GBOT: Graph-Based 3D Object Tracking for Augmented Reality-Assisted Assembly Guidance | Feb 12, 2024 | 3D Object Tracking6D Pose Estimation | CodeCode Available | 1 |
| Extreme Two-View Geometry From Object Poses with Diffusion Models | Feb 5, 2024 | Camera Pose EstimationObject | CodeCode Available | 1 |
| NOAH: Learning Pairwise Object Category Attentions for Image Classification | Feb 4, 2024 | Classificationimage-classification | CodeCode Available | 1 |
| Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation | Jan 22, 2024 | ObjectPrompt Learning | CodeCode Available | 1 |
| Spatial Structure Constraints for Weakly Supervised Semantic Segmentation | Jan 20, 2024 | ObjectObject Localization | CodeCode Available | 1 |
| Focaler-IoU: More Focused Intersection over Union Loss | Jan 19, 2024 | Objectobject-detection | CodeCode Available | 1 |
| NWPU-MOC: A Benchmark for Fine-grained Multi-category Object Counting in Aerial Images | Jan 19, 2024 | ObjectObject Counting | CodeCode Available | 1 |
| BlenDA: Domain Adaptive Object Detection through diffusion-based blending | Jan 18, 2024 | Domain AdaptationImage-to-Image Translation | CodeCode Available | 1 |
| Learning Implicit Representation for Reconstructing Articulated Objects | Jan 16, 2024 | 3D ReconstructionObject | CodeCode Available | 1 |
| RSUD20K: A Dataset for Road Scene Understanding In Autonomous Driving | Jan 14, 2024 | Autonomous DrivingBenchmarking | CodeCode Available | 1 |
| DCDet: Dynamic Cross-based 3D Object Detector | Jan 14, 2024 | 3D Object DetectionObject | CodeCode Available | 1 |
| CLIP-Guided Source-Free Object Detection in Aerial Images | Jan 10, 2024 | Domain AdaptationObject | CodeCode Available | 1 |
| Structure from Duplicates: Neural Inverse Graphics from a Pile of Objects | Jan 10, 2024 | Image ReconstructionObject | CodeCode Available | 1 |
| A Flying Bird Object Detection Method for Surveillance Video | Jan 8, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Explicit Visual Prompts for Visual Object Tracking | Jan 6, 2024 | ObjectObject Tracking | CodeCode Available | 1 |
| PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation | Jan 4, 2024 | Dataset GenerationObject | CodeCode Available | 1 |
| 1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation | Jan 1, 2024 | ObjectReferring Video Object Segmentation | CodeCode Available | 1 |
| ShapeMatcher: Self-Supervised Joint Shape Canonicalization Segmentation Retrieval and Deformation | Jan 1, 2024 | ObjectRetrieval | CodeCode Available | 1 |
| Tune-An-Ellipse: CLIP Has Potential to Find What You Want | Jan 1, 2024 | ObjectReferring Expression | CodeCode Available | 1 |
| DIOD: Self-Distillation Meets Object Discovery | Jan 1, 2024 | Instance SegmentationKnowledge Distillation | CodeCode Available | 1 |
| CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images | Jan 1, 2024 | 3D Object Detection3D Reconstruction | CodeCode Available | 1 |
| PairDETR : Joint Detection and Association of Human Bodies and Faces | Jan 1, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing | Jan 1, 2024 | Human-Object Interaction DetectionObject | CodeCode Available | 1 |
| LASO: Language-guided Affordance Segmentation on 3D Object | Jan 1, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| Tracking with Human-Intent Reasoning | Dec 29, 2023 | Language ModellingObject | CodeCode Available | 1 |
| iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views | Dec 28, 2023 | 3D Object ReconstructionCamera Pose Estimation | CodeCode Available | 1 |
| DECO: Query-Based End-to-End Object Detection with ConvNets | Dec 21, 2023 | DecoderObject | CodeCode Available | 1 |
| Universal Noise Annotation: Unveiling the Impact of Noisy annotation on Object Detection | Dec 21, 2023 | image-classificationImage Classification | CodeCode Available | 1 |
| EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering | Dec 19, 2023 | ObjectObject Counting | CodeCode Available | 1 |
| TAO-Amodal: A Benchmark for Tracking Any Object Amodally | Dec 19, 2023 | Amodal TrackingAutonomous Driving | CodeCode Available | 1 |
| Object-Aware Domain Generalization for Object Detection | Dec 19, 2023 | Autonomous DrivingContrastive Learning | CodeCode Available | 1 |
| CLIM: Contrastive Language-Image Mosaic for Region Representation | Dec 18, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Simple Image-level Classification Improves Open-vocabulary Object Detection | Dec 16, 2023 | Knowledge DistillationObject | CodeCode Available | 1 |
| PETDet: Proposal Enhancement for Two-Stage Fine-Grained Object Detection | Dec 16, 2023 | Multi-Task LearningObject | CodeCode Available | 1 |
| Painterly Image Harmonization by Learning from Painterly Objects | Dec 15, 2023 | Image HarmonizationObject | CodeCode Available | 1 |
| Ins-HOI: Instance Aware Human-Object Interactions Recovery | Dec 15, 2023 | DescriptiveDisentanglement | CodeCode Available | 1 |
| SKDF: A Simple Knowledge Distillation Framework for Distilling Open-Vocabulary Knowledge to Open-world Object Detector | Dec 14, 2023 | Knowledge DistillationObject | CodeCode Available | 1 |
| Mono3DVG: 3D Visual Grounding in Monocular Images | Dec 13, 2023 | 3D Object Detection3D visual grounding | CodeCode Available | 1 |
| DualTeacher: Bridging Coexistence of Unlabelled Classes for Semi-supervised Incremental Object Detection | Dec 13, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Unveiling Parts Beyond Objects:Towards Finer-Granularity Referring Expression Segmentation | Dec 13, 2023 | DescriptiveObject | CodeCode Available | 1 |
| Efficient Object Detection in Autonomous Driving using Spiking Neural Networks: Performance, Energy Consumption Analysis, and Insights into Open-set Object Discovery | Dec 12, 2023 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 |