| DroneMOT: Drone-based Multi-Object Tracking Considering Detection Difficulties and Simultaneous Moving of Drones and Objects | Jul 12, 2024 | Multi-Object TrackingObject | CodeCode Available | 1 |
| Textual Query-Driven Mask Transformer for Domain Generalized Segmentation | Jul 12, 2024 | Domain GeneralizationObject | CodeCode Available | 1 |
| Visual Multi-Object Tracking with Re-Identification and Occlusion Handling using Labeled Random Finite Sets | Jul 11, 2024 | Multi-Object TrackingObject | CodeCode Available | 1 |
| SRPose: Two-view Relative Pose Estimation with Sparse Keypoints | Jul 11, 2024 | ObjectPose Estimation | CodeCode Available | 1 |
| ActionVOS: Actions as Prompts for Video Object Segmentation | Jul 10, 2024 | ObjectReferring Video Object Segmentation | CodeCode Available | 1 |
| Cue Point Estimation using Object Detection | Jul 9, 2024 | Objectobject-detection | CodeCode Available | 1 |
| CaRe-Ego: Contact-aware Relationship Modeling for Egocentric Interactive Hand-object Segmentation | Jul 8, 2024 | DecoderObject | CodeCode Available | 1 |
| Zero-shot Object Counting with Good Exemplars | Jul 6, 2024 | Contrastive LearningObject | CodeCode Available | 1 |
| StreamLTS: Query-based Temporal-Spatial LiDAR Fusion for Cooperative Object Detection | Jul 4, 2024 | Autonomous DrivingObject | CodeCode Available | 1 |
| Comics Datasets Framework: Mix of Comics datasets for detection benchmarking | Jul 3, 2024 | BenchmarkingObject | CodeCode Available | 1 |
| Knowledge Transfer with Simulated Inter-Image Erasing for Weakly Supervised Semantic Segmentation | Jul 3, 2024 | ObjectObject Discovery | CodeCode Available | 1 |
| Similarity Distance-Based Label Assignment for Tiny Object Detection | Jul 2, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Learning Granularity-Aware Affordances from Human-Object Interaction for Tool-Based Functional Grasping in Dexterous Robotics | Jun 30, 2024 | Human-Object Interaction DetectionObject | CodeCode Available | 1 |
| BiTrack: Bidirectional Offline 3D Multi-Object Tracking Using Camera-LiDAR Data | Jun 26, 2024 | 3D Multi-Object TrackingMulti-Object Tracking | CodeCode Available | 1 |
| Uncertainty for SVBRDF Acquisition using Frequency Analysis | Jun 25, 2024 | Inverse RenderingObject | CodeCode Available | 1 |
| Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models | Jun 24, 2024 | Common Sense ReasoningHallucination | CodeCode Available | 1 |
| MVOC: a training-free multiple video object composition method with diffusion models | Jun 22, 2024 | Image to Video GenerationObject | CodeCode Available | 1 |
| DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection | Jun 21, 2024 | Class-agnostic Object DetectionMulti-object discovery | CodeCode Available | 1 |
| African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification | Jun 20, 2024 | BenchmarkingClassification | CodeCode Available | 1 |
| Composing Object Relations and Attributes for Image-Text Matching | Jun 17, 2024 | AttributeGraph Attention | CodeCode Available | 1 |
| CustAny: Customizing Anything from A Single Example | Jun 17, 2024 | ObjectVirtual Try-on | CodeCode Available | 1 |
| MMRel: A Relation Understanding Benchmark in the MLLM Era | Jun 13, 2024 | DiversityHallucination | CodeCode Available | 1 |
| ImageNet3D: Towards General-Purpose Object-Level 3D Understanding | Jun 13, 2024 | Image CaptioningLinear Probing Object-Level 3D Awareness | CodeCode Available | 1 |
| 3D-AVS: LiDAR-based 3D Auto-Vocabulary Segmentation | Jun 13, 2024 | Autonomous DrivingObject | CodeCode Available | 1 |
| LaMOT: Language-Guided Multi-Object Tracking | Jun 12, 2024 | DescriptiveMulti-Object Tracking | CodeCode Available | 1 |
| Dataset Enhancement with Instance-Level Augmentations | Jun 12, 2024 | Data AugmentationObject | CodeCode Available | 1 |
| OpenObj: Open-Vocabulary Object-Level Neural Radiance Fields with Fine-Grained Understanding | Jun 12, 2024 | 3D Scene ReconstructionNeRF | CodeCode Available | 1 |
| UEMM-Air: A Synthetic Multi-modal Dataset for Unmanned Aerial Vehicle Object Detection | Jun 10, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion | Jun 9, 2024 | Autonomous DrivingObject | CodeCode Available | 1 |
| Multi-Granularity Language-Guided Multi-Object Tracking | Jun 7, 2024 | Multi-Object TrackingObject | CodeCode Available | 1 |
| Bootstrapping Referring Multi-Object Tracking | Jun 7, 2024 | DiversityMulti-Object Tracking | CodeCode Available | 1 |
| Towards Generalizable Multi-Object Tracking | Jun 1, 2024 | Domain GeneralizationMulti-Object Tracking | CodeCode Available | 1 |
| CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation | Jun 1, 2024 | 2D Pose EstimationAnimal Pose Estimation | CodeCode Available | 1 |
| RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection | May 30, 2024 | Image CaptioningImage Inpainting | CodeCode Available | 1 |
| FocSAM: Delving Deeply into Focused Objects in Segmenting Anything | May 29, 2024 | DecoderInteractive Segmentation | CodeCode Available | 1 |
| Track Initialization and Re-Identification for~3D Multi-View Multi-Object Tracking | May 28, 2024 | 3D Multi-Object TrackingMulti-Object Tracking | CodeCode Available | 1 |
| DiffuBox: Refining 3D Object Detection with Point Diffusion | May 25, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 1 |
| CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation | May 24, 2024 | Generalized Referring Expression SegmentationObject | CodeCode Available | 1 |
| PuTR: A Pure Transformer for Decoupled and Online Multi-Object Tracking | May 23, 2024 | Multi-Object TrackingObject | CodeCode Available | 1 |
| Improving Single Domain-Generalized Object Detection: A Focus on Diversification and Alignment | May 23, 2024 | Decision MakingDomain Generalization | CodeCode Available | 1 |
| MOD-UV: Learning Mobile Object Detectors from Unlabeled Videos | May 23, 2024 | Motion SegmentationObject | CodeCode Available | 1 |
| FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention | May 19, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 1 |
| Visible and Clear: Finding Tiny Objects in Difference Map | May 18, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection | May 16, 2024 | Objectobject-detection | CodeCode Available | 1 |
| RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images | May 14, 2024 | 6D Pose EstimationObject | CodeCode Available | 1 |
| Zero Shot Context-Based Object Segmentation using SLIP (SAM+CLIP) | May 12, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| Multi-Object Tracking in the Dark | May 10, 2024 | Autonomous DrivingMulti-Object Tracking | CodeCode Available | 1 |
| Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos | May 7, 2024 | DenoisingObject | CodeCode Available | 1 |
| Hand-Object Interaction Controller (HOIC): Deep Reinforcement Learning for Reconstructing Interactions with Physics | May 4, 2024 | Deep Reinforcement LearningObject | CodeCode Available | 1 |
| Towards Consistent Object Detection via LiDAR-Camera Synergy | May 2, 2024 | Objectobject-detection | CodeCode Available | 1 |