| DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM | Mar 19, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Boosting Zero-Shot Human-Object Interaction Detection with Vision-Language Transfer | Mar 18, 2024 | Human-Object Interaction DetectionLanguage Modeling | CodeCode Available | 0 |
| R3DS: Reality-linked 3D Scenes for Panoramic Scene Understanding | Mar 18, 2024 | ObjectRelation Prediction | —Unverified | 0 |
| Prototipo de un Contador Bidireccional Automático de Personas basado en sensores de visión 3D | Mar 18, 2024 | Objectobject-detection | —Unverified | 0 |
| FlexCap: Describe Anything in Images in Controllable Detail | Mar 18, 2024 | AttributeDense Captioning | —Unverified | 0 |
| HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data | Mar 18, 2024 | 6D Pose Estimation using RGBImage Generation | —Unverified | 0 |
| Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model | Mar 18, 2024 | ObjectVisual Tracking | —Unverified | 0 |
| Prioritized Semantic Learning for Zero-shot Instance Navigation | Mar 18, 2024 | Language ModellingObject | CodeCode Available | 1 |
| Video Object Segmentation with Dynamic Query Modulation | Mar 18, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| Circle Representation for Medical Instance Object Segmentation | Mar 18, 2024 | Instance SegmentationObject | CodeCode Available | 0 |
| Object Segmentation-Assisted Inter Prediction for Versatile Video Coding | Mar 18, 2024 | Motion CompensationMotion Estimation | —Unverified | 0 |
| GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects | Mar 18, 2024 | 6D Pose Estimation using RGBObject | —Unverified | 0 |
| THOR: Text to Human-Object Interaction Diffusion via Relation Intervention | Mar 17, 2024 | DiversityHuman-Object Interaction Detection | —Unverified | 0 |
| NetTrack: Tracking Highly Dynamic Objects with a Net | Mar 17, 2024 | Multi-Object TrackingObject | CodeCode Available | 2 |
| CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | Mar 17, 2024 | Objectobject-detection | CodeCode Available | 2 |
| FORCE: Physics-aware Human-object Interaction | Mar 17, 2024 | DiversityFriction | —Unverified | 0 |
| Creating Seamless 3D Maps Using Radiance Fields | Mar 17, 2024 | NeRFObject | —Unverified | 0 |
| GRA: Detecting Oriented Objects through Group-wise Rotating and Attention | Mar 17, 2024 | Objectobject-detection | —Unverified | 0 |
| Unsupervised Collaborative Metric Learning with Mixed-Scale Groups for General Object Retrieval | Mar 16, 2024 | Metric LearningObject | CodeCode Available | 1 |
| View-Centric Multi-Object Tracking with Homographic Matching in Moving UAV | Mar 16, 2024 | Homography EstimationMulti-Object Tracking | —Unverified | 0 |
| Segment Any Object Model (SAOM): Real-to-Simulation Fine-Tuning Strategy for Multi-Class Multi-Instance Segmentation | Mar 16, 2024 | Instance SegmentationObject | —Unverified | 0 |
| IMPRINT: Generative Object Compositing by Learning Identity-Preserving Representation | Mar 15, 2024 | Object | —Unverified | 0 |
| GS-Pose: Generalizable Segmentation-based 6D Object Pose Estimation with 3D Gaussian Splatting | Mar 15, 2024 | 6D Pose Estimation using RGBObject | —Unverified | 0 |
| Latent Object Characteristics Recognition with Visual to Haptic-Audio Cross-modal Transfer Learning | Mar 15, 2024 | ObjectObject Recognition | —Unverified | 0 |
| Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification | Mar 15, 2024 | Object | CodeCode Available | 2 |
| Generative Region-Language Pretraining for Open-Ended Object Detection | Mar 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Learning Physical Dynamics for Object-centric Visual Prediction | Mar 15, 2024 | ObjectPrediction | —Unverified | 0 |
| Grasp Anything: Combining Teacher-Augmented Policy Gradient Learning with Instance Segmentation to Grasp Arbitrary Objects | Mar 15, 2024 | Instance SegmentationObject | —Unverified | 0 |
| Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors | Mar 14, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| Right Place, Right Time! Dynamizing Topological Graphs for Embodied Navigation | Mar 14, 2024 | Decision MakingLanguage Modeling | —Unverified | 0 |
| SHAN: Object-Level Privacy Detection via Inference on Scene Heterogeneous Graph | Mar 14, 2024 | Graph AttentionObject | —Unverified | 0 |
| Explorations in Texture Learning | Mar 14, 2024 | Object | CodeCode Available | 0 |
| E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection | Mar 14, 2024 | Autonomous DrivingObject | CodeCode Available | 2 |
| Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring | Mar 14, 2024 | ObjectObject Counting | —Unverified | 0 |
| Reconstruction and Simulation of Elastic Objects with Spring-Mass 3D Gaussians | Mar 14, 2024 | Future predictionObject | —Unverified | 0 |
| Rethinking Referring Object Removal | Mar 14, 2024 | Object | —Unverified | 0 |
| Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection | Mar 14, 2024 | Knowledge DistillationNovel Object Detection | CodeCode Available | 2 |
| OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning | Mar 14, 2024 | ObjectObject Tracking | —Unverified | 0 |
| Improving Distant 3D Object Detection Using 2D Box Supervision | Mar 14, 2024 | 3D Object DetectionDepth Estimation | —Unverified | 0 |
| PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest | Mar 14, 2024 | 3D Object DetectionObject | —Unverified | 0 |
| FogGuard: guarding YOLO against fog using perceptual loss | Mar 13, 2024 | Autonomous DrivingDomain Adaptation | CodeCode Available | 0 |
| TFCounter:Polishing Gems for Training-Free Object Counting | Mar 12, 2024 | ManagementObject | —Unverified | 0 |
| TaskCLIP: Extend Large Vision-Language Model for Task Oriented Object Detection | Mar 12, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| DragAnything: Motion Control for Anything using Entity Representation | Mar 12, 2024 | ObjectVideo Generation | CodeCode Available | 7 |
| Entropy is not Enough for Test-Time Adaptation: From the Perspective of Disentangled Factors | Mar 12, 2024 | ObjectPseudo Label | CodeCode Available | 1 |
| JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection | Mar 12, 2024 | Motion CompensationMoving Object Detection | —Unverified | 0 |
| FSC: Few-point Shape Completion | Mar 12, 2024 | DecoderObject | CodeCode Available | 1 |
| Learn and Search: An Elegant Technique for Object Lookup using Contrastive Learning | Mar 12, 2024 | Contrastive LearningObject | —Unverified | 0 |
| Adaptive Bounding Box Uncertainties via Two-Step Conformal Prediction | Mar 12, 2024 | Autonomous DrivingConformal Prediction | CodeCode Available | 1 |
| Category-Agnostic Pose Estimation for Point Clouds | Mar 12, 2024 | Category-Agnostic Pose EstimationObject | —Unverified | 0 |