| SBDet: A Symmetry-Breaking Object Detector via Relaxed Rotation-Equivariance | Aug 21, 2024 | 2D Object Detectionimage-classification | —Unverified | 0 |
| Domain-invariant Progressive Knowledge Distillation for UAV-based Object Detection | Aug 21, 2024 | Knowledge DistillationObject | —Unverified | 0 |
| Detection-Driven Object Count Optimization for Text-to-Image Diffusion Models | Aug 21, 2024 | DenoisingImage Generation | —Unverified | 0 |
| Target-Oriented Object Grasping via Multimodal Human Guidance | Aug 20, 2024 | Motion PlanningObject | —Unverified | 0 |
| A Review of Human-Object Interaction Detection | Aug 20, 2024 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Just a Hint: Point-Supervised Camouflaged Object Detection | Aug 20, 2024 | Contrastive LearningObject | —Unverified | 0 |
| LSVOS Challenge 3rd Place Report: SAM2 and Cutie based VOS | Aug 20, 2024 | Instance SegmentationObject | —Unverified | 0 |
| Aligning Object Detector Bounding Boxes with Human Preference | Aug 20, 2024 | Object | CodeCode Available | 0 |
| On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes | Aug 20, 2024 | Objectobject-detection | —Unverified | 0 |
| Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track | Aug 19, 2024 | ObjectSegmentation | —Unverified | 0 |
| Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering | Aug 19, 2024 | Inverse RenderingObject | —Unverified | 0 |
| 3D-Aware Instance Segmentation and Tracking in Egocentric Videos | Aug 19, 2024 | 3D Object ReconstructionInstance Segmentation | —Unverified | 0 |
| Enforcing View-Consistency in Class-Agnostic 3D Segmentation Fields | Aug 19, 2024 | Contrastive LearningObject | —Unverified | 0 |
| Physics-Aware Combinatorial Assembly Sequence Planning using Data-free Action Masking | Aug 19, 2024 | Deep Reinforcement LearningObject | CodeCode Available | 0 |
| Retina-Inspired Object Motion Segmentation for Event-Cameras | Aug 18, 2024 | Decision MakingMotion Compensation | —Unverified | 0 |
| GSLAMOT: A Tracklet and Query Graph-based Simultaneous Locating, Mapping, and Multiple Object Tracking System | Aug 17, 2024 | Multiple Object TrackingObject | —Unverified | 0 |
| Zero-Shot Object-Centric Representation Learning | Aug 17, 2024 | ObjectObject Discovery | —Unverified | 0 |
| MaskBEV: Towards A Unified Framework for BEV Detection and Map Segmentation | Aug 17, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Depth-guided Texture Diffusion for Image Semantic Segmentation | Aug 17, 2024 | Objectobject-detection | —Unverified | 0 |
| TEXTOC: Text-driven Object-Centric Style Transfer | Aug 16, 2024 | ObjectStyle Transfer | —Unverified | 0 |
| Enhancing Object Detection with Hybrid dataset in Manufacturing Environments: Comparing Federated Learning to Conventional Techniques | Aug 16, 2024 | Federated LearningObject | —Unverified | 0 |
| Multimodal Relational Triple Extraction with Query-based Entity Object Transformer | Aug 16, 2024 | Knowledge GraphsObject | —Unverified | 0 |
| FunEditor: Achieving Complex Image Edits via Function Aggregation with Diffusion Models | Aug 16, 2024 | Image Quality AssessmentObject | —Unverified | 0 |
| Infra-YOLO: Efficient Neural Network Structure with Model Compression for Real-Time Infrared Small Object Detection | Aug 14, 2024 | Efficient Neural NetworkModel Compression | —Unverified | 0 |
| See It All: Contextualized Late Aggregation for 3D Dense Captioning | Aug 14, 2024 | 3D dense captioningAll | —Unverified | 0 |
| SceneGPT: A Language Model for 3D Scene Understanding | Aug 13, 2024 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields | Aug 13, 2024 | Novel View SynthesisObject | —Unverified | 0 |
| Exploring Domain Shift on Radar-Based 3D Object Detection Amidst Diverse Environmental Conditions | Aug 13, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Divide and Conquer: Improving Multi-Camera 3D Perception with 2D Semantic-Depth Priors and Input-Dependent Queries | Aug 13, 2024 | 3D Object DetectionBEV Segmentation | —Unverified | 0 |
| Bi-directional Contextual Attention for 3D Dense Captioning | Aug 13, 2024 | 3D dense captioningAttribute | —Unverified | 0 |
| MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection | Aug 12, 2024 | 3D Object DetectionAutonomous Vehicles | —Unverified | 0 |
| DPDETR: Decoupled Position Detection Transformer for Infrared-Visible Object Detection | Aug 12, 2024 | DecoderObject | CodeCode Available | 0 |
| U-DECN: End-to-End Underwater Object Detection ConvNet with Improved DeNoising Training | Aug 11, 2024 | DenoisingObject | CodeCode Available | 0 |
| Robust Domain Generalization for Multi-modal Object Recognition | Aug 11, 2024 | Domain GeneralizationMulti-Label Classification | —Unverified | 0 |
| MacFormer: Semantic Segmentation with Fine Object Boundaries | Aug 11, 2024 | DecoderObject | —Unverified | 0 |
| SABER-6D: Shape Representation Based Implicit Object Pose Estimation | Aug 11, 2024 | DecoderObject | —Unverified | 0 |
| Embodied Uncertainty-Aware Object Segmentation | Aug 8, 2024 | Instance SegmentationInteractive Segmentation | —Unverified | 0 |
| Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models | Aug 8, 2024 | Contrastive LearningFine-Grained Image Recognition | —Unverified | 0 |
| ArtVLM: Attribute Recognition Through Vision-Based Prefix Language Modeling | Aug 7, 2024 | AttributeLanguage Modeling | —Unverified | 0 |
| An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion | Aug 6, 2024 | 3D Shape GenerationImage Generation | —Unverified | 0 |
| Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges | Aug 6, 2024 | ObjectObject Recognition | —Unverified | 0 |
| Mitigating Hallucinations in Large Vision-Language Models (LVLMs) via Language-Contrastive Decoding (LCD) | Aug 6, 2024 | Object | —Unverified | 0 |
| LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion | Aug 6, 2024 | ObjectRobotic Grasping | —Unverified | 0 |
| HQOD: Harmonious Quantization for Object Detection | Aug 5, 2024 | Objectobject-detection | CodeCode Available | 0 |
| View-consistent Object Removal in Radiance Fields | Aug 4, 2024 | Image InpaintingObject | —Unverified | 0 |
| KAN-RCBEVDepth: A multi-modal fusion algorithm in object detection for autonomous driving | Aug 4, 2024 | 3D Object DetectionAttribute | CodeCode Available | 0 |
| Pixel-Level Domain Adaptation: A New Perspective for Enhancing Weakly Supervised Semantic Segmentation | Aug 4, 2024 | Domain AdaptationObject | CodeCode Available | 0 |
| A Survey and Evaluation of Adversarial Attacks for Object Detection | Aug 4, 2024 | Adversarial RobustnessAutonomous Vehicles | —Unverified | 0 |
| Supervised Image Translation from Visible to Infrared Domain for Object Detection | Aug 3, 2024 | Generative Adversarial NetworkObject | —Unverified | 0 |
| Domain penalisation for improved Out-of-Distribution Generalisation | Aug 3, 2024 | Objectobject-detection | —Unverified | 0 |