| Orientation Matters: Making 3D Generative Models Orientation-Aligned | Jun 10, 2025 | Object | —Unverified | 0 |
| Hierarchical Neural Collapse Detection Transformer for Class Incremental Object Detection | Jun 10, 2025 | Class-Incremental Object DetectionObject | —Unverified | 0 |
| UA-Pose: Uncertainty-Aware 6D Object Pose Estimation and Online Object Completion with Partial References | Jun 9, 2025 | 6D Pose Estimation using RGBImage to 3D | —Unverified | 0 |
| MapBERT: Bitwise Masked Modeling for Real-Time Semantic Mapping Generation | Jun 9, 2025 | Computational EfficiencyObject | —Unverified | 0 |
| SAM2Auto: Auto Annotation Using FLASH | Jun 9, 2025 | Instance SegmentationObject | —Unverified | 0 |
| Domain Randomization for Object Detection in Manufacturing Applications using Synthetic Data: A Comprehensive Study | Jun 9, 2025 | Objectobject-detection | CodeCode Available | 0 |
| R3D2: Realistic 3D Asset Insertion via Diffusion for Autonomous Driving Simulation | Jun 9, 2025 | 3DGSAutonomous Driving | —Unverified | 0 |
| Multiple Object Stitching for Unsupervised Representation Learning | Jun 9, 2025 | Contrastive LearningObject | CodeCode Available | 1 |
| HOI-PAGE: Zero-Shot Human-Object Interaction Generation with Part Affordance Guidance | Jun 8, 2025 | Human-Object Interaction DetectionHuman-Object Interaction Generation | —Unverified | 0 |
| Object Navigation with Structure-Semantic Reasoning-Based Multi-level Map and Multimodal Decision-Making LLM | Jun 6, 2025 | Decision MakingObject | —Unverified | 0 |
| Edge-Enabled Collaborative Object Detection for Real-Time Multi-Vehicle Perception | Jun 6, 2025 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 0 |
| EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? | Jun 5, 2025 | Object | —Unverified | 0 |
| Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning | Jun 5, 2025 | In-Context LearningIndoor Scene Synthesis | —Unverified | 0 |
| CIVET: Systematic Evaluation of Understanding in VLMs | Jun 5, 2025 | Object | —Unverified | 0 |
| RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion | Jun 5, 2025 | Novel View SynthesisObject | —Unverified | 0 |
| Gen-n-Val: Agentic Image Data Generation and Validation | Jun 5, 2025 | Image HarmonizationInstance Segmentation | —Unverified | 0 |
| Light and 3D: a methodological exploration of digitisation techniques adapted to a selection of objects from the Musée d'Archéologie Nationale | Jun 5, 2025 | DiversityObject | —Unverified | 0 |
| Feature-Based Lie Group Transformer for Real-World Applications | Jun 5, 2025 | ObjectObject Recognition | —Unverified | 0 |
| From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes | Jun 5, 2025 | 3D visual groundingObject | —Unverified | 0 |
| Object-X: Learning to Reconstruct Multi-Modal 3D Object Representations | Jun 5, 2025 | 3D Object ReconstructionNovel View Synthesis | —Unverified | 0 |
| Rex-Thinker: Grounded Object Referring via Chain-of-Thought Reasoning | Jun 4, 2025 | ObjectReferring Expression | —Unverified | 0 |
| MambaNeXt-YOLO: A Hybrid State Space Model for Real-time Object Detection | Jun 4, 2025 | MambaNovel Object Detection | —Unverified | 0 |
| Sounding that Object: Interactive Object-Aware Image to Audio Generation | Jun 4, 2025 | Audio GenerationImage Segmentation | —Unverified | 0 |
| SemNav: A Model-Based Planner for Zero-Shot Object Goal Navigation Using Vision-Foundation Models | Jun 4, 2025 | Object | —Unverified | 0 |
| ReSpace: Text-Driven 3D Scene Synthesis and Editing with Preference Alignment | Jun 3, 2025 | Indoor Scene SynthesisObject | —Unverified | 0 |
| Tru-POMDP: Task Planning Under Uncertainty via Tree of Hypotheses and Open-Ended POMDPs | Jun 3, 2025 | ObjectObject Rearrangement | —Unverified | 0 |
| InterRVOS: Interaction-aware Referring Video Object Segmentation | Jun 3, 2025 | 8kObject | —Unverified | 0 |
| unMORE: Unsupervised Multi-Object Segmentation via Center-Boundary Reasoning | Jun 2, 2025 | Image ReconstructionObject | CodeCode Available | 0 |
| WoMAP: World Models For Embodied Open-Vocabulary Object Localization | Jun 2, 2025 | Active Object LocalizationEfficient Exploration | —Unverified | 0 |
| ComposeAnything: Composite Object Priors for Text-to-Image Generation | May 30, 2025 | DenoisingImage Generation | —Unverified | 0 |
| SORCE: Small Object Retrieval in Complex Environments | May 30, 2025 | BenchmarkingImage Retrieval | CodeCode Available | 0 |
| InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing | May 30, 2025 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint Frames | May 30, 2025 | ObjectSpatial Reasoning | —Unverified | 0 |
| Object Centric Concept Bottlenecks | May 30, 2025 | Decision MakingObject | —Unverified | 0 |
| DexMachina: Functional Retargeting for Bimanual Dexterous Manipulation | May 30, 2025 | Object | —Unverified | 0 |
| Conformal Object Detection by Sequential Risk Control | May 29, 2025 | Conformal PredictionObject | —Unverified | 0 |
| FMG-Det: Foundation Model Guided Robust Object Detection | May 29, 2025 | Multiple Instance LearningObject | —Unverified | 0 |
| Rooms from Motion: Un-posed Indoor 3D Object Detection as Localization and Mapping | May 29, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| MOVi: Training-free Text-conditioned Multi-Object Video Generation | May 29, 2025 | ObjectVideo Generation | —Unverified | 0 |
| Language-guided Learning for Object Detection Tackling Multiple Variations in Aerial Images | May 29, 2025 | Novel Object DetectionObject | —Unverified | 0 |
| Disrupting Vision-Language Model-Driven Navigation Services via Adversarial Object Fusion | May 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| The Meeseeks Mesh: Spatially Consistent 3D Adversarial Objects for BEV Detector | May 28, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| LPOI: Listwise Preference Optimization for Vision Language Models | May 27, 2025 | Object | CodeCode Available | 1 |
| Right Side Up? Disentangling Orientation Understanding in MLLMs with Fine-grained Multi-axis Perception Tasks | May 27, 2025 | 3D Scene ReconstructionDiagnostic | —Unverified | 0 |
| PartInstruct: Part-level Instruction Following for Fine-grained Robot Manipulation | May 27, 2025 | Instruction FollowingObject | —Unverified | 0 |
| CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects | May 27, 2025 | Object | —Unverified | 0 |
| ReaMOT: A Benchmark and Framework for Reasoning-based Multi-Object Tracking | May 26, 2025 | Multi-Object TrackingObject | CodeCode Available | 1 |
| Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models | May 26, 2025 | DisentanglementHallucination | CodeCode Available | 0 |
| Progressive Scaling Visual Object Tracking | May 26, 2025 | ObjectObject Tracking | —Unverified | 0 |
| Category-Agnostic Neural Object Rigging | May 26, 2025 | Object | —Unverified | 0 |