| Objects matter: object-centric world models improve reinforcement learning in visually complex environments | Jan 27, 2025 | Atari GamesDeep Reinforcement Learning | —Unverified | 0 |
| 3D Reconstruction of non-visible surfaces of objects from a Single Depth View -- Comparative Study | Jan 27, 2025 | 3D ReconstructionObject | —Unverified | 0 |
| Domain Adaptation from Generated Multi-Weather Images for Unsupervised Maritime Object Classification | Jan 26, 2025 | Domain AdaptationObject | CodeCode Available | 0 |
| Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models | Jan 25, 2025 | AttributeContrastive Learning | CodeCode Available | 2 |
| Evaluating Hallucination in Large Vision-Language Models based on Context-Aware Object Similarities | Jan 25, 2025 | HallucinationObject | —Unverified | 0 |
| Estimation-theoretic analysis of lensless imaging | Jan 24, 2025 | Object | —Unverified | 0 |
| ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations | Jan 24, 2025 | DecoderObject | —Unverified | 0 |
| CSAOT: Cooperative Multi-Agent System for Active Object Tracking | Jan 23, 2025 | Autonomous NavigationDeep Reinforcement Learning | —Unverified | 0 |
| CuriousBot: Interactive Mobile Exploration via Actionable 3D Relational Object Graph | Jan 23, 2025 | Object | —Unverified | 0 |
| MONA: Moving Object Detection from Videos Shot by Dynamic Camera | Jan 22, 2025 | Moving Object DetectionObject | —Unverified | 0 |
| Slot-BERT: Self-supervised Object Discovery in Surgical Video | Jan 21, 2025 | DisentanglementDomain Adaptation | —Unverified | 0 |
| TOFFE -- Temporally-binned Object Flow from Events for High-speed and Energy-Efficient Object Detection and Tracking | Jan 21, 2025 | Autonomous NavigationGPU | —Unverified | 0 |
| SMamba: Sparse Mamba for Event-based Object Detection | Jan 21, 2025 | MambaObject | CodeCode Available | 1 |
| Green Video Camouflaged Object Detection | Jan 19, 2025 | Objectobject-detection | —Unverified | 0 |
| Surface-SOS: Self-Supervised Object Segmentation via Neural Surface Representation | Jan 17, 2025 | NeRFObject | CodeCode Available | 0 |
| FLORA: Formal Language Model Enables Robust Training-free Zero-shot Object Referring Analysis | Jan 17, 2025 | Bayesian InferenceLanguage Modeling | —Unverified | 0 |
| RE-POSE: Synergizing Reinforcement Learning-Based Partitioning and Offloading for Edge Object Detection | Jan 16, 2025 | Autonomous DrivingObject | —Unverified | 0 |
| MonoSOWA: Scalable monocular 3D Object detector Without human Annotations | Jan 16, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Detecting Contextual Anomalies by Discovering Consistent Spatial Regions | Jan 14, 2025 | Anomaly DetectionClustering | —Unverified | 0 |
| Predicting Performance of Object Detection Models in Electron Microscopy Using Random Forests | Jan 14, 2025 | Defect DetectionObject | CodeCode Available | 0 |
| Bootstrapping Corner Cases: High-Resolution Inpainting for Safety Critical Detect and Avoid for Automated Flying | Jan 14, 2025 | Objectobject-detection | —Unverified | 0 |
| Learning Motion and Temporal Cues for Unsupervised Video Object Segmentation | Jan 14, 2025 | Objectobject-detection | CodeCode Available | 1 |
| Everybody Likes to Sleep: A Computer-Assisted Comparison of Object Naming Data from 30 Languages | Jan 14, 2025 | Object | CodeCode Available | 0 |
| SmartEraser: Remove Anything from Images using Masked-Region Guidance | Jan 14, 2025 | Instance SegmentationObject | —Unverified | 0 |
| DAViD: Modeling Dynamic Affordance of 3D Objects using Pre-trained Video Diffusion Models | Jan 14, 2025 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Object-Centric 2D Gaussian Splatting: Background Removal and Occlusion-Aware Pruning for Compact Object Models | Jan 14, 2025 | Object | —Unverified | 0 |
| BlobGEN-Vid: Compositional Text-to-Video Generation with Blob Video Representations | Jan 13, 2025 | ObjectText-to-Video Generation | —Unverified | 0 |
| Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics | Jan 13, 2025 | Action Recognitionhand-object pose | —Unverified | 0 |
| SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing | Jan 13, 2025 | Objectobject-detection | CodeCode Available | 0 |
| VDOR: A Video-based Dataset for Object Removal via Sequence Consistency | Jan 13, 2025 | Image InpaintingObject | —Unverified | 0 |
| VAGeo: View-specific Attention for Cross-View Object Geo-Localization | Jan 13, 2025 | geo-localizationObject | —Unverified | 0 |
| Toward Realistic Camouflaged Object Detection: Benchmarks and Method | Jan 13, 2025 | Instance SegmentationObject | CodeCode Available | 1 |
| UnCommon Objects in 3D | Jan 13, 2025 | Object | CodeCode Available | 5 |
| Guided SAM: Label-Efficient Part Segmentation | Jan 13, 2025 | ObjectObject Recognition | —Unverified | 0 |
| 3DCoMPaT200: Language-Grounded Compositional Understanding of Parts and Materials of 3D Shapes | Jan 12, 2025 | NavigateObject | CodeCode Available | 1 |
| Mamba-MOC: A Multicategory Remote Object Counting via State Space Model | Jan 12, 2025 | MambaObject | —Unverified | 0 |
| UniQ: Unified Decoder with Task-specific Queries for Efficient Scene Graph Generation | Jan 10, 2025 | DecoderGraph Generation | —Unverified | 0 |
| From Simple to Complex Skills: The Case of In-Hand Object Reorientation | Jan 9, 2025 | Object | —Unverified | 0 |
| Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation | Jan 9, 2025 | Image AnimationObject | —Unverified | 0 |
| Improving Skeleton-based Action Recognition with Interactive Object Information | Jan 9, 2025 | Action RecognitionData Augmentation | CodeCode Available | 0 |
| UPAQ: A Framework for Real-Time and Energy-Efficient 3D Object Detection in Autonomous Vehicles | Jan 8, 2025 | 3D Object DetectionAutonomous Vehicles | —Unverified | 0 |
| TexHOI: Reconstructing Textures of 3D Unknown Objects in Monocular Hand-Object Interaction Scenes | Jan 7, 2025 | Object | CodeCode Available | 0 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 |
| Learning to Transfer Human Hand Skills for Robot Manipulations | Jan 7, 2025 | ObjectRobot Manipulation | —Unverified | 0 |
| AuxDepthNet: Real-Time Monocular 3D Object Detection with Depth-Sensitive Features | Jan 7, 2025 | 3D Object DetectionComputational Efficiency | —Unverified | 0 |
| Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Jan 7, 2025 | Objectobject-detection | CodeCode Available | 4 |
| Universal Fine-grained Visual Categorization by Concept Guided Learning | Jan 6, 2025 | Fine-Grained Image ClassificationFine-Grained Visual Categorization | CodeCode Available | 0 |
| HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation | Jan 6, 2025 | 3DGSData Augmentation | —Unverified | 0 |
| Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation | Jan 6, 2025 | Image to Video GenerationObject | —Unverified | 0 |
| Human Gaze Boosts Object-Centered Representation Learning | Jan 6, 2025 | Gaze PredictionObject | —Unverified | 0 |