| SOS: Segment Object System for Open-World Instance Segmentation With Object Priors | Sep 22, 2024 | Instance SegmentationObject | —Unverified | 0 |
| A Bottom-Up Approach to Class-Agnostic Image Segmentation | Sep 20, 2024 | Image SegmentationMetric Learning | —Unverified | 0 |
| Formula-Supervised Visual-Geometric Pre-training | Sep 20, 2024 | 3D Object Classification3D Object Recognition | —Unverified | 0 |
| Learning to Play Video Games with Intuitive Physics Priors | Sep 20, 2024 | Decision MakingObject | —Unverified | 0 |
| Frequency-Guided Spatial Adaptation for Camouflaged Object Detection | Sep 19, 2024 | Objectobject-detection | —Unverified | 0 |
| Interpretable Action Recognition on Hard to Classify Actions | Sep 19, 2024 | Action RecognitionDepth Estimation | —Unverified | 0 |
| End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting | Sep 19, 2024 | DecoderObject | —Unverified | 0 |
| PoTATO: A Dataset for Analyzing Polarimetric Traces of Afloat Trash Objects | Sep 19, 2024 | Objectobject-detection | CodeCode Available | 0 |
| SIM-OFE: Structure Information Mining and Object-aware Feature Enhancement for Fine-Grained Visual Categorization | Sep 18, 2024 | Fine-Grained Image ClassificationFine-Grained Visual Categorization | —Unverified | 0 |
| FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation | Sep 18, 2024 | 6D Pose Estimation using RGBObject | —Unverified | 0 |
| Representing Positional Information in Generative World Models for Object Manipulation | Sep 18, 2024 | Object | —Unverified | 0 |
| End-to-End Probabilistic Geometry-Guided Regression for 6DoF Object Pose Estimation | Sep 18, 2024 | 6D Pose Estimation using RGBObject | —Unverified | 0 |
| Towards Global Localization using Multi-Modal Object-Instance Re-Identification | Sep 18, 2024 | Camera LocalizationObject | CodeCode Available | 0 |
| One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation | Sep 18, 2024 | AllObject | —Unverified | 0 |
| DETECLAP: Enhancing Audio-Visual Representation Learning with Object Information | Sep 18, 2024 | ObjectRepresentation Learning | —Unverified | 0 |
| Open-Set Semantic Uncertainty Aware Metric-Semantic Graph Matching | Sep 17, 2024 | Graph MatchingLoop Closure Detection | —Unverified | 0 |
| VALO: A Versatile Anytime Framework for LiDAR-based Object Detection Deep Neural Networks | Sep 17, 2024 | Objectobject-detection | CodeCode Available | 0 |
| TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection | Sep 17, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| RealDiff: Real-world 3D Shape Completion using Self-Supervised Diffusion Models | Sep 16, 2024 | ObjectPoint Cloud Completion | —Unverified | 0 |
| HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models | Sep 16, 2024 | AttributeDecoder | CodeCode Available | 0 |
| LithoHoD: A Litho Simulator-Powered Framework for IC Layout Hotspot Detection | Sep 16, 2024 | Objectobject-detection | —Unverified | 0 |
| Do Pre-trained Vision-Language Models Encode Object States? | Sep 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Point2Graph: An End-to-end Point Cloud-based 3D Open-Vocabulary Scene Graph for Robot Navigation | Sep 16, 2024 | 3D Open-Vocabulary Object DetectionGraph Generation | —Unverified | 0 |
| Uncertainty-Guided Appearance-Motion Association Network for Out-of-Distribution Action Detection | Sep 16, 2024 | Action DetectionObject | CodeCode Available | 0 |
| NARF24: Estimating Articulated Object Structure for Implicit Rendering | Sep 15, 2024 | Image SegmentationNeRF | —Unverified | 0 |
| Enhancing Weakly-Supervised Object Detection on Static Images through (Hallucinated) Motion | Sep 15, 2024 | Objectobject-detection | —Unverified | 0 |
| NBBOX: Noisy Bounding Box Improves Remote Sensing Object Detection | Sep 14, 2024 | Data AugmentationObject | CodeCode Available | 0 |
| Evaluating authenticity and quality of image captions via sentiment and semantic analyses | Sep 14, 2024 | Image CaptioningImage to text | —Unverified | 0 |
| ChildPlay-Hand: A Dataset of Hand Manipulations in the Wild | Sep 14, 2024 | Action RecognitionHand Detection | —Unverified | 0 |
| Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing | Sep 13, 2024 | Earth ObservationObject | —Unverified | 0 |
| SURGIVID: Annotation-Efficient Surgical Video Object Discovery | Sep 12, 2024 | ObjectObject Discovery | —Unverified | 0 |
| Multi-object event graph representation learning for Video Question Answering | Sep 12, 2024 | Contrastive LearningGraph Representation Learning | —Unverified | 0 |
| Hand-Object Interaction Pretraining from Videos | Sep 12, 2024 | ObjectReinforcement Learning (RL) | —Unverified | 0 |
| Single-View 3D Reconstruction via SO(2)-Equivariant Gaussian Sculpting Networks | Sep 11, 2024 | 3D Object Reconstruction3D Reconstruction | —Unverified | 0 |
| A Bayesian Framework for Active Tactile Object Recognition, Pose Estimation and Shape Transfer Learning | Sep 10, 2024 | Active LearningObject | —Unverified | 0 |
| Object Modeling from Underwater Forward-Scan Sonar Imagery with Sea-Surface Multipath | Sep 10, 2024 | Object | —Unverified | 0 |
| LEIA: Latent View-invariant Embeddings for Implicit 3D Articulation | Sep 10, 2024 | Motion EstimationNeRF | —Unverified | 0 |
| An Attribute-Enriched Dataset and Auto-Annotated Pipeline for Open Detection | Sep 10, 2024 | AttributeObject | —Unverified | 0 |
| LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation | Sep 9, 2024 | ObjectReferring Video Object Segmentation | —Unverified | 0 |
| Replay Consolidation with Label Propagation for Continual Object Detection | Sep 9, 2024 | Autonomous DrivingContinual Learning | —Unverified | 0 |
| From Words to Poses: Enhancing Novel Object Pose Estimation with Vision Language Models | Sep 9, 2024 | 6D Pose Estimation using RGBNeRF | —Unverified | 0 |
| Leveraging Object Priors for Point Tracking | Sep 9, 2024 | ObjectPoint Tracking | CodeCode Available | 0 |
| RCBEVDet++: Toward High-accuracy Radar-Camera Fusion 3D Perception Network | Sep 8, 2024 | 3D Multi-Object Tracking3D Object Detection | —Unverified | 0 |
| A Low-Computational Video Synopsis Framework with a Standard Dataset | Sep 8, 2024 | Objectobject-detection | CodeCode Available | 0 |
| Thinking Outside the BBox: Unconstrained Generative Object Compositing | Sep 6, 2024 | Object | —Unverified | 0 |
| Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences | Sep 6, 2024 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Dense Hand-Object(HO) GraspNet with Full Grasping Taxonomy and Dynamics | Sep 6, 2024 | 3D Hand Pose EstimationHand Pose Estimation | —Unverified | 0 |
| Multi-Modal Diffusion for Hand-Object Grasp Generation | Sep 6, 2024 | DiversityGrasp Generation | CodeCode Available | 0 |
| Organized Grouped Discrete Representation for Object-Centric Learning | Sep 5, 2024 | ObjectRepresentation Learning | —Unverified | 0 |
| Pluralistic Salient Object Detection | Sep 4, 2024 | Mixture-of-ExpertsObject | —Unverified | 0 |