| What Do Language Models Hear? Probing for Auditory Representations in Language Models | Feb 26, 2024 | Object | —Unverified | 0 |
| Real-Time Vehicle Detection and Urban Traffic Behavior Analysis Based on UAV Traffic Videos on Mobile Devices | Feb 26, 2024 | Objectobject-detection | —Unverified | 0 |
| Outline-Guided Object Inpainting with Diffusion Models | Feb 26, 2024 | Image AugmentationInstance Segmentation | —Unverified | 0 |
| SaRPFF: A Self-Attention with Register-based Pyramid Feature Fusion module for enhanced RLD detection | Feb 26, 2024 | Objectobject-detection | —Unverified | 0 |
| Parallelized Spatiotemporal Binding | Feb 26, 2024 | DecoderObject | —Unverified | 0 |
| PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models | Feb 26, 2024 | ObjectPhysical Commonsense Reasoning | —Unverified | 0 |
| Exploring Failure Cases in Multimodal Reasoning About Physical Dynamics | Feb 24, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge | Feb 24, 2024 | Contrastive LearningLanguage Modelling | —Unverified | 0 |
| Detection Is Tracking: Point Cloud Multi-Sweep Deep Learning Models Revisited | Feb 24, 2024 | Autonomous DrivingObject | —Unverified | 0 |
| Multi-Object Tracking by Hierarchical Visual Representations | Feb 24, 2024 | Multi-Object TrackingObject | —Unverified | 0 |
| Background Denoising for Ptychography via Wigner Distribution Deconvolution | Feb 23, 2024 | DenoisingObject | —Unverified | 0 |
| Place Anything into Any Video | Feb 22, 2024 | 3D GenerationObject | —Unverified | 0 |
| YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 | Feb 22, 2024 | Objectobject-detection | —Unverified | 0 |
| Path Planning based on 2D Object Bounding-box | Feb 22, 2024 | Autonomous DrivingGraph Neural Network | —Unverified | 0 |
| Object permanence in newborn chicks is robust against opposing evidence | Feb 22, 2024 | Object | —Unverified | 0 |
| Learning Dual-arm Object Rearrangement for Cartesian Robots | Feb 21, 2024 | Computational EfficiencyObject | —Unverified | 0 |
| Unsupervised learning based object detection using Contrastive Learning | Feb 21, 2024 | Contrastive LearningObject | —Unverified | 0 |
| Weakly supervised localisation of prostate cancer using reinforcement learning for bi-parametric MR images | Feb 21, 2024 | Multiple Instance LearningObject | —Unverified | 0 |
| CST: Calibration Side-Tuning for Parameter and Memory Efficient Transfer Learning | Feb 20, 2024 | GPUObject | —Unverified | 0 |
| TEXT2AFFORD: Probing Object Affordance Prediction abilities of Language Models solely from Text | Feb 20, 2024 | Object | CodeCode Available | 0 |
| GOOD: Towards Domain Generalized Orientated Object Detection | Feb 20, 2024 | HallucinationObject | —Unverified | 0 |
| Visual Reasoning in Object-Centric Deep Neural Networks: A Comparative Cognition Approach | Feb 20, 2024 | ObjectRelational Reasoning | CodeCode Available | 0 |
| OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog | Feb 20, 2024 | ObjectObject Tracking | —Unverified | 0 |
| Slot-VLM: SlowFast Slots for Video-Language Modeling | Feb 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Efficient Parameter Mining and Freezing for Continual Object Detection | Feb 20, 2024 | Continual LearningIncremental Learning | —Unverified | 0 |
| DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models | Feb 20, 2024 | Imitation LearningObject | —Unverified | 0 |
| Few-Shot Object Detection with Sparse Context Transformers | Feb 14, 2024 | Few-Shot Object DetectionObject | —Unverified | 0 |
| Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation | Feb 14, 2024 | DecoderObject | —Unverified | 0 |
| Detecting Anomalous Events in Object-centric Business Processes via Graph Neural Networks | Feb 14, 2024 | Anomaly DetectionObject | CodeCode Available | 0 |
| Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection | Feb 13, 2024 | Contrastive LearningObject | —Unverified | 0 |
| H2O-SDF: Two-phase Learning for 3D Indoor Reconstruction using Object Surface Fields | Feb 13, 2024 | Indoor Scene ReconstructionNeRF | CodeCode Available | 0 |
| Unsupervised Discovery of Object-Centric Neural Fields | Feb 12, 2024 | ObjectObject Discovery | —Unverified | 0 |
| Semantic Object-level Modeling for Robust Visual Camera Relocalization | Feb 10, 2024 | Camera RelocalizationObject | —Unverified | 0 |
| Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation | Feb 9, 2024 | 6D Pose Estimation using RGBBenchmarking | —Unverified | 0 |
| Event-to-Video Conversion for Overhead Object Detection | Feb 9, 2024 | Objectobject-detection | —Unverified | 0 |
| Transfer learning with generative models for object detection on limited datasets | Feb 9, 2024 | GeophysicsObject | —Unverified | 0 |
| InstaGen: Enhancing Object Detection by Training on Synthetic Dataset | Feb 8, 2024 | Objectobject-detection | —Unverified | 0 |
| FuncGrasp: Learning Object-Centric Neural Grasp Functions from Single Annotated Example Object | Feb 8, 2024 | Object | —Unverified | 0 |
| Extending 6D Object Pose Estimators for Stereo Vision | Feb 8, 2024 | 6D Pose Estimation6D Pose Estimation using RGB | —Unverified | 0 |
| NCRF: Neural Contact Radiance Fields for Free-Viewpoint Rendering of Hand-Object Interaction | Feb 8, 2024 | hand-object poseNovel View Synthesis | —Unverified | 0 |
| CLIP-Loc: Multi-modal Landmark Association for Global Localization in Object-based Maps | Feb 8, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Point-VOS: Pointing Up Video Object Segmentation | Feb 8, 2024 | ObjectSemantic Segmentation | —Unverified | 0 |
| Binding Dynamics in Rotating Features | Feb 8, 2024 | Object | —Unverified | 0 |
| Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration | Feb 7, 2024 | 3D Object DetectionDenoising | —Unverified | 0 |
| Shape-biased Texture Agnostic Representations for Improved Textureless and Metallic Object Detection and 6D Pose Estimation | Feb 7, 2024 | 6D Pose EstimationObject | CodeCode Available | 0 |
| Tactile-based Object Retrieval From Granular Media | Feb 7, 2024 | ObjectRetrieval | —Unverified | 0 |
| Color Recognition in Challenging Lighting Environments: CNN Approach | Feb 7, 2024 | Edge DetectionImage Segmentation | —Unverified | 0 |
| Text2Street: Controllable Text-to-image Generation for Street Views | Feb 7, 2024 | Image GenerationLayout Generation | —Unverified | 0 |
| Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion | Feb 5, 2024 | ObjectVideo Generation | —Unverified | 0 |
| DexDiffuser: Generating Dexterous Grasps with Diffusion Models | Feb 5, 2024 | DenoisingGrasp Generation | —Unverified | 0 |