| 1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation | Jan 1, 2024 | ObjectReferring Video Object Segmentation | CodeCode Available | 1 |
| SSL-OTA: Unveiling Backdoor Threats in Self-Supervised Learning for Object Detection | Dec 30, 2023 | Autonomous DrivingBackdoor Attack | —Unverified | 0 |
| Generating Enhanced Negatives for Training Language-Based Object Detectors | Dec 29, 2023 | Objectobject-detection | CodeCode Available | 0 |
| Tracking with Human-Intent Reasoning | Dec 29, 2023 | Language ModellingObject | CodeCode Available | 1 |
| 6D-Diff: A Keypoint Diffusion Framework for 6D Object Pose Estimation | Dec 29, 2023 | 6D Pose Estimation using RGBDenoising | —Unverified | 0 |
| HEAP: Unsupervised Object Discovery and Localization with Contrastive Grouping | Dec 29, 2023 | ObjectObject Discovery | —Unverified | 0 |
| MVPatch: More Vivid Patch for Adversarial Camouflaged Attacks on Object Detectors in the Physical World | Dec 29, 2023 | Objectobject-detection | —Unverified | 0 |
| Motion State: A New Benchmark Multiple Object Tracking | Dec 29, 2023 | Multi-Object TrackingMultiple Object Tracking | —Unverified | 0 |
| Generalization properties of contrastive world models | Dec 29, 2023 | Object | —Unverified | 0 |
| Fast Quantum Convolutional Neural Networks for Low-Complexity Object Detection in Autonomous Driving Applications | Dec 28, 2023 | Autonomous DrivingObject | —Unverified | 0 |
| iFusion: Inverting Diffusion for Pose-Free Reconstruction from Sparse Views | Dec 28, 2023 | 3D Object ReconstructionCamera Pose Estimation | CodeCode Available | 1 |
| ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe | Dec 28, 2023 | ObjectObject Tracking | CodeCode Available | 2 |
| DOEPatch: Dynamically Optimized Ensemble Model for Adversarial Patches Generation | Dec 28, 2023 | Autonomous DrivingObject | —Unverified | 0 |
| DeLR: Active Learning for Detection with Decoupled Localization and Recognition Query | Dec 28, 2023 | Active LearningObject | —Unverified | 0 |
| X Modality Assisting RGBT Object Tracking | Dec 27, 2023 | Knowledge DistillationObject | —Unverified | 0 |
| ConstScene: Dataset and Model for Advancing Robust Semantic Segmentation in Construction Environments | Dec 27, 2023 | Objectobject-detection | CodeCode Available | 0 |
| In-Hand 3D Object Reconstruction from a Monocular RGB Video | Dec 27, 2023 | 3D Object Reconstruction3D Reconstruction | —Unverified | 0 |
| A Comprehensive Study of Object Tracking in Low-Light Environments | Dec 25, 2023 | DenoisingObject | —Unverified | 0 |
| UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces | Dec 25, 2023 | Image SegmentationObject | CodeCode Available | 2 |
| Get a Grip: Reconstructing Hand-Object Stable Grasps in Egocentric Videos | Dec 25, 2023 | ObjectObject Reconstruction | —Unverified | 0 |
| Prompt-Propose-Verify: A Reliable Hand-Object-Interaction Data Generation Framework using Foundational Models | Dec 23, 2023 | Image GenerationObject | —Unverified | 0 |
| Scale Optimization Using Evolutionary Reinforcement Learning for Object Detection on Drone Imagery | Dec 23, 2023 | Objectobject-detection | —Unverified | 0 |
| FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection | Dec 22, 2023 | Data AugmentationObject | —Unverified | 0 |
| TimePillars: Temporally-Recurrent 3D LiDAR Object Detection | Dec 22, 2023 | Autonomous DrivingDiversity | —Unverified | 0 |
| Transformer-Based Multi-Object Smoothing with Decoupled Data Association and Smoothing | Dec 22, 2023 | Multi-Object TrackingObject | —Unverified | 0 |
| Context Enhanced Transformer for Single Image Object Detection | Dec 22, 2023 | Objectobject-detection | —Unverified | 0 |
| Prototype-based Cross-Modal Object Tracking | Dec 22, 2023 | ObjectObject Tracking | CodeCode Available | 2 |
| MACS: Mass Conditioned 3D Hand and Object Motion Synthesis | Dec 22, 2023 | Motion SynthesisObject | —Unverified | 0 |
| Lift-Attend-Splat: Bird's-eye-view camera-lidar fusion using transformers | Dec 22, 2023 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| MEAOD: Model Extraction Attack against Object Detectors | Dec 22, 2023 | Active Learningmodel | —Unverified | 0 |
| Revisiting Few-Shot Object Detection with Vision-Language Models | Dec 22, 2023 | Autonomous VehiclesFew-Shot Object Detection | CodeCode Available | 0 |
| VCoder: Versatile Vision Encoders for Multimodal Large Language Models | Dec 21, 2023 | Image CaptioningImage Generation | CodeCode Available | 2 |
| Modular Neural Network Policies for Learning In-Flight Object Catching with a Robot Hand-Arm System | Dec 21, 2023 | Deep Reinforcement LearningObject | —Unverified | 0 |
| Universal Noise Annotation: Unveiling the Impact of Noisy annotation on Object Detection | Dec 21, 2023 | image-classificationImage Classification | CodeCode Available | 1 |
| DECO: Query-Based End-to-End Object Detection with ConvNets | Dec 21, 2023 | DecoderObject | CodeCode Available | 1 |
| Object Attribute Matters in Visual Question Answering | Dec 20, 2023 | AttributeGraph Neural Network | CodeCode Available | 0 |
| Deep Learning on Object-centric 3D Neural Fields | Dec 20, 2023 | Deep LearningObject | —Unverified | 0 |
| Object-aware Adaptive-Positivity Learning for Audio-Visual Question Answering | Dec 20, 2023 | Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA) | CodeCode Available | 0 |
| OCTOPUS: Open-vocabulary Content Tracking and Object Placement Using Semantic Understanding in Mixed Reality | Dec 20, 2023 | Mixed RealityObject | —Unverified | 0 |
| TAO-Amodal: A Benchmark for Tracking Any Object Amodally | Dec 19, 2023 | Amodal TrackingAutonomous Driving | CodeCode Available | 1 |
| On the Effectiveness of Retrieval, Alignment, and Replay in Manipulation | Dec 19, 2023 | Behavioural cloningImitation Learning | —Unverified | 0 |
| First qualitative observations on deep learning vision model YOLO and DETR for automated driving in Austria | Dec 19, 2023 | 2D Object DetectionAutonomous Driving | —Unverified | 0 |
| Object-Aware Domain Generalization for Object Detection | Dec 19, 2023 | Autonomous DrivingContrastive Learning | CodeCode Available | 1 |
| LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset | Dec 19, 2023 | 3D Object DetectionObject | —Unverified | 0 |
| ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition in the Operating Room | Dec 19, 2023 | Action ClassificationActivity Recognition | —Unverified | 0 |
| Weakly Supervised Open-Vocabulary Object Detection | Dec 19, 2023 | AttributeNovel Concepts | —Unverified | 0 |
| Scene-Conditional 3D Object Stylization and Composition | Dec 19, 2023 | Object | —Unverified | 0 |
| EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering | Dec 19, 2023 | ObjectObject Counting | CodeCode Available | 1 |
| Transformer Network for Multi-Person Tracking and Re-Identification in Unconstrained Environment | Dec 19, 2023 | DecoderMulti-Object Tracking | —Unverified | 0 |
| CLIM: Contrastive Language-Image Mosaic for Region Representation | Dec 18, 2023 | Objectobject-detection | CodeCode Available | 1 |