| Object-Centric Instruction Augmentation for Robotic Manipulation | Jan 5, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| VoxelNextFusion: A Simple, Unified and Effective Voxel Fusion Framework for Multi-Modal 3D Object Detection | Jan 5, 2024 | 3D Object DetectionFeature Importance | —Unverified | 0 |
| Object-oriented backdoor attack against image captioning | Jan 5, 2024 | Backdoor AttackImage Captioning | —Unverified | 0 |
| Une ontologie pour les systèmes multi-agents ambiants dans les villes intelligentes | Jan 5, 2024 | Object | —Unverified | 0 |
| PAHD: Perception-Action based Human Decision Making using Explainable Graph Neural Networks on SAR Images | Jan 5, 2024 | Decision MakingObject | —Unverified | 0 |
| Fit-NGP: Fitting Object Models to Neural Graphics Primitives | Jan 4, 2024 | ObjectPose Estimation | —Unverified | 0 |
| Towards Efficient Object Re-Identification with A Novel Cloud-Edge Collaborative Framework | Jan 4, 2024 | Collaborative InferenceObject | —Unverified | 0 |
| ShapeAug: Occlusion Augmentation for Event Camera Data | Jan 4, 2024 | Data AugmentationObject | —Unverified | 0 |
| PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation | Jan 4, 2024 | Dataset GenerationObject | CodeCode Available | 1 |
| Slot-guided Volumetric Object Radiance Fields | Jan 4, 2024 | ObjectRepresentation Learning | —Unverified | 0 |
| Unsupervised Object-Centric Learning from Multiple Unspecified Viewpoints | Jan 3, 2024 | Object | —Unverified | 0 |
| Context-Guided Spatio-Temporal Video Grounding | Jan 3, 2024 | ObjectSpatio-Temporal Video Grounding | CodeCode Available | 2 |
| Incorporating Geo-Diverse Knowledge into Prompting for Increased Geographical Robustness in Object Recognition | Jan 3, 2024 | DescriptiveLanguage Modeling | —Unverified | 0 |
| Image Sculpting: Precise Object Editing with 3D Geometry Control | Jan 2, 2024 | 3D geometryObject | —Unverified | 0 |
| Hybrid Pooling and Convolutional Network for Improving Accuracy and Training Convergence Speed in Object Detection | Jan 2, 2024 | Objectobject-detection | —Unverified | 0 |
| Depth-discriminative Metric Learning for Monocular 3D Object Detection | Jan 2, 2024 | 3D Object DetectionDepth Estimation | —Unverified | 0 |
| Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting | Jan 2, 2024 | Autonomous DrivingNeRF | CodeCode Available | 5 |
| Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D | Jan 1, 2024 | 3D Object RetrievalDepth Estimation | —Unverified | 0 |
| Unsupervised 3D Structure Inference from Category-Specific Image Collections | Jan 1, 2024 | Graph MatchingObject | —Unverified | 0 |
| Learning to Segment Referred Objects from Narrated Egocentric Videos | Jan 1, 2024 | ObjectSegmentation | —Unverified | 0 |
| CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoor Object Detection from Multi-view Images | Jan 1, 2024 | 3D Object Detection3D Reconstruction | CodeCode Available | 1 |
| Few-Shot Object Detection with Foundation Models | Jan 1, 2024 | Few-Shot LearningFew-Shot Object Detection | —Unverified | 0 |
| Exploring Orthogonality in Open World Object Detection | Jan 1, 2024 | Incremental LearningObject | CodeCode Available | 2 |
| PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor | Jan 1, 2024 | Object | CodeCode Available | 4 |
| EASE-DETR: Easing the Competition among Object Queries | Jan 1, 2024 | DecoderObject | —Unverified | 0 |
| DIOD: Self-Distillation Meets Object Discovery | Jan 1, 2024 | Instance SegmentationKnowledge Distillation | CodeCode Available | 1 |
| Exploring Pose-Aware Human-Object Interaction via Hybrid Learning | Jan 1, 2024 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| On Scaling Up a Multilingual Vision and Language Model | Jan 1, 2024 | document understandingIn-Context Learning | —Unverified | 0 |
| Bi-Causal: Group Activity Recognition via Bidirectional Causality | Jan 1, 2024 | Activity RecognitionGroup Activity Recognition | —Unverified | 0 |
| Point Segment and Count: A Generalized Framework for Object Counting | Jan 1, 2024 | Few-shot Object Counting and DetectionKnowledge Distillation | CodeCode Available | 2 |
| LASO: Language-guided Affordance Segmentation on 3D Object | Jan 1, 2024 | ObjectSegmentation | CodeCode Available | 1 |
| Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation | Jan 1, 2024 | DescriptiveObject | CodeCode Available | 2 |
| PairDETR : Joint Detection and Association of Human Bodies and Faces | Jan 1, 2024 | Objectobject-detection | CodeCode Available | 1 |
| DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking | Jan 1, 2024 | Multi-Object TrackingObject | —Unverified | 0 |
| Cyclic Learning for Binaural Audio Generation and Localization | Jan 1, 2024 | Audio GenerationObject | —Unverified | 0 |
| CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image | Jan 1, 2024 | Novel View SynthesisObject | —Unverified | 0 |
| Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection | Jan 1, 2024 | feature selectionModel Compression | —Unverified | 0 |
| Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector | Jan 1, 2024 | Objectobject-detection | —Unverified | 0 |
| Tune-An-Ellipse: CLIP Has Potential to Find What You Want | Jan 1, 2024 | ObjectReferring Expression | CodeCode Available | 1 |
| ShapeMatcher: Self-Supervised Joint Shape Canonicalization Segmentation Retrieval and Deformation | Jan 1, 2024 | ObjectRetrieval | CodeCode Available | 1 |
| SNIDA: Unlocking Few-Shot Object Detection with Non-linear Semantic Decoupling Augmentation | Jan 1, 2024 | Data AugmentationDiversity | —Unverified | 0 |
| Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection | Jan 1, 2024 | 2D Object DetectionObject | —Unverified | 0 |
| Brush2Prompt: Contextual Prompt Generator for Object Inpainting | Jan 1, 2024 | DiversityObject | —Unverified | 0 |
| Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing | Jan 1, 2024 | Human-Object Interaction DetectionObject | CodeCode Available | 1 |
| Projecting Trackable Thermal Patterns for Dynamic Computer Vision | Jan 1, 2024 | ObjectObject Tracking | —Unverified | 0 |
| Relational Matching for Weakly Semi-Supervised Oriented Object Detection | Jan 1, 2024 | Graph MatchingObject | —Unverified | 0 |
| DAVE - A Detect-and-Verify Paradigm for Low-Shot Counting | Jan 1, 2024 | Object | —Unverified | 0 |
| HOI-M^3: Capture Multiple Humans and Objects Interaction within Contextual Environment | Jan 1, 2024 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Contextual Associated Triplet Queries for Panoptic Scene Graph Generation | Jan 1, 2024 | Graph GenerationObject | —Unverified | 0 |
| DiffAugment: Diffusion based Long-Tailed Visual Relationship Recognition | Jan 1, 2024 | ObjectRelation | —Unverified | 0 |