| Efficient Object Detection in Autonomous Driving using Spiking Neural Networks: Performance, Energy Consumption Analysis, and Insights into Open-set Object Discovery | Dec 12, 2023 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 |
| MedYOLO: A Medical Image Object Detection Framework | Dec 12, 2023 | Computed Tomography (CT)Object | CodeCode Available | 1 |
| GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos | Dec 12, 2023 | Object | CodeCode Available | 1 |
| InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models | Dec 10, 2023 | Human-Object Interaction GenerationObject | CodeCode Available | 1 |
| Correcting Diffusion Generation through Resampling | Dec 10, 2023 | Image GenerationObject | CodeCode Available | 1 |
| Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency | Dec 8, 2023 | DecoderHallucination | CodeCode Available | 1 |
| 3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection | Dec 8, 2023 | 3D Object DetectionData Augmentation | CodeCode Available | 1 |
| Benchmarking and Analysis of Unsupervised Object Segmentation from Real-world Single Images | Dec 8, 2023 | BenchmarkingObject | CodeCode Available | 1 |
| Mitigating Open-Vocabulary Caption Hallucinations | Dec 6, 2023 | DiversityHallucination | CodeCode Available | 1 |
| High Pileup Particle Tracking with Object Condensation | Dec 6, 2023 | Edge ClassificationObject | CodeCode Available | 1 |
| TokenCompose: Text-to-Image Diffusion with Token-level Supervision | Dec 6, 2023 | DenoisingImage Generation | CodeCode Available | 1 |
| Boosting Segment Anything Model Towards Open-Vocabulary Learning | Dec 6, 2023 | modelObject | CodeCode Available | 1 |
| DreamComposer: Controllable 3D Object Generation via Multi-View Conditions | Dec 6, 2023 | 3D Object ReconstructionNovel View Synthesis | CodeCode Available | 1 |
| Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection | Dec 5, 2023 | 3D Object DetectionDenoising | CodeCode Available | 1 |
| Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites | Dec 4, 2023 | HallucinationHallucination Evaluation | CodeCode Available | 1 |
| BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection | Dec 4, 2023 | 3D Object DetectionDecoder | CodeCode Available | 1 |
| Object Recognition as Next Token Prediction | Dec 4, 2023 | DecoderLanguage Modeling | CodeCode Available | 1 |
| Toward Improving Robustness of Object Detectors Against Domain Shift | Dec 2, 2023 | Data AugmentationDiversity | CodeCode Available | 1 |
| SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers | Dec 1, 2023 | DecoderObject | CodeCode Available | 1 |
| Is Underwater Image Enhancement All Object Detectors Need? | Nov 30, 2023 | AllImage Enhancement | CodeCode Available | 1 |
| A Simple Video Segmenter by Tracking Objects Along Axial Trajectories | Nov 30, 2023 | GPUObject | CodeCode Available | 1 |
| Lasagna: Layered Score Distillation for Disentangled Object Relighting | Nov 30, 2023 | ColorizationObject | CodeCode Available | 1 |
| The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding | Nov 29, 2023 | Objectobject-detection | CodeCode Available | 1 |
| RQFormer: Rotated Query Transformer for End-to-End Oriented Object Detection | Nov 29, 2023 | DecoderObject | CodeCode Available | 1 |
| Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation | Nov 29, 2023 | ClusteringObject | CodeCode Available | 1 |
| UGG: Unified Generative Grasping | Nov 28, 2023 | Grasp GenerationObject | CodeCode Available | 1 |
| Segment Every Out-of-Distribution Object | Nov 27, 2023 | ObjectSegmentation | CodeCode Available | 1 |
| Single-Model and Any-Modality for Video Object Tracking | Nov 27, 2023 | ObjectObject Tracking | CodeCode Available | 1 |
| Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding | Nov 26, 2023 | 3D visual groundingObject | CodeCode Available | 1 |
| PointOBB: Learning Oriented Object Detection via Single Point Supervision | Nov 23, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision | Nov 23, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Physical Reasoning and Object Planning for Household Embodied Agents | Nov 22, 2023 | 2kDecision Making | CodeCode Available | 1 |
| Point, Segment and Count: A Generalized Framework for Object Counting | Nov 21, 2023 | Knowledge DistillationObject | CodeCode Available | 1 |
| Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning | Nov 20, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Enhancing Novel Object Detection via Cooperative Foundational Models | Nov 19, 2023 | Novel Class DiscoveryNovel Object Detection | CodeCode Available | 1 |
| Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention | Nov 18, 2023 | Concept AlignmentGraph Generation | CodeCode Available | 1 |
| SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation | Nov 18, 2023 | ObjectPose Estimation | CodeCode Available | 1 |
| ShapeMatcher: Self-Supervised Joint Shape Canonicalization, Segmentation, Retrieval and Deformation | Nov 18, 2023 | ObjectRetrieval | CodeCode Available | 1 |
| Closely-Spaced Object Classification Using MuyGPyS | Nov 17, 2023 | ClassificationObject | CodeCode Available | 1 |
| Neural-Logic Human-Object Interaction Detection | Nov 16, 2023 | DecoderHuman-Object Interaction Detection | CodeCode Available | 1 |
| Identifying Linear Relational Concepts in Large Language Models | Nov 15, 2023 | Object | CodeCode Available | 1 |
| AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation | Nov 13, 2023 | AttributeHallucination | CodeCode Available | 1 |
| Which One? Leveraging Context Between Objects and Multiple Views for Language Grounding | Nov 12, 2023 | ObjectPosition | CodeCode Available | 1 |
| Language-guided Robot Grasping: CLIP-based Referring Grasp Synthesis in Clutter | Nov 9, 2023 | ObjectVisual Grounding | CodeCode Available | 1 |
| Kinematic-aware Prompting for Generalizable Articulated Object Manipulation with LLMs | Nov 6, 2023 | Imitation LearningIn-Context Learning | CodeCode Available | 1 |
| Rotation Invariant Transformer for Recognizing Object in UAVs | Nov 5, 2023 | ObjectPerson Re-Identification | CodeCode Available | 1 |
| Proposal-Level Unsupervised Domain Adaptation for Open World Unbiased Detector | Nov 4, 2023 | Domain AdaptationIncremental Learning | CodeCode Available | 1 |
| VQPy: An Object-Oriented Approach to Modern Video Analytics | Nov 3, 2023 | Object | CodeCode Available | 1 |
| Patch-based Selection and Refinement for Early Object Detection | Nov 3, 2023 | Objectobject-detection | CodeCode Available | 1 |
| Re-Scoring Using Image-Language Similarity for Few-Shot Object Detection | Nov 1, 2023 | ClassificationFew-Shot Object Detection | CodeCode Available | 1 |