| RevColV2: Exploring Disentangled Representations in Masked Image Modeling | Sep 2, 2023 | Decoderimage-classification | CodeCode Available | 2 |
| OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation | Sep 1, 2023 | 3D Open-Vocabulary Instance Segmentation3D Open-Vocabulary Object Detection | CodeCode Available | 2 |
| Turning a CLIP Model into a Scene Text Spotter | Aug 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Dataset Quantization | Aug 21, 2023 | Dataset Distillationobject-detection | CodeCode Available | 2 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 |
| SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos | Aug 18, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection | Aug 15, 2023 | Multispectral Object Detectionobject-detection | CodeCode Available | 2 |
| UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation | Aug 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection | Aug 10, 2023 | Objectobject-detection | CodeCode Available | 2 |
| FocalFormer3D : Focusing on Hard Instance for 3D Object Detection | Aug 8, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection | Jul 27, 2023 | 3D geometry3D Object Detection | CodeCode Available | 2 |
| COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts | Jul 24, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future | Jul 18, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| Scale-Aware Modulation Meet Transformer | Jul 17, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Hierarchical Open-vocabulary Universal Image Segmentation | Jul 3, 2023 | Image ComprehensionImage Segmentation | CodeCode Available | 2 |
| LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching | Jun 20, 2023 | Brain Tumor ClassificationContrastive Learning | CodeCode Available | 2 |
| Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception | Jun 10, 2023 | 3D Object DetectionBenchmarking | CodeCode Available | 2 |
| FasterViT: Fast Vision Transformers with Hierarchical Attention | Jun 9, 2023 | Image Classificationobject-detection | CodeCode Available | 2 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model | Jun 4, 2023 | 3D Object DetectionImage Segmentation | CodeCode Available | 2 |
| Multi-modal Queried Object Detection in the Wild | May 30, 2023 | Few-Shot Object DetectionObject | CodeCode Available | 2 |
| UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving | May 30, 2023 | 3D Object Detection3D Scene Reconstruction | CodeCode Available | 2 |
| Contextual Object Detection with Multimodal Large Language Models | May 29, 2023 | Cloze TestDecoder | CodeCode Available | 2 |
| Efficient Multi-Scale Attention Module with Cross-Spatial Learning | May 23, 2023 | Dimensionality Reductionimage-classification | CodeCode Available | 2 |
| DetGPT: Detect What You Need via Reasoning | May 23, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |