| RevColV2: Exploring Disentangled Representations in Masked Image Modeling | Sep 2, 2023 | Decoderimage-classification | CodeCode Available | 2 |
| OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation | Sep 1, 2023 | 3D Open-Vocabulary Instance Segmentation3D Open-Vocabulary Object Detection | CodeCode Available | 2 |
| Dataset Quantization | Aug 21, 2023 | Dataset Distillationobject-detection | CodeCode Available | 2 |
| Turning a CLIP Model into a Scene Text Spotter | Aug 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 |
| SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos | Aug 18, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| ICAFusion: Iterative Cross-Attention Guided Feature Fusion for Multispectral Object Detection | Aug 15, 2023 | Multispectral Object Detectionobject-detection | CodeCode Available | 2 |
| UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation | Aug 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection | Aug 10, 2023 | Objectobject-detection | CodeCode Available | 2 |
| FocalFormer3D : Focusing on Hard Instance for 3D Object Detection | Aug 8, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection | Jul 27, 2023 | 3D geometry3D Object Detection | CodeCode Available | 2 |
| COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts | Jul 24, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future | Jul 18, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| Scale-Aware Modulation Meet Transformer | Jul 17, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Hierarchical Open-vocabulary Universal Image Segmentation | Jul 3, 2023 | Image ComprehensionImage Segmentation | CodeCode Available | 2 |
| LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching | Jun 20, 2023 | Brain Tumor ClassificationContrastive Learning | CodeCode Available | 2 |
| Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception | Jun 10, 2023 | 3D Object DetectionBenchmarking | CodeCode Available | 2 |
| FasterViT: Fast Vision Transformers with Hierarchical Attention | Jun 9, 2023 | Image Classificationobject-detection | CodeCode Available | 2 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model | Jun 4, 2023 | 3D Object DetectionImage Segmentation | CodeCode Available | 2 |
| Multi-modal Queried Object Detection in the Wild | May 30, 2023 | Few-Shot Object DetectionObject | CodeCode Available | 2 |
| UniScene: Multi-Camera Unified Pre-training via 3D Scene Reconstruction for Autonomous Driving | May 30, 2023 | 3D Object Detection3D Scene Reconstruction | CodeCode Available | 2 |
| Contextual Object Detection with Multimodal Large Language Models | May 29, 2023 | Cloze TestDecoder | CodeCode Available | 2 |
| Efficient Multi-Scale Attention Module with Cross-Spatial Learning | May 23, 2023 | Dimensionality Reductionimage-classification | CodeCode Available | 2 |
| DetGPT: Detect What You Need via Reasoning | May 23, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| Going Denser with Open-Vocabulary Part Segmentation | May 18, 2023 | Objectobject-detection | CodeCode Available | 2 |
| PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds | May 8, 2023 | 2D Object Detection3D Object Detection | CodeCode Available | 2 |
| OctFormer: Octree-based Transformers for 3D Point Clouds | May 4, 2023 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 2 |
| SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model | May 3, 2023 | Instance SegmentationObject | CodeCode Available | 2 |
| SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection | Apr 27, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| A Strong and Reproducible Object Detector with Only Public Datasets | Apr 25, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Radar-Camera Fusion for Object Detection and Semantic Segmentation in Autonomous Driving: A Comprehensive Review | Apr 20, 2023 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition | Apr 10, 2023 | image-classificationImage Classification | CodeCode Available | 2 |
| EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection | Mar 31, 2023 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| Vision Transformer with Quadrangle Attention | Mar 27, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Spherical Transformer for LiDAR-based 3D Recognition | Mar 22, 2023 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 2 |
| Dense Distinct Query for End-to-End Object Detection | Mar 22, 2023 | Objectobject-detection | CodeCode Available | 2 |
| Detecting Everything in the Open World: Towards Universal Object Detection | Mar 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection | Mar 21, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking | Mar 20, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| A Simple Framework for 3D Occupancy Estimation in Autonomous Driving | Mar 17, 2023 | 3D Object Detection3D Reconstruction | CodeCode Available | 2 |
| Large Selective Kernel Network for Remote Sensing Object Detection | Mar 16, 2023 | Objectobject-detection | CodeCode Available | 2 |
| BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| BiFormer: Vision Transformer with Bi-Level Routing Attention | Mar 15, 2023 | Computational EfficiencyGPU | CodeCode Available | 2 |
| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception | Mar 14, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 2 |
| Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR | Mar 13, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention | Mar 13, 2023 | image-classificationImage Classification | CodeCode Available | 2 |
| Virtual Sparse Convolution for Multimodal 3D Object Detection | Mar 4, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| Pillar R-CNN for Point Cloud 3D Object Detection | Feb 26, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |