| GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving | Nov 19, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer | Jun 3, 2024 | 3D Object DetectionImage-to-Image Translation | CodeCode Available | 2 |
| ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer | Mar 8, 2022 | Image Classificationobject-detection | CodeCode Available | 2 |
| DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets | Jan 15, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| Generative Region-Language Pretraining for Open-Ended Object Detection | Mar 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond | May 23, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| Dynamic Graph Induced Contour-aware Heat Conduction Network for Event-based Object Detection | May 19, 2025 | Event-based visionObject | CodeCode Available | 2 |
| A Simple Framework for 3D Occupancy Estimation in Autonomous Driving | Mar 17, 2023 | 3D Object Detection3D Reconstruction | CodeCode Available | 2 |
| Global Context Networks | Dec 24, 2020 | Instance SegmentationObject Detection | CodeCode Available | 2 |
| GOReloc: Graph-based Object-Level Relocalization for Visual SLAM | Aug 15, 2024 | Objectobject-detection | CodeCode Available | 2 |
| GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs | May 10, 2024 | graph constructionimage-classification | CodeCode Available | 2 |
| A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Jan 18, 2024 | Instance SegmentationInteractive Segmentation | CodeCode Available | 2 |
| GrootVL: Tree Topology is All You Need in State Space Model | Jun 4, 2024 | Allimage-classification | CodeCode Available | 2 |
| EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications | Jun 21, 2022 | Image ClassificationObject Detection | CodeCode Available | 2 |
| Evaluating Large-Vocabulary Object Detectors: The Devil is in the Details | Feb 1, 2021 | Benchmarkingobject-detection | CodeCode Available | 2 |
| GroupViT: Semantic Segmentation Emerges from Text Supervision | Feb 22, 2022 | Object DetectionScene Understanding | CodeCode Available | 2 |
| HASSOD: Hierarchical Adaptive Self-Supervised Object Detection | Feb 5, 2024 | Objectobject-detection | CodeCode Available | 2 |
| A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future | Jul 18, 2023 | Knowledge Distillationobject-detection | CodeCode Available | 2 |
| HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Apr 3, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Hierarchical Open-vocabulary Universal Image Segmentation | Jul 3, 2023 | Image ComprehensionImage Segmentation | CodeCode Available | 2 |
| Fully Sparse 3D Object Detection | Jul 20, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Hulk: A Universal Knowledge Translator for Human-Centric Tasks | Dec 4, 2023 | 3D Human Pose EstimationAction Recognition | CodeCode Available | 2 |
| Improving CLIP Fine-tuning Performance | Jan 1, 2023 | Diagnosticobject-detection | CodeCode Available | 2 |
| Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression | Nov 19, 2019 | object-detectionObject Detection | CodeCode Available | 2 |
| Dilated Neighborhood Attention Transformer | Sep 29, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 2 |