| Cross-Layer Feature Pyramid Transformer for Small Object Detection in Aerial Images | Jul 29, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| MonoWAD: Weather-Adaptive Diffusion Model for Robust Monocular 3D Object Detection | Jul 23, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| COALA: A Practical and Vision-Centric Federated Learning Platform | Jul 23, 2024 | BenchmarkingContinual Learning | CodeCode Available | 2 |
| PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects | Jul 23, 2024 | Instance SegmentationObject | CodeCode Available | 2 |
| ESOD: Efficient Small Object Detection on High-Resolution Images | Jul 23, 2024 | GPUObject | CodeCode Available | 2 |
| GroupMamba: Efficient Group-Based Visual State Space Model | Jul 18, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval | Jul 17, 2024 | DecoderImage Enhancement | CodeCode Available | 2 |
| Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes | Jul 16, 2024 | Human Instance SegmentationInstance Segmentation | CodeCode Available | 2 |
| LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction | Jul 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| OPEN: Object-wise Position Embedding for Multi-view 3D Object Detection | Jul 15, 2024 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset | Jul 14, 2024 | 3D Object DetectionMultispectral Object Detection | CodeCode Available | 2 |
| Projecting Points to Axes: Oriented Object Detection via Point-Axis Representation | Jul 11, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| SCSA: Exploring the Synergistic Effects Between Spatial and Channel Attention | Jul 6, 2024 | Classificationobject-detection | CodeCode Available | 2 |
| Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detection | Jul 5, 2024 | Novel Object Detectionobject-detection | CodeCode Available | 2 |
| SH17: A Dataset for Human Safety and Personal Protective Equipment Detection in Manufacturing Industry | Jul 5, 2024 | Benchmarkingobject-detection | CodeCode Available | 2 |
| SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding | Jul 3, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection | Jul 1, 2024 | Objectobject-detection | CodeCode Available | 2 |
| The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval | Jun 26, 2024 | Action LocalizationMoment Retrieval | CodeCode Available | 2 |
| LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection | Jun 20, 2024 | Computational EfficiencyObject | CodeCode Available | 2 |
| Scaling Efficient Masked Image Modeling on Large Remote Sensing Dataset | Jun 17, 2024 | Aerial Scene ClassificationDiversity | CodeCode Available | 2 |
| Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection | Jun 15, 2024 | 3D Object DetectionComputational Efficiency | CodeCode Available | 2 |
| EFM3D: A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models | Jun 14, 2024 | 3D Object Detection3D Reconstruction | CodeCode Available | 2 |
| BEVSpread: Spread Voxel Pooling for Bird's-Eye-View Representation in Vision-based Roadside 3D Object Detection | Jun 13, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| STAR: A First-Ever Dataset and A Large-Scale Benchmark for Scene Graph Generation in Large-Size Satellite Imagery | Jun 13, 2024 | Graph GenerationObject | CodeCode Available | 2 |
| EFFOcc: A Minimal Baseline for EFficient Fusion-based 3D Occupancy Network | Jun 11, 2024 | 3D Object DetectionActive Learning | CodeCode Available | 2 |