| A DeNoising FPN With Transformer R-CNN for Tiny Object Detection | Jun 9, 2024 | Contrastive LearningDenoising | CodeCode Available | 2 |
| Parameter-Inverted Image Pyramid Networks | Jun 6, 2024 | Computational Efficiencyimage-classification | CodeCode Available | 2 |
| FedPylot: Navigating Federated Learning for Real-Time Object Detection in Internet of Vehicles | Jun 5, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 |
| GrootVL: Tree Topology is All You Need in State Space Model | Jun 4, 2024 | Allimage-classification | CodeCode Available | 2 |
| GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer | Jun 3, 2024 | 3D Object DetectionImage-to-Image Translation | CodeCode Available | 2 |
| Fully Test-Time Adaptation for Monocular 3D Object Detection | May 30, 2024 | 3D Object DetectionMonocular 3D Object Detection | CodeCode Available | 2 |
| REACT: Real-time Efficiency and Accuracy Compromise for Tradeoffs in Scene Graph Generation | May 25, 2024 | Graph GenerationObject | CodeCode Available | 2 |
| Drones Help Drones: A Collaborative Framework for Multi-Drone Object Trajectory Prediction and Beyond | May 23, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment | May 20, 2024 | Contrastive LearningDomain Adaptation | CodeCode Available | 2 |
| SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization | May 19, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection | May 16, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network | May 16, 2024 | Binary ClassificationDecoder | CodeCode Available | 2 |
| DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data | May 16, 2024 | Data AugmentationDiversity | CodeCode Available | 2 |
| Grounded 3D-LLM with Referent Tokens | May 16, 2024 | Dense CaptioningDiversity | CodeCode Available | 2 |
| GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs | May 10, 2024 | graph constructionimage-classification | CodeCode Available | 2 |
| ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers | May 7, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 2 |
| PTQ4SAM: Post-Training Quantization for Segment Anything | May 6, 2024 | Instance Segmentationobject-detection | CodeCode Available | 2 |
| Commonsense Prototype for Outdoor Unsupervised 3D Object Detection | Apr 25, 2024 | 3D Object DetectionObject | CodeCode Available | 2 |
| CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions | Apr 25, 2024 | MambaMultispectral Object Detection | CodeCode Available | 2 |
| ShadowRefiner: Towards Mask-free Shadow Removal via Fast Fourier Transformer | Apr 18, 2024 | Image Shadow Removalobject-detection | CodeCode Available | 2 |
| MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion | Apr 12, 2024 | Image ReconstructionMamba | CodeCode Available | 2 |
| SFSORT: Scene Features-based Simple Online Real-Time Tracker | Apr 11, 2024 | CPUMulti-Object Tracking | CodeCode Available | 2 |
| Scaling Multi-Camera 3D Object Detection through Weak-to-Strong Eliciting | Apr 10, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping | Apr 9, 2024 | Image RetrievalObject | CodeCode Available | 2 |
| YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images | Apr 9, 2024 | Objectobject-detection | CodeCode Available | 2 |
| MonoCD: Monocular 3D Object Detection with Complementary Depths | Apr 4, 2024 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| Is CLIP the main roadblock for fine-grained open-world perception? | Apr 4, 2024 | Autonomous DrivingNovel Concepts | CodeCode Available | 2 |
| DQ-DETR: DETR with Dynamic Query for Tiny Object Detection | Apr 4, 2024 | Objectobject-detection | CodeCode Available | 2 |
| HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras | Apr 3, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| DPFT: Dual Perspective Fusion Transformer for Camera-Radar-based Object Detection | Apr 3, 2024 | Autonomous Vehiclesobject-detection | CodeCode Available | 2 |
| Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss | Apr 2, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Scene Adaptive Sparse Transformer for Event-based Object Detection | Apr 2, 2024 | Objectobject-detection | CodeCode Available | 2 |
| EGTR: Extracting Graph from Transformer for Scene Graph Generation | Apr 2, 2024 | Graph GenerationMulti-Task Learning | CodeCode Available | 2 |
| NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields | Apr 1, 2024 | 3D Object DetectionNeRF | CodeCode Available | 2 |
| DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Mar 28, 2024 | Fine-Grained Image ClassificationImage Classification | CodeCode Available | 2 |
| OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation | Mar 28, 2024 | 3D Object DetectionNovel Class Discovery | CodeCode Available | 2 |
| Is Your LiDAR Placement Optimized for 3D Scene Understanding? | Mar 25, 2024 | 3D Object DetectionLIDAR Semantic Segmentation | CodeCode Available | 2 |
| RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition | Mar 20, 2024 | Contrastive LearningFine-Grained Visual Recognition | CodeCode Available | 2 |
| Continual Forgetting for Pre-trained Vision Models | Mar 18, 2024 | Continual ForgettingFace Recognition | CodeCode Available | 2 |
| CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | Mar 17, 2024 | Objectobject-detection | CodeCode Available | 2 |
| HCF-Net: Hierarchical Context Fusion Network for Infrared Small Object Detection | Mar 16, 2024 | channel selectionobject-detection | CodeCode Available | 2 |
| Generative Region-Language Pretraining for Open-Ended Object Detection | Mar 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection | Mar 14, 2024 | Knowledge DistillationNovel Object Detection | CodeCode Available | 2 |
| E2E-MFD: Towards End-to-End Synchronous Multimodal Fusion Detection | Mar 14, 2024 | Autonomous DrivingObject | CodeCode Available | 2 |
| MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Mar 13, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| LISO: Lidar-only Self-Supervised 3D Object Detection | Mar 11, 2024 | 3D Object DetectionObject | CodeCode Available | 2 |
| V_kD: Improving Knowledge Distillation using Orthogonal Projections | Mar 10, 2024 | Image GenerationKnowledge Distillation | CodeCode Available | 2 |
| Poly Kernel Inception Network for Remote Sensing Detection | Mar 10, 2024 | Objectobject-detection | CodeCode Available | 2 |
| SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection | Mar 9, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Mar 8, 2024 | object-detectionObject Detection | CodeCode Available | 2 |