Vision Transformers: From Semantic Segmentation to Dense Prediction Jul 19, 2022 image-classification Image Classification
Code Code Available 35 XCiT: Cross-Covariance Image Transformers Jun 17, 2021 image-classification Image Classification
Code Code Available 35 Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling Jan 9, 2023 2D Object Detection Contrastive Learning
Code Code Available 35 Detecting Twenty-thousand Classes using Image-level Supervision Jan 7, 2022 Cross-Domain Few-Shot Object Detection image-classification
Code Code Available 35 Demystify Mamba in Vision: A Linear Attention Perspective May 26, 2024 image-classification Image Classification
Code Code Available 35 VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks Mar 1, 2024 Image Classification Image Generation
Code Code Available 35 Vision-Language Pre-training: Basics, Recent Advances, and Future Trends Oct 17, 2022 Few-Shot Learning Image Captioning
Code Code Available 35 xLSTM-UNet can be an Effective 2D & 3D Medical Image Segmentation Backbone with Vision-LSTM (ViL) better than its Mamba Counterpart Jul 1, 2024 3D Medical Imaging Segmentation image-classification
Code Code Available 35 Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket Jan 4, 2024 image-classification Image Classification
Code Code Available 35 EfficientNetV2: Smaller Models and Faster Training Apr 1, 2021 AutoML Classification
Code Code Available 35 SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery Dec 15, 2023 Contrastive Learning Earth Observation
Code Code Available 35 TCFormer: Visual Recognition via Token Clustering Transformer Jul 16, 2024 Clustering image-classification
Code Code Available 35 ResNeSt: Split-Attention Networks Apr 19, 2020 image-classification Image Classification
Code Code Available 35 PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition Mar 26, 2024 Image Classification Instance Segmentation
Code Code Available 35 Datasets: A Community Library for Natural Language Processing Sep 7, 2021 Image Classification Object Recognition
Code Code Available 35 Transformers in Medical Imaging: A Survey Jan 24, 2022 Image Classification Image Segmentation
Code Code Available 35 ADOPT: Modified Adam Can Converge with Any β_2 with the Optimal Rate Nov 5, 2024 Deep Reinforcement Learning image-classification
Code Code Available 35 ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities May 18, 2023 1 Image, 2*2 Stitchi Action Classification
Code Code Available 35 Patches Are All You Need? Jan 24, 2022 All Image Classification
Code Code Available 35 Cascade Prompt Learning for Vision-Language Model Adaptation Sep 26, 2024 General Knowledge image-classification
Code Code Available 35 MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices Dec 28, 2023 AutoML CPU
Code Code Available 35 Momentum Contrast for Unsupervised Visual Representation Learning Nov 13, 2019 Contrastive Learning Image Classification
Code Code Available 35 MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining Mar 20, 2024 Aerial Scene Classification Building change detection for remote sensing images
Code Code Available 35 U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection May 18, 2020 Dichotomous Image Segmentation GPU
Code Code Available 35 MetaFormer Baselines for Vision Oct 24, 2022 Domain Generalization Image Classification
Code Code Available 35 MaxViT: Multi-Axis Vision Transformer Apr 4, 2022 image-classification Image Classification
Code Code Available 35 MiniViT: Compressing Vision Transformers with Weight Multiplexing Apr 14, 2022 Diversity Image Classification
Code Code Available 35 Ludwig: a type-based declarative deep learning toolbox Sep 17, 2019 Decoder Deep Learning
Code Code Available 35 Bag of Freebies for Training Object Detection Neural Networks Feb 11, 2019 General Classification image-classification
Code Code Available 35 MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs Nov 22, 2024 image-classification Image Classification
Code Code Available 35 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks Aug 15, 2024 image-classification Image Classification
Code Code Available 35 Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey Feb 8, 2024 Articles Entity Alignment
Code Code Available 35 FusionBench: A Comprehensive Benchmark of Deep Model Fusion Jun 5, 2024 image-classification Image Classification
Code Code Available 35 QOC: Quantum On-Chip Training with Parameter Shift and Gradient Pruning Feb 26, 2022 image-classification Image Classification
Code Code Available 35 Falcon: A Remote Sensing Vision-Language Foundation Model Mar 14, 2025 Image Captioning image-classification
Code Code Available 35 FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization Mar 24, 2023 3D Hand Pose Estimation GPU
Code Code Available 35 AutoAugment: Learning Augmentation Policies from Data May 24, 2018 Data Augmentation Domain Generalization
Code Code Available 35 MobileNetV4 -- Universal Models for the Mobile Ecosystem Apr 16, 2024 Image Classification Neural Architecture Search
Code Code Available 35 UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition Nov 27, 2023 Image Classification Object Detection
Code Code Available 35 RSMamba: Remote Sensing Image Classification with State Space Model Mar 28, 2024 Classification image-classification
Code Code Available 35 Separable Self-attention for Mobile Vision Transformers Jun 6, 2022 Image Classification Object Detection
Code Code Available 35 EfficientViM: Efficient Vision Mamba with Hidden State Mixer based State Space Duality Nov 22, 2024 Efficient Neural Network Image Classification
Code Code Available 25 UNetFormer: A UNet-like Transformer for Efficient Semantic Segmentation of Remote Sensing Urban Scene Imagery Sep 18, 2021 Change Detection Decoder
Code Code Available 25 MogaNet: Multi-order Gated Aggregation Network Nov 7, 2022 3D Human Pose Estimation Image Classification
Code Code Available 25 Effective Data Augmentation With Diffusion Models Feb 7, 2023 Data Augmentation Diversity
Code Code Available 25 Agent Attention: On the Integration of Softmax and Linear Attention Dec 14, 2023 Computational Efficiency image-classification
Code Code Available 25 Efficient Multi-Scale Attention Module with Cross-Spatial Learning May 23, 2023 Dimensionality Reduction image-classification
Code Code Available 25 EMR-Merging: Tuning-Free High-Performance Model Merging May 23, 2024 Image Classification Image Retrieval
Code Code Available 25 Dilated Neighborhood Attention Transformer Sep 29, 2022 Image Classification Instance Segmentation
Code Code Available 25 ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks Oct 8, 2019 Dimensionality Reduction image-classification
Code Code Available 25