| Norface: Improving Facial Expression Analysis by Identity Normalization | Jul 22, 2024 | ClassificationEmotion Recognition | CodeCode Available | 1 |
| Swin SMT: Global Sequential Modeling in 3D Medical Image Segmentation | Jul 10, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 1 |
| Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs | Jul 1, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models | Jun 19, 2024 | ARCMixture-of-Experts | CodeCode Available | 1 |
| Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts | Jun 17, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| MoE-RBench: Towards Building Reliable Language Models with Sparse Mixture-of-Experts | Jun 17, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 1 |
| Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion | Jun 14, 2024 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| DeepUnifiedMom: Unified Time-series Momentum Portfolio Construction via Multi-Task Learning with Multi-Gate Mixture of Experts | Jun 13, 2024 | ManagementMixture-of-Experts | CodeCode Available | 1 |
| Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark | Jun 12, 2024 | BenchmarkingMixture-of-Experts | CodeCode Available | 1 |
| MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter | Jun 7, 2024 | CPUGPU | CodeCode Available | 1 |
| Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node | May 27, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast | May 23, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 1 |
| Graph Sparsification via Mixture of Graphs | May 23, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 1 |
| Mixture of Experts Meets Prompt-Based Continual Learning | May 23, 2024 | Continual LearningMixture-of-Experts | CodeCode Available | 1 |
| DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis | May 22, 2024 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models | May 19, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 |
| M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts | May 15, 2024 | Image SegmentationMixture-of-Experts | CodeCode Available | 1 |
| EWMoE: An effective model for global weather forecasting with mixture-of-experts | May 9, 2024 | Mixture-of-ExpertsWeather Forecasting | CodeCode Available | 1 |
| Revisiting RGBT Tracking Benchmarks from the Perspective of Modality Validity: A New Benchmark, Problem, and Method | Apr 30, 2024 | Mixture-of-ExpertsRgb-T Tracking | CodeCode Available | 1 |
| M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework | Apr 29, 2024 | AutoMLMixture-of-Experts | CodeCode Available | 1 |
| Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing | Apr 29, 2024 | Image Super-ResolutionMixture-of-Experts | CodeCode Available | 1 |
| Large Multi-modality Model Assisted AI-Generated Image Quality Assessment | Apr 27, 2024 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts | Apr 23, 2024 | HumanEvalmbpp | CodeCode Available | 1 |
| Multi-Head Mixture-of-Experts | Apr 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation | Apr 1, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models | Apr 1, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Unleashing the Power of Meta-tuning for Few-shot Generalization Through Sparse Interpolated Experts | Mar 13, 2024 | Domain GeneralizationFew-Shot Image Classification | CodeCode Available | 1 |
| Unity by Diversity: Improved Representation Learning in Multimodal VAEs | Mar 8, 2024 | DecoderDiversity | CodeCode Available | 1 |
| DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling | Mar 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| Sequence-level Semantic Representation Fusion for Recommender Systems | Feb 28, 2024 | Mixture-of-ExpertsRecommendation Systems | CodeCode Available | 1 |
| XMoE: Sparse Models with Fine-grained and Adaptive Expert Selection | Feb 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LLMBind: A Unified Modality-Task Integration Framework | Feb 22, 2024 | AI AgentAudio Generation | CodeCode Available | 1 |
| HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts | Feb 20, 2024 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| Scaling physics-informed hard constraints with mixture-of-experts | Feb 20, 2024 | Inductive BiasMixture-of-Experts | CodeCode Available | 1 |
| BiMediX: Bilingual Medical Mixture of Experts LLM | Feb 20, 2024 | Mixture-of-ExpertsMultiple-choice | CodeCode Available | 1 |
| Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization | Feb 19, 2024 | Attributecounterfactual | CodeCode Available | 1 |
| Multimodal Clinical Trial Outcome Prediction with Large Language Models | Feb 9, 2024 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |
| Merging Multi-Task Models via Weight-Ensembling Mixture of Experts | Feb 1, 2024 | Mixture-of-ExpertsTask Arithmetic | CodeCode Available | 1 |
| Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters | Feb 1, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 |
| Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings | Jan 28, 2024 | Contrastive LearningDescriptive | CodeCode Available | 1 |
| Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference | Jan 16, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| Frequency-Adaptive Pan-Sharpening with Mixture of Experts | Jan 4, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing | Dec 22, 2023 | Mixture-of-ExpertsMotion Generation | CodeCode Available | 1 |
| When Parameter-efficient Tuning Meets General-purpose Vision-language Models | Dec 16, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention | Dec 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Parameter Efficient Adaptation for Image Restoration with Heterogeneous Mixture-of-Experts | Dec 12, 2023 | DenoisingDiversity | CodeCode Available | 1 |
| HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts | Dec 12, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Mixture-of-Linear-Experts for Long-term Time Series Forecasting | Dec 11, 2023 | Mixture-of-ExpertsTime Series | CodeCode Available | 1 |
| GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts | Dec 7, 2023 | DiversityGraph Neural Network | CodeCode Available | 1 |