| Sparse Mixture-of-Experts are Domain Generalizable Learners | Jun 8, 2022 | Domain GeneralizationMixture-of-Experts | CodeCode Available | 1 |
| Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers | Mar 2, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling | May 14, 2021 | Dialogue GenerationLanguage Modeling | CodeCode Available | 1 |
| Large Multi-modality Model Assisted AI-Generated Image Quality Assessment | Apr 27, 2024 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE | Feb 10, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder Approach | Oct 18, 2023 | Blind Super-ResolutionDecoder | CodeCode Available | 1 |
| Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss | Sep 9, 2021 | Mixture-of-ExpertsRetrieval | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| Layerwise Recurrent Router for Mixture-of-Experts | Aug 13, 2024 | AttributeMixture-of-Experts | CodeCode Available | 1 |
| HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models | Oct 8, 2021 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 |
| HyperFormer: Enhancing Entity and Relation Interaction for Hyper-Relational Knowledge Graph Completion | Aug 12, 2023 | AttributeKnowledge Graph Completion | CodeCode Available | 1 |
| Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential Recommendation | Jan 24, 2025 | Contrastive LearningMixture-of-Experts | CodeCode Available | 1 |
| HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts | Feb 20, 2024 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| Heterogeneous Mixture of Experts for Remote Sensing Image Super-Resolution | Feb 12, 2025 | Image Super-ResolutionMixture-of-Experts | CodeCode Available | 1 |
| Heterogeneous Multi-task Learning with Expert Diversity | Jun 20, 2021 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts | Dec 7, 2023 | DiversityGraph Neural Network | CodeCode Available | 1 |
| DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling | Mar 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identification | Apr 21, 2025 | Exemplar-FreeKnowledge Distillation | CodeCode Available | 1 |
| Graph Sparsification via Mixture of Graphs | May 23, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 1 |
| HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts | Dec 12, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of Experts | Jan 25, 2025 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |
| Distilling the Knowledge in a Neural Network | Mar 9, 2015 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| Go Wider Instead of Deeper | Jul 25, 2021 | Image ClassificationMixture-of-Experts | CodeCode Available | 1 |
| Gradient-free variational learning with conditional mixture networks | Aug 29, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Frequency-Adaptive Pan-Sharpening with Mixture of Experts | Jan 4, 2024 | Mixture-of-Experts | CodeCode Available | 1 |