| Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward Model | Jun 2, 2024 | DenoisingMixture-of-Experts | CodeCode Available | 3 | 5 |
| Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance Fields | May 4, 2025 | Mixture-of-ExpertsNeRF | CodeCode Available | 3 | 5 |
| Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models | Feb 10, 2024 | CPUGPU | CodeCode Available | 3 | 5 |
| Generalizing Motion Planners with Mixture of Experts for Autonomous Driving | Oct 21, 2024 | Autonomous DrivingData Augmentation | CodeCode Available | 3 | 5 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 | 5 |
| A Survey on Inference Optimization Techniques for Mixture of Experts Models | Dec 18, 2024 | Computational EfficiencyDistributed Computing | CodeCode Available | 3 | 5 |
| MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts | Jan 8, 2024 | MambaMixture-of-Experts | CodeCode Available | 3 | 5 |
| A Survey on Mixture of Experts | Jun 26, 2024 | In-Context LearningMixture-of-Experts | CodeCode Available | 3 | 5 |
| AnyGraph: Graph Foundation Model in the Wild | Aug 20, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 3 | 5 |
| ST-MoE: Designing Stable and Transferable Sparse Expert Models | Feb 17, 2022 | ARCCommon Sense Reasoning | CodeCode Available | 3 | 5 |
| SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling | Dec 23, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 | 5 |
| CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | May 9, 2024 | Image CaptioningInstruction Following | CodeCode Available | 2 | 5 |
| ModuleFormer: Modularity Emerges from Mixture-of-Experts | Jun 7, 2023 | Language ModellingLightweight Deployment | CodeCode Available | 2 | 5 |
| Mixture of Tokens: Continuous MoE through Cross-Example Aggregation | Oct 24, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 | 5 |
| Mixture of Lookup Experts | Mar 20, 2025 | Mixture-of-Experts | CodeCode Available | 2 | 5 |
| MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving | Sep 11, 2024 | Autonomous DrivingFeature Engineering | CodeCode Available | 2 | 5 |
| Mixture of A Million Experts | Jul 4, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 2 | 5 |
| MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection | Apr 12, 2024 | Mixture-of-Experts | CodeCode Available | 2 | 5 |
| LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image Restoration | Oct 20, 2024 | AllComputational Efficiency | CodeCode Available | 2 | 5 |
| Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment | Feb 24, 2025 | image-classificationImage Classification | CodeCode Available | 2 | 5 |
| LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training | Nov 24, 2024 | MathMixture-of-Experts | CodeCode Available | 2 | 5 |
| MC-MoE: Mixture Compressor for Mixture-of-Experts LLMs Gains More | Oct 8, 2024 | Mixture-of-ExpertsQuantization | CodeCode Available | 2 | 5 |
| Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation | Mar 18, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 2 | 5 |
| CNMBERT: A Model for Converting Hanyu Pinyin Abbreviations to Chinese Characters | Nov 18, 2024 | fill-maskFill Mask | CodeCode Available | 2 | 5 |
| LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes | Jan 7, 2025 | Mixture-of-ExpertsRepresentation Learning | CodeCode Available | 2 | 5 |