| Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision Transformer | May 30, 2025 | Mixture-of-Experts | CodeCode Available | 1 | 5 |
| MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter | Jun 7, 2024 | CPUGPU | CodeCode Available | 1 | 5 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 | 5 |
| EWMoE: An effective model for global weather forecasting with mixture-of-experts | May 9, 2024 | Mixture-of-ExpertsWeather Forecasting | CodeCode Available | 1 | 5 |
| MedCoT: Medical Chain of Thought via Hierarchical Expert | Dec 18, 2024 | DiagnosticMedical Visual Question Answering | CodeCode Available | 1 | 5 |
| FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing | Dec 22, 2023 | Mixture-of-ExpertsMotion Generation | CodeCode Available | 1 | 5 |
| CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference | Feb 6, 2025 | Mixture-of-Experts | CodeCode Available | 1 | 5 |
| Merging Experts into One: Improving Computational Efficiency of Mixture of Experts | Oct 15, 2023 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 | 5 |
| MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank Compensators | Apr 3, 2025 | Mixture-of-ExpertsQuantization | CodeCode Available | 1 | 5 |
| Emergent Modularity in Pre-trained Transformers | May 28, 2023 | Mixture-of-Experts | CodeCode Available | 1 | 5 |