| Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality Detection | Aug 16, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models | Aug 15, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts | Aug 15, 2024 | Mixture-of-Experts | —Unverified | 0 |
| A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning | Aug 13, 2024 | Mixture-of-ExpertsSurvey | —Unverified | 0 |
| AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies | Aug 13, 2024 | Language ModellingMixture-of-Experts | CodeCode Available | 1 |
| Layerwise Recurrent Router for Mixture-of-Experts | Aug 13, 2024 | AttributeMixture-of-Experts | CodeCode Available | 1 |
| HoME: Hierarchy of Multi-Gate Experts for Multi-Task Learning at Kuaishou | Aug 10, 2024 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| Understanding the Performance and Estimating the Cost of LLM Fine-Tuning | Aug 8, 2024 | GPUMixture-of-Experts | CodeCode Available | 0 |
| LaDiMo: Layer-wise Distillation Inspired MoEfier | Aug 8, 2024 | Knowledge DistillationMixture-of-Experts | —Unverified | 0 |
| MoC-System: Efficient Fault Tolerance for Sparse Mixture-of-Experts Model Training | Aug 8, 2024 | Mixture-of-Experts | —Unverified | 0 |