| SkillNet-NLU: A Sparsely Activated Model for General-Purpose Natural Language Understanding | Mar 7, 2022 | Language ModellingMasked Language Modeling | —Unverified | 0 |
| Functional mixture-of-experts for classification | Feb 28, 2022 | ClassificationMixture-of-Experts | —Unverified | 0 |
| Mixture-of-Experts with Expert Choice Routing | Feb 18, 2022 | Mixture-of-Experts | —Unverified | 0 |
| A Survey on Dynamic Neural Networks for Natural Language Processing | Feb 15, 2022 | Dynamic neural networksMixture-of-Experts | —Unverified | 0 |
| Physics-Guided Problem Decomposition for Scaling Deep Learning of High-dimensional Eigen-Solvers: The Case of Schrödinger's Equation | Feb 12, 2022 | Mixture-of-ExpertsProblem Decomposition | —Unverified | 0 |
| One Student Knows All Experts Know: From Sparse to Dense | Jan 26, 2022 | AllKnowledge Distillation | —Unverified | 0 |
| MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation | Jan 16, 2022 | Knowledge DistillationMixture-of-Experts | —Unverified | 0 |
| Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners | Jan 16, 2022 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale | Jan 14, 2022 | DecoderMixture-of-Experts | CodeCode Available | 0 |
| Towards Lightweight Neural Animation : Exploration of Neural Network Pruning in Mixture of Experts-based Animation Models | Jan 11, 2022 | Mixture-of-ExpertsNetwork Pruning | —Unverified | 0 |