| Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model | Dec 19, 2022 | GPUMachine Translation | —Unverified | 0 |
| Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation | Dec 15, 2022 | Machine TranslationMixture-of-Experts | —Unverified | 0 |
| Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners | Dec 15, 2022 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing | Dec 10, 2022 | Mixture-of-Experts | —Unverified | 0 |
| Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints | Dec 9, 2022 | Mixture-of-Experts | CodeCode Available | 2 |
| Incorporating Polar Field Data for Improved Solar Flare Prediction | Dec 4, 2022 | Mixture-of-ExpertsPrediction | —Unverified | 0 |
| Named Entity and Relation Extraction with Multi-Modal Retrieval | Dec 3, 2022 | Mixture-of-ExpertsMulti-modal Named Entity Recognition | —Unverified | 0 |
| MegaBlocks: Efficient Sparse Training with Mixture-of-Experts | Nov 29, 2022 | GPUMixture-of-Experts | CodeCode Available | 3 |
| Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling | Nov 28, 2022 | Mixture-of-Experts | —Unverified | 0 |
| Mixture of Decision Trees for Interpretable Machine Learning | Nov 26, 2022 | Interpretable Machine LearningMixture-of-Experts | CodeCode Available | 1 |