| StableMoE: Stable Routing Strategy for Mixture of Experts | Apr 18, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation | Apr 15, 2022 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition | Apr 7, 2022 | Mixture-of-Expertsspeech-recognition | CodeCode Available | 1 |
| Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution | Mar 27, 2022 | Image Super-ResolutionMixture-of-Experts | CodeCode Available | 1 |
| SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization | Mar 13, 2022 | Abstractive Text SummarizationDocument Summarization | CodeCode Available | 1 |
| Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models | Mar 2, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate | Dec 29, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification | Dec 16, 2021 | Generalizable Person Re-identificationMixture-of-Experts | CodeCode Available | 1 |
| Unsupervised Foreground Extraction via Deep Region Competition | Oct 29, 2021 | Image SegmentationInductive Bias | CodeCode Available | 1 |
| HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models | Oct 8, 2021 | Abstractive Text SummarizationDecoder | CodeCode Available | 1 |