| Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset | Mar 22, 2023 | Mixture-of-Expertstext-classification | —Unverified | 0 |
| HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals | Mar 17, 2023 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| MCR-DL: Mix-and-Match Communication Runtime for Deep Learning | Mar 15, 2023 | Deep LearningGPU | —Unverified | 0 |
| Scaling Vision-Language Models with Sparse Mixture of Experts | Mar 13, 2023 | Mixture-of-Experts | —Unverified | 0 |
| A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training | Mar 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference | Mar 10, 2023 | CPUDecoder | —Unverified | 0 |
| Improving Expert Specialization in Mixture of Experts | Feb 28, 2023 | Continual LearningMixture-of-Experts | —Unverified | 0 |
| Improved Training of Mixture-of-Experts Language GANs | Feb 23, 2023 | Adversarial TextImage Generation | —Unverified | 0 |
| TMoE-P: Towards the Pareto Optimum for Multivariate Soft Sensors | Feb 21, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Massively Multilingual Shallow Fusion with Large Language Models | Feb 17, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |