| Utility-Driven Speculative Decoding for Mixture-of-Experts | Jun 17, 2025 | GPULarge Language Model | —Unverified | 0 | 0 |
| Vanilla Transformers are Transfer Capability Teachers | Mar 4, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 | 0 |
| Variational Distillation of Diffusion Policies into Mixture of Experts | Jun 18, 2024 | DenoisingMixture-of-Experts | —Unverified | 0 | 0 |
| Variational Mixture of Gaussian Process Experts | Dec 1, 2008 | Gaussian ProcessesMixture-of-Experts | —Unverified | 0 | 0 |
| ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts | Oct 21, 2024 | image-classificationImage Classification | —Unverified | 0 | 0 |
| Visual Saliency Prediction Using a Mixture of Deep Neural Networks | Feb 1, 2017 | Mixture-of-ExpertsSaliency Prediction | —Unverified | 0 | 0 |
| WDMoE: Wireless Distributed Large Language Models with Mixture of Experts | May 6, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| WDMoE: Wireless Distributed Mixture of Experts for Large Language Models | Nov 11, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| WeNet: Weighted Networks for Recurrent Network Architecture Search | Apr 8, 2019 | General Classificationimage-classification | —Unverified | 0 | 0 |
| Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production | Nov 18, 2022 | Machine TranslationMixture-of-Experts | —Unverified | 0 | 0 |