| SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget | Aug 29, 2023 | Mixture-of-Expertsobject-detection | —Unverified | 0 |
| Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts | Apr 7, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts | May 22, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective | Feb 1, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model | Oct 13, 2021 | Mixture-of-ExpertsQuestion Generation | —Unverified | 0 |
| SimSMoE: Solving Representational Collapse via Similarity Measure | Jun 22, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Simultaneous Feature and Expert Selection within Mixture of Experts | May 29, 2014 | feature selectionMixture-of-Experts | —Unverified | 0 |
| Single-Example Learning in a Mixture of GPDMs with Latent Geometries | Jun 17, 2025 | Mixture-of-Experts | —Unverified | 0 |
| SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills | Jun 28, 2023 | Mixture-of-ExpertsNatural Language Understanding | —Unverified | 0 |
| SMAR: Soft Modality-Aware Routing Strategy for MoE-based Multimodal Large Language Models Preserving Language Capabilities | Jun 6, 2025 | Mixture-of-Experts | —Unverified | 0 |