| Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach | May 22, 2014 | ClusteringGeneral Classification | —Unverified | 0 | 0 |
| Semantic-Aware Dynamic Parameter for Video Inpainting Transformer | Jan 1, 2023 | Mixture-of-ExpertsVideo Inpainting | —Unverified | 0 | 0 |
| Probing Semantic Routing in Large Mixture-of-Expert Models | Feb 15, 2025 | Mixture-of-ExpertsSentence | —Unverified | 0 | 0 |
| SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation | Mar 19, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| MoESys: A Distributed and Efficient Mixture-of-Experts Training and Inference System for Internet Services | May 20, 2022 | CPUDistributed Computing | —Unverified | 0 | 0 |
| Serving Large Language Models on Huawei CloudMatrix384 | Jun 15, 2025 | Mixture-of-ExpertsQuantization | —Unverified | 0 | 0 |
| SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget | Aug 29, 2023 | Mixture-of-Expertsobject-detection | —Unverified | 0 | 0 |
| Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts | Apr 7, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts | May 22, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective | Feb 1, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |