| Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models | Apr 10, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Scaling Vision-Language Models with Sparse Mixture of Experts | Mar 13, 2023 | Mixture-of-Experts | —Unverified | 0 | 0 |
| SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection | Jan 21, 2025 | Contrastive LearningFraud Detection | —Unverified | 0 | 0 |
| SciDFM: A Large Language Model with Mixture-of-Experts for Science | Sep 27, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR | Jun 26, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Security Assessment of DeepSeek and GPT Series Models against Jailbreak Attacks | Jun 23, 2025 | Mixture-of-ExpertsSafety Alignment | —Unverified | 0 | 0 |
| Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning | Apr 10, 2025 | Mixture-of-Expertsreinforcement-learning | —Unverified | 0 | 0 |
| Seed1.5-VL Technical Report | May 11, 2025 | Mixture-of-ExpertsMultimodal Reasoning | —Unverified | 0 | 0 |
| Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models | May 19, 2025 | FairnessMixture-of-Experts | —Unverified | 0 | 0 |
| SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts | Apr 7, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach | May 22, 2014 | ClusteringGeneral Classification | —Unverified | 0 | 0 |
| Semantic-Aware Dynamic Parameter for Video Inpainting Transformer | Jan 1, 2023 | Mixture-of-ExpertsVideo Inpainting | —Unverified | 0 | 0 |
| Probing Semantic Routing in Large Mixture-of-Expert Models | Feb 15, 2025 | Mixture-of-ExpertsSentence | —Unverified | 0 | 0 |
| SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation | Mar 19, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| MoESys: A Distributed and Efficient Mixture-of-Experts Training and Inference System for Internet Services | May 20, 2022 | CPUDistributed Computing | —Unverified | 0 | 0 |
| Serving Large Language Models on Huawei CloudMatrix384 | Jun 15, 2025 | Mixture-of-ExpertsQuantization | —Unverified | 0 | 0 |
| SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget | Aug 29, 2023 | Mixture-of-Expertsobject-detection | —Unverified | 0 | 0 |
| Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts | Apr 7, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts | May 22, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective | Feb 1, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model | Oct 13, 2021 | Mixture-of-ExpertsQuestion Generation | —Unverified | 0 | 0 |
| SimSMoE: Solving Representational Collapse via Similarity Measure | Jun 22, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Simultaneous Feature and Expert Selection within Mixture of Experts | May 29, 2014 | feature selectionMixture-of-Experts | —Unverified | 0 | 0 |
| Single-Example Learning in a Mixture of GPDMs with Latent Geometries | Jun 17, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills | Jun 28, 2023 | Mixture-of-ExpertsNatural Language Understanding | —Unverified | 0 | 0 |