| Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models | Apr 10, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Scaling Vision-Language Models with Sparse Mixture of Experts | Mar 13, 2023 | Mixture-of-Experts | —Unverified | 0 | 0 |
| SCFCRC: Simultaneously Counteract Feature Camouflage and Relation Camouflage for Fraud Detection | Jan 21, 2025 | Contrastive LearningFraud Detection | —Unverified | 0 | 0 |
| SciDFM: A Large Language Model with Mixture-of-Experts for Science | Sep 27, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR | Jun 26, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 | 0 |
| Security Assessment of DeepSeek and GPT Series Models against Jailbreak Attacks | Jun 23, 2025 | Mixture-of-ExpertsSafety Alignment | —Unverified | 0 | 0 |
| Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning | Apr 10, 2025 | Mixture-of-Expertsreinforcement-learning | —Unverified | 0 | 0 |
| Seed1.5-VL Technical Report | May 11, 2025 | Mixture-of-ExpertsMultimodal Reasoning | —Unverified | 0 | 0 |
| Seeing the Unseen: How EMoE Unveils Bias in Text-to-Image Diffusion Models | May 19, 2025 | FairnessMixture-of-Experts | —Unverified | 0 | 0 |
| SEER-MoE: Sparse Expert Efficiency through Regularization for Mixture-of-Experts | Apr 7, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Self-tuned Visual Subclass Learning with Shared Samples An Incremental Approach | May 22, 2014 | ClusteringGeneral Classification | —Unverified | 0 | 0 |
| Semantic-Aware Dynamic Parameter for Video Inpainting Transformer | Jan 1, 2023 | Mixture-of-ExpertsVideo Inpainting | —Unverified | 0 | 0 |
| Probing Semantic Routing in Large Mixture-of-Expert Models | Feb 15, 2025 | Mixture-of-ExpertsSentence | —Unverified | 0 | 0 |
| SemEval-2025 Task 1: AdMIRe -- Advancing Multimodal Idiomaticity Representation | Mar 19, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| MoESys: A Distributed and Efficient Mixture-of-Experts Training and Inference System for Internet Services | May 20, 2022 | CPUDistributed Computing | —Unverified | 0 | 0 |
| Serving Large Language Models on Huawei CloudMatrix384 | Jun 15, 2025 | Mixture-of-ExpertsQuantization | —Unverified | 0 | 0 |
| SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget | Aug 29, 2023 | Mixture-of-Expertsobject-detection | —Unverified | 0 | 0 |
| Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts | Apr 7, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts | May 22, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sigmoid Self-Attention has Lower Sample Complexity than Softmax Self-Attention: A Mixture-of-Experts Perspective | Feb 1, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model | Oct 13, 2021 | Mixture-of-ExpertsQuestion Generation | —Unverified | 0 | 0 |
| SimSMoE: Solving Representational Collapse via Similarity Measure | Jun 22, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Simultaneous Feature and Expert Selection within Mixture of Experts | May 29, 2014 | feature selectionMixture-of-Experts | —Unverified | 0 | 0 |
| Single-Example Learning in a Mixture of GPDMs with Latent Geometries | Jun 17, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills | Jun 28, 2023 | Mixture-of-ExpertsNatural Language Understanding | —Unverified | 0 | 0 |
| SMAR: Soft Modality-Aware Routing Strategy for MoE-based Multimodal Large Language Models Preserving Language Capabilities | Jun 6, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing | Dec 10, 2022 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning | Jul 1, 2024 | Continual LearningMixture-of-Experts | —Unverified | 0 | 0 |
| Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners | Jan 16, 2022 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 | 0 |
| Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners | Apr 16, 2022 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 | 0 |
| Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT | May 24, 2022 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sparse Mixture of Experts as Unified Competitive Learning | Mar 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Sparse Mixture-of-Experts for Non-Uniform Noise Reduction in MRI Images | Jan 24, 2025 | DenoisingDiagnostic | —Unverified | 0 | 0 |
| Cross-token Modeling with Conditional Computation | Sep 5, 2021 | Computational EfficiencyImage Classification | —Unverified | 0 | 0 |
| Sparse Upcycling: Inference Inefficient Finetuning | Nov 13, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Sparse Video Representation Using Steered Mixture-of-Experts With Global Motion Compensation | Sep 13, 2022 | Mixture-of-ExpertsMotion Compensation | —Unverified | 0 | 0 |
| Sparsity-Constrained Optimal Transport | Sep 30, 2022 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Speculative MoE: Communication Efficient Parallel MoE Inference with Speculative Token and Expert Pre-scheduling | Mar 6, 2025 | Mixture-of-ExpertsScheduling | —Unverified | 0 | 0 |
| SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations | Nov 8, 2022 | Mixture-of-ExpertsSpeech-to-Speech Translation | —Unverified | 0 | 0 |
| SpeechMoE2: Mixture-of-Experts Model with Improved Routing | Nov 23, 2021 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 | 0 |
| Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis | Jul 8, 2025 | Data AugmentationMixture-of-Experts | —Unverified | 0 | 0 |
| SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts | Aug 17, 2021 | DiversityMixture-of-Experts | —Unverified | 0 | 0 |
| SQL-GEN: Bridging the Dialect Gap for Text-to-SQL Via Synthetic Data And Model Merging | Aug 22, 2024 | DiversityMixture-of-Experts | —Unverified | 0 | 0 |
| StableMoE: Stable Routing Strategy for Mixture of Experts | Nov 16, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| STAR-Rec: Making Peace with Length Variance and Pattern Diversity in Sequential Recommendation | May 6, 2025 | DiversityMixture-of-Experts | —Unverified | 0 | 0 |
| Static Batching of Irregular Workloads on GPUs: Framework and Application to Efficient MoE Model Inference | Jan 27, 2025 | GPUMixture-of-Experts | —Unverified | 0 | 0 |
| Statistical Advantages of Perturbing Cosine Router in Mixture of Experts | May 23, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Statistical Perspective of Top-K Sparse Softmax Gating Mixture of Experts | Sep 25, 2023 | Density EstimationMixture-of-Experts | —Unverified | 0 | 0 |
| Stealing User Prompts from Mixture of Experts | Oct 30, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Steered Mixture-of-Experts Autoencoder Design for Real-Time Image Modelling and Denoising | May 5, 2023 | DecoderDenoising | —Unverified | 0 | 0 |