SOTAVerified

Mixture-of-Experts

Papers

Showing 151175 of 1312 papers

TitleStatusHype
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
Merging Experts into One: Improving Computational Efficiency of Mixture of ExpertsCode1
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognitionCode1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyCode1
Merging Multi-Task Models via Weight-Ensembling Mixture of ExpertsCode1
MEFT: Memory-Efficient Fine-Tuning through Sparse AdapterCode1
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE InferenceCode1
Specialized federated learning using a mixture of expertsCode1
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision TransformerCode1
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated ImagesCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical QueriesCode1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out StrategiesCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals PredictionCode1
MedCoT: Medical Chain of Thought via Hierarchical ExpertCode1
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identificationCode1
Mixture of Experts Meets Prompt-Based Continual LearningCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
Show:102550
← PrevPage 7 of 53Next →

No leaderboard results yet.