SOTAVerified

Mixture-of-Experts

Papers

Showing 301325 of 1312 papers

TitleStatusHype
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
Frequency-Adaptive Pan-Sharpening with Mixture of ExpertsCode1
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-ExpertsCode1
Heterogeneous Mixture of Experts for Remote Sensing Image Super-ResolutionCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyCode1
Merging Experts into One: Improving Computational Efficiency of Mixture of ExpertsCode1
MeteoRA: Multiple-tasks Embedded LoRA for Large Language ModelsCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference CostsCode1
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of AdaptersCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
Specialized federated learning using a mixture of expertsCode1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse GateCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Emergent Modularity in Pre-trained TransformersCode1
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of ExpertsCode1
Heterogeneous Multi-task Learning with Expert DiversityCode1
Learning to Skip the Middle Layers of TransformersCode1
Modality Interactive Mixture-of-Experts for Fake News DetectionCode1
Mixture of Experts Meets Prompt-Based Continual LearningCode1
Show:102550
← PrevPage 13 of 53Next →

No leaderboard results yet.