SOTAVerified

Mixture-of-Experts

Papers

Showing 101125 of 1312 papers

TitleStatusHype
Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language ModelsCode2
MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery DetectionCode2
Multi-Task Dense Prediction via Mixture of Low-Rank ExpertsCode2
Task-Customized Mixture of Adapters for General Image FusionCode2
Dynamic Tuning Towards Parameter and Inference Efficiency for ViT AdaptationCode2
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-ExpertsCode2
Scattered Mixture-of-Experts ImplementationCode2
Harder Tasks Need More Experts: Dynamic Routing in MoE ModelsCode2
TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of ExpertsCode2
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language ModelsCode2
Higher Layers Need More LoRA ExpertsCode2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-TuningCode2
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style PluginCode2
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter ModelsCode2
Mixture of Tokens: Continuous MoE through Cross-Example AggregationCode2
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction TuningCode2
Fast Feedforward NetworksCode2
Motion In-Betweening with Phase ManifoldsCode2
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-ExpertsCode2
ModuleFormer: Modularity Emerges from Mixture-of-ExpertsCode2
Learning A Sparse Transformer Network for Effective Image DerainingCode2
Sparse Upcycling: Training Mixture-of-Experts from Dense CheckpointsCode2
No Language Left Behind: Scaling Human-Centered Machine TranslationCode2
Towards Universal Sequence Representation Learning for Recommender SystemsCode2
Show:102550
← PrevPage 5 of 53Next →

No leaderboard results yet.