SOTAVerified

Mixture-of-Experts

Papers

Showing 111120 of 1312 papers

TitleStatusHype
Higher Layers Need More LoRA ExpertsCode2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
Aurora:Activating Chinese chat capability for Mixtral-8x7B sparse Mixture-of-Experts through Instruction-TuningCode2
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style PluginCode2
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter ModelsCode2
Mixture of Tokens: Continuous MoE through Cross-Example AggregationCode2
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction TuningCode2
Fast Feedforward NetworksCode2
Motion In-Betweening with Phase ManifoldsCode2
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-ExpertsCode2
Show:102550
← PrevPage 12 of 132Next →

No leaderboard results yet.