SOTAVerified

Mixture-of-Experts

Papers

Showing 2130 of 1312 papers

TitleStatusHype
Stretching Each Dollar: Diffusion Training from Scratch on a Micro-BudgetCode5
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-trainingCode5
Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-ExpertsCode5
Parrot: Multilingual Visual Instruction TuningCode5
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of ExpertsCode5
Rethinking LLM Language Adaptation: A Case Study on Chinese MixtralCode5
OpenMoE: An Early Effort on Open Mixture-of-Experts Language ModelsCode5
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsCode5
Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language ModelCode5
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-FreeCode4
Show:102550
← PrevPage 3 of 132Next →

No leaderboard results yet.