SOTAVerified

Mixture-of-Experts

Papers

Showing 2650 of 1312 papers

TitleStatusHype
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of ExpertsCode5
Rethinking LLM Language Adaptation: A Case Study on Chinese MixtralCode5
Aria: An Open Multimodal Native Mixture-of-Experts ModelCode5
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by TencentCode5
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleCode4
JetMoE: Reaching Llama2 Performance with 0.1M DollarsCode4
OLMoE: Open Mixture-of-Experts Language ModelsCode4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation ExpertsCode4
Mixtral of ExpertsCode4
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-FreeCode4
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of ExpertsCode4
Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language ModelsCode4
Training Sparse Mixture Of Experts Text Embedding ModelsCode4
Fast Inference of Mixture-of-Experts Language Models with OffloadingCode4
MoH: Multi-Head Attention as Mixture-of-Head AttentionCode4
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language ModelsCode4
Learning Heterogeneous Mixture of Scene Experts for Large-scale Neural Radiance FieldsCode3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-ExpertsCode3
Generalizing Motion Planners with Mixture of Experts for Autonomous DrivingCode3
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of ExpertsCode3
MoAI: Mixture of All Intelligence for Large Language and Vision ModelsCode3
MoE-Mamba: Efficient Selective State Space Models with Mixture of ExpertsCode3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge DistillationCode3
Show:102550
← PrevPage 2 of 53Next →

No leaderboard results yet.