SOTAVerified

Mixture-of-Experts

Papers

Showing 701725 of 1312 papers

TitleStatusHype
Mixture of Experts Meets Prompt-Based Continual LearningCode1
Graph Sparsification via Mixture of GraphsCode1
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer ModelsCode2
Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-ContrastCode1
Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts0
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One TokenCode2
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
Ensemble and Mixture-of-Experts DeepONets For Operator LearningCode0
MeteoRA: Multiple-tasks Embedded LoRA for Large Language ModelsCode1
Learning More Generalized Experts by Merging Experts in Mixture-of-Experts0
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of ExpertsCode5
Many Hands Make Light Work: Task-Oriented Dialogue System with Module-Based Mixture-of-Experts0
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of ExpertsCode1
A Mixture of Experts Approach to 3D Human Motion PredictionCode0
A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text Worlds0
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-ExpertsCode2
SUTRA: Scalable Multilingual Language Model Architecture0
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language ModelCode9
MEET: Mixture of Experts Extra Tree-Based sEMG Hand Gesture Identification0
WDMoE: Wireless Distributed Large Language Models with Mixture of Experts0
Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training0
Mixture of partially linear experts0
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-ExpertsCode3
Hierarchical mixture of discriminative Generalized Dirichlet classifiers0
Show:102550
← PrevPage 29 of 53Next →

No leaderboard results yet.