SOTAVerified

Mixture-of-Experts

Papers

Showing 10261050 of 1312 papers

TitleStatusHype
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE0
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models0
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs0
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM0
EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media0
EVLM: An Efficient Vision-Language Model for Visual Understanding0
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models0
Expert Aggregation for Financial Forecasting0
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference0
Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts0
ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses0
ExpertRank: A Multi-level Coarse-grained Expert-based Listwise Ranking Loss0
Experts Weights Averaging: A New General Training Scheme for Vision Transformers0
Expert-Token Resonance: Redefining MoE Routing through Affinity-Driven Active Selection0
Explainable Classifier for Malignant Lymphoma Subtyping via Cell Graph and Image Fusion0
Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models0
Exploiting Mixture-of-Experts Redundancy Unlocks Multimodal Generative Abilities0
Exploring Domain Robust Lightweight Reward Models based on Router Mechanism0
Exploring Routing Strategies for Multilingual Mixture-of-Experts Models0
M6-T: Exploring Sparse Expert Models and Beyond0
Exploring Speaker Diarization with Mixture of Experts0
Facet-Aware Multi-Head Mixture-of-Experts Model for Sequential Recommendation0
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective0
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition0
Faster MoE LLM Inference for Extremely Large Models0
Show:102550
← PrevPage 42 of 53Next →

No leaderboard results yet.