SOTAVerified

Mixture-of-Experts

Papers

Showing 526550 of 1312 papers

TitleStatusHype
Experts Weights Averaging: A New General Training Scheme for Vision Transformers0
CLER: Cross-task Learning with Expert Representation to Generalize Reading and Understanding0
A Novel Cluster Classify Regress Model Predictive Controller Formulation; CCR-MPC0
Advancing Expert Specialization for Better MoE0
ExpertRank: A Multi-level Coarse-grained Expert-based Listwise Ranking Loss0
ExpertRAG: Efficient RAG with Mixture of Experts -- Optimizing Context Retrieval for Adaptive LLM Responses0
CICADA: Cross-Domain Interpretable Coding for Anomaly Detection and Adaptation in Multivariate Time Series0
Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts0
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference0
A Novel A.I Enhanced Reservoir Characterization with a Combined Mixture of Experts -- NVIDIA Modulus based Physics Informed Neural Operator Forward Model0
Expert Aggregation for Financial Forecasting0
Wonderful Matrices: More Efficient and Effective Architecture for Language Modeling Tasks0
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models0
Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings0
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction0
EVLM: An Efficient Vision-Language Model for Visual Understanding0
EvidenceMoE: A Physics-Guided Mixture-of-Experts with Evidential Critics for Advancing Fluorescence Light Detection and Ranging in Scattering Media0
Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM0
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs0
Non-asymptotic model selection in block-diagonal mixture of polynomial experts models0
3D Gaussian Splatting Data Compression with Mixture of Priors0
Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models0
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE0
Channel Gain Cartography via Mixture of Experts0
EVA: Mixture-of-Experts Semantic Variant Alignment for Compositional Zero-Shot Learning0
Show:102550
← PrevPage 22 of 53Next →

No leaderboard results yet.