SOTAVerified

Mixture-of-Experts

Papers

Showing 276300 of 1312 papers

TitleStatusHype
Multimodal Clinical Trial Outcome Prediction with Large Language ModelsCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
Multi-Head Mixture-of-ExpertsCode1
Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-ExpertsCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality ReductionCode1
Specialized federated learning using a mixture of expertsCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identificationCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Distilling the Knowledge in a Neural NetworkCode1
Emergent Modularity in Pre-trained TransformersCode1
Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of ExpertsCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing NetworksCode1
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-ExpertsCode1
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingCode1
MoËT: Mixture of Expert Trees and its Application to Verifiable Reinforcement LearningCode1
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
MoExtend: Tuning New Experts for Modality and Task ExtensionCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
Show:102550
← PrevPage 12 of 53Next →

No leaderboard results yet.