SOTAVerified

Mixture-of-Experts

Papers

Showing 151175 of 1312 papers

TitleStatusHype
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image AnalysisCode1
M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation FrameworkCode1
3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognitionCode1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designCode1
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of ExpertsCode1
LOLA -- An Open-Source Massively Multilingual Large Language ModelCode1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge ExcavationCode1
LLMBind: A Unified Modality-Task Integration FrameworkCode1
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE InferenceCode1
LLMCarbon: Modeling the end-to-end Carbon Footprint of Large Language ModelsCode1
LITE: Modeling Environmental Ecosystems with Multimodal Large Language ModelsCode1
Lifting the Curse of Capacity Gap in Distilling Language ModelsCode1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out StrategiesCode1
Learning to Skip the Middle Layers of TransformersCode1
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsCode1
Learning Soccer Juggling Skills with Layer-wise Mixture-of-ExpertsCode1
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language ModelsCode1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals PredictionCode1
Mixture of Experts Meets Prompt-Based Continual LearningCode1
Large Multi-modality Model Assisted AI-Generated Image Quality AssessmentCode1
Layerwise Recurrent Router for Mixture-of-ExpertsCode1
RetGen: A Joint framework for Retrieval and Grounded Text Generation ModelingCode1
Show:102550
← PrevPage 7 of 53Next →

No leaderboard results yet.