SOTAVerified

Mixture-of-Experts

Papers

Showing 176200 of 1312 papers

TitleStatusHype
DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality ReductionCode1
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-DesignCode1
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility PredictionCode1
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training QualityCode1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
Retraining-Free Merging of Sparse MoE via Hierarchical ClusteringCode1
Efficient Dictionary Learning with Switch Sparse AutoencodersCode1
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the WildCode1
Searching for Efficient Linear Layers over a Continuous Space of Structured MatricesCode1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow PredictionCode1
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoECode1
LOLA -- An Open-Source Massively Multilingual Large Language ModelCode1
M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA frameworkCode1
Gradient-free variational learning with conditional mixture networksCode1
Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic ForecastingCode1
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE InferenceCode1
Customizing Language Models with Instance-wise LoRA for Sequential RecommendationCode1
Layerwise Recurrent Router for Mixture-of-ExpertsCode1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out StrategiesCode1
MoExtend: Tuning New Experts for Modality and Task ExtensionCode1
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical RoutingCode1
M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image AnalysisCode1
Show:102550
← PrevPage 8 of 53Next →

No leaderboard results yet.