SOTAVerified

Mixture-of-Experts

Papers

Showing 226250 of 1312 papers

TitleStatusHype
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-DesignCode1
Image Super-resolution Via Latent Diffusion: A Sampling-space Mixture Of Experts And Frequency-augmented Decoder ApproachCode1
Customizing Language Models with Instance-wise LoRA for Sequential RecommendationCode1
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
Distilling the Knowledge in a Neural NetworkCode1
DirectMultiStep: Direct Route Generation for Multi-Step RetrosynthesisCode1
A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow PredictionCode1
HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of ExpertsCode1
Large Multi-modality Model Assisted AI-Generated Image Quality AssessmentCode1
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
Hierarchical Time-Aware Mixture of Experts for Multi-Modal Sequential RecommendationCode1
BrainMAP: Learning Multiple Activation Pathways in Brain NetworksCode1
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural NetworksCode1
Heterogeneous Multi-task Learning with Expert DiversityCode1
Graph Sparsification via Mixture of GraphsCode1
Gradient-free variational learning with conditional mixture networksCode1
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned ExpertsCode1
Heterogeneous Mixture of Experts for Remote Sensing Image Super-ResolutionCode1
HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder ModelsCode1
Gated Multimodal Units for Information FusionCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse GateCode1
BiMediX: Bilingual Medical Mixture of Experts LLMCode1
FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of ExpertsCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
Show:102550
← PrevPage 10 of 53Next →

No leaderboard results yet.