SOTAVerified

Mixture-of-Experts

Papers

Showing 151200 of 1312 papers

TitleStatusHype
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided AdaptationCode1
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the WildCode1
Modality Interactive Mixture-of-Experts for Fake News DetectionCode1
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE InferenceCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
MoEDiff-SR: Mixture of Experts-Guided Diffusion Model for Region-Adaptive MRI Super-ResolutionCode1
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice RoutingCode1
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned ExpertsCode1
Graph Sparsification via Mixture of GraphsCode1
Mixture-of-Linear-Experts for Long-term Time Series ForecastingCode1
Mixture of Decision Trees for Interpretable Machine LearningCode1
Gradient-free variational learning with conditional mixture networksCode1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsCode1
RetGen: A Joint framework for Retrieval and Grounded Text Generation ModelingCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
Go Wider Instead of DeeperCode1
Mixture of Experts Meets Prompt-Based Continual LearningCode1
FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of ExpertsCode1
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM InferenceCode1
Frequency-Adaptive Pan-Sharpening with Mixture of ExpertsCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank CompensatorsCode1
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identificationCode1
Specialized federated learning using a mixture of expertsCode1
PAD-Net: An Efficient Framework for Dynamic NetworksCode1
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
MeteoRA: Multiple-tasks Embedded LoRA for Large Language ModelsCode1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyCode1
Merging Experts into One: Improving Computational Efficiency of Mixture of ExpertsCode1
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action ModelCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
Merging Multi-Task Models via Weight-Ensembling Mixture of ExpertsCode1
MiCE: Mixture of Contrastive Experts for Unsupervised Image ClusteringCode1
Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision TransformerCode1
EWMoE: An effective model for global weather forecasting with mixture-of-expertsCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated ImagesCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical QueriesCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
Addressing Confounding Feature Issue for Causal RecommendationCode1
COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local SearchCode1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out StrategiesCode1
Mixture of Attention Heads: Selecting Attention Heads Per TokenCode1
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-MixingCode1
Emergent Modularity in Pre-trained TransformersCode1
Show:102550
← PrevPage 4 of 27Next →

No leaderboard results yet.