SOTAVerified

Mixture-of-Experts

Papers

Showing 151200 of 1312 papers

TitleStatusHype
GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned ExpertsCode1
Graph Sparsification via Mixture of GraphsCode1
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE InferenceCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided AdaptationCode1
Mixture of Experts Meets Prompt-Based Continual LearningCode1
Mixture-of-Linear-Experts for Long-term Time Series ForecastingCode1
Mixture of Decision Trees for Interpretable Machine LearningCode1
Go Wider Instead of DeeperCode1
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out StrategiesCode1
COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local SearchCode1
Gated Multimodal Units for Information FusionCode1
AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language ModelsCode1
Gradient-free variational learning with conditional mixture networksCode1
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question AnsweringCode1
Mixture of Attention Heads: Selecting Attention Heads Per TokenCode1
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice RoutingCode1
MoEDiff-SR: Mixture of Experts-Guided Diffusion Model for Region-Adaptive MRI Super-ResolutionCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM InferenceCode1
MiCE: Mixture of Contrastive Experts for Unsupervised Image ClusteringCode1
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identificationCode1
Specialized federated learning using a mixture of expertsCode1
MeteoRA: Multiple-tasks Embedded LoRA for Large Language ModelsCode1
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank CompensatorsCode1
Exploring Sparse MoE in GANs for Text-conditioned Image SynthesisCode1
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyCode1
PAD-Net: An Efficient Framework for Dynamic NetworksCode1
MEFT: Memory-Efficient Fine-Tuning through Sparse AdapterCode1
Merging Experts into One: Improving Computational Efficiency of Mixture of ExpertsCode1
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action ModelCode1
Examining Post-Training Quantization for Mixture-of-Experts: A BenchmarkCode1
MedCoT: Medical Chain of Thought via Hierarchical ExpertCode1
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
Merging Multi-Task Models via Weight-Ensembling Mixture of ExpertsCode1
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and EditingCode1
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-ExpertsCode1
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated ImagesCode1
M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of ExpertsCode1
Making Neural Networks Interpretable with Attribution: Application to Implicit Signals PredictionCode1
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical QueriesCode1
Emergent Modularity in Pre-trained TransformersCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designCode1
Addressing Confounding Feature Issue for Causal RecommendationCode1
Show:102550
← PrevPage 4 of 27Next →

No leaderboard results yet.