SOTAVerified

Mixture-of-Experts

Papers

Showing 451500 of 1312 papers

TitleStatusHype
MoE-I^2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank DecompositionCode0
MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel OptimizationCode0
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language ModelsCode1
Stereo-Talker: Audio-driven 3D Human Synthesis with Prior-Guided Mixture-of-Experts0
Efficient and Interpretable Grammatical Error Correction with Mixture of ExpertsCode0
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning0
Stealing User Prompts from Mixture of Experts0
Neural Experts: Mixture of Experts for Implicit Neural Representations0
ProMoE: Fast MoE-based LLM Serving using Proactive Caching0
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging0
FinTeamExperts: Role Specialized MOEs For Financial Analysis0
Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving0
Hierarchical Mixture of Experts: Generalizable Learning for High-Level SynthesisCode0
DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality ReductionCode1
Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-DesignCode1
Mixture of Parrots: Experts improve memorization more than reasoning0
MoMQ: Mixture-of-Experts Enhances Multi-Dialect Query Generation across Relational and Non-Relational Databases0
Robust and Explainable Depression Identification from Speech Using Vowel-Based Ensemble Learning Approaches0
MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning0
Faster Language Models with Better Multi-Token Prediction Using Tensor Decomposition0
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference0
Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling0
Generalizing Motion Planners with Mixture of Experts for Autonomous DrivingCode3
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-ExpertsCode0
LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze DatasetCode1
ViMoE: An Empirical Study of Designing Vision Mixture-of-Experts0
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image RestorationCode2
MENTOR: Mixture-of-Experts Network with Task-Oriented Perturbation for Visual Reinforcement Learning0
MomentumSMoE: Integrating Momentum into Sparse Mixture of ExpertsCode1
ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility PredictionCode1
Enhancing Generalization in Sparse Mixture of Experts Models: The Case for Increased Expert Activation in Compositional Tasks0
On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs0
EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference0
Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts0
Transformer Layer Injection: A Novel Approach for Efficient Upscaling of Large Language Models0
MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router0
MoH: Multi-Head Attention as Mixture-of-Head AttentionCode4
Quadratic Gating Functions in Mixture of Experts: A Statistical Insight0
GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable RecommendationCode1
AlphaLoRA: Assigning LoRA Experts Based on Layer Training QualityCode1
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs0
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family ExpertsCode2
Learning to Ground VLMs without Forgetting0
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For FreeCode2
Scalable Multi-Domain Adaptation of Language Models using Modular Experts0
Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language ModelsCode1
Moirai-MoE: Empowering Time Series Foundation Models with Sparse Mixture of ExpertsCode5
ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RL0
MoIN: Mixture of Introvert Experts to Upcycle an LLM0
AT-MoE: Adaptive Task-planning Mixture of Experts via LoRA Approach0
Show:102550
← PrevPage 10 of 27Next →

No leaderboard results yet.