SOTAVerified

Mixture-of-Experts

Papers

Showing 801850 of 1312 papers

TitleStatusHype
Scaling physics-informed hard constraints with mixture-of-expertsCode1
HyperMoE: Towards Better Mixture of Experts via Transferring Among ExpertsCode1
BiMediX: Bilingual Medical Mixture of Experts LLMCode1
Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference0
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models0
Towards an empirical understanding of MoE design choices0
Multilinear Mixture of Experts: Scalable Expert Specialization through FactorizationCode1
Turn Waste into Worth: Rectifying Top-k Router of MoE0
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning0
AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction0
Higher Layers Need More LoRA ExpertsCode2
P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation0
Mixture of Link Predictors on GraphsCode0
Scaling Laws for Fine-Grained Mixture of ExpertsCode3
Differentially Private Training of Mixture of Experts Models0
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
Multimodal Clinical Trial Outcome Prediction with Large Language ModelsCode1
Buffer Overflow in Mixture of Experts0
Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts0
On Parameter Estimation in Deviated Gaussian Mixture of Experts0
Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts0
On Least Square Estimation in Softmax Gating Mixture of Experts0
Intrinsic User-Centric Interpretability through Global Mixture of ExpertsCode0
FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion0
CompeteSMoE - Effective Training of Sparse Mixture of Experts via CompetitionCode0
pFedMoE: Data-Level Personalization with Mixture of Experts for Model-Heterogeneous Personalized Federated LearningCode0
BlackMamba: Mixture of Experts for State-Space ModelsCode3
Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of AdaptersCode1
Merging Multi-Task Models via Weight-Ensembling Mixture of ExpertsCode1
MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts0
Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models0
Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to Improve in ChessCode0
OpenMoE: An Early Effort on Open Mixture-of-Experts Language ModelsCode5
Routers in Vision Mixture of Experts: An Empirical Study0
LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs0
MoE-LLaVA: Mixture of Experts for Large Vision-Language ModelsCode7
Contrastive Learning and Mixture of Experts Enables Precise Vector EmbeddingsCode1
Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?0
M^3TN: Multi-gate Mixture-of-Experts based Multi-valued Treatment Network for Uplift Modeling0
Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model InferenceCode1
Towards A Better Metric for Text-to-Video Generation0
Prompt-based mental health screening from social media text0
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsCode5
Robust Calibration For Improved Weather Prediction Under Distributional Shift0
MoE-Mamba: Efficient Selective State Space Models with Mixture of ExpertsCode3
Mixtral of ExpertsCode4
Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models0
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
Subjective and Objective Analysis of Indian Social Media Video QualityCode0
Frequency-Adaptive Pan-Sharpening with Mixture of ExpertsCode1
Show:102550
← PrevPage 17 of 27Next →

No leaderboard results yet.