SOTAVerified

Mixture-of-Experts

Papers

Showing 376400 of 1312 papers

TitleStatusHype
CoLA: Collaborative Low-Rank AdaptationCode0
Fast filtering of non-Gaussian models using Amortized Optimal Transport MapsCode0
MoNTA: Accelerating Mixture-of-Experts Training with Network-Traffc-Aware Parallel OptimizationCode0
MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from DemonstrationsCode0
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds using Convolutional Neural NetworksCode0
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language ModelsCode0
Extreme Classification in Log Memory using Count-Min Sketch: A Case Study of Amazon Search with 50M ProductsCode0
MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert SpecializationCode0
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors RoutingCode0
Cluster-Driven Expert Pruning for Mixture-of-Experts Large Language ModelsCode0
MoE-I^2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank DecompositionCode0
Exploring Model Consensus to Generate Translation ParaphrasesCode0
Exploiting Activation Sparsity with Dense to Dynamic-k Mixture-of-Experts ConversionCode0
MoRE-Brain: Routed Mixture of Experts for Interpretable and Generalizable Cross-Subject fMRI Visual DecodingCode0
More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed RoutingCode0
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-ExpertsCode0
MLP-KAN: Unifying Deep Representation and Function LearningCode0
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-ExpertsCode0
Expert Sample Consensus Applied to Camera Re-LocalizationCode0
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language ModelsCode0
CompeteSMoE - Effective Training of Sparse Mixture of Experts via CompetitionCode0
Mixture of Nested Experts: Adaptive Processing of Visual TokensCode0
Mixture of Link Predictors on GraphsCode0
Modality-Independent Brain Lesion Segmentation with Privacy-aware Continual LearningCode0
A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts modelsCode0
Show:102550
← PrevPage 16 of 53Next →

No leaderboard results yet.