SOTAVerified

Mixture-of-Experts

Papers

Showing 9761000 of 1312 papers

TitleStatusHype
MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question AnsweringCode1
Improving Expert Specialization in Mixture of Experts0
Improved Training of Mixture-of-Experts Language GANs0
TMoE-P: Towards the Pareto Optimum for Multivariate Soft Sensors0
Massively Multilingual Shallow Fusion with Large Language Models0
Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective0
Alternating Updates for Efficient Transformers0
PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets0
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction0
Covariate-guided Bayesian mixture model for multivariate time seriesCode0
Semantic-Aware Dynamic Parameter for Video Inpainting Transformer0
AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts0
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners0
MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion0
Generalizing Multimodal Variational Methods to Sets0
Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model0
Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation0
Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners0
SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing0
Sparse Upcycling: Training Mixture-of-Experts from Dense CheckpointsCode2
Incorporating Polar Field Data for Improved Solar Flare Prediction0
Named Entity and Relation Extraction with Multi-Modal Retrieval0
MegaBlocks: Efficient Sparse Training with Mixture-of-ExpertsCode3
Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling0
Mixture of Decision Trees for Interpretable Machine LearningCode1
Show:102550
← PrevPage 40 of 53Next →

No leaderboard results yet.