SOTAVerified

Mixture-of-Experts

Papers

Showing 11011150 of 1312 papers

TitleStatusHype
Sparse Mixers: Combining MoE and Mixing to build a more efficient BERT0
MoESys: A Distributed and Efficient Mixture-of-Experts Training and Inference System for Internet Services0
Pluralistic Image Completion with Probabilistic Mixture-of-Experts0
Unified Modeling of Multi-Domain Multi-Device ASR Systems0
ST-ExpertNet: A Deep Expert Framework for Traffic Prediction0
Optimizing Mixture of Experts using Dynamic Recompilations0
How Can Cross-lingual Knowledge Contribute Better to Fine-Grained Entity Typing?0
On the Representation Collapse of Sparse Mixture of Experts0
Residual Mixture of Experts0
Towards Efficient Single Image Dehazing and Desnowing0
Table-based Fact Verification with Self-adaptive Mixture of ExpertsCode0
Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners0
Mixture of Experts for Biomedical Question Answering0
Mixture-of-experts VAEs can disregard variation in surjective multimodal data0
Learning to Adapt Clinical Sequences with Residual Mixture of ExpertsCode0
Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation0
On the Adaptation to Concept Drift for CTR Prediction0
Efficient Reflectance Capture with a Deep Gated Mixture-of-Experts0
Build a Robust QA System with Transformer-based Mixture of ExpertsCode0
Efficient Language Modeling with Sparse all-MLP0
SkillNet-NLU: A Sparsely Activated Model for General-Purpose Natural Language Understanding0
Functional mixture-of-experts for classification0
Mixture-of-Experts with Expert Choice Routing0
A Survey on Dynamic Neural Networks for Natural Language Processing0
Physics-Guided Problem Decomposition for Scaling Deep Learning of High-dimensional Eigen-Solvers: The Case of Schrödinger's Equation0
One Student Knows All Experts Know: From Sparse to Dense0
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation0
Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners0
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI ScaleCode0
Towards Lightweight Neural Animation : Exploration of Neural Network Pruning in Mixture of Experts-based Animation Models0
Combinations of Adaptive Filters0
Efficient Large Scale Language Modeling with Mixtures of Experts0
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts0
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition0
Specializing Versatile Skill Libraries using Local Mixture of ExpertsCode0
Anchoring to Exemplars for Training Mixture-of-Expert Cell Embeddings0
A Mixture of Expert Based Deep Neural Network for Improved ASR0
TAL: Two-stream Adaptive Learning for Generalizable Person Re-identification0
Expert Aggregation for Financial Forecasting0
SpeechMoE2: Mixture-of-Experts Model with Improved Routing0
Table-based Fact Verification with Self-adaptive Mixture of Experts0
MoEfication: Conditional Computation of Transformer Models for Efficient Inference0
StableMoE: Stable Routing Strategy for Mixture of Experts0
M6-T: Exploring Sparse Expert Models and Beyond0
SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization0
Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern EstimationCode0
RTM Super Learner Results at Quality Estimation Task0
Polynomial-Spline Neural Networks with Exact Integrals0
P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts0
Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model0
Show:102550
← PrevPage 23 of 27Next →

No leaderboard results yet.