SOTAVerified

Mixture-of-Experts

Papers

Showing 11011150 of 1312 papers

TitleStatusHype
Combinations of Adaptive Filters0
Efficient Large Scale Language Modeling with Mixtures of Experts0
Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identificationCode1
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts0
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition0
Specializing Versatile Skill Libraries using Local Mixture of ExpertsCode0
Anchoring to Exemplars for Training Mixture-of-Expert Cell Embeddings0
A Mixture of Expert Based Deep Neural Network for Improved ASR0
TAL: Two-stream Adaptive Learning for Generalizable Person Re-identification0
Expert Aggregation for Financial Forecasting0
SpeechMoE2: Mixture-of-Experts Model with Improved Routing0
M6-T: Exploring Sparse Expert Models and Beyond0
StableMoE: Stable Routing Strategy for Mixture of Experts0
Table-based Fact Verification with Self-adaptive Mixture of Experts0
SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization0
MoEfication: Conditional Computation of Transformer Models for Efficient Inference0
Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern EstimationCode0
RTM Super Learner Results at Quality Estimation Task0
Unsupervised Foreground Extraction via Deep Region CompetitionCode1
Polynomial-Spline Neural Networks with Exact Integrals0
P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts0
Simple or Complex? Complexity-Controllable Question Generation with Soft Templates and Deep Mixture of Experts Model0
HydraSum: Disentangling Stylistic Features in Text Summarization using Multi-Decoder ModelsCode1
Taming Sparsely Activated Transformer with Stochastic ExpertsCode1
Sparse MoEs meet Efficient EnsemblesCode1
Continual Learning Using Task Conditional Neural Networks0
Full-Precision Free Binary Graph Neural Networks0
MECATS: Mixture-of-Experts for Probabilistic Forecasts of Aggregated Time Series0
HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models0
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference0
Unbiased Gradient Estimation with Balanced Assignments for Mixtures of Experts0
Scalable and Efficient MoE Training for Multitask Multilingual Models0
Universal Simultaneous Machine Translation with Mixture-of-Experts Wait-k PolicyCode0
Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax LossCode1
Cross-token Modeling with Conditional Computation0
Personalised Federated Learning: A Combinational Approach0
SPMoE: Generate Multiple Pattern-Aware Outputs with Sparse Pattern Mixture of Experts0
AIREX: Neural Network-based Approach for Air Quality Inference in Unmonitored Cities0
Strength in Numbers: Averaging and Clustering Effects in Mixture of Experts for Graph-Based Dependency Parsing0
A Mixture-of-Experts Model for Antonym-Synonym DiscriminationCode0
ExpertRank: A Multi-level Coarse-grained Expert-based Listwise Ranking Loss0
Few-Shot and Continual Learning with Attentive Independent MechanismsCode1
Go Wider Instead of DeeperCode1
Federated Mixture of Experts0
Lifelong Mixture of Variational AutoencodersCode0
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style0
Adaptive 3D descattering with a dynamic synthesis networkCode0
On component interactions in two-stage recommender systems0
Mixtures of Deep Neural Experts for Automated Speech Scoring0
Heterogeneous Multi-task Learning with Expert DiversityCode1
Show:102550
← PrevPage 23 of 27Next →

No leaderboard results yet.