SOTAVerified

Mixture-of-Experts

Papers

Showing 10011050 of 1312 papers

TitleStatusHype
Double Deep Q-Learning in Opponent Modeling0
Spatial Mixture-of-ExpertsCode1
Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production0
A Bird's-eye View of Reranking: from List Level to Page LevelCode0
HMOE: Hypernetwork-based Mixture of Experts for Domain Generalization0
Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts0
PAD-Net: An Efficient Framework for Dynamic NetworksCode1
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations0
Using Deep Mixture-of-Experts to Detect Word Meaning Shift for TempoWiC0
Safe Real-World Autonomous Driving by Learning to Predict and Plan with a Mixture of Experts0
Contextual Mixture of Experts: Integrating Knowledge into Predictive Modeling0
Prediction Sets for High-Dimensional Mixture of Experts Models0
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models0
Coordination with Humans via Strategy Matching0
M^3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-designCode1
On the Adversarial Robustness of Mixture of Experts0
Tiny-Attention Adapter: Contexts Are More Important Than the Number of Parameters0
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine TranslationCode1
Mixture of Attention Heads: Selecting Attention Heads Per TokenCode1
FEAMOE: Fair, Explainable and Adaptive Mixture of Experts0
Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-ExpertsCode1
Deep Learning Mixture-of-Experts Approach for Cytotoxic Edema Assessment in Infants and Children0
Probabilistic partition of unity networks for high-dimensional regression problems0
Table-based Fact Verification with Self-labeled Keypoint Alignment0
Parameter-varying neural ordinary differential equations with partition-of-unity networks0
Sparsity-Constrained Optimal Transport0
Mixture of experts models for multilevel data: modelling framework and approximation theory0
Tuning of Mixture-of-Experts Mixed-Precision Neural Networks0
Diversified Dynamic Routing for Vision Tasks0
Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition0
Sparse Video Representation Using Steered Mixture-of-Experts With Global Motion Compensation0
A Review of Sparse Expert Models in Deep Learning0
ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels0
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical QueriesCode1
Context-aware Mixture-of-Experts for Unbiased Scene Graph Generation0
A Theoretical View on Sparsely Activated Networks0
Towards Understanding Mixture of Experts in Deep LearningCode1
Edge-Aware Autoencoder Design for Real-Time Mixture-of-Experts Image Compression0
Learning Soccer Juggling Skills with Layer-wise Mixture-of-ExpertsCode1
Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing0
MoEC: Mixture of Expert Clusters0
Learning Large-scale Universal User Representation with Sparse Mixture of Experts0
No Language Left Behind: Scaling Human-Centered Machine TranslationCode2
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleCode4
RoME: Role-aware Mixture-of-Expert Transformer for Text-to-Video RetrievalCode0
Scalable Neural Data Server: A Data Recommender for Transfer Learning0
Adaptive Expert Models for Personalization in Federated LearningCode0
Towards Universal Sequence Representation Learning for Recommender SystemsCode2
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEsCode2
Sparse Mixture-of-Experts are Domain Generalizable LearnersCode1
Show:102550
← PrevPage 21 of 27Next →

No leaderboard results yet.