SOTAVerified

Mixture-of-Experts

Papers

Showing 121130 of 1312 papers

TitleStatusHype
ModuleFormer: Modularity Emerges from Mixture-of-ExpertsCode2
Learning A Sparse Transformer Network for Effective Image DerainingCode2
Sparse Upcycling: Training Mixture-of-Experts from Dense CheckpointsCode2
No Language Left Behind: Scaling Human-Centered Machine TranslationCode2
Towards Universal Sequence Representation Learning for Recommender SystemsCode2
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEsCode2
Tutel: Adaptive Mixture-of-Experts at ScaleCode2
Text2Human: Text-Driven Controllable Human Image GenerationCode2
MDFEND: Multi-domain Fake News DetectionCode2
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient SparsityCode2
Show:102550
← PrevPage 13 of 132Next →

No leaderboard results yet.