SOTAVerified

Mixture-of-Experts

Papers

Showing 191200 of 1312 papers

TitleStatusHype
Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical QueriesCode1
XMoE: Sparse Models with Fine-grained and Adaptive Expert SelectionCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf NodeCode1
Addressing Confounding Feature Issue for Causal RecommendationCode1
COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local SearchCode1
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out StrategiesCode1
Mixture of Attention Heads: Selecting Attention Heads Per TokenCode1
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-MixingCode1
Emergent Modularity in Pre-trained TransformersCode1
Show:102550
← PrevPage 20 of 132Next →

No leaderboard results yet.