SOTAVerified

Mixture-of-Experts

Papers

Showing 126150 of 1312 papers

TitleStatusHype
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEsCode2
Tutel: Adaptive Mixture-of-Experts at ScaleCode2
Text2Human: Text-Driven Controllable Human Image GenerationCode2
MDFEND: Multi-domain Fake News DetectionCode2
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient SparsityCode2
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts LayerCode2
Learning to Skip the Middle Layers of TransformersCode1
Structural Similarity-Inspired Unfolding for Lightweight Image Super-ResolutionCode1
SPACE: Your Genomic Profile Predictor is a Powerful DNA Foundation ModelCode1
Mastering Massive Multi-Task Reinforcement Learning via Mixture-of-Expert Decision TransformerCode1
FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language ModelsCode1
ThanoRA: Task Heterogeneity-Aware Multi-Task Low-Rank AdaptationCode1
JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation ModelCode1
U-SAM: An audio language Model for Unified Speech, Audio, and Music UnderstandingCode1
Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training and InferenceCode1
Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language UnderstandingCode1
MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-DesignCode1
Mixture of Sparse Attention: Content-Based Learnable Sparse Attention via Expert-Choice RoutingCode1
Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identificationCode1
Manifold Induced Biases for Zero-shot and Few-shot Detection of Generated ImagesCode1
Dense Backpropagation Improves Training for Sparse Mixture-of-ExpertsCode1
C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-MixingCode1
MoEDiff-SR: Mixture of Experts-Guided Diffusion Model for Region-Adaptive MRI Super-ResolutionCode1
MiLo: Efficient Quantized MoE Inference with Mixture of Low-Rank CompensatorsCode1
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual TrackingCode1
Show:102550
← PrevPage 6 of 53Next →

No leaderboard results yet.