SOTAVerified

Mixture-of-Experts

Papers

Showing 901925 of 1312 papers

TitleStatusHype
BadMoE: Backdooring Mixture-of-Experts LLMs via Optimizing Routing Triggers and Infecting Dormant Experts0
Balanced and Elastic End-to-end Training of Dynamic LLMs0
BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts0
Bayesian Hierarchical Mixtures of Experts0
Bayesian shrinkage in mixture of experts models: Identifying robust determinants of class membership0
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference0
Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts0
Beyond Standard MoE: Mixture of Latent Experts for Resource-Efficient Language Models0
Biased Mixtures Of Experts: Enabling Computer Vision Inference Under Data Transfer Limitations0
BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference0
BiPrompt-SAM: Enhancing Image Segmentation via Explicit Selection between Point and Text Prompts0
BLR-MoE: Boosted Language-Routing Mixture of Experts for Domain-Robust Multilingual E2E ASR0
Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM0
Boost Your NeRF: A Model-Agnostic Mixture of Experts Framework for High Quality and Efficient Rendering0
Brain-Like Processing Pathways Form in Models With Heterogeneous Experts0
BrainNet-MoE: Brain-Inspired Mixture-of-Experts Learning for Neurological Disease Identification0
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM0
Breaking Data Silos: Towards Open and Scalable Mobility Foundation Models via Generative Continual Learning0
Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts0
Breaking the gridlock in Mixture-of-Experts: Consistent and Efficient Algorithms0
Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts0
Brief analysis of DeepSeek R1 and it's implications for Generative AI0
Buffer Overflow in Mixture of Experts0
Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition0
CAME: Competitively Learning a Mixture-of-Experts Model for First-stage Retrieval0
Show:102550
← PrevPage 37 of 53Next →

No leaderboard results yet.