SOTAVerified

Mixture-of-Experts

Papers

Showing 426450 of 1312 papers

TitleStatusHype
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report GenerationCode0
Build a Robust QA System with Transformer-based Mixture of ExpertsCode0
Embarrassingly Parallel Inference for Gaussian ProcessesCode0
Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern EstimationCode0
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-ExpertsCode0
Eidetic Learning: an Efficient and Provable Solution to Catastrophic ForgettingCode0
Manifold-Preserving Transformers are Effective for Short-Long Range EncodingCode0
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-ExpertsCode0
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-ExpertsCode0
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?Code0
Robust Federated Learning by Mixture of ExpertsCode0
m2mKD: Module-to-Module Knowledge Distillation for Modular TransformersCode0
RouterKT: Mixture-of-Experts for Knowledge TracingCode0
Efficient and Interpretable Grammatical Error Correction with Mixture of ExpertsCode0
Effective Approaches to Batch Parallelization for Dynamic Neural Network ArchitecturesCode0
Lifelong Mixture of Variational AutoencodersCode0
Learning Mixture-of-Experts for General-Purpose Black-Box Discrete OptimizationCode0
Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectivesCode0
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware OptimizationCode0
Countering Mainstream Bias via End-to-End Adaptive Local LearningCode0
SEKE: Specialised Experts for Keyword ExtractionCode0
A multi-scale lithium-ion battery capacity prediction using mixture of experts and patch-based MLPCode0
DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing MechanismCode0
Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts ModelsCode0
A Multi-Modal Deep Learning Framework for Pan-Cancer PrognosisCode0
Show:102550
← PrevPage 18 of 53Next →

No leaderboard results yet.