SOTAVerified

Mixture-of-Experts

Papers

Showing 401450 of 1312 papers

TitleStatusHype
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors RoutingCode0
MoE-MLoRA for Multi-Domain CTR Prediction: Efficient Adaptation with Expert SpecializationCode0
Expert Sample Consensus Applied to Camera Re-LocalizationCode0
MoE-I^2: Compressing Mixture of Experts Models through Inter-Expert Pruning and Intra-Expert Low-Rank DecompositionCode0
A non-asymptotic approach for model selection via penalization in high-dimensional mixture of experts modelsCode0
Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to Improve in ChessCode0
Modality-Independent Brain Lesion Segmentation with Privacy-aware Continual LearningCode0
MLP-KAN: Unifying Deep Representation and Function LearningCode0
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-ExpertsCode0
Condensing Multilingual Knowledge with Lightweight Language-Specific ModulesCode0
Multimodal Fusion Strategies for Mapping Biophysical Landscape FeaturesCode0
Mixture of Link Predictors on GraphsCode0
Anomaly Detection by Recombining Gated Unsupervised ExpertsCode0
Mixture-of-Experts Variational Autoencoder for Clustering and Generating from Similarity-Based Representations on Single Cell DataCode0
Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language ModelsCode0
Equipping Computational Pathology Systems with Artifact Processing Pipelines: A Showcase for Computation and Performance Trade-offsCode0
Catching Attention with Automatic Pull Quote SelectionCode0
CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-ExpertsCode0
Ensemble and Mixture-of-Experts DeepONets For Operator LearningCode0
Mixture-of-Experts Graph Transformers for Interpretable Particle Collision DetectionCode0
Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node ClassificationCode0
Mixture of Nested Experts: Adaptive Processing of Visual TokensCode0
Mixture Content Selection for Diverse Sequence GenerationCode0
Adversarial Mixture Of Experts with Category Hierarchy Soft ConstraintCode0
An Empirical Study on Model-agnostic Debiasing Strategies for Robust Natural Language InferenceCode0
MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report GenerationCode0
Build a Robust QA System with Transformer-based Mixture of ExpertsCode0
Embarrassingly Parallel Inference for Gaussian ProcessesCode0
Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern EstimationCode0
Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-ExpertsCode0
Eidetic Learning: an Efficient and Provable Solution to Catastrophic ForgettingCode0
Manifold-Preserving Transformers are Effective for Short-Long Range EncodingCode0
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-ExpertsCode0
Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-ExpertsCode0
LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress?Code0
Robust Federated Learning by Mixture of ExpertsCode0
m2mKD: Module-to-Module Knowledge Distillation for Modular TransformersCode0
RouterKT: Mixture-of-Experts for Knowledge TracingCode0
Efficient and Interpretable Grammatical Error Correction with Mixture of ExpertsCode0
Effective Approaches to Batch Parallelization for Dynamic Neural Network ArchitecturesCode0
Lifelong Mixture of Variational AutoencodersCode0
Learning Mixture-of-Experts for General-Purpose Black-Box Discrete OptimizationCode0
Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectivesCode0
EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware OptimizationCode0
Countering Mainstream Bias via End-to-End Adaptive Local LearningCode0
SEKE: Specialised Experts for Keyword ExtractionCode0
A multi-scale lithium-ion battery capacity prediction using mixture of experts and patch-based MLPCode0
DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing MechanismCode0
Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts ModelsCode0
A Multi-Modal Deep Learning Framework for Pan-Cancer PrognosisCode0
Show:102550
← PrevPage 9 of 27Next →

No leaderboard results yet.