| Granger-causal Attentive Mixtures of Experts: Learning Important Features with Neural Networks | Feb 6, 2018 | Feature ImportanceMixture-of-Experts | CodeCode Available | 0 | 5 |
| Graph Knowledge Distillation to Mixture of Experts | Jun 17, 2024 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 0 | 5 |
| Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern Estimation | Nov 2, 2021 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Mixture of Link Predictors on Graphs | Feb 13, 2024 | Link PredictionMixture-of-Experts | CodeCode Available | 0 | 5 |
| Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Eliciting and Understanding Cross-Task Skills with Task-Level Mixture-of-Experts | May 25, 2022 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 0 | 5 |
| Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting | Feb 13, 2025 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Mixture-of-Experts Graph Transformers for Interpretable Particle Collision Detection | Jan 6, 2025 | Decision MakingMixture-of-Experts | CodeCode Available | 0 | 5 |
| Adversarial Mixture Of Experts with Category Hierarchy Soft Constraint | Jul 24, 2020 | ClusteringFeature Importance | CodeCode Available | 0 | 5 |
| Guiding the Experts: Semantic Priors for Efficient and Focused MoE Routing | May 24, 2025 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Mixture Content Selection for Diverse Sequence Generation | Sep 4, 2019 | Abstractive Text SummarizationDecoder | CodeCode Available | 0 | 5 |
| GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theory | Jun 18, 2024 | Code GenerationMathematical Problem-Solving | CodeCode Available | 0 | 5 |
| Mixture of Experts Meets Decoupled Message Passing: Towards General and Adaptive Node Classification | Dec 11, 2024 | Computational Efficiency | CodeCode Available | 0 | 5 |
| Online Action Recognition for Human Risk Prediction with Anticipated Haptic Alert via Wearables | Dec 14, 2023 | Action RecognitionMixture-of-Experts | CodeCode Available | 0 | 5 |
| MicarVLMoE: A Modern Gated Cross-Aligned Vision-Language Mixture of Experts Model for Medical Image Captioning and Report Generation | Apr 29, 2025 | cross-modal alignmentDecoder | CodeCode Available | 0 | 5 |
| MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts | Jul 13, 2024 | DiversityMixture-of-Experts | CodeCode Available | 0 | 5 |
| Efficient and Interpretable Grammatical Error Correction with Mixture of Experts | Oct 30, 2024 | Grammatical Error CorrectionMixture-of-Experts | CodeCode Available | 0 | 5 |
| Effective Approaches to Batch Parallelization for Dynamic Neural Network Architectures | Jul 8, 2017 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 0 | 5 |
| Manifold-Preserving Transformers are Effective for Short-Long Range Encoding | Oct 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers | Feb 26, 2024 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 0 | 5 |
| EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization | Jun 16, 2025 | Mixture-of-ExpertsModel Compression | CodeCode Available | 0 | 5 |
| A multi-scale lithium-ion battery capacity prediction using mixture of experts and patch-based MLP | Mar 26, 2025 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| DynMoLE: Boosting Mixture of LoRA Experts Fine-Tuning with a Hybrid Routing Mechanism | Apr 1, 2025 | Common Sense ReasoningComputational Efficiency | CodeCode Available | 0 | 5 |
| Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models | Feb 21, 2025 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| A Multi-Modal Deep Learning Framework for Pan-Cancer Prognosis | Jan 13, 2025 | Deep LearningMixture-of-Experts | CodeCode Available | 0 | 5 |
| DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation | Aug 23, 2024 | Deep Reinforcement LearningMixture-of-Experts | CodeCode Available | 0 | 5 |
| BIG-MoE: Bypass Isolated Gating MoE for Generalized Multimodal Face Anti-Spoofing | Dec 24, 2024 | Decision MakingFace Anti-Spoofing | CodeCode Available | 0 | 5 |
| Hierarchical Mixtures of Generators for Adversarial Learning | Nov 5, 2019 | ClusteringMixture-of-Experts | CodeCode Available | 0 | 5 |
| LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress? | May 7, 2025 | Large Language ModelMixture-of-Experts | CodeCode Available | 0 | 5 |
| Bidirectional Attention as a Mixture of Continuous Word Experts | Jul 8, 2023 | Language ModellingMixture-of-Experts | CodeCode Available | 0 | 5 |
| DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning | Jun 7, 2021 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 0 | 5 |
| Lifelong Mixture of Variational Autoencoders | Jul 9, 2021 | Lifelong learningMixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning to Adapt Clinical Sequences with Residual Mixture of Experts | Apr 6, 2022 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Beyond Sharing: Conflict-Aware Multivariate Time Series Anomaly Detection | Aug 17, 2023 | Anomaly DetectionMixture-of-Experts | CodeCode Available | 0 | 5 |
| Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform | Jul 11, 2023 | Continual LearningMixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning Mixture-of-Experts for General-Purpose Black-Box Discrete Optimization | May 29, 2024 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning multi-modal generative models with permutation-invariant encoders and tighter variational objectives | Sep 1, 2023 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning Deep Mixtures of Gaussian Process Experts Using Sum-Product Networks | Sep 12, 2018 | Gaussian ProcessesMixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning CHARME models with neural networks | Feb 8, 2020 | Learning TheoryMixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning Gating ConvNet for Two-Stream based Methods in Action Recognition | Sep 12, 2017 | Action ClassificationAction Recognition | CodeCode Available | 0 | 5 |
| Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization | Oct 1, 2019 | DiversityFine-Grained Image Classification | CodeCode Available | 0 | 5 |
| Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer Gate | Feb 17, 2025 | GPUMixture-of-Experts | CodeCode Available | 0 | 5 |
| k-Winners-Take-All Ensemble Neural Network | Jan 4, 2024 | AllMixture-of-Experts | CodeCode Available | 0 | 5 |
| Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-Experts | Jun 26, 2025 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| A Mixture-of-Experts Model for Learning Multi-Facet Entity Embeddings | Dec 1, 2020 | Entity EmbeddingsMixture-of-Experts | CodeCode Available | 0 | 5 |
| Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline | Feb 9, 2025 | CPUGPU | CodeCode Available | 0 | 5 |
| Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective | Feb 2, 2025 | FairnessImage Segmentation | CodeCode Available | 0 | 5 |
| A Mixture-of-Experts Model for Antonym-Synonym Discrimination | Aug 1, 2021 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Discontinuity-Sensitive Optimal Control Learning by Mixture of Experts | Mar 7, 2018 | Mixture-of-ExpertsModel Predictive Control | CodeCode Available | 0 | 5 |
| Intrinsic User-Centric Interpretability through Global Mixture of Experts | Feb 5, 2024 | Mixture-of-ExpertsNews Classification | CodeCode Available | 0 | 5 |