| Learning CHARME models with neural networks | Feb 8, 2020 | Learning TheoryMixture-of-Experts | CodeCode Available | 0 |
| A Multi-Modal Deep Learning Framework for Pan-Cancer Prognosis | Jan 13, 2025 | Deep LearningMixture-of-Experts | CodeCode Available | 0 |
| RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths | May 29, 2023 | Image GenerationMixture-of-Experts | CodeCode Available | 0 |
| Embarrassingly Parallel Inference for Gaussian Processes | Feb 27, 2017 | Gaussian ProcessesMixture-of-Experts | CodeCode Available | 0 |
| Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization | Oct 1, 2019 | DiversityFine-Grained Image Classification | CodeCode Available | 0 |
| Towards Adversarial Robustness of Model-Level Mixture-of-Experts Architectures for Semantic Segmentation | Dec 16, 2024 | Adversarial RobustnessMixture-of-Experts | CodeCode Available | 0 |
| Latent Prototype Routing: Achieving Near-Perfect Load Balancing in Mixture-of-Experts | Jun 26, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Elucidating Robust Learning with Uncertainty-Aware Corruption Pattern Estimation | Nov 2, 2021 | Mixture-of-Experts | CodeCode Available | 0 |
| STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation | Jun 9, 2025 | Graph AttentionImputation | CodeCode Available | 0 |
| CompeteSMoE - Effective Training of Sparse Mixture of Experts via Competition | Feb 4, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| CoLA: Collaborative Low-Rank Adaptation | May 21, 2025 | CoLAMixture-of-Experts | CodeCode Available | 0 |
| What You Have is What You Track: Adaptive and Robust Multimodal Tracking | Jul 8, 2025 | Mixture-of-ExpertsVisual Tracking | CodeCode Available | 0 |
| Beyond Sharing: Conflict-Aware Multivariate Time Series Anomaly Detection | Aug 17, 2023 | Anomaly DetectionMixture-of-Experts | CodeCode Available | 0 |
| k-Winners-Take-All Ensemble Neural Network | Jan 4, 2024 | AllMixture-of-Experts | CodeCode Available | 0 |
| Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity | May 3, 2023 | Machine TranslationMixture-of-Experts | CodeCode Available | 0 |
| Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline | Feb 9, 2025 | CPUGPU | CodeCode Available | 0 |
| Jamba: A Hybrid Transformer-Mamba Language Model | Mar 28, 2024 | GPULanguage Modeling | CodeCode Available | 0 |
| A Mixture of Experts Approach to 3D Human Motion Prediction | May 9, 2024 | Human motion predictionMixture-of-Experts | CodeCode Available | 0 |
| Understanding the Performance and Estimating the Cost of LLM Fine-Tuning | Aug 8, 2024 | GPUMixture-of-Experts | CodeCode Available | 0 |
| ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration | Mar 10, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Restoring Spatially-Heterogeneous Distortions using Mixture of Experts Network | Sep 30, 2020 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 0 |
| Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate | May 26, 2025 | ImputationMixture-of-Experts | CodeCode Available | 0 |
| Intrinsic User-Centric Interpretability through Global Mixture of Experts | Feb 5, 2024 | Mixture-of-ExpertsNews Classification | CodeCode Available | 0 |
| Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality Detection | Aug 16, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment | Jun 1, 2023 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |