| Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models | May 21, 2025 | AllCPU | CodeCode Available | 0 |
| Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language Understanding | Jun 17, 2024 | Mixture-of-ExpertsNatural Language Understanding | CodeCode Available | 0 |
| MLP-KAN: Unifying Deep Representation and Function Learning | Oct 3, 2024 | Kolmogorov-Arnold NetworksMixture-of-Experts | CodeCode Available | 0 |
| Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Mixture of Nested Experts: Adaptive Processing of Visual Tokens | Jul 29, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE | Nov 5, 2023 | DecoderMixture-of-Experts | CodeCode Available | 0 |
| Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective | Feb 2, 2025 | FairnessImage Segmentation | CodeCode Available | 0 |
| Mixture of Modular Experts: Distilling Knowledge from a Multilingual Teacher into Specialized Modular Language Models | Jul 28, 2024 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 0 |
| Discontinuity-Sensitive Optimal Control Learning by Mixture of Experts | Mar 7, 2018 | Mixture-of-ExpertsModel Predictive Control | CodeCode Available | 0 |
| H^3Fusion: Helpful, Harmless, Honest Fusion of Aligned LLMs | Nov 26, 2024 | Mixture-of-Experts | CodeCode Available | 0 |