| Interpretable mixture of experts for time series prediction under recurrent and non-recurrent conditions | Sep 5, 2024 | Mixture-of-ExpertsTime Series | —Unverified | 0 |
| Pluralistic Salient Object Detection | Sep 4, 2024 | Mixture-of-ExpertsObject | —Unverified | 0 |
| Configurable Foundation Models: Building LLMs from a Modular Perspective | Sep 4, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Enhancing Code-Switching Speech Recognition with LID-Based Collaborative Mixture of Experts Model | Sep 3, 2024 | Language IdentificationMixture-of-Experts | —Unverified | 0 |
| OLMoE: Open Mixture-of-Experts Language Models | Sep 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching | Sep 2, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts | Sep 2, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Gradient-free variational learning with conditional mixture networks | Aug 29, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts | Aug 28, 2024 | Mixture-of-Experts | —Unverified | 0 |
| LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation | Aug 28, 2024 | Computational EfficiencyHallucination | CodeCode Available | 3 |