| MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks | Jun 7, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 2 |
| Style Mixture of Experts for Expressive Text-To-Speech Synthesis | Jun 5, 2024 | Mixture-of-ExpertsSpeech Synthesis | —Unverified | 0 |
| Continual Traffic Forecasting via Mixture of Experts | Jun 5, 2024 | Continual LearningMixture-of-Experts | —Unverified | 0 |
| Node-wise Filtering in Graph Neural Networks: A Mixture of Experts Approach | Jun 5, 2024 | Mixture-of-ExpertsNode Classification | —Unverified | 0 |
| Filtered not Mixed: Stochastic Filtering-Based Online Gating for Mixture of Large Language Models | Jun 5, 2024 | Mixture-of-ExpertsTime Series | —Unverified | 0 |
| Parrot: Multilingual Visual Instruction Tuning | Jun 4, 2024 | Mixture-of-Experts | CodeCode Available | 5 |
| Demystifying the Compression of Mixture-of-Experts Through a Unified Framework | Jun 4, 2024 | Mixture-of-Experts | CodeCode Available | 2 |
| Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models | Jun 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Reservoir History Matching of the Norne field with generative exotic priors and a coupled Mixture of Experts -- Physics Informed Neural Operator Forward Model | Jun 2, 2024 | DenoisingMixture-of-Experts | CodeCode Available | 3 |
| Optimizing 6G Integrated Sensing and Communications (ISAC) via Expert Networks | Jun 1, 2024 | ISACMixture-of-Experts | —Unverified | 0 |
| A Gaussian Process-based Streaming Algorithm for Prediction of Time Series With Regimes and Outliers | Jun 1, 2024 | Gaussian ProcessesMixture-of-Experts | CodeCode Available | 0 |
| Training-efficient density quantum machine learning | May 30, 2024 | LEMMAMixture-of-Experts | —Unverified | 0 |
| MEMoE: Enhancing Model Editing with Mixture of Experts Adaptors | May 29, 2024 | Mixture-of-ExpertsModel Editing | —Unverified | 0 |
| Learning Mixture-of-Experts for General-Purpose Black-Box Discrete Optimization | May 29, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| MoNDE: Mixture of Near-Data Experts for Large-Scale Sparse Models | May 29, 2024 | DecoderGPU | —Unverified | 0 |
| LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design | May 28, 2024 | Mixture-of-Experts | —Unverified | 0 |
| XTrack: Multimodal Training Boosts RGB-X Video Object Trackers | May 28, 2024 | Inductive BiasMixture-of-Experts | CodeCode Available | 2 |
| Yuan 2.0-M32: Mixture of Experts with Attention Router | May 28, 2024 | ARCMath | CodeCode Available | 2 |
| Enhancing Fast Feed Forward Networks with Load Balancing and a Master Leaf Node | May 27, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts | May 26, 2024 | Binary ClassificationMixture-of-Experts | —Unverified | 0 |
| Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation | May 26, 2024 | feature selectionMixture-of-Experts | CodeCode Available | 2 |
| MoEUT: Mixture-of-Experts Universal Transformers | May 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Expert-Token Resonance: Redefining MoE Routing through Affinity-Driven Active Selection | May 24, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training | May 23, 2024 | GSM8KMixture-of-Experts | CodeCode Available | 7 |
| Statistical Advantages of Perturbing Cosine Router in Mixture of Experts | May 23, 2024 | Mixture-of-Experts | —Unverified | 0 |