| Mixture of Nested Experts: Adaptive Processing of Visual Tokens | Jul 29, 2024 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| Multimodal Cultural Safety: Evaluation Frameworks and Alignment Strategies | May 20, 2025 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| LLM-e Guess: Can LLMs Capabilities Advance Without Hardware Progress? | May 7, 2025 | Large Language ModelMixture-of-Experts | CodeCode Available | 0 | 5 |
| Lifelong Mixture of Variational Autoencoders | Jul 9, 2021 | Lifelong learningMixture-of-Experts | CodeCode Available | 0 | 5 |
| Efficient and Interpretable Grammatical Error Correction with Mixture of Experts | Oct 30, 2024 | Grammatical Error CorrectionMixture-of-Experts | CodeCode Available | 0 | 5 |
| Effective Approaches to Batch Parallelization for Dynamic Neural Network Architectures | Jul 8, 2017 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 0 | 5 |
| Robust Federated Learning by Mixture of Experts | Apr 23, 2021 | Federated LearningMixture-of-Experts | CodeCode Available | 0 | 5 |
| Robust Traffic Forecasting against Spatial Shift over Years | Oct 1, 2024 | AttributeMixture-of-Experts | CodeCode Available | 0 | 5 |
| Learning to Adapt Clinical Sequences with Residual Mixture of Experts | Apr 6, 2022 | Mixture-of-Experts | CodeCode Available | 0 | 5 |
| m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers | Feb 26, 2024 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 0 | 5 |