| Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective | Feb 2, 2023 | GPUMixture-of-Experts | —Unverified | 0 |
| Alternating Updates for Efficient Transformers | Jan 30, 2023 | Mixture-of-Experts | —Unverified | 0 |
| PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets | Jan 14, 2023 | ManagementMixture-of-Experts | —Unverified | 0 |
| AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction | Jan 6, 2023 | Click-Through Rate PredictionMixture-of-Experts | —Unverified | 0 |
| Covariate-guided Bayesian mixture model for multivariate time series | Jan 3, 2023 | Mixture-of-ExpertsTime Series | CodeCode Available | 0 |
| Semantic-Aware Dynamic Parameter for Video Inpainting Transformer | Jan 1, 2023 | Mixture-of-ExpertsVideo Inpainting | —Unverified | 0 |
| Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners | Jan 1, 2023 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts | Jan 1, 2023 | Instance SegmentationMixture-of-Experts | —Unverified | 0 |
| Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model | Dec 19, 2022 | GPUMachine Translation | —Unverified | 0 |
| MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion | Dec 19, 2022 | Code CompletionMixture-of-Experts | —Unverified | 0 |