| Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts | May 12, 2023 | Ensemble LearningMixture-of-Experts | —Unverified | 0 |
| Locking and Quacking: Stacking Bayesian model predictions by log-pooling and superposition | May 12, 2023 | Bayesian InferenceMixture-of-Experts | —Unverified | 0 |
| Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception | May 10, 2023 | Classificationimage-classification | —Unverified | 0 |
| Steered Mixture-of-Experts Autoencoder Design for Real-Time Image Modelling and Denoising | May 5, 2023 | DecoderDenoising | —Unverified | 0 |
| Demystifying Softmax Gating Function in Gaussian Mixture of Experts | May 5, 2023 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity | May 3, 2023 | Machine TranslationMixture-of-Experts | CodeCode Available | 0 |
| Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration | May 1, 2023 | Data IntegrationEntity Resolution | CodeCode Available | 1 |
| Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism | Apr 22, 2023 | AllMixture-of-Experts | —Unverified | 0 |
| Revisiting Single-gated Mixtures of Experts | Apr 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement | Apr 8, 2023 | Mixture-of-ExpertsScheduling | —Unverified | 0 |