| Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts | May 12, 2023 | Ensemble LearningMixture-of-Experts | —Unverified | 0 |
| Locking and Quacking: Stacking Bayesian model predictions by log-pooling and superposition | May 12, 2023 | Bayesian InferenceMixture-of-Experts | —Unverified | 0 |
| Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception | May 10, 2023 | Classificationimage-classification | —Unverified | 0 |
| Steered Mixture-of-Experts Autoencoder Design for Real-Time Image Modelling and Denoising | May 5, 2023 | DecoderDenoising | —Unverified | 0 |
| Demystifying Softmax Gating Function in Gaussian Mixture of Experts | May 5, 2023 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity | May 3, 2023 | Machine TranslationMixture-of-Experts | CodeCode Available | 0 |
| Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration | May 1, 2023 | Data IntegrationEntity Resolution | CodeCode Available | 1 |
| Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism | Apr 22, 2023 | AllMixture-of-Experts | —Unverified | 0 |
| Revisiting Single-gated Mixtures of Experts | Apr 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement | Apr 8, 2023 | Mixture-of-ExpertsScheduling | —Unverified | 0 |
| Mixed Regression via Approximate Message Passing | Apr 5, 2023 | DenoisingMixture-of-Experts | —Unverified | 0 |
| Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation | Apr 3, 2023 | Mixture-of-ExpertsTransfer Learning | CodeCode Available | 1 |
| Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild | Apr 2, 2023 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| Steered Mixture of Experts Regression for Image Denoising with Multi-Model-Inference | Mar 30, 2023 | DenoisingImage Denoising | —Unverified | 0 |
| Information Maximizing Curriculum: A Curriculum-Based Approach for Imitating Diverse Skills | Mar 27, 2023 | Imitation LearningMixture-of-Experts | CodeCode Available | 0 |
| WM-MoE: Weather-aware Multi-scale Mixture-of-Experts for Blind Adverse Weather Removal | Mar 24, 2023 | Autonomous DrivingContrastive Learning | —Unverified | 0 |
| Disguise without Disruption: Utility-Preserving Face De-Identification | Mar 23, 2023 | De-identificationEnsemble Learning | —Unverified | 0 |
| Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset | Mar 22, 2023 | Mixture-of-Expertstext-classification | —Unverified | 0 |
| Learning A Sparse Transformer Network for Effective Image Deraining | Mar 21, 2023 | Image ReconstructionImage Restoration | CodeCode Available | 2 |
| HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals | Mar 17, 2023 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| MCR-DL: Mix-and-Match Communication Runtime for Deep Learning | Mar 15, 2023 | Deep LearningGPU | —Unverified | 0 |
| Scaling Vision-Language Models with Sparse Mixture of Experts | Mar 13, 2023 | Mixture-of-Experts | —Unverified | 0 |
| A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training | Mar 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference | Mar 10, 2023 | CPUDecoder | —Unverified | 0 |
| Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers | Mar 2, 2023 | Mixture-of-Experts | CodeCode Available | 1 |