| Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts | May 12, 2023 | Ensemble LearningMixture-of-Experts | —Unverified | 0 |
| Locking and Quacking: Stacking Bayesian model predictions by log-pooling and superposition | May 12, 2023 | Bayesian InferenceMixture-of-Experts | —Unverified | 0 |
| Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception | May 10, 2023 | Classificationimage-classification | —Unverified | 0 |
| Steered Mixture-of-Experts Autoencoder Design for Real-Time Image Modelling and Denoising | May 5, 2023 | DecoderDenoising | —Unverified | 0 |
| Demystifying Softmax Gating Function in Gaussian Mixture of Experts | May 5, 2023 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| Towards Being Parameter-Efficient: A Stratified Sparsely Activated Transformer with Dynamic Capacity | May 3, 2023 | Machine TranslationMixture-of-Experts | CodeCode Available | 0 |
| Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration | May 1, 2023 | Data IntegrationEntity Resolution | CodeCode Available | 1 |
| Pipeline MoE: A Flexible MoE Implementation with Pipeline Parallelism | Apr 22, 2023 | AllMixture-of-Experts | —Unverified | 0 |
| Revisiting Single-gated Mixtures of Experts | Apr 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placement | Apr 8, 2023 | Mixture-of-ExpertsScheduling | —Unverified | 0 |
| Mixed Regression via Approximate Message Passing | Apr 5, 2023 | DenoisingMixture-of-Experts | —Unverified | 0 |
| Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation | Apr 3, 2023 | Mixture-of-ExpertsTransfer Learning | CodeCode Available | 1 |
| Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild | Apr 2, 2023 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| Steered Mixture of Experts Regression for Image Denoising with Multi-Model-Inference | Mar 30, 2023 | DenoisingImage Denoising | —Unverified | 0 |
| Information Maximizing Curriculum: A Curriculum-Based Approach for Imitating Diverse Skills | Mar 27, 2023 | Imitation LearningMixture-of-Experts | CodeCode Available | 0 |
| WM-MoE: Weather-aware Multi-scale Mixture-of-Experts for Blind Adverse Weather Removal | Mar 24, 2023 | Autonomous DrivingContrastive Learning | —Unverified | 0 |
| Disguise without Disruption: Utility-Preserving Face De-Identification | Mar 23, 2023 | De-identificationEnsemble Learning | —Unverified | 0 |
| Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset | Mar 22, 2023 | Mixture-of-Expertstext-classification | —Unverified | 0 |
| Learning A Sparse Transformer Network for Effective Image Deraining | Mar 21, 2023 | Image ReconstructionImage Restoration | CodeCode Available | 2 |
| HDformer: A Higher Dimensional Transformer for Diabetes Detection Utilizing Long Range Vascular Signals | Mar 17, 2023 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| MCR-DL: Mix-and-Match Communication Runtime for Deep Learning | Mar 15, 2023 | Deep LearningGPU | —Unverified | 0 |
| Scaling Vision-Language Models with Sparse Mixture of Experts | Mar 13, 2023 | Mixture-of-Experts | —Unverified | 0 |
| A Hybrid Tensor-Expert-Data Parallelism Approach to Optimize Mixture-of-Experts Training | Mar 11, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference | Mar 10, 2023 | CPUDecoder | —Unverified | 0 |
| Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers | Mar 2, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering | Mar 2, 2023 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| Improving Expert Specialization in Mixture of Experts | Feb 28, 2023 | Continual LearningMixture-of-Experts | —Unverified | 0 |
| Improved Training of Mixture-of-Experts Language GANs | Feb 23, 2023 | Adversarial TextImage Generation | —Unverified | 0 |
| TMoE-P: Towards the Pareto Optimum for Multivariate Soft Sensors | Feb 21, 2023 | Mixture-of-Experts | —Unverified | 0 |
| Massively Multilingual Shallow Fusion with Large Language Models | Feb 17, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective | Feb 2, 2023 | GPUMixture-of-Experts | —Unverified | 0 |
| Alternating Updates for Efficient Transformers | Jan 30, 2023 | Mixture-of-Experts | —Unverified | 0 |
| PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets | Jan 14, 2023 | ManagementMixture-of-Experts | —Unverified | 0 |
| AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction | Jan 6, 2023 | Click-Through Rate PredictionMixture-of-Experts | —Unverified | 0 |
| Covariate-guided Bayesian mixture model for multivariate time series | Jan 3, 2023 | Mixture-of-ExpertsTime Series | CodeCode Available | 0 |
| Semantic-Aware Dynamic Parameter for Video Inpainting Transformer | Jan 1, 2023 | Mixture-of-ExpertsVideo Inpainting | —Unverified | 0 |
| AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts | Jan 1, 2023 | Instance SegmentationMixture-of-Experts | —Unverified | 0 |
| Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners | Jan 1, 2023 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| MultiCoder: Multi-Programming-Lingual Pre-Training for Low-Resource Code Completion | Dec 19, 2022 | Code CompletionMixture-of-Experts | —Unverified | 0 |
| Generalizing Multimodal Variational Methods to Sets | Dec 19, 2022 | Mixture-of-Experts | —Unverified | 0 |
| Memory-efficient NLLB-200: Language-specific Expert Pruning of a Massively Multilingual Machine Translation Model | Dec 19, 2022 | GPUMachine Translation | —Unverified | 0 |
| Fixing MoE Over-Fitting on Low-Resource Languages in Multilingual Machine Translation | Dec 15, 2022 | Machine TranslationMixture-of-Experts | —Unverified | 0 |
| Mod-Squad: Designing Mixture of Experts As Modular Multi-Task Learners | Dec 15, 2022 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| SMILE: Scaling Mixture-of-Experts with Efficient Bi-level Routing | Dec 10, 2022 | Mixture-of-Experts | —Unverified | 0 |
| Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints | Dec 9, 2022 | Mixture-of-Experts | CodeCode Available | 2 |
| Incorporating Polar Field Data for Improved Solar Flare Prediction | Dec 4, 2022 | Mixture-of-ExpertsPrediction | —Unverified | 0 |
| Named Entity and Relation Extraction with Multi-Modal Retrieval | Dec 3, 2022 | Mixture-of-ExpertsMulti-modal Named Entity Recognition | —Unverified | 0 |
| MegaBlocks: Efficient Sparse Training with Mixture-of-Experts | Nov 29, 2022 | GPUMixture-of-Experts | CodeCode Available | 3 |
| Automatically Extracting Information in Medical Dialogue: Expert System And Attention for Labelling | Nov 28, 2022 | Mixture-of-Experts | —Unverified | 0 |
| Mixture of Decision Trees for Interpretable Machine Learning | Nov 26, 2022 | Interpretable Machine LearningMixture-of-Experts | CodeCode Available | 1 |