| Scaling physics-informed hard constraints with mixture-of-experts | Feb 20, 2024 | Inductive BiasMixture-of-Experts | CodeCode Available | 1 |
| HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts | Feb 20, 2024 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| BiMediX: Bilingual Medical Mixture of Experts LLM | Feb 20, 2024 | Mixture-of-ExpertsMultiple-choice | CodeCode Available | 1 |
| Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference | Feb 20, 2024 | DenoisingDiagnostic | —Unverified | 0 |
| MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models | Feb 20, 2024 | Common Sense ReasoningContrastive Learning | —Unverified | 0 |
| Towards an empirical understanding of MoE design choices | Feb 20, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Multilinear Mixture of Experts: Scalable Expert Specialization through Factorization | Feb 19, 2024 | Attributecounterfactual | CodeCode Available | 1 |
| Turn Waste into Worth: Rectifying Top-k Router of MoE | Feb 17, 2024 | Computational EfficiencyGPU | —Unverified | 0 |
| MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning | Feb 17, 2024 | Lifelong learningMixture-of-Experts | —Unverified | 0 |
| AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction | Feb 13, 2024 | Contrastive LearningMixture-of-Experts | —Unverified | 0 |
| Higher Layers Need More LoRA Experts | Feb 13, 2024 | Mixture-of-Experts | CodeCode Available | 2 |
| P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation | Feb 13, 2024 | MambaMixture-of-Experts | —Unverified | 0 |
| Mixture of Link Predictors on Graphs | Feb 13, 2024 | Link PredictionMixture-of-Experts | CodeCode Available | 0 |
| Scaling Laws for Fine-Grained Mixture of Experts | Feb 12, 2024 | Mixture-of-Experts | CodeCode Available | 3 |
| Differentially Private Training of Mixture of Experts Models | Feb 11, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models | Feb 10, 2024 | CPUGPU | CodeCode Available | 3 |
| Multimodal Clinical Trial Outcome Prediction with Large Language Models | Feb 9, 2024 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |
| Buffer Overflow in Mixture of Experts | Feb 8, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Task-customized Masked AutoEncoder via Mixture of Cluster-conditional Experts | Feb 8, 2024 | Mixture-of-ExpertsSelf-Supervised Learning | —Unverified | 0 |
| On Parameter Estimation in Deviated Gaussian Mixture of Experts | Feb 7, 2024 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts | Feb 5, 2024 | GPUMixture-of-Experts | —Unverified | 0 |
| On Least Square Estimation in Softmax Gating Mixture of Experts | Feb 5, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Intrinsic User-Centric Interpretability through Global Mixture of Experts | Feb 5, 2024 | Mixture-of-ExpertsNews Classification | CodeCode Available | 0 |
| FuseMoE: Mixture-of-Experts Transformers for Fleximodal Fusion | Feb 5, 2024 | Missing ElementsMixture-of-Experts | —Unverified | 0 |
| CompeteSMoE - Effective Training of Sparse Mixture of Experts via Competition | Feb 4, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| pFedMoE: Data-Level Personalization with Mixture of Experts for Model-Heterogeneous Personalized Federated Learning | Feb 2, 2024 | Federated LearningMixture-of-Experts | CodeCode Available | 0 |
| BlackMamba: Mixture of Experts for State-Space Models | Feb 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters | Feb 1, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 |
| Merging Multi-Task Models via Weight-Ensembling Mixture of Experts | Feb 1, 2024 | Mixture-of-ExpertsTask Arithmetic | CodeCode Available | 1 |
| MoDE: A Mixture-of-Experts Model with Mutual Distillation among the Experts | Jan 31, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Explainable data-driven modeling via mixture of experts: towards effective blending of grey and black-box models | Jan 30, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Checkmating One, by Using Many: Combining Mixture of Experts with MCTS to Improve in Chess | Jan 30, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models | Jan 29, 2024 | DecoderMixture-of-Experts | CodeCode Available | 5 |
| Routers in Vision Mixture of Experts: An Empirical Study | Jan 29, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs | Jan 29, 2024 | Language ModellingLarge Language Model | —Unverified | 0 |
| MoE-LLaVA: Mixture of Experts for Large Vision-Language Models | Jan 29, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 7 |
| Contrastive Learning and Mixture of Experts Enables Precise Vector Embeddings | Jan 28, 2024 | Contrastive LearningDescriptive | CodeCode Available | 1 |
| Is Temperature Sample Efficient for Softmax Gaussian Mixture of Experts? | Jan 25, 2024 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| M^3TN: Multi-gate Mixture-of-Experts based Multi-valued Treatment Network for Uplift Modeling | Jan 24, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference | Jan 16, 2024 | GPUMixture-of-Experts | CodeCode Available | 1 |
| Towards A Better Metric for Text-to-Video Generation | Jan 15, 2024 | Mixture-of-ExpertsText-to-Video Generation | —Unverified | 0 |
| Prompt-based mental health screening from social media text | Jan 11, 2024 | Mixture-of-Experts | —Unverified | 0 |
| DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models | Jan 11, 2024 | Language ModellingLarge Language Model | CodeCode Available | 5 |
| Robust Calibration For Improved Weather Prediction Under Distributional Shift | Jan 8, 2024 | Data AugmentationMixture-of-Experts | —Unverified | 0 |
| MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts | Jan 8, 2024 | MambaMixture-of-Experts | CodeCode Available | 3 |
| Mixtral of Experts | Jan 8, 2024 | Code GenerationCommon Sense Reasoning | CodeCode Available | 4 |
| Incorporating Visual Experts to Resolve the Information Loss in Multimodal Large Language Models | Jan 6, 2024 | Instruction FollowingMixture-of-Experts | —Unverified | 0 |
| Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks | Jan 5, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 2 |
| Subjective and Objective Analysis of Indian Social Media Video Quality | Jan 5, 2024 | Mixture-of-ExpertsVisual Question Answering (VQA) | CodeCode Available | 0 |
| Frequency-Adaptive Pan-Sharpening with Mixture of Experts | Jan 4, 2024 | Mixture-of-Experts | CodeCode Available | 1 |