| Mixture of Experts Meets Prompt-Based Continual Learning | May 23, 2024 | Continual LearningMixture-of-Experts | CodeCode Available | 1 |
| Graph Sparsification via Mixture of Graphs | May 23, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 1 |
| Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models | May 23, 2024 | Mixture-of-ExpertsVisual Question Answering | CodeCode Available | 2 |
| Unchosen Experts Can Contribute Too: Unleashing MoE Models' Power by Self-Contrast | May 23, 2024 | Computational EfficiencyGSM8K | CodeCode Available | 1 |
| Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts | May 22, 2024 | Mixture-of-Experts | —Unverified | 0 |
| xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token | May 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| DirectMultiStep: Direct Route Generation for Multi-Step Retrosynthesis | May 22, 2024 | DiversityMixture-of-Experts | CodeCode Available | 1 |
| Ensemble and Mixture-of-Experts DeepONets For Operator Learning | May 20, 2024 | Mixture-of-ExpertsOperator learning | CodeCode Available | 0 |
| MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models | May 19, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 |
| Learning More Generalized Experts by Merging Experts in Mixture-of-Experts | May 19, 2024 | Incremental LearningMixture-of-Experts | —Unverified | 0 |
| Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts | May 18, 2024 | Mixture-of-ExpertsVisual Question Answering | CodeCode Available | 5 |
| Many Hands Make Light Work: Task-Oriented Dialogue System with Module-Based Mixture-of-Experts | May 16, 2024 | Dialogue State TrackingMixture-of-Experts | —Unverified | 0 |
| M^4oE: A Foundation Model for Medical Multimodal Image Segmentation with Mixture of Experts | May 15, 2024 | Image SegmentationMixture-of-Experts | CodeCode Available | 1 |
| A Mixture of Experts Approach to 3D Human Motion Prediction | May 9, 2024 | Human motion predictionMixture-of-Experts | CodeCode Available | 0 |
| A Mixture-of-Experts Approach to Few-Shot Task Transfer in Open-Ended Text Worlds | May 9, 2024 | Few-Shot LearningMixture-of-Experts | —Unverified | 0 |
| EWMoE: An effective model for global weather forecasting with mixture-of-experts | May 9, 2024 | Mixture-of-ExpertsWeather Forecasting | CodeCode Available | 1 |
| CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | May 9, 2024 | Image CaptioningInstruction Following | CodeCode Available | 2 |
| SUTRA: Scalable Multilingual Language Model Architecture | May 7, 2024 | Computational EfficiencyHallucination | —Unverified | 0 |
| DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model | May 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| MEET: Mixture of Experts Extra Tree-Based sEMG Hand Gesture Identification | May 6, 2024 | Electromyography (EMG)Gesture Recognition | —Unverified | 0 |
| WDMoE: Wireless Distributed Large Language Models with Mixture of Experts | May 6, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training | May 6, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Mixture of partially linear experts | May 5, 2024 | Mixture-of-Experts | —Unverified | 0 |
| MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts | May 2, 2024 | Combinatorial OptimizationMixture-of-Experts | CodeCode Available | 3 |
| Hierarchical mixture of discriminative Generalized Dirichlet classifiers | May 2, 2024 | Mixture-of-ExpertsSpam detection | —Unverified | 0 |
| Powering In-Database Dynamic Model Slicing for Structured Data Analytics | May 1, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment | May 1, 2024 | Mixture-of-Experts | —Unverified | 0 |
| MoPEFT: A Mixture-of-PEFTs for the Segment Anything Model | May 1, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapping | Apr 30, 2024 | AllMixture-of-Experts | —Unverified | 0 |
| Revisiting RGBT Tracking Benchmarks from the Perspective of Modality Validity: A New Benchmark, Problem, and Method | Apr 30, 2024 | Mixture-of-ExpertsRgb-T Tracking | CodeCode Available | 1 |
| Mix of Experts Language Model for Named Entity Recognition | Apr 30, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| M3oE: Multi-Domain Multi-Task Mixture-of Experts Recommendation Framework | Apr 29, 2024 | AutoMLMixture-of-Experts | CodeCode Available | 1 |
| Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing | Apr 29, 2024 | Image Super-ResolutionMixture-of-Experts | CodeCode Available | 1 |
| Towards Incremental Learning in Large Language Models: A Critical Review | Apr 28, 2024 | Continual LearningIncremental Learning | —Unverified | 0 |
| Large Multi-modality Model Assisted AI-Generated Image Quality Assessment | Apr 27, 2024 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| Integration of Mixture of Experts and Multimodal Generative AI in Internet of Vehicles: A Survey | Apr 25, 2024 | Autonomous DrivingDecision Making | —Unverified | 0 |
| U2++ MoE: Scaling 4.7x parameters with minimal impact on RTF | Apr 25, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Multi-Head Mixture-of-Experts | Apr 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts | Apr 23, 2024 | HumanEvalmbpp | CodeCode Available | 1 |
| MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | Apr 22, 2024 | Common Sense ReasoningGPU | CodeCode Available | 3 |
| A Novel A.I Enhanced Reservoir Characterization with a Combined Mixture of Experts -- NVIDIA Modulus based Physics Informed Neural Operator Forward Model | Apr 20, 2024 | Mixture-of-ExpertsUncertainty Quantification | —Unverified | 0 |
| A Large-scale Medical Visual Task Adaptation Benchmark | Apr 19, 2024 | Mixture-of-Experts | —Unverified | 0 |
| MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation | Apr 17, 2024 | DisentanglementImage Generation | —Unverified | 0 |
| Med-MoE: Mixture of Domain-Specific Experts for Lightweight Medical Vision-Language Models | Apr 16, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Generative AI Agents with Large Language Model for Satellite Networks via a Mixture of Experts Transmission | Apr 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Intuition-aware Mixture-of-Rank-1-Experts for Parameter Efficient Finetuning | Apr 13, 2024 | DiversityMixture-of-Experts | —Unverified | 0 |
| Countering Mainstream Bias via End-to-End Adaptive Local Learning | Apr 13, 2024 | Collaborative FilteringMixture-of-Experts | CodeCode Available | 0 |
| Mixture of Experts Soften the Curse of Dimensionality in Operator Learning | Apr 13, 2024 | Mixture-of-ExpertsOperator learning | —Unverified | 0 |
| MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection | Apr 12, 2024 | Mixture-of-Experts | CodeCode Available | 2 |
| JetMoE: Reaching Llama2 Performance with 0.1M Dollars | Apr 11, 2024 | GPUMixture-of-Experts | CodeCode Available | 4 |