| DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensionality Reduction | Oct 25, 2024 | Dimensionality ReductionMixture-of-Experts | CodeCode Available | 1 |
| Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design | Oct 24, 2024 | Mixture-of-ExpertsMMLU | CodeCode Available | 1 |
| LMHaze: Intensity-aware Image Dehazing with a Large-scale Multi-intensity Real Haze Dataset | Oct 21, 2024 | Image DehazingMamba | CodeCode Available | 1 |
| ST-MoE-BERT: A Spatial-Temporal Mixture-of-Experts Framework for Long-Term Cross-City Mobility Prediction | Oct 18, 2024 | ClassificationHuman Dynamics | CodeCode Available | 1 |
| MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts | Oct 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation | Oct 15, 2024 | Explainable RecommendationLanguage Modelling | CodeCode Available | 1 |
| AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality | Oct 14, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 1 |
| Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models | Oct 14, 2024 | Federated LearningMixture-of-Experts | CodeCode Available | 1 |
| Retraining-Free Merging of Sparse MoE via Hierarchical Clustering | Oct 11, 2024 | ClusteringLanguage Modeling | CodeCode Available | 1 |
| Efficient Dictionary Learning with Switch Sparse Autoencoders | Oct 10, 2024 | Dictionary LearningMixture-of-Experts | CodeCode Available | 1 |
| Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild | Oct 7, 2024 | BenchmarkingMixture-of-Experts | CodeCode Available | 1 |
| Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices | Oct 3, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction | Sep 26, 2024 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |
| Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoE | Sep 26, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| LOLA -- An Open-Source Massively Multilingual Large Language Model | Sep 17, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 |
| M3-Jepa: Multimodal Alignment via Multi-directional MoE based on the JEPA framework | Sep 9, 2024 | Computational EfficiencyCross-Modal Retrieval | CodeCode Available | 1 |
| Gradient-free variational learning with conditional mixture networks | Aug 29, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic Forecasting | Aug 20, 2024 | AttributeMixture-of-Experts | CodeCode Available | 1 |
| AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference | Aug 19, 2024 | ManagementMixture-of-Experts | CodeCode Available | 1 |
| Customizing Language Models with Instance-wise LoRA for Sequential Recommendation | Aug 19, 2024 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 1 |
| Layerwise Recurrent Router for Mixture-of-Experts | Aug 13, 2024 | AttributeMixture-of-Experts | CodeCode Available | 1 |
| AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies | Aug 13, 2024 | Language ModellingMixture-of-Experts | CodeCode Available | 1 |
| MoExtend: Tuning New Experts for Modality and Task Extension | Aug 7, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing | Jul 26, 2024 | AttributeLanguage Modelling | CodeCode Available | 1 |
| M4: Multi-Proxy Multi-Gate Mixture of Experts Network for Multiple Instance Learning in Histopathology Image Analysis | Jul 24, 2024 | Mixture-of-ExpertsMultiple Instance Learning | CodeCode Available | 1 |