| OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment | Feb 26, 2025 | Mixture-of-ExpertsRecommendation Systems | —Unverified | 0 |
| Delta Decompression for MoE-based LLMs Compression | Feb 24, 2025 | DiversityMixture-of-Experts | CodeCode Available | 2 |
| The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE | Feb 24, 2025 | Linear Mode ConnectivityMixture-of-Experts | —Unverified | 0 |
| ENACT-Heart -- ENsemble-based Assessment Using CNN and Transformer on Heart Sounds | Feb 24, 2025 | DiagnosticMixture-of-Experts | —Unverified | 0 |
| BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference | Feb 24, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks | Feb 24, 2025 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment | Feb 24, 2025 | image-classificationImage Classification | CodeCode Available | 2 |
| An Autonomous Network Orchestration Framework Integrating Large Language Models with Continual Reinforcement Learning | Feb 22, 2025 | ARCContinual Learning | —Unverified | 0 |
| Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models | Feb 21, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Tight Clusters Make Specialized Experts | Feb 21, 2025 | ClusteringLanguage Modeling | CodeCode Available | 0 |
| Ray-Tracing for Conditionally Activated Neural Networks | Feb 20, 2025 | Mixture-of-Experts | —Unverified | 0 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts | Feb 19, 2025 | Dictionary LearningMixture-of-Experts | —Unverified | 0 |
| DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs | Feb 18, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models | Feb 18, 2025 | Knowledge DistillationMixture-of-Experts | —Unverified | 0 |
| MoBA: Mixture of Block Attention for Long-Context LLMs | Feb 18, 2025 | Mixture-of-Experts | CodeCode Available | 7 |
| Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer Gate | Feb 17, 2025 | GPUMixture-of-Experts | CodeCode Available | 0 |
| Connector-S: A Survey of Connectors in Multi-modal Large Language Models | Feb 17, 2025 | Mixture-of-ExpertsSurvey | —Unverified | 0 |
| How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines | Feb 17, 2025 | Mixture-of-Experts | —Unverified | 0 |
| ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models | Feb 16, 2025 | energy managementMixture-of-Experts | —Unverified | 0 |
| Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time | Feb 16, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Probing Semantic Routing in Large Mixture-of-Expert Models | Feb 15, 2025 | Mixture-of-ExpertsSentence | —Unverified | 0 |
| Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting | Feb 13, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Heterogeneous Mixture of Experts for Remote Sensing Image Super-Resolution | Feb 12, 2025 | Image Super-ResolutionMixture-of-Experts | CodeCode Available | 1 |
| Mixture of Decoupled Message Passing Experts with Entropy Constraint for General Node Classification | Feb 12, 2025 | Mixture-of-ExpertsNode Classification | —Unverified | 0 |