| Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models | Feb 18, 2025 | Knowledge DistillationMixture-of-Experts | —Unverified | 0 |
| DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs | Feb 18, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines | Feb 17, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Connector-S: A Survey of Connectors in Multi-modal Large Language Models | Feb 17, 2025 | Mixture-of-ExpertsSurvey | —Unverified | 0 |
| Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer Gate | Feb 17, 2025 | GPUMixture-of-Experts | CodeCode Available | 0 |
| Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time | Feb 16, 2025 | Mixture-of-Experts | —Unverified | 0 |
| ClimateLLM: Efficient Weather Forecasting via Frequency-Aware Large Language Models | Feb 16, 2025 | energy managementMixture-of-Experts | —Unverified | 0 |
| Probing Semantic Routing in Large Mixture-of-Expert Models | Feb 15, 2025 | Mixture-of-ExpertsSentence | —Unverified | 0 |
| Eidetic Learning: an Efficient and Provable Solution to Catastrophic Forgetting | Feb 13, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Mixture of Decoupled Message Passing Experts with Entropy Constraint for General Node Classification | Feb 12, 2025 | Mixture-of-ExpertsNode Classification | —Unverified | 0 |