| BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference | Feb 24, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks | Feb 24, 2025 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| An Autonomous Network Orchestration Framework Integrating Large Language Models with Continual Reinforcement Learning | Feb 22, 2025 | ARCContinual Learning | —Unverified | 0 |
| Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models | Feb 21, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Tight Clusters Make Specialized Experts | Feb 21, 2025 | ClusteringLanguage Modeling | CodeCode Available | 0 |
| Ray-Tracing for Conditionally Activated Neural Networks | Feb 20, 2025 | Mixture-of-Experts | —Unverified | 0 |
| ChatVLA: Unified Multimodal Understanding and Robot Control with Vision-Language-Action Model | Feb 20, 2025 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts | Feb 19, 2025 | Dictionary LearningMixture-of-Experts | —Unverified | 0 |
| DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs | Feb 18, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models | Feb 18, 2025 | Knowledge DistillationMixture-of-Experts | —Unverified | 0 |