| Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization | Feb 26, 2025 | Mixture-of-Experts | —Unverified | 0 |
| The Empirical Impact of Reducing Symmetries on the Performance of Deep Ensembles and MoE | Feb 24, 2025 | Linear Mode ConnectivityMixture-of-Experts | —Unverified | 0 |
| Evaluating Expert Contributions in a MoE LLM for Quiz-Based Tasks | Feb 24, 2025 | Mixture-of-ExpertsMMLU | —Unverified | 0 |
| ENACT-Heart -- ENsemble-based Assessment Using CNN and Transformer on Heart Sounds | Feb 24, 2025 | DiagnosticMixture-of-Experts | —Unverified | 0 |
| BigMac: A Communication-Efficient Mixture-of-Experts Model Structure for Fast Training and Inference | Feb 24, 2025 | Mixture-of-Experts | —Unverified | 0 |
| An Autonomous Network Orchestration Framework Integrating Large Language Models with Continual Reinforcement Learning | Feb 22, 2025 | ARCContinual Learning | —Unverified | 0 |
| Tight Clusters Make Specialized Experts | Feb 21, 2025 | ClusteringLanguage Modeling | CodeCode Available | 0 |
| Binary-Integer-Programming Based Algorithm for Expert Load Balancing in Mixture-of-Experts Models | Feb 21, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Ray-Tracing for Conditionally Activated Neural Networks | Feb 20, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Unraveling the Localized Latents: Learning Stratified Manifold Structures in LLM Embedding Space with Sparse Mixture-of-Experts | Feb 19, 2025 | Dictionary LearningMixture-of-Experts | —Unverified | 0 |