| Towards Foundational Models for Dynamical System Reconstruction: Hierarchical Meta-Learning via Mixture of Experts | Feb 7, 2025 | Meta-LearningMixture-of-Experts | —Unverified | 0 |
| fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving | Feb 7, 2025 | CPUGPU | —Unverified | 0 |
| Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient | Feb 7, 2025 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Mixture of neural operator experts for learning boundary conditions and model selection | Feb 6, 2025 | Mixture-of-ExpertsModel Selection | —Unverified | 0 |
| CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference | Feb 6, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| Optimizing Robustness and Accuracy in Mixture of Experts: A Dual-Model Approach | Feb 5, 2025 | Adversarial RobustnessMixture-of-Experts | —Unverified | 0 |
| ReGNet: Reciprocal Space-Aware Long-Range Modeling for Crystalline Property Prediction | Feb 4, 2025 | Computational EfficiencyLong-range modeling | —Unverified | 0 |
| Brief analysis of DeepSeek R1 and it's implications for Generative AI | Feb 4, 2025 | GPUMixture-of-Experts | —Unverified | 0 |
| M2R2: Mixture of Multi-Rate Residuals for Efficient Transformer Inference | Feb 4, 2025 | Mixture-of-Experts | —Unverified | 0 |
| CLIP-UP: A Simple and Efficient Mixture-of-Experts CLIP Training Recipe with Sparse Upcycling | Feb 3, 2025 | Mixture-of-Experts | —Unverified | 0 |