| Samoyeds: Accelerating MoE Models with Structured Sparsity Leveraging Sparse Tensor Cores | Mar 13, 2025 | Mixture-of-Experts | CodeCode Available | 1 |
| Gradient-free variational learning with conditional mixture networks | Aug 29, 2024 | Computational EfficiencyMixture-of-Experts | CodeCode Available | 1 |
| Go Wider Instead of Deeper | Jul 25, 2021 | Image ClassificationMixture-of-Experts | CodeCode Available | 1 |
| GraphMETRO: Mitigating Complex Graph Distribution Shifts via Mixture of Aligned Experts | Dec 7, 2023 | DiversityGraph Neural Network | CodeCode Available | 1 |
| Distilling the Knowledge in a Neural Network | Mar 9, 2015 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| Distribution-aware Forgetting Compensation for Exemplar-Free Lifelong Person Re-identification | Apr 21, 2025 | Exemplar-FreeKnowledge Distillation | CodeCode Available | 1 |
| Graph Sparsification via Mixture of Graphs | May 23, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 1 |
| JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model | May 22, 2025 | GPULong-range modeling | CodeCode Available | 1 |
| Frequency-Adaptive Pan-Sharpening with Mixture of Experts | Jan 4, 2024 | Mixture-of-Experts | CodeCode Available | 1 |
| FreqMoE: Enhancing Time Series Forecasting through Frequency Decomposition Mixture of Experts | Jan 25, 2025 | Mixture-of-ExpertsPrediction | CodeCode Available | 1 |