| Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts | Feb 10, 2020 | Language ModellingMixture-of-Experts | CodeCode Available | 1 |
| Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models | Nov 8, 2019 | Mixture-of-Experts | CodeCode Available | 1 |
| MoËT: Mixture of Expert Trees and its Application to Verifiable Reinforcement Learning | Jun 16, 2019 | Game of GoImitation Learning | CodeCode Available | 1 |
| Gated Multimodal Units for Information Fusion | Feb 7, 2017 | General ClassificationGenre classification | CodeCode Available | 1 |
| Distilling the Knowledge in a Neural Network | Mar 9, 2015 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| GEMINUS: Dual-aware Global and Scene-Adaptive Mixture-of-Experts for End-to-End Autonomous Driving | Jul 19, 2025 | Autonomous DrivingBench2Drive | CodeCode Available | 0 |
| R^2MoE: Redundancy-Removal Mixture of Experts for Lifelong Concept Learning | Jul 17, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Mixture of Experts in Large Language Models | Jul 15, 2025 | DiversityLanguage Modeling | —Unverified | 0 |
| Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive | Jul 13, 2025 | CPUInteractive Segmentation | —Unverified | 0 |
| KAT-V1: Kwai-AutoThink Technical Report | Jul 11, 2025 | Knowledge DistillationLarge Language Model | —Unverified | 0 |