| Wolf: Captioning Everything with a World Summarization Framework | Jul 26, 2024 | Autonomous DrivingMixture-of-Experts | —Unverified | 0 | 0 |
| Yi-Lightning Technical Report | Dec 2, 2024 | ChatbotLarge Language Model | —Unverified | 0 | 0 |
| PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning | Jul 31, 2024 | Continual LearningGeneral Knowledge | —Unverified | 0 | 0 |
| Zero-Resource Multilingual Model Transfer: Learning What to Share | Sep 27, 2018 | Cross-Lingual TransferMixture-of-Experts | —Unverified | 0 | 0 |
| Multimodal Fusion and Coherence Modeling for Video Topic Segmentation | Aug 1, 2024 | Contrastive LearningMixture-of-Experts | —Unverified | 0 | 0 |
| HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction | Aug 2, 2024 | Click-Through Rate PredictionMixture-of-Experts | —Unverified | 0 | 0 |
| Mixture-of-Noises Enhanced Forgery-Aware Predictor for Multi-Face Manipulation Detection and Localization | Aug 5, 2024 | Face DetectionMixture-of-Experts | —Unverified | 0 | 0 |
| Routing in Sparsely-gated Language Models responds to Context | Sep 21, 2024 | DecoderMixture-of-Experts | —Unverified | 0 | 0 |
| On DeepSeekMoE: Statistical Benefits of Shared Experts and Normalized Sigmoid Gating | May 16, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| A Fast Kernel-based Conditional Independence test with Application to Causal Discovery | May 16, 2025 | Causal DiscoveryCausal Inference | —Unverified | 0 | 0 |
| MoE-CAP: Benchmarking Cost, Accuracy and Performance of Sparse Mixture-of-Experts Systems | May 16, 2025 | BenchmarkingMixture-of-Experts | —Unverified | 0 | 0 |
| MegaScale-MoE: Large-Scale Communication-Efficient Training of Mixture-of-Experts Models in Production | May 16, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| A Survey of Generative Categories and Techniques in Multimodal Large Language Models | May 29, 2025 | Mixture-of-ExpertsSelf-Supervised Learning | —Unverified | 0 | 0 |
| 3D Gaussian Splatting Data Compression with Mixture of Priors | May 6, 2025 | 3DGSData Compression | —Unverified | 0 | 0 |
| 3D-MoE: A Mixture-of-Experts Multi-modal LLM for 3D Vision and Pose Diffusion via Rectified Flow | Jan 28, 2025 | Instruction FollowingMixture-of-Experts | —Unverified | 0 | 0 |
| Accelerating Mixture-of-Experts Training with Adaptive Expert Replication | Apr 28, 2025 | GPUMixture-of-Experts | —Unverified | 0 | 0 |
| Accelerating MoE Model Inference with Expert Sharding | Mar 11, 2025 | DecoderGPU | —Unverified | 0 | 0 |
| Acquiring Diverse Skills using Curriculum Reinforcement Learning with Mixture of Experts | Mar 11, 2024 | Mixture-of-ExpertsReinforcement Learning (RL) | —Unverified | 0 | 0 |
| Modular Action Concept Grounding in Semantic Video Prediction | Nov 23, 2020 | Action RecognitionMixture-of-Experts | —Unverified | 0 | 0 |
| AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction | Jan 6, 2023 | Click-Through Rate PredictionMixture-of-Experts | —Unverified | 0 | 0 |
| Ada-K Routing: Boosting the Efficiency of MoE-based LLMs | Oct 14, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 | 0 |
| AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts | Jan 1, 2023 | Instance SegmentationMixture-of-Experts | —Unverified | 0 | 0 |
| Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection | Sep 9, 2024 | Anomaly DetectionMixture-of-Experts | —Unverified | 0 | 0 |
| Adaptive Conditional Expert Selection Network for Multi-domain Recommendation | Nov 11, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 | 0 |
| Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Apr 10, 2025 | Mixture-of-Expertsobject-detection | —Unverified | 0 | 0 |