| A Mixture of Expert Approach for Low-Cost Customization of Deep Neural Networks | Oct 31, 2018 | Mixture-of-Experts | —Unverified | 0 | 0 |
| Hypertext Entity Extraction in Webpage | Mar 4, 2024 | Mixture-of-Experts | —Unverified | 0 | 0 |
| HydraSum - Disentangling Stylistic Features in Text Summarization using Multi-Decoder Models | Sep 29, 2021 | Abstractive Text SummarizationDecoder | —Unverified | 0 | 0 |
| Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought | May 21, 2025 | ChatbotInstruction Following | —Unverified | 0 | 0 |
| A Universal Approximation Theorem for Mixture of Experts Models | Feb 11, 2016 | General ClassificationMixture-of-Experts | —Unverified | 0 | 0 |
| AMEND: A Mixture of Experts Framework for Long-tailed Trajectory Prediction | Feb 13, 2024 | Contrastive LearningMixture-of-Experts | —Unverified | 0 | 0 |
| Adaptive Detection of Fast Moving Celestial Objects Using a Mixture of Experts and Physical-Inspired Neural Network | Apr 10, 2025 | Mixture-of-Expertsobject-detection | —Unverified | 0 | 0 |
| How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines | Feb 17, 2025 | Mixture-of-Experts | —Unverified | 0 | 0 |
| How Lightweight Can A Vision Transformer Be | Jul 25, 2024 | Mixture-of-ExpertsTransfer Learning | —Unverified | 0 | 0 |
| How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers | Mar 4, 2024 | Few-Shot LearningLanguage Modeling | —Unverified | 0 | 0 |