| Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch Pipeline | Feb 9, 2025 | CPUGPU | CodeCode Available | 0 |
| Jamba: A Hybrid Transformer-Mamba Language Model | Mar 28, 2024 | GPULanguage Modeling | CodeCode Available | 0 |
| A Mixture of Experts Approach to 3D Human Motion Prediction | May 9, 2024 | Human motion predictionMixture-of-Experts | CodeCode Available | 0 |
| Understanding the Performance and Estimating the Cost of LLM Fine-Tuning | Aug 8, 2024 | GPUMixture-of-Experts | CodeCode Available | 0 |
| ResMoE: Space-efficient Compression of Mixture of Experts LLMs via Residual Restoration | Mar 10, 2025 | Mixture-of-Experts | CodeCode Available | 0 |
| Restoring Spatially-Heterogeneous Distortions using Mixture of Experts Network | Sep 30, 2020 | Mixture-of-ExpertsMulti-Task Learning | CodeCode Available | 0 |
| Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate | May 26, 2025 | ImputationMixture-of-Experts | CodeCode Available | 0 |
| Intrinsic User-Centric Interpretability through Global Mixture of Experts | Feb 5, 2024 | Mixture-of-ExpertsNews Classification | CodeCode Available | 0 |
| Integrating Multi-view Analysis: Multi-view Mixture-of-Expert for Textual Personality Detection | Aug 16, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment | Jun 1, 2023 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |