| Understanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of Experts | Oct 16, 2024 | Mixture-of-Expertsparameter estimation | —Unverified | 0 |
| On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs | Oct 16, 2024 | Mixture-of-ExpertsText Detection | —Unverified | 0 |
| EPS-MoE: Expert Pipeline Scheduler for Cost-Efficient MoE Inference | Oct 16, 2024 | Computational EfficiencyLarge Language Model | —Unverified | 0 |
| MoE-Pruner: Pruning Mixture-of-Experts Large Language Model using the Hints from Its Router | Oct 15, 2024 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| Transformer Layer Injection: A Novel Approach for Efficient Upscaling of Large Language Models | Oct 15, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Quadratic Gating Functions in Mixture of Experts: A Statistical Insight | Oct 15, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| Scalable Multi-Domain Adaptation of Language Models using Modular Experts | Oct 14, 2024 | Domain AdaptationGeneral Knowledge | —Unverified | 0 |
| Learning to Ground VLMs without Forgetting | Oct 14, 2024 | DecoderLanguage Modelling | —Unverified | 0 |
| Ada-K Routing: Boosting the Efficiency of MoE-based LLMs | Oct 14, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| ContextWIN: Whittle Index Based Mixture-of-Experts Neural Model For Restless Bandits Via Deep RL | Oct 13, 2024 | Decision MakingMixture-of-Experts | —Unverified | 0 |