| MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts | Mar 8, 2024 | Domain GeneralizationMixture-of-Experts | —Unverified | 0 |
| ConstitutionalExperts: Training a Mixture of Principle-based Prompts | Mar 7, 2024 | Mixture-of-Experts | —Unverified | 0 |
| Video Relationship Detection Using Mixture of Experts | Mar 6, 2024 | Action RecognitionMixture-of-Experts | CodeCode Available | 0 |
| Mixture-of-LoRAs: An Efficient Multitask Tuning for Large Language Models | Mar 6, 2024 | Mixture-of-ExpertsMulti-Task Learning | —Unverified | 0 |
| TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts | Mar 5, 2024 | Graph AttentionGraph Embedding | CodeCode Available | 2 |
| Vanilla Transformers are Transfer Capability Teachers | Mar 4, 2024 | Computational EfficiencyMixture-of-Experts | —Unverified | 0 |
| How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers | Mar 4, 2024 | Few-Shot LearningLanguage Modeling | —Unverified | 0 |
| Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral | Mar 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 |
| Hypertext Entity Extraction in Webpage | Mar 4, 2024 | Mixture-of-Experts | —Unverified | 0 |
| DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling | Mar 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 |