| How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers | Mar 4, 2024 | Few-Shot LearningLanguage Modeling | —Unverified | 0 |
| Enhancing the "Immunity" of Mixture-of-Experts Networks for Adversarial Defense | Feb 29, 2024 | Adversarial DefenseAdversarial Robustness | —Unverified | 0 |
| An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement | Feb 27, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| m2mKD: Module-to-Module Knowledge Distillation for Modular Transformers | Feb 26, 2024 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 0 |
| ASEM: Enhancing Empathy in Chatbot through Attention-based Sentiment and Emotion Modeling | Feb 25, 2024 | ChatbotDiversity | CodeCode Available | 0 |
| Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts | Feb 23, 2024 | Mixture-of-Experts | CodeCode Available | 0 |
| PEMT: Multi-Task Correlation Guided Mixture-of-Experts Enables Parameter-Efficient Transfer Learning | Feb 23, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | —Unverified | 0 |
| MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models | Feb 20, 2024 | Common Sense ReasoningContrastive Learning | —Unverified | 0 |
| Denoising OCT Images Using Steered Mixture of Experts with Multi-Model Inference | Feb 20, 2024 | DenoisingDiagnostic | —Unverified | 0 |
| Towards an empirical understanding of MoE design choices | Feb 20, 2024 | Mixture-of-Experts | —Unverified | 0 |