| Emergent Modularity in Pre-trained Transformers | May 28, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Lifting the Curse of Capacity Gap in Distilling Language Models | May 20, 2023 | Knowledge DistillationMixture-of-Experts | CodeCode Available | 1 |
| Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration | May 1, 2023 | Data IntegrationEntity Resolution | CodeCode Available | 1 |
| Long-Tailed Visual Recognition via Self-Heterogeneous Integration with Knowledge Excavation | Apr 3, 2023 | Mixture-of-ExpertsTransfer Learning | CodeCode Available | 1 |
| Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild | Apr 2, 2023 | Image Quality AssessmentMixture-of-Experts | CodeCode Available | 1 |
| MixPHM: Redundancy-Aware Parameter-Efficient Tuning for Low-Resource Visual Question Answering | Mar 2, 2023 | Mixture-of-ExpertsQuestion Answering | CodeCode Available | 1 |
| Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers | Mar 2, 2023 | Mixture-of-Experts | CodeCode Available | 1 |
| Mixture of Decision Trees for Interpretable Machine Learning | Nov 26, 2022 | Interpretable Machine LearningMixture-of-Experts | CodeCode Available | 1 |
| Spatial Mixture-of-Experts | Nov 24, 2022 | Mixture-of-Experts | CodeCode Available | 1 |
| PAD-Net: An Efficient Framework for Dynamic Networks | Nov 10, 2022 | image-classificationImage Classification | CodeCode Available | 1 |