| A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications | Mar 10, 2025 | Continual LearningMeta-Learning | CodeCode Available | 9 | 5 |
| Arcee's MergeKit: A Toolkit for Merging Large Language Models | Mar 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| VITA: Towards Open-Source Interactive Omni Multimodal LLM | Aug 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond | Jan 19, 2025 | Deep LearningMulti-Task Learning | CodeCode Available | 7 | 5 |
| MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning | Oct 14, 2023 | Image ClassificationImage Description | CodeCode Available | 7 | 5 |
| StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning | Jun 5, 2024 | Automatic Speech Recognition (ASR)de-en | CodeCode Available | 5 | 5 |
| MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter Experts | Apr 13, 2024 | DiversityLanguage Modeling | CodeCode Available | 5 | 5 |
| YOLOR-Based Multi-Task Learning | Sep 29, 2023 | Image CaptioningInstance Segmentation | CodeCode Available | 5 | 5 |
| Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks | Nov 17, 2022 | DecoderLanguage Modelling | CodeCode Available | 4 | 5 |
| CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models | Oct 9, 2024 | Multi-Task Learning | CodeCode Available | 4 | 5 |
| Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities | Aug 14, 2024 | Continual LearningFew-Shot Learning | CodeCode Available | 4 | 5 |
| InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning | Feb 9, 2024 | Data AugmentationGSM8K | CodeCode Available | 4 | 5 |
| DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks | May 7, 2024 | BinarizationDeblurring | CodeCode Available | 4 | 5 |
| Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective | Feb 2, 2025 | Multi-Task Learning | CodeCode Available | 3 | 5 |
| Scientific Machine Learning through Physics-Informed Neural Networks: Where we are and What's next | Jan 14, 2022 | Multi-Task Learning | CodeCode Available | 3 | 5 |
| UCF: Uncovering Common Features for Generalizable Deepfake Detection | Apr 27, 2023 | Binary ClassificationDecoder | CodeCode Available | 3 | 5 |
| MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts | May 2, 2024 | Combinatorial OptimizationMixture-of-Experts | CodeCode Available | 3 | 5 |
| DARWIN 1.5: Large Language Models as Materials Science Adapted Learners | Dec 16, 2024 | Large Language ModelMulti-Task Learning | CodeCode Available | 3 | 5 |
| ERNIE 2.0: A Continual Pre-training Framework for Language Understanding | Jul 29, 2019 | Chinese Named Entity RecognitionChinese Reading Comprehension | CodeCode Available | 3 | 5 |
| PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images | Jun 2, 2022 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 | 5 |
| Ludwig: a type-based declarative deep learning toolbox | Sep 17, 2019 | DecoderDeep Learning | CodeCode Available | 3 | 5 |
| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 | 5 |
| MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts | Apr 22, 2024 | Common Sense ReasoningGPU | CodeCode Available | 3 | 5 |
| Relational Multi-Task Learning: Modeling Relations between Data and Tasks | Mar 14, 2023 | Multi-Task LearningTransfer Learning | CodeCode Available | 3 | 5 |
| YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation | Jul 5, 2024 | Drum TranscriptionDrum Transcription in Music (DTM) | CodeCode Available | 3 | 5 |