Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text
Junjie Xing, Kenny Zhu, Shaodian Zhang
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/adapt-sjtu/AMTTLOfficialIn papertf★ 0
Abstract
Chinese word segmentation (CWS) trained from open source corpus faces dramatic performance drop when dealing with domain text, especially for a domain with lots of special terms and diverse writing styles, such as the biomedical domain. However, building domain-specific CWS requires extremely high annotation cost. In this paper, we propose an approach by exploiting domain-invariant knowledge from high resource to low resource domains. Extensive experiments show that our model achieves consistently higher accuracy than the single-task CWS and other transfer learning baselines, especially when there is a large disparity between source and target domains.