ECSpell^UD: Zero-shot Domain Adaptive Chinese Spelling Check with User Dictionary
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Spellers often work within a particular domain in real life. Due to lots of uncommon domain terms, experiments on our built domain specific dataset show that general models perform terribly. Inspired by the common practice of input methods, we propose to add an alterable user dictionary to handle the zero-shot domain adaption problem. Specifically, we attach a User Dictionary guided inference module (UD) to a general token classification based speller. Without extra fine-tuning, UD reveals an ability to improve the performance of all the tested spellers, especially for strong baselines. Therefore, to further make the domain adaptive speller practical, we develop a competitive general speller ECSpell which adopts the Error Consistent masking strategy to create data for pertaining. Domain experiments demonstrate that ECSpell^UD, namely ECSpell combined with UD, surpasses all the other baselines largely, even approaching the performance on the general benchmark.