GPT Understands, Too
Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/THUDM/P-tuningOfficialIn paperpytorch★ 938
- github.com/THUDM/GLMpytorch★ 3,463
- github.com/liucongg/chatglm-finetuningpytorch★ 2,781
- github.com/thudm/swissarmytransformerpytorch★ 1,116
- github.com/jellyfish042/rwkv-statetuningpytorch★ 35
- github.com/BBuf/GLMpytorch★ 3
- github.com/alibaba/EasyNLP/tree/master/examples/fewshot_learningjax★ 0
- github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/few_shot/p-tuningpaddle★ 0
- github.com/2024-MindSpore-1/Code2/tree/main/model-1/gptjmindspore★ 0
- github.com/MindSpore-scientific-2/code-14/tree/main/gptjmindspore★ 0
Abstract
Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance -- e.g., changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.