GPT Understands, Too

2021-03-18Code Available2· sign in to hype

Xiao Liu, Yanan Zheng, Zhengxiao Du, Ming Ding, Yujie Qian, Zhilin Yang, Jie Tang

Code Available — Be the first to reproduce this paper.

Code

github.com/THUDM/P-tuning
OfficialIn paperpytorch★ 938
github.com/THUDM/GLM
pytorch★ 3,463
github.com/liucongg/chatglm-finetuning
pytorch★ 2,781
github.com/thudm/swissarmytransformer
pytorch★ 1,116
github.com/jellyfish042/rwkv-statetuning
pytorch★ 35
github.com/BBuf/GLM
pytorch★ 3
github.com/alibaba/EasyNLP/tree/master/examples/fewshot_learning
jax★ 0
github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/few_shot/p-tuning
paddle★ 0
github.com/2024-MindSpore-1/Code2/tree/main/model-1/gptj
mindspore★ 0
github.com/MindSpore-scientific-2/code-14/tree/main/gptj
mindspore★ 0

Abstract

Prompting a pretrained language model with natural language patterns has been proved effective for natural language understanding (NLU). However, our preliminary study reveals that manual discrete prompts often lead to unstable performance -- e.g., changing a single word in the prompt might result in substantial performance drop. We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts. Empirically, P-Tuning not only stabilizes training by minimizing the gap between various discrete prompts, but also improves performance by a sizeable margin on a wide range of NLU tasks including LAMA and SuperGLUE. P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.

Tasks

Knowledge Probing Language Modeling Language Modelling Natural Language Understanding Prompt Engineering

GPT Understands, Too

Code

Abstract

Tasks

Reproductions