TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents
Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/thu-coai/CDial-GPTpytorch★ 1,938
- github.com/ErikEkstedt/TurnGPTpytorch★ 65
- github.com/noriyukipy/gptchatpytorch★ 0
- github.com/the-pythoncoder/counsel-chatpytorch★ 0
- github.com/KhueNguyen312/Persona-Chatbotpytorch★ 0
- github.com/pranavgollamudi/Chatbotpytorch★ 0
- github.com/dladustn95/enLanguageModelpytorch★ 0
- github.com/BSlience/end2end-conversational-aipytorch★ 0
- github.com/samsonleegh/convai_smilepytorch★ 0
- github.com/huggingface/transfer-learning-conv-aipytorch★ 0
Abstract
We introduce a new approach to generative data-driven dialogue systems (e.g. chatbots) called TransferTransfo which is a combination of a Transfer learning based training scheme and a high-capacity Transformer model. Fine-tuning is performed by using a multi-task objective which combines several unsupervised prediction tasks. The resulting fine-tuned model shows strong improvements over the current state-of-the-art end-to-end conversational models like memory augmented seq2seq and information-retrieval models. On the privately held PERSONA-CHAT dataset of the Conversational Intelligence Challenge 2, this approach obtains a new state-of-the-art, with respective perplexity, Hits@1 and F1 metrics of 16.28 (45 % absolute improvement), 80.7 (46 % absolute improvement) and 19.5 (20 % absolute improvement).
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Persona-Chat | TransferTransfo | Avg F1 | 19.09 | — | Unverified |