Language Models as Recommender Systems: Evaluations and Limitations

2021-09-22NeurIPS Workshop ICBINB 2021Unverified0· sign in to hype

Yuhui Zhang, Hao Ding, Zeren Shui, Yifei Ma, James Zou, Anoop Deoras, Hao Wang

Unverified — Be the first to reproduce this paper.

Abstract

Pre-trained language models (PLMs) such as BERT and GPT learn general text representations and encode extensive world knowledge; thus, they can be efficiently and accurately adapted to various downstream tasks. In this work, we propose to leverage these powerful PLMs as recommender systems and use prompts to reformulate the session-based recommendation task to a multi-token cloze task. We evaluate the proposed method on a movie recommendation dataset in zero-shot and fine-tuned settings where no or limited training data are available. In the zero-shot setting: we find that PLMs outperform the random recommendation baseline by a large margin; in the meantime, we observe strong linguistic bias when using PLMs as recommenders. In the fine-tuned setting: such bias is reduced with available training data; however, PLMs tend to under-perform traditional recommender system baselines such as GRU4Rec. Our observations demonstrate the current challenges of multi-token inference and shed light on future works in this novel direction.

Tasks

Movie Recommendation Recommendation Systems Session-Based Recommendations World Knowledge

Language Models as Recommender Systems: Evaluations and Limitations

Abstract

Tasks

Reproductions