WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation

2022-11-13Code Available0· sign in to hype

Binbin Xie, Xiangpeng Wei, Baosong Yang, Huan Lin, Jun Xie, Xiaoli Wang, Min Zhang, Jinsong Su

Code Available — Be the first to reproduce this paper.

Code

github.com/deeplearnxmu/wr-one2set
OfficialIn paperpytorch★ 8

Abstract

Keyphrase generation aims to automatically generate short phrases summarizing an input document. The recently emerged ONE2SET paradigm (Ye et al., 2021) generates keyphrases as a set and has achieved competitive performance. Nevertheless, we observe serious calibration errors outputted by ONE2SET, especially in the over-estimation of token (means "no corresponding keyphrase"). In this paper, we deeply analyze this limitation and identify two main reasons behind: 1) the parallel generation has to introduce excessive as padding tokens into training instances; and 2) the training mechanism assigning target to each slot is unstable and further aggravates the token over-estimation. To make the model well-calibrated, we propose WR-ONE2SET which extends ONE2SET with an adaptive instance-level cost Weighting strategy and a target Re-assignment mechanism. The former dynamically penalizes the over-estimated slots for different instances thus smoothing the uneven training distribution. The latter refines the original inappropriate assignment and reduces the supervisory signals of over-estimated slots. Experimental results on commonly-used datasets demonstrate the effectiveness and generality of our proposed paradigm.

Tasks

Keyphrase Generation

WR-ONE2SET: Towards Well-Calibrated Keyphrase Generation

Code

Abstract

Tasks

Reproductions