Speeding Up Entmax
2021-12-17ACL ARR December 2022Unverified0· sign in to hype
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Softmax is the de facto standard for normalizing logits in modern neural networks for language processing. However, by producing a dense probability distribution each token in the vocabulary has a nonzero chance of being selected at each generation step, leading to a variety of reported problems in text generation. -entmax of Peters et al. (2019) solves this problem, but is unfortunately slower than softmax. In this paper, we propose an alternative to -entmax, which keeps its virtuous characteristics, but is as fast as optimized softmax and achieves on par or better performance in machine translation task.