SOTAVerified

Speech Disfluencies occur at Higher Perplexities

2020-12-01COLING (CogALex) 2020Unverified0· sign in to hype

Priyanka Sen

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Speech disfluencies have been hypothesized to occur before words that are less predictable and therefore more cognitively demanding. In this paper, we revisit this hypothesis by using OpenAI’s GPT-2 to calculate predictability of words as language model perplexity. Using the Switchboard corpus, we find that 51% of disfluencies occur at the highest, second highest, or within one token of the highest perplexity, and this distribution is not random. We also show that disfluencies precede words with significantly higher perplexity than fluent contexts. Based on our results, we offer new evidence that disfluencies are more likely to occur before less predictable words.

Tasks

Reproductions