SOTAVerified

Improved POS tagging for spontaneous, clinical speech using data augmentation

2023-07-11Unverified0· sign in to hype

Seth Kulick, Neville Ryant, David J. Irwin, Naomi Nevler, Sunghye Cho

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper addresses the problem of improving POS tagging of transcripts of speech from clinical populations. In contrast to prior work on parsing and POS tagging of transcribed speech, we do not make use of an in domain treebank for training. Instead, we train on an out of domain treebank of newswire using data augmentation techniques to make these structures resemble natural, spontaneous speech. We trained a parser with and without the augmented data and tested its performance using manually validated POS tags in clinical speech produced by patients with various types of neurodegenerative conditions.

Tasks

Reproductions