On the Relevance of Syntactic and Discourse Features for Author Profiling and Identification
2017-04-01EACL 2017Unverified0· sign in to hype
Juan Soler-Company, Leo Wanner
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The majority of approaches to author profiling and author identification focus mainly on lexical features, i.e., on the content of a text. We argue that syntactic and discourse features play a significantly more prominent role than they were given in the past. We show that they achieve state-of-the-art performance in author and gender identification on a literary corpus while keeping the feature set small: the used feature set is composed of only 188 features and still outperforms the winner of the PAN 2014 shared task on author verification in the literary genre.