SOTAVerified

GLAD: Groningen Lightweight Authorship Detection

2015-09-06Code Available0· sign in to hype

Manuela Hürlimann, Benno Weck, Esther van den Berg, Simon Šuster, and Malvina Nissim

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We present a simple and effective approach to authorship verification for Dutch, English, Spanish and Greek, which can be easily ported to yet other languages.We train a binary linear classifier both on the features describing known and unknown documents individually, and on the joint features comparing these two types of documents. The list of feature types includes, among others, character n-grams, the lexical overlap, visual text properties and a compression measure. We obtain competitive results that outperform the baseline and position our system among the top PAN shared task participants.

Tasks

Reproductions