SOTAVerified

TwiSty: A Multilingual Twitter Stylometry Corpus for Gender and Personality Profiling

2016-05-01LREC 2016Unverified0· sign in to hype

Ben Verhoeven, Walter Daelemans, Barbara Plank

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Personality profiling is the task of detecting personality traits of authors based on writing style. Several personality typologies exist, however, the Briggs-Myer Type Indicator (MBTI) is particularly popular in the non-scientific community, and many people use it to analyse their own personality and talk about the results online. Therefore, large amounts of self-assessed data on MBTI are readily available on social-media platforms such as Twitter. We present a novel corpus of tweets annotated with the MBTI personality type and gender of their author for six Western European languages (Dutch, German, French, Italian, Portuguese and Spanish). We outline the corpus creation and annotation, show statistics of the obtained data distributions and present first baselines on Myers-Briggs personality profiling and gender prediction for all six languages.

Tasks

Reproductions