SOTAVerified

Acquiring a Formality-Informed Lexical Resource for Style Analysis

2021-04-01EACL 2021Code Available0· sign in to hype

Elisabeth Eder, Ulrike Krieg-Holz, Udo Hahn

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

To track different levels of formality in written discourse, we introduce a novel type of lexicon for the German language, with entries ordered by their degree of (in)formality. We start with a set of words extracted from traditional lexicographic resources, extend it by sentence-based similarity computations, and let crowdworkers assess the enlarged set of lexical items on a continuous informal-formal scale as a gold standard for evaluation. We submit this lexicon to an intrinsic evaluation related to the best regression models and their effect on predicting formality scores and complement our investigation by an extrinsic evaluation of formality on a German-language email corpus.

Tasks

Reproductions