SOTAVerified

A Corpus for Multilingual Analysis of Online Terms of Service

2021-11-01EMNLP (NLLP) 2021Unverified0· sign in to hype

Kasper Drawzeski, Andrea Galassi, Agnieszka Jablonowska, Francesca Lagioia, Marco Lippi, Hans Wolfgang Micklitz, Giovanni Sartor, Giacomo Tagiuri, Paolo Torroni

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service. The data set comprises a total of 100 contracts, obtained from 25 documents annotated in four different languages: English, German, Italian, and Polish. For each contract, potentially unfair clauses for the consumer are annotated, for nine different unfairness categories. We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages.

Tasks

Reproductions