SOTAVerified

Fairness in Representation for Multilingual NLP: Insights from Controlled Experiments on Conditional Language Modeling

2021-09-29ICLR 2022Unverified0· sign in to hype

Ada Wan

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We perform systematically and fairly controlled experiments with the 6-layer Transformer to investigate whether languages which have been traditionally considered morphologically rich (AR and RU) and poor (ZH) are equally hard to conditional-language-model. We evaluate through statistical comparisons across 30 possible language directions from the 6 languages of the United Nations Parallel Corpus on 3 representation levels --- character, byte, and word. Results show that performance is relative to the representation granularity of each of the languages, not to the language as a whole. By eliminating statistically significant performance disparity on the character and byte levels, we show that performance disparity is not a necessary condition. The disparity that mirrors the morphological complexity hierarchy is a byproduct of word segmentation. Evidence from data statistics, along with the fact that word segmentation is qualitatively indeterminate, renders a decades-long debate on morphological complexity (unless it is being intentionally modeled in a word-based, meaning-driven context) irrelevant in the context of computing. The intent of our work is to help effect more objectivity and adequacy in evaluation as well as fairness and inclusivity in experimental setup in the area of language and computing so to uphold diversity in ML and AI research. Multilinguality is real and relevant in computing not due to canonical, structural linguistic concepts such as morphology or "words" in our minds, but rather standards related to internationalization and localization, such as character encoding --- something which has thus far been sorely overlooked in our discourse and curricula.

Tasks

Reproductions