SOTAVerified

Lexical Normalization

Lexical normalization is the task of translating/transforming a non standard text to a standard register.

Example:

new pix comming tomoroe
new pictures coming tomorrow

Datasets usually consists of tweets, since these naturally contain a fair amount of these phenomena.

For lexical normalization, only replacements on the word-level are annotated. Some corpora include annotation for 1-N and N-1 replacements. However, word insertion/deletion and reordering is not part of the task.

Papers

Showing 2130 of 47 papers

TitleStatusHype
Norm It! Lexical Normalization for Italian and Its Downstream Effects for Dependency Parsing0
Sequence-to-Sequence Lexical Normalization with Multilingual Transformers0
Sesame Street to Mount Sinai: BERT-constrained character-level Moses models for multilingual lexical normalization0
Shared Tasks of the 2015 Workshop on Noisy User-generated Text: Twitter Lexical Normalization and Named Entity Recognition0
Synthetic Data for English Lexical Normalization: How Close Can We Get to Manually Annotated Data?0
The Denoised Web Treebank: Evaluating Dependency Parsing under Noisy Input Conditions0
Towards Shared Datasets for Normalization Research0
To What Extent Does Lexical Normalization Help English-as-a-Second Language Learners to Read Noisy English Texts?0
Tweet Normalization with Syllables0
Accurate Word Segmentation and POS Tagging for Japanese Microblogs: Corpus Annotation and Joint Modeling with Lexical Normalization0
Show:102550
← PrevPage 3 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MoNoiseAccuracy87.63Unverified
2Syllable basedAccuracy86.08Unverified
3TextNormAccuracy83.94Unverified
4unLOLAccuracy82.06Unverified