SOTAVerified

Lexical Normalization

Lexical normalization is the task of translating/transforming a non standard text to a standard register.

Example:

new pix comming tomoroe
new pictures coming tomorrow

Datasets usually consists of tweets, since these naturally contain a fair amount of these phenomena.

For lexical normalization, only replacements on the word-level are annotated. Some corpora include annotation for 1-N and N-1 replacements. However, word insertion/deletion and reordering is not part of the task.

Papers

Showing 2647 of 47 papers

TitleStatusHype
Contrastive String Representation Learning using Synthetic Data0
Enhancing BERT for Lexical Normalization0
Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text0
IHS\_RD: Lexical Normalization for English Tweets0
Lexical Normalization for Code-switched Data and its Effect on POS-tagging0
Lexical Normalization of User-Generated Medical Text0
Multilingual Sequence Labeling Approach to solve Lexical Normalization0
NCSU-SAS-Ning: Candidate Generation and Feature Engineering for Supervised Lexical Normalization0
Adapting Sequence to Sequence models for Text Normalization in Social MediaCode0
ViSoLex: An Open-Source Repository for Vietnamese Social Media Lexical NormalizationCode0
Increasing Robustness for Cross-domain Dialogue Act Classification on Social Media DataCode0
A Multi-cascaded Deep Model for Bilingual SMS ClassificationCode0
A Clustering Framework for Lexical Normalization of Roman UrduCode0
Lexical Normalization for Code-switched Data and its Effect on POS TaggingCode0
Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short TextCode0
Automatic Textual Normalization for Hate Speech DetectionCode0
Modeling Input Uncertainty in Neural Network Dependency ParsingCode0
MoNoise: A Multi-lingual and Easy-to-use Lexical Normalization ToolCode0
MoNoise: Modeling Noise Using a Modular Normalization SystemCode0
DaN+: Danish Nested Named Entities and Lexical NormalizationCode0
MultiLexNorm: A Shared Task on Multilingual Lexical NormalizationCode0
User-Generated Text Corpus for Evaluating Japanese Morphological Analysis and Lexical NormalizationCode0
Show:102550
← PrevPage 2 of 2Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MoNoiseAccuracy87.63Unverified
2Syllable basedAccuracy86.08Unverified
3TextNormAccuracy83.94Unverified
4unLOLAccuracy82.06Unverified