SOTAVerified|Agents Browse Leaderboard About Blog

Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–50 of 351 papers

Title	Date	Tasks	Status	Score
Training Data Augmentation for Context-Sensitive Neural Lemmatization Using Inflection Tables and Raw Text	Apr 2, 2019	Data AugmentationLEMMA	CodeCode Available	5
Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers	May 30, 2024	LemmatizationMorphological Tagging	CodeCode Available	5
Development of a Hindi Lemmatizer	May 24, 2013	LemmatizationMachine Translation	CodeCode Available	5
Imitation Learning for Neural Morphological String Transduction	Aug 31, 2018	Imitation LearningLemmatization	CodeCode Available	5
Integrated Sequence Tagging for Medieval Latin Using Deep Representation Learning	Mar 4, 2016	LEMMALemmatization	CodeCode Available	5
IUCM at SemEval-2018 Task 11: Similar-Topic Texts as a Comprehension Knowledge Source	Jun 1, 2018	ClusteringLemmatization	CodeCode Available	5
Evaluating Shortest Edit Script Methods for Contextual Lemmatization	Mar 25, 2024	LEMMALemmatization	CodeCode Available	5
Cross-Lingual Lemmatization and Morphology Tagging with Two-Stage Multilingual BERT Fine-Tuning	Aug 1, 2019	LemmatizationMorphological Analysis	CodeCode Available	5
CMU-01 at the SIGMORPHON 2019 Shared Task on Crosslinguality and Context in Morphology	Jul 23, 2019	LEMMALemmatization	CodeCode Available	5
Cross-lingual Named Entity Corpus for Slavic Languages	Mar 30, 2024	LEMMALemmatization	CodeCode Available	5

Show:10 25 50

← PrevPage 5 of 36Next →

No leaderboard results yet.