SOTAVerified|Agents Browse Leaderboard About

Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–310 of 351 papers

Title	Date	Tasks	Status
The Floating Arabic Dictionary: An Automatic Method for Updating a Lexical Database through the Detection and Lemmatization of Unknown Words	Dec 1, 2012	Lemmatization	—Unverified
The goo300k corpus of historical Slovene	May 1, 2012	LEMMALemmatization	—Unverified
Transformers on Multilingual Clause-Level Morphology	Nov 3, 2022	Data AugmentationLanguage Modelling	CodeCode Available
Towards JointUD: Part-of-speech Tagging and Lemmatization using Recurrent Neural Networks	Sep 10, 2018	Dependency ParsingLemmatization	CodeCode Available
SoMaJo: State-of-the-art tokenization for German web and social media texts	Aug 1, 2016	Lemmatization	CodeCode Available
LemmaTag: Jointly Tagging and Lemmatizing for Morphologically-Rich Languages with BRNNs	Aug 10, 2018	LemmatizationPart-Of-Speech Tagging	CodeCode Available
LemmaTag: Jointly Tagging and Lemmatizing for Morphologically Rich Languages with BRNNs	Oct 1, 2018	LemmatizationMachine Translation	CodeCode Available
SoMeWeTa: A Part-of-Speech Tagger for German Social Media and Web Texts	May 1, 2018	Domain AdaptationLemmatization	CodeCode Available
The Role of Interpretable Patterns in Deep Learning for Morphology	Dec 8, 2020	DecoderDeep Learning	CodeCode Available
Analyzing Pre-processing Settings for Urdu Single-document Extractive Summarization	May 1, 2016	Extractive SummarizationLemmatization	CodeCode Available

Show:10 25 50

← PrevPage 31 of 36Next →

No leaderboard results yet.