SOTAVerified|Agents Browse Leaderboard About

Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 331–340 of 351 papers

Title	Date	Tasks	Status
bleu2vec: the Painfully Familiar Metric on Continuous Vector Space Steroids	Sep 1, 2017	LemmatizationMachine Translation	—Unverified
Breaking the Fake News Barrier: Deep Learning Approaches in Bangla Language	Jan 30, 2025	Lemmatization	—Unverified
Build Fast and Accurate Lemmatization for Arabic	Oct 18, 2017	Information RetrievalLemmatization	—Unverified
Building a Lemmatizer and a Spell-checker for Sorani Kurdish	Sep 27, 2018	Language ModelingLanguage Modelling	—Unverified
Building a multilingual parallel corpus for human users	May 1, 2012	Lemmatization	—Unverified
Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages	May 1, 2012	Lemmatization	—Unverified
CBNU System for SIGMORPHON 2019 Shared Task 2: a Pipeline Model	Aug 1, 2019	LEMMALemmatization	—Unverified
CELI: An Experiment with Cross Language Textual Entailment	Jul 1, 2012	LemmatizationNamed Entity Recognition (NER)	—Unverified
CEPLEXicon ― A Lexicon of Child European Portuguese	May 1, 2016	Lemmatization	—Unverified
Character-level Supervision for Low-resource POS Tagging	Jul 1, 2018	Feature EngineeringLEMMA	—Unverified

Show:10 25 50

← PrevPage 34 of 36Next →

No leaderboard results yet.