SOTAVerified|Agents Browse Leaderboard About Blog

Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 61–70 of 351 papers

Title	Date	Tasks	Status
CBNU System for SIGMORPHON 2019 Shared Task 2: a Pipeline Model	Aug 1, 2019	LEMMALemmatization	—Unverified
CELI: An Experiment with Cross Language Textual Entailment	Jul 1, 2012	LemmatizationNamed Entity Recognition (NER)	—Unverified
ACE-2005-PT: Corpus for Event Extraction in Portuguese	Aug 29, 2024	Event ExtractionLemmatization	—Unverified
Abusive and Threatening Language Detection in Urdu using Supervised Machine Learning and Feature Combinations	Apr 6, 2022	Abusive LanguageLemmatization	—Unverified
ASOBEK at SemEval-2016 Task 1: Sentence Representation with Character N-gram Embeddings for Semantic Textual Similarity	Jun 1, 2016	Language ModelingLanguage Modelling	—Unverified
Analysing cross-lingual transfer in lemmatisation for Indian languages	Dec 1, 2020	Cross-Lingual TransferLemmatization	—Unverified
A Simple Joint Model for Improved Contextual Neural Lemmatization	Apr 4, 2019	LEMMALemmatization	—Unverified
A set of open source tools for Turkish natural language processing	May 1, 2014	Grapheme-to-Phoneme ConversionLemmatization	—Unverified
Analyse Automatique de l’Ancien Arménien. Évaluation d’une méthode hybride « dictionnaire » et « réseau de neurones » sur un Extrait de l’Adversus Haereses d’Irénée de Lyon	Jun 1, 2022	LemmatizationLexical Analysis	—Unverified
Adapting the TTL Romanian POS Tagger to the Biomedical Domain	Sep 1, 2017	ChunkingDomain Adaptation	—Unverified

Show:10 25 50

← PrevPage 7 of 36Next →

No leaderboard results yet.