SOTAVerified|Agents Browse Leaderboard About

Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 311–320 of 351 papers

Title	Date	Tasks	Status
DBTagger: Multi-Task Learning for Keyword Mapping in NLIDBs Using Bi-Directional Recurrent Neural Networks	Jan 11, 2021	LemmatizationMulti-Task Learning	CodeCode Available
From Text to Lexicon: Bridging the Gap between Word Embeddings and Lexical Resources	Aug 1, 2018	Coreference ResolutionLemmatization	CodeCode Available
Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization	Sep 14, 2012	LemmatizationText Summarization	CodeCode Available
Unsupervised Compound Splitting With Distributional Semantics Rivals Supervised Methods	Jun 1, 2016	Lemmatization	CodeCode Available
Training Data Augmentation for Context-Sensitive Neural Lemmatization Using Inflection Tables and Raw Text	Apr 2, 2019	Data AugmentationLEMMA	CodeCode Available
Grammatical gender associations outweigh topical gender bias in crosslinguistic word embeddings	May 18, 2020	Cultural Vocal Bursts Intensity PredictionLemmatization	CodeCode Available
Training Data Augmentation for Context-Sensitive Neural Lemmatizer Using Inflection Tables and Raw Text	Jun 1, 2019	Data AugmentationLEMMA	CodeCode Available
Neural Transition-based String Transduction for Limited-Resource Setting in Morphology	Aug 1, 2018	LemmatizationMachine Translation	CodeCode Available
Stylistic Fingerprints, POS-tags and Inflected Languages: A Case Study in Polish	Jun 5, 2022	Authorship AttributionLemmatization	CodeCode Available
Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers	May 30, 2024	LemmatizationMorphological Tagging	CodeCode Available

Show:10 25 50

← PrevPage 32 of 36Next →

No leaderboard results yet.