SOTAVerified|Agents Browse Leaderboard About

Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 321–330 of 351 papers

Title	Date	Tasks	Status
Lexicon and Rule-based Word Lemmatization Approach for the Somali Language	Aug 3, 2023	ArticlesInformation Retrieval	CodeCode Available
NLP-Cube: End-to-End Raw Text Processing With Neural Networks	Oct 1, 2018	LemmatizationSentence	CodeCode Available
Sudachi: a Japanese Tokenizer for Business	May 1, 2018	ChunkingLemmatization	CodeCode Available
Evaluating Shortest Edit Script Methods for Contextual Lemmatization	Mar 25, 2024	LEMMALemmatization	CodeCode Available
Enhancing Sequence-to-Sequence Neural Lemmatization with External Resources	Jan 28, 2021	Data AugmentationDecoder	CodeCode Available
Resource-Size matters: Improving Neural Named Entity Recognition with Optimized Large Corpora	Jul 26, 2018	Lemmatizationnamed-entity-recognition	CodeCode Available
Imitation Learning for Neural Morphological String Transduction	Aug 31, 2018	Imitation LearningLemmatization	CodeCode Available
Revisiting NMT for Normalization of Early English Letters	Jun 1, 2019	LemmatizationMachine Translation	CodeCode Available
Improving Lemmatization of Non-Standard Languages with Joint Learning	Mar 16, 2019	DecoderLanguage Modeling	CodeCode Available
The Frankfurt Latin Lexicon: From Morphological Expansion and Word Embeddings to SemioGraphs	May 21, 2020	LemmatizationWord Embeddings	CodeCode Available

Show:10 25 50

← PrevPage 33 of 36Next →

No leaderboard results yet.