Lemmatization

Lemmatization is a process of determining a base or dictionary form (lemma) for a given surface form. Especially for languages with rich morphology it is important to be able to normalize words into their base forms to better support for example search engines and linguistic studies. Main difficulties in Lemmatization arise from encountering previously unseen words during inference time as well as disambiguating ambiguous surface forms which can be inflected variants of several different base forms depending on the context.

Source: Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 176–200 of 351 papers

Title	Date	Tasks	Status
Universal Morphologies for the Caucasus region	May 1, 2018	Lemmatization	—Unverified
BioRo: The Biomedical Corpus for the Romanian Language	May 1, 2018	Lemmatization	—Unverified
Coreference Resolution in FreeLing 4.0	May 1, 2018	Constituency Parsingcoreference-resolution	—Unverified
Very Large-Scale Lexical Resources to Enhance Chinese and Japanese Machine Translation	May 1, 2018	LemmatizationMachine Translation	—Unverified
SoMeWeTa: A Part-of-Speech Tagger for German Social Media and Web Texts	May 1, 2018	Domain AdaptationLemmatization	CodeCode Available
SentiArabic: A Sentiment Analyzer for Standard Arabic	May 1, 2018	Arabic Sentiment AnalysisLemmatization	—Unverified
TreeAnnotator: Versatile Visual Annotation of Hierarchical Text Relations	May 1, 2018	Lemmatization	—Unverified
Automatic Categorization of Tagalog Documents Using Support Vector Machines	Nov 1, 2017	Document ClassificationGeneral Classification	—Unverified
Build Fast and Accurate Lemmatization for Arabic	Oct 18, 2017	Information RetrievalLemmatization	—Unverified
Fast and Accurate Decision Trees for Natural Language Processing Tasks	Sep 1, 2017	AttributeBIG-bench Machine Learning	—Unverified
Lemmatization of Multi-word Common Noun Phrases and Named Entities in Polish	Sep 1, 2017	Lemmatization	—Unverified
An Extensible Multilingual Open Source Lemmatizer	Sep 1, 2017	Information RetrievalLEMMA	—Unverified
Adapting the TTL Romanian POS Tagger to the Biomedical Domain	Sep 1, 2017	ChunkingDomain Adaptation	—Unverified
Automatically Acquired Lexical Knowledge Improves Japanese Joint Morphological and Dependency Analysis	Sep 1, 2017	LemmatizationMorphological Analysis	—Unverified
bleu2vec: the Painfully Familiar Metric on Continuous Vector Space Steroids	Sep 1, 2017	LemmatizationMachine Translation	—Unverified
Evaluation of Finite State Morphological Analyzers Based on Paradigm Extraction from Wiktionary	Sep 1, 2017	Language ModelingLanguage Modelling	—Unverified
Impact of Feature Selection on Micro-Text Classification	Aug 27, 2017	ClassificationClustering	—Unverified
KeyXtract Twitter Model - An Essential Keywords Extraction Model for Twitter Designed using NLP Tools	Aug 9, 2017	Lemmatizationmodel	—Unverified
Lexical Correction of Polish Twitter Political Data	Aug 1, 2017	Entity Extraction using GANLemmatization	—Unverified
Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe	Aug 1, 2017	Dependency ParsingLemmatization	—Unverified
RACAI's Natural Language Processing pipeline for Universal Dependencies	Aug 1, 2017	LemmatizationSentence	—Unverified
LABDA at SemEval-2017 Task 10: Relation Classification between keyphrases via Convolutional Neural Network	Aug 1, 2017	ArticlesGeneral Classification	—Unverified
DT\_Team at SemEval-2017 Task 1: Semantic Similarity Using Alignments, Sentence-Level Embeddings and Gaussian Mixture Model Output	Aug 1, 2017	LemmatizationSemantic Similarity	—Unverified
QLUT at SemEval-2017 Task 1: Semantic Textual Similarity Based on Word Embeddings	Aug 1, 2017	LemmatizationSemantic Textual Similarity	—Unverified
Oxford at SemEval-2017 Task 9: Neural AMR Parsing with Pointer-Augmented Attention	Aug 1, 2017	AMR ParsingDecoder	—Unverified

Show:10 25 50

← PrevPage 8 of 15Next →

No leaderboard results yet.