Language Identification

Language identification is the task of determining the language of a text.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 201–250 of 794 papers

Title	Date	Tasks	Status
DeepAnalyzer at SemEval-2019 Task 6: A deep learning-based ensemble method for identifying offensive tweets	Jun 1, 2019	Language IdentificationPart-Of-Speech Tagging	—Unverified
Deep learning-based end-to-end spoken language identification system for domain-mismatched scenario	Jun 1, 2022	Language IdentificationSpeaker Verification	—Unverified
Deep Models for Arabic Dialect Identification on Benchmarked Data	Aug 1, 2018	Deep LearningDialect Identification	—Unverified
DELab@IIITSM at ICON-2021 Shared Task: Identification of Aggression and Biasness Using Decision Tree	Dec 1, 2021	Language Identification	—Unverified
Demographic Dialectal Variation in Social Media: A Case Study of African-American English	Aug 31, 2016	Dependency ParsingLanguage Identification	—Unverified
Detecting Code-Switching in a Multilingual Alpine Heritage Corpus	Oct 1, 2014	Language IdentificationNamed Entity Recognition (NER)	—Unverified
Ensemble Methods for Native Language Identification	Sep 1, 2017	Language AcquisitionLanguage Identification	—Unverified
Detection of Similar Languages and Dialects Using Deep Supervised Autoencoder	Dec 1, 2020	Language Identification	—Unverified
Detect Language of Transliterated Texts	Apr 26, 2020	Language IdentificationTranslation	—Unverified
Developing Language-tagged Corpora for Code-switching Tweets	Jun 1, 2015	Language Identification	—Unverified
Challenges of Computational Processing of Code-Switching	Oct 7, 2016	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Development of Text and Speech database for Hindi and Indian English specific to Mobile Communication environment	May 1, 2012	Language IdentificationSpeech Recognition	—Unverified
Dialect Diversity in Text Summarization on Twitter	Jul 15, 2020	AttributeDiversity	—Unverified
Dialects Identification of Armenian Language	Jun 1, 2022	Dialect IdentificationLanguage Identification	—Unverified
Discovering Parallel Language Resources for Training MT Engines	May 1, 2018	Language IdentificationMachine Translation	—Unverified
Discriminating between Indo-Aryan Languages Using SVM Ensembles	Jul 9, 2018	Language Identification	—Unverified
Discriminating between Mandarin Chinese and Swiss-German varieties using adaptive language models	Jun 1, 2019	Dialect IdentificationLanguage Identification	—Unverified
Discriminating between Similar Languages Using PPM	Sep 1, 2015	Language Identification	—Unverified
Babler - Data Collection from the Web to Support Speech Recognition and Keyword Search	Aug 1, 2016	Automatic Speech Recognition (ASR)Language Identification	—Unverified
Discriminating between Similar Languages and Arabic Dialect Identification: A Report on the Third DSL Shared Task	Dec 1, 2016	Dialect IdentificationGeneral Classification	—Unverified
Discriminating between Similar Languages on Imbalanced Conversational Texts	May 1, 2018	Language Identification	—Unverified
Discriminating between Similar Languages with Word-level Convolutional Neural Networks	Apr 1, 2017	Language IdentificationQuestion Answering	—Unverified
BERT-based Multi-Task Model for Country and Province Level Modern Standard Arabic and Dialectal Arabic Identification	Jun 23, 2021	Language IdentificationMulti-Task Learning	—Unverified
Discriminating Non-Native English with 350 Words	Jun 1, 2013	Language AcquisitionLanguage Identification	—Unverified
Discriminating Similar Languages with Linear SVMs and Neural Networks	Dec 1, 2016	Deep LearningLanguage Identification	—Unverified
Discriminating Similar Languages with Token-Based Backoff	Sep 1, 2015	Language Identification	—Unverified
Challenges in Neural Language Identification: NRC at VarDial 2020	Dec 1, 2020	Language Identification	—Unverified
Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks	Dec 1, 2016	Dialect IdentificationInformation Retrieval	—Unverified
Distinguishing Literal and Non-Literal Usage of German Particle Verbs	Jun 1, 2016	General ClassificationLanguage Identification	—Unverified
Distributed Representations of Words and Documents for Discriminating Similar Languages	Sep 1, 2015	Language IdentificationMeta-Learning	—Unverified
Distributional Interaction of Concreteness and Abstractness in Verb--Noun Subcategorisation	May 1, 2019	Language IdentificationObject	—Unverified
DKPro TC: A Java-based Framework for Supervised Learning Experiments on Textual Data	Jun 1, 2014	Language IdentificationPart-Of-Speech Tagging	—Unverified
DLRG@DravidianLangTech-EACL2021: Transformer based approachfor Offensive Language Identification on Code-Mixed Tamil	Apr 1, 2021	Language IdentificationLanguage Modeling	—Unverified
Do Characters Abuse More Than Words?	Sep 1, 2016	Hate Speech DetectionLanguage Identification	—Unverified
A Report on the VarDial Evaluation Campaign 2020	Dec 1, 2020	Dialect IdentificationLanguage Identification	—Unverified
Does the Phonology of L1 Show Up in L2 Texts?	Jun 1, 2014	Language IdentificationTopic Models	—Unverified
Domain Attentive Fusion for End-to-end Dialect Identification with Unknown Target Domain	Dec 4, 2018	Dialect IdentificationLanguage Identification	—Unverified
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition	Sep 27, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Ceasing hate withMoH: Hate Speech Detection in Hindi-English Code-Switched Language	Oct 18, 2021	Hate Speech DetectionLanguage Identification	—Unverified
Duluth at SemEval-2020 Task 12: Offensive Tweet Identification in English with Logistic Regression	Jul 25, 2020	Language Identificationregression	—Unverified
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent Identification	Aug 4, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Efficient Discrimination Between Closely Related Languages	Dec 1, 2012	Document ClassificationLanguage Identification	—Unverified
Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset	Oct 5, 2024	Language Identification	—Unverified
Emad at SemEval-2019 Task 6: Offensive Language Identification using Traditional Machine Learning and Deep Learning approaches	Jun 1, 2019	Data AugmentationLanguage Identification	—Unverified
BRUMS at SemEval-2020 Task 12 : Transformer based Multilingual Offensive Language Identification in Social Media	Oct 13, 2020	Language Identification	—Unverified
Arabic Dialect Identification in the Context of Bivalency and Code-Switching	May 1, 2018	Dialect IdentificationLanguage Identification	—Unverified
BRUMS at SemEval-2020 Task 12: Transformer Based Multilingual Offensive Language Identification in Social Media	Dec 1, 2020	Language Identification	—Unverified
Arabic Language WEKA-Based Dialect Classifier for Arabic Automatic Speech Recognition Transcripts	Dec 1, 2016	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection	Nov 26, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Categorization of Turkish News Documents with Morphological Analysis	Aug 1, 2013	Information RetrievalLanguage Identification	—Unverified

Show:10 25 50

← PrevPage 5 of 16Next →

All datasets VOXLINGUA107 GlotLID-C Nordic Language Identification OpenSubtitles Universal Dependencies VoxForge

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	wav2vec 2.0 LV-60K	Error rate	7.2	—	Unverified
2	XLS-R	Error rate	5.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GlotLID	Macro F1	0.98	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	FastText	Accuracy	0.97	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Apple bi-LSTM	Accuracy	91.37	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Apple bi-LSTM	Accuracy	86.93	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ConformerG-P	Accuracy	99.8	—	Unverified