SOTAVerified

Language Identification

Language identification is the task of determining the language of a text.

Papers

Showing 101150 of 794 papers

TitleStatusHype
AfriHuBERT: A self-supervised speech representation model for African languagesCode0
Automatic Language Identification in Texts: A SurveyCode0
IIITK@DravidianLangTech-EACL2021: Offensive Language Identification and Meme Classification in Tamil, Malayalam and KannadaCode0
Script-Agnostic Language IdentificationCode0
Improving Multilingual ASR in the Wild Using Simple N-best Re-rankingCode0
SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)Code0
Aggressive Language Identification Using Word Embeddings and Sentiment FeaturesCode0
SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language IdentificationCode0
Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language DetectionCode0
Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based Approach to Offensive Language IdentificationCode0
HeLI, a Word-Based Backoff Method for Language IdentificationCode0
TAC at SemEval-2020 Task 12: Ensembling Approach for Multilingual Offensive Language Identification in Social MediaCode0
Joint UD Parsing of Norwegian Bokm and NynorskCode0
GeezSwitch: Language Identification in Typologically Related Low-resourced East African LanguagesCode0
Towards Ethical Content-Based Detection of Online Influence CampaignsCode0
Towards Offensive Language Identification for Dravidian LanguagesCode0
Finding Structure in Text, Genome and Other Symbolic SequencesCode0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
English Please: Evaluating Machine Translation with Large Language Models for Multilingual Bug ReportsCode0
End-to-end Language Identification using NetFV and NetVLADCode0
FBK-DH at SemEval-2020 Task 12: Using Multi-channel BERT for Multilingual Offensive Language DetectionCode0
Using Language Learner Data for Metaphor DetectionCode0
Geographic Adaptation of Pretrained Language ModelsCode0
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource DevicesCode0
CyberTronics at SemEval-2020 Task 12: Multilingual Offensive Language Identification over Social MediaCode0
DocLangID: Improving Few-Shot Training to Identify the Language of Historical DocumentsCode0
Comparing the Performance of CNNs and Shallow Models for Language IdentificationCode0
Code-Switched Language Identification is Harder Than You ThinkCode0
AdelaideCyC at SemEval-2020 Task 12: Ensemble of Classifiers for Offensive Language Detection in Social MediaCode0
Combination of multiple Deep Learning architectures for Offensive Language Detection in TweetsCode0
Cross-Domain Adaptation of Spoken Language Identification for Related Languages: The Curious Case of Slavic LanguagesCode0
Cross-lingual Offensive Language Identification for Low Resource Languages: The Case of MarathiCode0
Discriminating between Similar Languages using Weighted Subword FeaturesCode0
Discriminating Between Similar Nordic LanguagesCode0
Crawling microblogging services to gather language-classified URLs. Workflow and case studyCode0
Embeddia at SemEval-2019 Task 6: Detecting Hate with Neural Network and Transfer Learning ApproachesCode0
DOSA: Dravidian Code-Mixed Offensive Span Identification DatasetCode0
Enhance Language Identification using Dual-mode Model with Knowledge DistillationCode0
Geographically-Informed Language IdentificationCode0
FLEURS: Few-shot Learning Evaluation of Universal Representations of SpeechCode0
From English to Code-Switching: Transfer Learning with Strong Morphological CluesCode0
From N-grams to Pre-trained Multilingual Models For Language IdentificationCode0
JU\_ETCE\_17\_21 at SemEval-2019 Task 6: Efficient Machine Learning and Neural Network Approaches for Identifying and Categorizing Offensive Language in TweetsCode0
On the End-to-End Solution to Mandarin-English Code-switching Speech RecognitionCode0
Building a TOCFL Learner Corpus for Chinese Grammatical Error Diagnosis0
Building a learner corpus for Russian0
Arabic Native Language Identification0
bs,hr,srWaC - Web Corpora of Bosnian, Croatian and Serbian0
BRUMS at SemEval-2020 Task 12: Transformer Based Multilingual Offensive Language Identification in Social Media0
Arabic Language WEKA-Based Dialect Classifier for Arabic Automatic Speech Recognition Transcripts0
Show:102550
← PrevPage 3 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1wav2vec 2.0 LV-60KError rate7.2Unverified
2XLS-RError rate5.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlotLIDMacro F10.98Unverified
#ModelMetricClaimedVerifiedStatus
1FastTextAccuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy91.37Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy86.93Unverified
#ModelMetricClaimedVerifiedStatus
1ConformerG-PAccuracy99.8Unverified