SOTAVerified

Language Identification

Language identification is the task of determining the language of a text.

Papers

Showing 601650 of 794 papers

TitleStatusHype
Transliteration for Low-Resource Code-Switching Texts: Building an Automatic Cyrillic-to-Latin Converter for Tatar0
TuGeBiC: A Turkish German Bilingual Code-Switching Corpus0
Turkish Native Language Identification0
TwistBytes - Identification of Cuneiform Languages and German Dialects at VarDial 20190
Twitter Language Identification Of Similar Languages And Dialects Without Ground Truth0
Twitter Universal Dependency Parsing for African-American and Mainstream American English0
Two LRL \& Distractor Corpora from Web Information Retrieval and a Small Case Study in Language Identification without Training Corpora0
Two-stage Training for Chinese Dialect Recognition0
Typological Features for Multilingual Delexicalised Dependency Parsing0
UJNLP at SemEval-2020 Task 12: Detecting Offensive Language Using Bidirectional Transformers0
Universal and non-universal text statistics: Clustering coefficient for language identification0
Universal Dependencies Treebank for Tatar: Incorporating Intra-Word Code-Switching Information0
Unravelling Interlanguage Facts via Explainable Machine Learning0
Unsupervised Code-Switching for Multilingual Historical Document Transcription0
Unsupervised Deep Language and Dialect Identification for Short Texts0
Unsupervised Feature Learning for Visual Sign Language Identification0
Unsupervised neural adaptation model based on optimal transport for spoken language identification0
Unsupervised Personality-Aware Language Identification0
Unsupervised Preference-Aware Language Identification0
UNT Linguistics at SemEval-2020 Task 12: Linear SVC with Pre-trained Word Embeddings as Document Vectors and Targeted Linguistic Features0
Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus0
Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpora0
URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors0
Using Classifier Features to Determine Language Transfer on Morphemes0
Using Maximum Entropy Models to Discriminate between Similar Languages and Varieties0
Using N-gram and Word Network Features for Native Language Identification0
Using Other Learner Corpora in the 2013 NLI Shared Task0
Using Shallow Syntactic Features to Measure Influences of L1 and Proficiency Level in EFL Writings0
Using Social Networks to Improve Language Variety Identification with Neural Networks0
Utterance-level end-to-end language identification using attention-based CNN-BLSTM0
Validating and Exploring Large Geographic Corpora0
Vanilla Classifiers for Distinguishing between Similar Languages0
VarClass: An Open-source Language Identification Tool for Language Varieties0
VAST: A Corpus of Video Annotation for Speech Technologies0
Vector Space Model as Cognitive Space for Text Classification0
Vers la correction automatique de textes bruit\'es: Architecture g\'en\'erale et d\'etermination de la langue d'un mot inconnu (Towards Automatic Spell-Checking of Noisy Texts : General Architecture and Language Identification for Unknown Words) [in French]0
Visual Script and Language Identification0
Vocabulary-Based Language Similarity using Web Corpora0
VOXLINGUA107: A DATASET FOR SPOKEN LANGUAGE RECOGNITION0
VTEX System Description for the NLI 2013 Shared Task0
Wavelet Scattering Transform for Improving Generalization in Low-Resourced Spoken Language Identification0
When Sparse Traditional Models Outperform Dense Neural Networks: the Curious Case of Discriminating between Similar Languages0
Whispy: Adapting STT Whisper Models to Real-Time Environments0
WLV-RIT at HASOC-Dravidian-CodeMix-FIRE2020: Offensive Language Identification in Code-switched YouTube Comments0
WOLI at SemEval-2020 Task 12: Arabic Offensive Language Identification on Different Twitter Datasets0
Word-Level Language Identification and Predicting Codeswitching Points in Swahili-English Language Data0
Word-level Language Identification in Bi-lingual Code-switched Texts0
Word Level Language Identification in English Telugu Code Mixed Data0
Word Level Language Identification in Online Multilingual Communication0
Word-level Language Identification using CRF: Code-switching Shared Task Report of MSR India System0
Show:102550
← PrevPage 13 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1wav2vec 2.0 LV-60KError rate7.2Unverified
2XLS-RError rate5.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlotLIDMacro F10.98Unverified
#ModelMetricClaimedVerifiedStatus
1FastTextAccuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy91.37Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy86.93Unverified
#ModelMetricClaimedVerifiedStatus
1ConformerG-PAccuracy99.8Unverified