SOTAVerified

Language Identification

Language identification is the task of determining the language of a text.

Papers

Showing 2650 of 794 papers

TitleStatusHype
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at ScaleCode1
BERT-LID: Leveraging BERT to Improve Spoken Language IdentificationCode1
Speech-MASSIVE: A Multilingual Speech Dataset for SLU and BeyondCode1
Scaling Speech Technology to 1,000+ LanguagesCode1
PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language IdentificationCode1
A reproduction of Apple's bi-directional LSTM models for language identification in short stringsCode1
Bhasha-Abhijnaanam: Native-script and romanized Language Identification for 22 Indic languagesCode1
Common Voice: A Massively-Multilingual Speech CorpusCode1
FastSpell: the LangId Magic SpellCode1
KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social MediaCode1
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority LanguagesCode1
Hyperseed: Unsupervised Learning with Vector Symbolic ArchitecturesCode1
Improving Spoken Language Identification with Map-MixCode1
Using Radio Archives for Low-Resource Speech Recognition: Towards an Intelligent Virtual Assistant for Illiterate UsersCode1
AlexU-BackTranslation-TL at SemEval-2020 Task 12: Improving Offensive Language Detection Using Data Augmentation and Transfer Learning0
Albanian Language Identification in Text Documents0
A deep-learning based native-language classification by using a latent semantic analysis for the NLI Shared Task 20170
SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification0
A language model based approach towards large scale and lightweight language identification systems0
A Deep Generative Approach to Native Language Identification0
A Code-Switching Corpus of Turkish-German Conversations0
Addition of Code Mixed Features to Enhance the Sentiment Prediction of Song Lyrics0
Accurate Pinyin-English Codeswitched Language Identification0
A Federated Learning Approach to Privacy Preserving Offensive Language Identification0
A Dataset and Classifier for Recognizing Social Media English0
Show:102550
← PrevPage 2 of 32Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1wav2vec 2.0 LV-60KError rate7.2Unverified
2XLS-RError rate5.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlotLIDMacro F10.98Unverified
#ModelMetricClaimedVerifiedStatus
1FastTextAccuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy91.37Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy86.93Unverified
#ModelMetricClaimedVerifiedStatus
1ConformerG-PAccuracy99.8Unverified