SOTAVerified

Language Identification

Language identification is the task of determining the language of a text.

Papers

Showing 201250 of 794 papers

TitleStatusHype
Hyperseed: Unsupervised Learning with Vector Symbolic ArchitecturesCode1
Mandarin-English Code-switching Speech Recognition with Self-supervised Speech Representation Models0
Pretrained Transformers for Offensive Language Identification in TanglishCode0
Is Attention always needed? A Case Study on Language Identification from Speech0
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition0
Language Identification with a Reciprocal Rank ClassifierCode0
UPV at CheckThat! 2021: Mitigating Cultural Differences for Identifying Multilingual Check-worthy ClaimsCode0
Unsupervised Personality-Aware Language Identification0
The futility of STILTs for the classification of lexical borrowings in Spanish0
On the Language-specificity of Multilingual BERT and the Impact of Fine-tuningCode0
FBERT: A Neural Transformer for Identifying Offensive Content0
Cross-lingual Offensive Language Identification for Low Resource Languages: The Case of MarathiCode0
A Pre-trained Transformer and CNN Model with Joint Language ID and Part-of-Speech Tagging for Code-Mixed Social-Media Text0
Fiction in Russian Translation: A Translationese Study0
Corpus Creation and Language Identification in Low-Resource Code-Mixed Telugu-English Text0
Offensive Language Identification in Low-resourced Code-mixed Dravidian languages using Pseudo-labelingCode0
Towards Offensive Language Identification for Tamil Code-Mixed YouTube Comments and PostsCode0
A Dual-Decoder Conformer for Multilingual Speech Recognition0
Dyn-ASR: Compact, Multilingual Speech Recognition via Spoken Language and Accent Identification0
OLR 2021 Challenge: Datasets, Rules and Baselines0
Improved Language Identification Through Cross-Lingual Self-Supervised Learning0
Oriental Language Recognition (OLR) 2020: Summary and Analysis0
Language Identification of Hindi-English tweets using code-mixed BERT0
Language Lexicons for Hindi-English Multilingual Text Processing0
A Simple and Efficient Probabilistic Language model for Code-Mixed Text0
BERT-based Multi-Task Model for Country and Province Level Modern Standard Arabic and Dialectal Arabic Identification0
DravidianCodeMix: Sentiment Analysis and Offensive Language Identification Dataset for Dravidian Languages in Code-Mixed TextCode1
SpeechBrain: A General-Purpose Speech ToolkitCode1
SIGTYP 2021 Shared Task: Robust Spoken Language Identification0
Active learning and negative evidence for language identification0
Self-Contextualized Attention for Abusive Language Identification0
Transliteration for Low-Resource Code-Switching Texts: Building an Automatic Cyrillic-to-Latin Converter for Tatar0
Much Gracias: Semi-supervised Code-switch Detection for Spanish-English: How far can we get?0
Anlirika: An LSTM–CNN Flow Twister for Spoken Language Identification0
Language ID Prediction from Speech Using Self-Attentive Pooling0
Data Filtering using Cross-Lingual Word Embeddings0
Singing Language Identification using a Deep Phonotactic ApproachCode0
Low-Resource Spoken Language Identification Using Self-Attentive Pooling and Deep 1D Time-Channel Separable Convolutions0
An Exploratory Analysis of the Relation Between Offensive Language and Mental Health0
Multilingual Offensive Language Identification for Low-resource Languages0
Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages0
Using Radio Archives for Low-Resource Speech Recognition: Towards an Intelligent Virtual Assistant for Illiterate UsersCode1
Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions0
IIITK@DravidianLangTech-EACL2021: Offensive Language Identification and Meme Classification in Tamil, Malayalam and KannadaCode0
BERT-based Multi-Task Model for Country and Province Level MSA and Dialectal Arabic Identification0
Optimizing a Supervised Classifier for a Difficult Language Identification Problem0
Findings of the VarDial Evaluation Campaign 20210
N-gram and Neural Models for Uralic Language Identification: NRC at VarDial 20210
Comparing the Performance of CNNs and Shallow Models for Language IdentificationCode0
Simon @ DravidianLangTech-EACL2021: Detecting Offensive Content in Kannada Language0
Show:102550
← PrevPage 5 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1wav2vec 2.0 LV-60KError rate7.2Unverified
2XLS-RError rate5.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlotLIDMacro F10.98Unverified
#ModelMetricClaimedVerifiedStatus
1FastTextAccuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy91.37Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy86.93Unverified
#ModelMetricClaimedVerifiedStatus
1ConformerG-PAccuracy99.8Unverified