SOTAVerified

Language Identification

Language identification is the task of determining the language of a text.

Papers

Showing 2130 of 794 papers

TitleStatusHype
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority LanguagesCode1
GlotLID: Language Identification for Low-Resource LanguagesCode1
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data FiltersCode1
KInIT at SemEval-2024 Task 8: Fine-tuned LLMs for Multilingual Machine-Generated Text DetectionCode1
AfroLID: A Neural Language Identification Tool for African LanguagesCode1
L3Cube-HingCorpus and HingBERT: A Code Mixed Hindi-English Dataset and BERT Language ModelsCode1
Language-Informed Beam Search Decoding for Multilingual Machine TranslationCode1
MaskLID: Code-Switching Language Identification through Iterative MaskingCode1
PALI: A Language Identification Benchmark for Perso-Arabic ScriptsCode1
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languagesCode1
Show:102550
← PrevPage 3 of 80Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1wav2vec 2.0 LV-60KError rate7.2Unverified
2XLS-RError rate5.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlotLIDMacro F10.98Unverified
#ModelMetricClaimedVerifiedStatus
1FastTextAccuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy91.37Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy86.93Unverified
#ModelMetricClaimedVerifiedStatus
1ConformerG-PAccuracy99.8Unverified