SOTAVerified

Language Identification

Language identification is the task of determining the language of a text.

Papers

Showing 51100 of 794 papers

TitleStatusHype
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts0
English Please: Evaluating Machine Translation with Large Language Models for Multilingual Bug ReportsCode0
Multi-label Scandinavian Language Identification (SLIDE)Code0
On the use of Performer and Agent Attention for Spoken Language Identification0
Evaluating Standard and Dialectal Frisian ASR: Multilingual Fine-tuning and Language Identification for Improved Low-resource Performance0
Is It Navajo? Accurate Language Detection in Endangered Athabaskan LanguagesCode0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and Bert LID0
Enhancing Code-Switching ASR Leveraging Non-Peaky CTC Loss and Deep Language Posterior Injection0
Exploring Facets of Language Generation in the Limit0
Can adversarial attacks by large language models be attributed?0
Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages0
Computational Approaches to Arabic-English Code-Switching0
Generation through the lens of learning theory0
A Multi-Task Text Classification Pipeline with Natural Language Explanations: A User-Centric Evaluation in Sentiment Analysis and Offensive Language Identification in Greek Tweets0
From N-grams to Pre-trained Multilingual Models For Language IdentificationCode0
Efficiently Identifying Low-Quality Language Subsets in Multilingual Datasets: A Case Study on a Large-Scale Multilingual Audio Dataset0
AfriHuBERT: A self-supervised speech representation model for African languagesCode0
Improving Multilingual ASR in the Wild Using Simple N-best Re-rankingCode0
Leveraging Open-Source Large Language Models for Native Language Identification0
Enhancing Code-Switching Speech Recognition with LID-Based Collaborative Mixture of Experts Model0
Literary and Colloquial Dialect Identification for Tamil using Acoustic Features0
Towards Generalized Offensive Language Identification0
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models0
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR0
Script-Agnostic Language IdentificationCode0
Rapid Language Adaptation for Multilingual E2E Speech Recognition Using Encoder Prompting0
Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech0
ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets0
Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation0
Malayalam Sign Language Identification using Finetuned YOLOv8 and Computer Vision Techniques0
Whispy: Adapting STT Whisper Models to Real-Time Environments0
A Federated Learning Approach to Privacy Preserving Offensive Language Identification0
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy ConditionsCode0
Geographically-Informed Language IdentificationCode0
More than words: Advancements and challenges in speech recognition for singing0
Validating and Exploring Large Geographic Corpora0
Aligning Speech to Languages to Enhance Code-switching Speech Recognition0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language IdentificationCode0
Code-Switched Language Identification is Harder Than You ThinkCode0
Detecting Structured Language Alternations in Historical Documents by Combining Language Identification with Fourier Analysis0
Acoustic characterization of speech rhythm: going beyond metrics with recurrent neural networks0
Language Detection for Transliterated Content0
Generative linguistic representation for spoken language identification0
Cross-Linguistic Offensive Language Detection: BERT-Based Analysis of Bengali, Assamese, & Bodo Conversational Hateful Content from Social Media0
Leveraging Language ID to Calculate Intermediate CTC Loss for Enhanced Code-Switching Speech Recognition0
Attention-Guided Adaptation for Code-Switching Speech Recognition0
Native Language Identification with Large Language Models0
Self-supervised Adaptive Pre-training of Multilingual Speech Models for Language and Dialect Identification0
A Text-to-Text Model for Multilingual Offensive Language Identification0
Show:102550
← PrevPage 2 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1wav2vec 2.0 LV-60KError rate7.2Unverified
2XLS-RError rate5.7Unverified
#ModelMetricClaimedVerifiedStatus
1GlotLIDMacro F10.98Unverified
#ModelMetricClaimedVerifiedStatus
1FastTextAccuracy0.97Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy91.37Unverified
#ModelMetricClaimedVerifiedStatus
1Apple bi-LSTMAccuracy86.93Unverified
#ModelMetricClaimedVerifiedStatus
1ConformerG-PAccuracy99.8Unverified