| Offensive Language Identification in Transliterated and Code-Mixed Bangla | Nov 25, 2023 | Language Identification | —Unverified | 0 |
| The Obscure Limitation of Modular Multilingual Language Models | Nov 21, 2023 | Language Identification | —Unverified | 0 |
| Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability | Nov 16, 2023 | Language Identification | —Unverified | 0 |
| OffMix-3L: A Novel Code-Mixed Dataset in Bangla-English-Hindi for Offensive Language Identification | Oct 27, 2023 | Language Identification | CodeCode Available | 0 |
| Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition | Oct 17, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond | Oct 9, 2023 | Language Identificationspeech-recognition | —Unverified | 0 |
| Wavelet Scattering Transform for Improving Generalization in Low-Resourced Spoken Language Identification | Oct 1, 2023 | Language IdentificationSpoken language identification | —Unverified | 0 |
| Multimodal Modeling For Spoken Language Identification | Sep 19, 2023 | Language IdentificationSpoken language identification | —Unverified | 0 |
| CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages | Sep 17, 2023 | HallucinationLanguage Identification | —Unverified | 0 |
| Native Language Identification with Big Bird Embeddings | Sep 13, 2023 | Computational EfficiencyFeature Engineering | CodeCode Available | 0 |
| Robust Open-Set Spoken Language Identification and the CU MultiLang Dataset | Aug 29, 2023 | Language IdentificationSpoken language identification | —Unverified | 0 |
| Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual Predatory Chats and Abusive Texts | Aug 28, 2023 | Abusive LanguageFake News Detection | —Unverified | 0 |
| Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss | Aug 11, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Turkish Native Language Identification | Jul 27, 2023 | Language IdentificationNative Language Identification | —Unverified | 0 |
| MASR: Multi-label Aware Speech Representation | Jul 20, 2023 | Emotion RecognitionLanguage Identification | —Unverified | 0 |
| Multilingual Speech-to-Speech Translation into Multiple Target Languages | Jul 17, 2023 | Language IdentificationSpeech-to-Speech Translation | —Unverified | 0 |
| Towards spoken dialect identification of Irish | Jul 14, 2023 | Dialect IdentificationLanguage Identification | —Unverified | 0 |
| Confidence-based Ensembles of End-to-End Speech Recognition Models | Jun 27, 2023 | Language IdentificationModel Selection | —Unverified | 0 |
| My Boli: Code-mixed Marathi-English Corpora, Pretrained Language Models and Evaluation Benchmarks | Jun 24, 2023 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |
| Unified model for code-switching speech recognition and language identification based on a concatenated tokenizer | Jun 14, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| RoBERTweet: A BERT Language Model for Romanian Tweets | Jun 11, 2023 | Language IdentificationLanguage Modeling | —Unverified | 0 |
| Leveraging Language Identification to Enhance Code-Mixed Text Classification | Jun 8, 2023 | ClassificationHate Speech Detection | —Unverified | 0 |
| Label Aware Speech Representation Learning For Language Identification | Jun 7, 2023 | Language IdentificationMissing Labels | —Unverified | 0 |
| Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech | Jun 1, 2023 | DecoderLanguage Identification | CodeCode Available | 0 |
| Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning | May 31, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| MERLIon CCS Challenge Evaluation Plan | May 31, 2023 | Language IdentificationTask 2 | CodeCode Available | 0 |
| Investigating model performance in language identification: beyond simple error statistics | May 30, 2023 | Language Identification | CodeCode Available | 0 |
| MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization | May 30, 2023 | Language Identification | CodeCode Available | 0 |
| Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities | May 25, 2023 | Language IdentificationMachine Translation | CodeCode Available | 0 |
| LIMIT: Language Identification, Misidentification, and Translation using Hierarchical Models in 350+ Languages | May 23, 2023 | Language IdentificationTranslation | CodeCode Available | 0 |
| Multilingual Large Language Models Are Not (Yet) Code-Switchers | May 23, 2023 | BenchmarkingLanguage Identification | —Unverified | 0 |
| ML-SUPERB: Multilingual Speech Universal PERformance Benchmark | May 18, 2023 | Automatic Speech RecognitionLanguage Identification | —Unverified | 0 |
| DocLangID: Improving Few-Shot Training to Identify the Language of Historical Documents | May 3, 2023 | Few-Shot LearningLanguage Identification | CodeCode Available | 0 |
| Lessons Learned in ATCO2: 5000 hours of Air Traffic Control Communications for Robust Automatic Speech Recognition and Understanding | May 2, 2023 | Automatic Speech RecognitionLanguage Identification | —Unverified | 0 |
| Approaches to Corpus Creation for Low-Resource Language Technology: the Case of Southern Kurdish and Laki | Apr 3, 2023 | Language Identification | CodeCode Available | 0 |
| MMT: A Multilingual and Multi-Topic Indian Social Media Dataset | Apr 2, 2023 | DiversityLanguage Identification | —Unverified | 0 |
| Joint unsupervised and supervised learning for context-aware language identification | Mar 29, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Language Variety Identification with True Labels | Mar 2, 2023 | Language Identification | CodeCode Available | 0 |
| Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training | Mar 1, 2023 | Language Identification | —Unverified | 0 |
| Augmented Transformers with Adaptive n-grams Embedding for Multilingual Scene Text Recognition | Feb 28, 2023 | Language IdentificationScene Text Recognition | —Unverified | 0 |
| Language identification as improvement for lip-based biometric visual systems | Feb 27, 2023 | Language Identification | —Unverified | 0 |
| Cross-Corpora Spoken Language Identification with Domain Diversification and Generalization | Feb 10, 2023 | Data AugmentationDomain Generalization | —Unverified | 0 |
| A Twitter BERT Approach for Offensive Language Detection in Marathi | Dec 20, 2022 | Data AugmentationLanguage Identification | —Unverified | 0 |
| An Overview of Indian Spoken Language Recognition from Machine Learning Perspective | Nov 30, 2022 | Language IdentificationSpoken language identification | —Unverified | 0 |
| Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts | Nov 26, 2022 | Language Identification | —Unverified | 0 |
| Predicting the Type and Target of Offensive Social Media Posts in Marathi | Nov 22, 2022 | Language Identification | CodeCode Available | 0 |
| Scaling Native Language Identification with Transformer Adapters | Nov 18, 2022 | Language IdentificationMarketing | —Unverified | 0 |
| Overview of the HASOC Subtrack at FIRE 2022: Offensive Language Identification in Marathi | Nov 18, 2022 | Language Identification | —Unverified | 0 |
| CoLI-Machine Learning Approaches for Code-mixed Language Identification at the Word Level in Kannada-English Texts | Nov 17, 2022 | Language IdentificationSentence | —Unverified | 0 |
| Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models | Nov 9, 2022 | Language IdentificationSpoken language identification | —Unverified | 0 |