| BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | Nov 9, 2022 | DecoderLanguage Modeling | CodeCode Available | 4 |
| Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing | Jan 9, 2021 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 |
| WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER | Nov 1, 2021 | Domain AdaptationMultilingual Named Entity Recognition | CodeCode Available | 1 |
| XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages | May 19, 2023 | In-Context LearningMultilingual NLP | CodeCode Available | 1 |
| On Bilingual Lexicon Induction with Large Language Models | Oct 21, 2023 | Bilingual Lexicon InductionCross-Lingual Word Embeddings | CodeCode Available | 1 |
| Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages | Apr 12, 2021 | Machine TranslationMultilingual NLP | CodeCode Available | 1 |
| fugashi, a Tool for Tokenizing Japanese in Python | Oct 14, 2020 | Multilingual NLP | CodeCode Available | 1 |
| HONEST: Measuring Hurtful Sentence Completion in Language Models | Jun 1, 2021 | Hate Speech DetectionHurtful Sentence Completion | CodeCode Available | 1 |
| PMIndia -- A Collection of Parallel Corpora of Languages of India | Jan 27, 2020 | Machine TranslationMultilingual NLP | CodeCode Available | 1 |
| Unsupervised Cross-lingual Representation Learning at Scale | Nov 5, 2019 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 |
| DetIE: Multilingual Open Information Extraction Inspired by Object Detection | Jun 24, 2022 | Multilingual NLPObject | CodeCode Available | 1 |
| Language-agnostic BERT Sentence Embedding | Jul 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving Bilingual Lexicon Induction with Cross-Encoder Reranking | Oct 30, 2022 | Bilingual Lexicon InductionCross Encoder Reranking | CodeCode Available | 1 |
| Improving Word Translation via Two-Stage Contrastive Learning | Mar 15, 2022 | Bilingual Lexicon InductionContrastive Learning | CodeCode Available | 1 |
| Simultaneous Translation and Paraphrase for Language Education | Jul 1, 2020 | Machine TranslationMultilingual NLP | CodeCode Available | 1 |
| Improving Word Translation via Two-Stage Contrastive Learning | Nov 16, 2021 | Bilingual Lexicon InductionContrastive Learning | CodeCode Available | 1 |
| Discovering Representation Sprachbund For Multilingual Pre-Training | Sep 1, 2021 | Multilingual NLP | —Unverified | 0 |
| Don't Touch My Diacritics | Oct 31, 2024 | Multilingual NLP | —Unverified | 0 |
| Dravidian language family through Universal Dependencies lens | Jun 20, 2024 | Multilingual NLP | —Unverified | 0 |
| Fairness in Representation for Multilingual NLP: Insights from Controlled Experiments on Conditional Language Modeling | Sep 29, 2021 | FairnessLanguage Modeling | —Unverified | 0 |
| HausaNLP: Current Status, Challenges and Future Directions for Hausa Natural Language Processing | May 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| How does a Multilingual LM Handle Multiple Languages? | Feb 6, 2025 | Multilingual NLPMultilingual Word Embeddings | —Unverified | 0 |
| How Do Multilingual Encoders Learn Cross-lingual Representation? | Jul 12, 2022 | Cross-Lingual TransferMultilingual NLP | —Unverified | 0 |
| How Good is Your Wikipedia? Auditing Data Quality for Low-resource and Multilingual NLP | Nov 8, 2024 | ArticlesMultilingual NLP | —Unverified | 0 |
| Development of the Multilingual Semantic Annotation System | May 1, 2015 | Multilingual NLP | —Unverified | 0 |
| A Multilingual Modeling Method for Span-Extraction Reading Comprehension | May 31, 2021 | Multilingual NLPReading Comprehension | —Unverified | 0 |
| A Reproducibility Study on Quantifying Language Similarity: The Impact of Missing Values in the URIEL Knowledge Base | May 17, 2024 | Missing ValuesMultilingual NLP | —Unverified | 0 |
| A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems | Oct 19, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Automatic Parallel Corpus Creation for Hindi-English News Translation Task | Jan 24, 2019 | Machine TranslationMultilingual NLP | —Unverified | 0 |
| Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks | May 24, 2024 | BenchmarkingDecoder | —Unverified | 0 |
| Beyond Static Models and Test Sets: Benchmarking the Potential of Pre-trained Models Across Tasks and Languages | May 12, 2022 | BenchmarkingDiversity | —Unverified | 0 |
| Bias Beyond English: Evaluating Social Bias and Debiasing Methods in a Low-Resource Setting | Apr 15, 2025 | FairnessMultilingual NLP | —Unverified | 0 |
| Challenges and Considerations with Code-Mixed NLP for Multilingual Societies | Jun 15, 2021 | ManagementMultilingual NLP | —Unverified | 0 |
| Challenges and Strategies in Cross-Cultural NLP | Mar 18, 2022 | Cultural Vocal Bursts Intensity PredictionDiversity | —Unverified | 0 |
| ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning | Apr 12, 2023 | Multilingual NLPText Generation | —Unverified | 0 |
| Code-Mixed Telugu-English Hate Speech Detection | Feb 15, 2025 | Hate Speech DetectionMultilingual NLP | —Unverified | 0 |
| Coreference Strategies in English-German Translation | Dec 1, 2020 | Machine TranslationMultilingual NLP | —Unverified | 0 |
| Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology | May 20, 2025 | Cross-Lingual TransferMultilingual NLP | —Unverified | 0 |
| DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus | May 1, 2016 | Entity LinkingMultilingual NLP | —Unverified | 0 |
| Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities | Jul 9, 2024 | Multilingual NLPQuestion Answering | —Unverified | 0 |
| I’ve got a construction looks funny – representing and recovering non-standard constructions in UD | Dec 1, 2020 | Multilingual NLP | —Unverified | 0 |
| IXA pipeline: Efficient and Ready to Use Multilingual NLP tools | May 1, 2014 | Coreference ResolutionMultilingual NLP | —Unverified | 0 |
| LAGO: Few-shot Crosslingual Embedding Inversion Attacks via Language Similarity-Aware Graph Optimization | May 21, 2025 | Distributed OptimizationMultilingual NLP | —Unverified | 0 |
| Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora | Jun 19, 2024 | Multilingual NLP | —Unverified | 0 |
| Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks | May 30, 2025 | Cross-Lingual TransferMultilingual NLP | —Unverified | 0 |
| Multilingual Prompt Engineering in Large Language Models: A Survey Across NLP Tasks | May 16, 2025 | Multilingual NLPPrompt Engineering | —Unverified | 0 |
| Multilingual Word Embeddings for Low-Resource Languages using Anchors and a Chain of Related Languages | Nov 21, 2023 | Bilingual Lexicon InductionMultilingual NLP | —Unverified | 0 |
| Multi-teacher Distillation for Multilingual Spelling Correction | Nov 20, 2023 | Multilingual NLPSpeech-to-Text | —Unverified | 0 |
| nmT5 -- Is parallel data still relevant for pre-training massively multilingual language models? | Jun 3, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |
| nmT5 - Is parallel data still relevant for pre-training massively multilingual language models? | Aug 1, 2021 | Language ModelingLanguage Modelling | —Unverified | 0 |