| BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | Nov 9, 2022 | DecoderLanguage Modeling | CodeCode Available | 4 | 5 |
| Improving Word Translation via Two-Stage Contrastive Learning | Nov 16, 2021 | Bilingual Lexicon InductionContrastive Learning | CodeCode Available | 1 | 5 |
| Improving Bilingual Lexicon Induction with Cross-Encoder Reranking | Oct 30, 2022 | Bilingual Lexicon InductionCross Encoder Reranking | CodeCode Available | 1 | 5 |
| WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER | Nov 1, 2021 | Domain AdaptationMultilingual Named Entity Recognition | CodeCode Available | 1 | 5 |
| HONEST: Measuring Hurtful Sentence Completion in Language Models | Jun 1, 2021 | Hate Speech DetectionHurtful Sentence Completion | CodeCode Available | 1 | 5 |
| Unsupervised Cross-lingual Representation Learning at Scale | Nov 5, 2019 | Cross-Lingual TransferLanguage Modeling | CodeCode Available | 1 | 5 |
| fugashi, a Tool for Tokenizing Japanese in Python | Oct 14, 2020 | Multilingual NLP | CodeCode Available | 1 | 5 |
| DetIE: Multilingual Open Information Extraction Inspired by Object Detection | Jun 24, 2022 | Multilingual NLPObject | CodeCode Available | 1 | 5 |
| Trankit: A Light-Weight Transformer-based Toolkit for Multilingual Natural Language Processing | Jan 9, 2021 | Dependency ParsingLanguage Modeling | CodeCode Available | 1 | 5 |
| XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages | May 19, 2023 | In-Context LearningMultilingual NLP | CodeCode Available | 1 | 5 |
| Simultaneous Translation and Paraphrase for Language Education | Jul 1, 2020 | Machine TranslationMultilingual NLP | CodeCode Available | 1 | 5 |
| Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages | Apr 12, 2021 | Machine TranslationMultilingual NLP | CodeCode Available | 1 | 5 |
| PMIndia -- A Collection of Parallel Corpora of Languages of India | Jan 27, 2020 | Machine TranslationMultilingual NLP | CodeCode Available | 1 | 5 |
| On Bilingual Lexicon Induction with Large Language Models | Oct 21, 2023 | Bilingual Lexicon InductionCross-Lingual Word Embeddings | CodeCode Available | 1 | 5 |
| Improving Word Translation via Two-Stage Contrastive Learning | Mar 15, 2022 | Bilingual Lexicon InductionContrastive Learning | CodeCode Available | 1 | 5 |
| Language-agnostic BERT Sentence Embedding | Jul 3, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| TeDDi Sample: Text Data Diversity Sample for Language Comparison and Multilingual NLP | Jun 1, 2022 | DiversityMultilingual NLP | CodeCode Available | 0 | 5 |
| A General-Purpose Multilingual Document Encoder | May 11, 2023 | Cross-Lingual TransferDocument Classification | CodeCode Available | 0 | 5 |
| A Measure for Transparent Comparison of Linguistic Diversity in Multilingual NLP Data Sets | Mar 6, 2024 | DiversityMultilingual NLP | CodeCode Available | 0 | 5 |
| Analysing The Impact Of Linguistic Features On Cross-Lingual Transfer | May 12, 2021 | Cross-Lingual TransferMultilingual NLP | CodeCode Available | 0 | 5 |
| Analyzing Language Bias Between French and English in Conventional Multilingual Sentiment Analysis Models | May 7, 2024 | Multilingual NLPSentiment Analysis | CodeCode Available | 0 | 5 |
| BaitBuster-Bangla: A Comprehensive Dataset for Clickbait Detection in Bangla with Multi-Feature and Multi-Modal Analysis | Oct 13, 2023 | ClassificationClickbait Detection | CodeCode Available | 0 | 5 |
| Crosslingual Transfer Learning for Low-Resource Languages Based on Multilingual Colexification Graphs | May 22, 2023 | Multilingual NLPRetrieval | CodeCode Available | 0 | 5 |
| Cultural and Geographical Influences on Image Translatability of Words across Languages | Jun 1, 2021 | Cultural Vocal Bursts Intensity PredictionLow Resource Neural Machine Translation | CodeCode Available | 0 | 5 |
| Evaluating The Effectiveness of Capsule Neural Network in Toxic Comment Classification using Pre-trained BERT Embeddings | Oct 12, 2023 | Multilingual NLPNatural Language Understanding | CodeCode Available | 0 | 5 |
| HumSet: Dataset of Multilingual Information Extraction and Classification for Humanitarian Crisis Response | Oct 10, 2022 | HumanitarianMultilabel Text Classification | CodeCode Available | 0 | 5 |
| Improving Cross-Lingual Word Embeddings by Meeting in the Middle | Aug 27, 2018 | Cross-Lingual Word EmbeddingsMultilingual NLP | CodeCode Available | 0 | 5 |
| Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis | Dec 1, 2020 | ClusteringMultilingual NLP | CodeCode Available | 0 | 5 |
| MMCR4NLP: Multilingual Multiway Corpora Repository for Natural Language Processing | Oct 3, 2017 | Machine TranslationMultilingual NLP | CodeCode Available | 0 | 5 |
| Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca | Sep 16, 2023 | Instruction FollowingLarge Language Model | CodeCode Available | 0 | 5 |
| News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation | Jun 18, 2024 | Cross-Lingual TransferDomain Adaptation | CodeCode Available | 0 | 5 |
| PEACH: Pre-Training Sequence-to-Sequence Multilingual Models for Translation with Semi-Supervised Pseudo-Parallel Document Generation | Apr 3, 2023 | DenoisingLanguage Modeling | CodeCode Available | 0 | 5 |
| PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India | May 15, 2023 | Cross-Lingual Abstractive SummarizationMultilingual NLP | CodeCode Available | 0 | 5 |
| ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models | Jun 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer | Sep 19, 2023 | Cross-Lingual TransferMultilingual NLP | CodeCode Available | 0 | 5 |
| Self-Augmented In-Context Learning for Unsupervised Word Translation | Feb 15, 2024 | Bilingual Lexicon InductionCross-Lingual Word Embeddings | CodeCode Available | 0 | 5 |
| Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation | Jun 4, 2019 | Multilingual Named Entity RecognitionMultilingual NLP | CodeCode Available | 0 | 5 |
| SICK-NL: A Dataset for Dutch Natural Language Inference | Apr 1, 2021 | Multilingual NLPNatural Language Inference | CodeCode Available | 0 | 5 |
| SICKNL: A Dataset for Dutch Natural Language Inference | Jan 14, 2021 | Multilingual NLPNatural Language Inference | CodeCode Available | 0 | 5 |
| UQA: Corpus for Urdu Question Answering | May 2, 2024 | Multilingual NLPQuestion Answering | CodeCode Available | 0 | 5 |
| What Drives Performance in Multilingual Language Models? | Apr 29, 2024 | Cross-Lingual TransferMultilingual NLP | CodeCode Available | 0 | 5 |
| What is "Typological Diversity" in NLP? | Feb 6, 2024 | DiversityMultilingual NLP | CodeCode Available | 0 | 5 |
| XeroAlign: Zero-Shot Cross-lingual Transformer Alignment | May 6, 2021 | Multilingual NLPNatural Language Understanding | CodeCode Available | 0 | 5 |
| I’ve got a construction looks funny – representing and recovering non-standard constructions in UD | Dec 1, 2020 | Multilingual NLP | —Unverified | 0 | 0 |
| IXA pipeline: Efficient and Ready to Use Multilingual NLP tools | May 1, 2014 | Coreference ResolutionMultilingual NLP | —Unverified | 0 | 0 |
| LAGO: Few-shot Crosslingual Embedding Inversion Attacks via Language Similarity-Aware Graph Optimization | May 21, 2025 | Distributed OptimizationMultilingual NLP | —Unverified | 0 | 0 |
| Challenges and Strategies in Cross-Cultural NLP | Mar 18, 2022 | Cultural Vocal Bursts Intensity PredictionDiversity | —Unverified | 0 | 0 |
| Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora | Jun 19, 2024 | Multilingual NLP | —Unverified | 0 | 0 |
| A Reproducibility Study on Quantifying Language Similarity: The Impact of Missing Values in the URIEL Knowledge Base | May 17, 2024 | Missing ValuesMultilingual NLP | —Unverified | 0 | 0 |
| Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasks | May 24, 2024 | BenchmarkingDecoder | —Unverified | 0 | 0 |