| Introducing Syllable Tokenization for Low-resource Languages: A Case Study with Swahili | Mar 26, 2024 | Multilingual NLPText Generation | —Unverified | 0 |
| Is Translation All You Need? A Study on Solving Multilingual Tasks with Large Language Models | Mar 15, 2024 | AllMultilingual NLP | —Unverified | 0 |
| A Measure for Transparent Comparison of Linguistic Diversity in Multilingual NLP Data Sets | Mar 6, 2024 | DiversityMultilingual NLP | CodeCode Available | 0 |
| Self-Augmented In-Context Learning for Unsupervised Word Translation | Feb 15, 2024 | Bilingual Lexicon InductionCross-Lingual Word Embeddings | CodeCode Available | 0 |
| What is "Typological Diversity" in NLP? | Feb 6, 2024 | DiversityMultilingual NLP | CodeCode Available | 0 |
| Patterns of Persistence and Diffusibility across the World's Languages | Jan 3, 2024 | Multilingual NLP | —Unverified | 0 |
| Multilingual Word Embeddings for Low-Resource Languages using Anchors and a Chain of Related Languages | Nov 21, 2023 | Bilingual Lexicon InductionMultilingual NLP | —Unverified | 0 |
| Multi-teacher Distillation for Multilingual Spelling Correction | Nov 20, 2023 | Multilingual NLPSpeech-to-Text | —Unverified | 0 |
| On Bilingual Lexicon Induction with Large Language Models | Oct 21, 2023 | Bilingual Lexicon InductionCross-Lingual Word Embeddings | CodeCode Available | 1 |
| A Systematic Study of Performance Disparities in Multilingual Task-Oriented Dialogue Systems | Oct 19, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| BaitBuster-Bangla: A Comprehensive Dataset for Clickbait Detection in Bangla with Multi-Feature and Multi-Modal Analysis | Oct 13, 2023 | ClassificationClickbait Detection | CodeCode Available | 0 |
| Evaluating The Effectiveness of Capsule Neural Network in Toxic Comment Classification using Pre-trained BERT Embeddings | Oct 12, 2023 | Multilingual NLPNatural Language Understanding | CodeCode Available | 0 |
| Zero-shot Cross-lingual Transfer without Parallel Corpus | Oct 7, 2023 | Cross-Lingual TransferMultilingual NLP | —Unverified | 0 |
| Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer | Sep 19, 2023 | Cross-Lingual TransferMultilingual NLP | CodeCode Available | 0 |
| Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca | Sep 16, 2023 | Instruction FollowingLarge Language Model | CodeCode Available | 0 |
| Crosslingual Transfer Learning for Low-Resource Languages Based on Multilingual Colexification Graphs | May 22, 2023 | Multilingual NLPRetrieval | CodeCode Available | 0 |
| XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages | May 19, 2023 | In-Context LearningMultilingual NLP | CodeCode Available | 1 |
| PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India | May 15, 2023 | Cross-Lingual Abstractive SummarizationMultilingual NLP | CodeCode Available | 0 |
| A General-Purpose Multilingual Document Encoder | May 11, 2023 | Cross-Lingual TransferDocument Classification | CodeCode Available | 0 |
| ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning | Apr 12, 2023 | Multilingual NLPText Generation | —Unverified | 0 |
| PEACH: Pre-Training Sequence-to-Sequence Multilingual Models for Translation with Semi-Supervised Pseudo-Parallel Document Generation | Apr 3, 2023 | DenoisingLanguage Modeling | CodeCode Available | 0 |
| BLOOM: A 176B-Parameter Open-Access Multilingual Language Model | Nov 9, 2022 | DecoderLanguage Modeling | CodeCode Available | 4 |
| Improving Bilingual Lexicon Induction with Cross-Encoder Reranking | Oct 30, 2022 | Bilingual Lexicon InductionCross Encoder Reranking | CodeCode Available | 1 |
| HumSet: Dataset of Multilingual Information Extraction and Classification for Humanitarian Crisis Response | Oct 10, 2022 | HumanitarianMultilabel Text Classification | CodeCode Available | 0 |
| How Do Multilingual Encoders Learn Cross-lingual Representation? | Jul 12, 2022 | Cross-Lingual TransferMultilingual NLP | —Unverified | 0 |