| FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models | May 23, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| DUMB: A Benchmark for Smart Evaluation of Dutch Models | May 22, 2023 | XLM-R | CodeCode Available | 1 |
| Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages | May 20, 2023 | Language ModellingXLM-R | CodeCode Available | 1 |
| ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain | May 20, 2023 | De-identificationLanguage Modeling | CodeCode Available | 1 |
| LEXTREME: A Multi-Lingual and Multi-Task Benchmark for the Legal Domain | Jan 30, 2023 | XLM-R | CodeCode Available | 1 |
| Multilingual Sentence Transformer as A Multilingual Word Aligner | Jan 28, 2023 | SentenceWord Alignment | CodeCode Available | 1 |
| ViHOS: Hate Speech Spans Detection for Vietnamese | Jan 24, 2023 | Hate Span IdentificationSequence-to-sequence Language Modeling | CodeCode Available | 1 |
| Towards Leaving No Indic Language Behind: Building Monolingual Corpora, Benchmark and Models for Indic Languages | Dec 11, 2022 | Natural Language UnderstandingXLM-R | CodeCode Available | 1 |
| Improving Bilingual Lexicon Induction with Cross-Encoder Reranking | Oct 30, 2022 | Bilingual Lexicon InductionCross Encoder Reranking | CodeCode Available | 1 |
| VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models | May 30, 2022 | Vietnamese Natural Language UnderstandingXLM-R | CodeCode Available | 1 |