SOTAVerified

Cross-Lingual Transfer

Cross-lingual transfer refers to transfer learning using data and models available for one language for which ample such resources are available (e.g., English) to solve tasks in another, commonly more low-resource, language.

Papers

Showing 125 of 782 papers

TitleStatusHype
T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient EmbeddingsCode2
MIND Your Language: A Multilingual Dataset for Cross-lingual News RecommendationCode2
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language VariantsCode2
Crosslingual Generalization through Multitask FinetuningCode2
Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource SettingsCode2
Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual GenerationCode2
mGPT: Few-Shot Learners Go MultilingualCode2
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual TransferCode2
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMsCode1
Bridging the Gap: Enhancing LLM Performance for Low-Resource African Languages with New Benchmarks, Fine-Tuning, and Cultural AdjustmentsCode1
Multilingual Large Language Models: A Systematic SurveyCode1
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code GeneratorsCode1
From One to Many: Expanding the Scope of Toxicity Mitigation in Language ModelsCode1
AdaMergeX: Cross-Lingual Transfer with Large Language Models via Adaptive Adapter MergingCode1
ColBERT-XM: A Modular Multi-Vector Representation Model for Zero-Shot Multilingual Information RetrievalCode1
Investigating Cultural Alignment of Large Language ModelsCode1
LEIA: Facilitating Cross-lingual Knowledge Transfer in Language Models with Entity-based Data AugmentationCode1
UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning DatasetCode1
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?Code1
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought ProcessesCode1
Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social MediaCode1
CLARA: Multilingual Contrastive Learning for Audio Representation AcquisitionCode1
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and DialectsCode1
mCLIP: Multilingual CLIP via Cross-lingual TransferCode1
Allophant: Cross-lingual Phoneme Recognition with Articulatory AttributesCode1
Show:102550
← PrevPage 1 of 32Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PaLM 2 (few-shot)Accuracy94.4Unverified
2mT0-13BAccuracy84.45Unverified
3RoBERTa Large (translate test)Accuracy76.05Unverified
4BLOOMZAccuracy75.5Unverified
5MAD-X BaseAccuracy60.94Unverified
6mGPTAccuracy55.5Unverified