Machine Translation

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–450 of 10752 papers

Title	Date	Tasks	Status	Hype	Score
Bicleaner AI: Bicleaner Goes Neural	Jun 1, 2022	Binary ClassificationMachine Translation	CodeCode Available	1	5
Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models	Jun 11, 2024	Machine TranslationSentence	CodeCode Available	1	5
Decoder-only Streaming Transformer for Simultaneous Translation	Jun 6, 2024	DecoderMachine Translation	CodeCode Available	1	5
BIG-C: a Multimodal Multi-Purpose Dataset for Bemba	May 26, 2023	Machine Translationspeech-recognition	CodeCode Available	1	5
A global analysis of metrics used for measuring performance in natural language processing	Apr 25, 2022	BenchmarkingMachine Translation	CodeCode Available	1	5
BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine Translation	May 23, 2023	Contrastive LearningMachine Translation	CodeCode Available	1	5
Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation	Jun 6, 2022	de-enMachine Translation	CodeCode Available	1	5
Improving Multilingual Translation by Representation and Gradient Regularization	Sep 10, 2021	DecoderMachine Translation	CodeCode Available	1	5
Improving Simultaneous Machine Translation with Monolingual Data	Dec 2, 2022	HallucinationKnowledge Distillation	CodeCode Available	1	5
Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation	May 26, 2021	DiversityMachine Translation	CodeCode Available	1	5
DeLighT: Deep and Light-weight Transformer	Aug 3, 2020	Language ModelingLanguage Modelling	CodeCode Available	1	5
Improving Transformer Optimization Through Better Initialization	Jan 1, 2020	DecoderLanguage Modeling	CodeCode Available	1	5
Binary and Ternary Natural Language Generation	Jun 2, 2023	Machine TranslationQuantization	CodeCode Available	1	5
Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning	Aug 26, 2022	Cross-Modal RetrievalMachine Translation	CodeCode Available	1	5
Crosslingual Capabilities and Knowledge Barriers in Multilingual Large Language Models	Jun 23, 2024	Machine TranslationMMLU	CodeCode Available	1	5
BLEU might be Guilty but References are not Innocent	Apr 13, 2020	DiversityMachine Translation	CodeCode Available	1	5
Block Pruning For Faster Transformers	Sep 10, 2021	Machine TranslationQuestion Answering	CodeCode Available	1	5
Blockwise Parallel Decoding for Deep Autoregressive Models	Nov 7, 2018	DecoderImage Super-Resolution	CodeCode Available	1	5
Incorporating Terminology Constraints in Automatic Post-Editing	Oct 19, 2020	Automatic Post-EditingData Augmentation	CodeCode Available	1	5
A Discriminative Hierarchical PLDA-based Model for Spoken Language Recognition	Jan 4, 2022	Machine Translationspeech-recognition	CodeCode Available	1	5
Cross-lingual Visual Pre-training for Multimodal Machine Translation	Jan 25, 2021	Language ModellingMachine Translation	CodeCode Available	1	5
IndicBART: A Pre-trained Model for Indic Natural Language Generation	Sep 7, 2021	Extreme SummarizationMachine Translation	CodeCode Available	1	5
Bridging the Gap between Different Vocabularies for LLM Ensemble	Apr 15, 2024	Arithmetic ReasoningData-to-Text Generation	CodeCode Available	1	5
Bridging the Domain Gaps in Context Representations for k-Nearest Neighbor Neural Machine Translation	May 26, 2023	Domain AdaptationMachine Translation	CodeCode Available	1	5
IndicXNLI: Evaluating Multilingual Inference for Indian Languages	Apr 19, 2022	Cross-Lingual TransferMachine Translation	CodeCode Available	1	5
IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language Generation	Apr 16, 2021	Machine TranslationQuestion Answering	CodeCode Available	1	5
Informative Language Representation Learning for Massively Multilingual Neural Machine Translation	Sep 4, 2022	Machine TranslationNavigate	CodeCode Available	1	5
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues	Jul 23, 2020	Action ClassificationKeyword Spotting	CodeCode Available	1	5
INK: Injecting kNN Knowledge in Nearest Neighbor Machine Translation	Jun 10, 2023	Machine TranslationTranslation	CodeCode Available	1	5
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity	Feb 8, 2023	Code GenerationHallucination	CodeCode Available	1	5
Cross-Lingual Adaptation using Structural Correspondence Learning	Aug 4, 2010	ClassificationDomain Adaptation	CodeCode Available	1	5
Can Automatic Post-Editing Improve NMT?	Sep 30, 2020	Automatic Post-EditingMachine Translation	CodeCode Available	1	5
BPE-Dropout: Simple and Effective Subword Regularization	Oct 29, 2019	Machine TranslationSegmentation	CodeCode Available	1	5
Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM	Mar 3, 2023	Cross-Lingual TransferLanguage Modeling	CodeCode Available	1	5
CTC-based Non-autoregressive Textless Speech-to-Speech Translation	Jun 11, 2024	Knowledge DistillationMachine Translation	CodeCode Available	1	5
Is normalization indispensable for training deep neural network?	Dec 1, 2020	General Classificationimage-classification	CodeCode Available	1	5
It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information	May 5, 2020	Machine TranslationNMT	CodeCode Available	1	5
Cached Transformers: Improving Transformers with Differentiable Memory Cache	Dec 20, 2023	image-classificationImage Classification	CodeCode Available	1	5
Joint Optimization of Tokenization and Downstream Model	May 26, 2021	Machine Translationmodel	CodeCode Available	1	5
Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?	Aug 1, 2021	Cognate PredictionMachine Translation	CodeCode Available	1	5
KazParC: Kazakh Parallel Corpus for Machine Translation	Mar 28, 2024	Machine TranslationTranslation	CodeCode Available	1	5
Can Language Models Make Fun? A Case Study in Chinese Comical Crosstalk	Jul 2, 2022	BenchmarkingMachine Translation	CodeCode Available	1	5
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation	Dec 26, 2019	BenchmarkingDomain Adaptation	CodeCode Available	1	5
AdaScale SGD: A User-Friendly Algorithm for Distributed Training	Jul 9, 2020	image-classificationImage Classification	CodeCode Available	1	5
Can We Generate Shellcodes via Natural Language? An Empirical Study	Feb 8, 2022	Code GenerationMachine Translation	CodeCode Available	1	5
Language and Speech Technology for Central Kurdish Varieties	Mar 4, 2024	Automatic Speech RecognitionDiversity	CodeCode Available	1	5
CoVoST 2 and Massively Multilingual Speech-to-Text Translation	Jul 20, 2020	Machine Translationspeech-recognition	CodeCode Available	1	5
Cascaded Head-colliding Attention	May 31, 2021	Language ModelingLanguage Modelling	CodeCode Available	1	5
A Multilingual Neural Machine Translation Model for Biomedical Data	Aug 6, 2020	Machine TranslationTranslation	CodeCode Available	1	5
A Benchmark for Evaluating Machine Translation Metrics on Dialects Without Standard Orthography	Nov 28, 2023	Machine TranslationText Generation	CodeCode Available	1	5

Show:10 25 50

← PrevPage 9 of 216Next →

All datasets WMT2014 English-German WMT2014 English-French IWSLT2014 German-English ACES WMT2016 English-Romanian WMT2016 Romanian-English WMT2014 German-English IWSLT2015 German-English WMT2016 English-German IWSLT2015 English-Vietnamese IWSLT2015 English-German WMT2016 German-English

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Transformer Cycle (Rev)	BLEU score	35.14	—	Unverified
2	Noisy back-translation	BLEU score	35	—	Unverified
3	Transformer+Rep(Uni)	BLEU score	33.89	—	Unverified
4	T5-11B	BLEU score	32.1	—	Unverified
5	BiBERT	BLEU score	31.26	—	Unverified
6	Transformer + R-Drop	BLEU score	30.91	—	Unverified
7	Bi-SimCut	BLEU score	30.78	—	Unverified
8	BERT-fused NMT	BLEU score	30.75	—	Unverified
9	Data Diversification - Transformer	BLEU score	30.7	—	Unverified
10	SimCut	BLEU score	30.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Transformer+BT (ADMIN init)	BLEU score	46.4	—	Unverified
2	Noisy back-translation	BLEU score	45.6	—	Unverified
3	mRASP+Fine-Tune	BLEU score	44.3	—	Unverified
4	Transformer + R-Drop	BLEU score	43.95	—	Unverified
5	Admin	BLEU score	43.8	—	Unverified
6	Transformer (ADMIN init)	BLEU score	43.8	—	Unverified
7	BERT-fused NMT	BLEU score	43.78	—	Unverified
8	MUSE(Paralllel Multi-scale Attention)	BLEU score	43.5	—	Unverified
9	T5	BLEU score	43.4	—	Unverified
10	Local Joint Self-attention	BLEU score	43.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PiNMT	BLEU score	40.43	—	Unverified
2	BiBERT	BLEU score	38.61	—	Unverified
3	Bi-SimCut	BLEU score	38.37	—	Unverified
4	Cutoff + Relaxed Attention + LM	BLEU score	37.96	—	Unverified
5	DRDA	BLEU score	37.95	—	Unverified
6	Transformer + R-Drop + Cutoff	BLEU score	37.9	—	Unverified
7	SimCut	BLEU score	37.81	—	Unverified
8	Cutoff+Knee	BLEU score	37.78	—	Unverified
9	Cutoff	BLEU score	37.6	—	Unverified
10	CipherDAug	BLEU score	37.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HWTSC-Teacher-Sim	Score	19.97	—	Unverified
2	MS-COMET-22	Score	19.89	—	Unverified
3	MS-COMET-QE-22	Score	19.76	—	Unverified
4	KG-BERTScore	Score	17.28	—	Unverified
5	metricx_xl_DA_2019	Score	17.17	—	Unverified
6	COMET-QE	Score	16.8	—	Unverified
7	COMET-22	Score	16.31	—	Unverified
8	UniTE-src	Score	15.68	—	Unverified
9	UniTE-ref	Score	15.38	—	Unverified
10	metricx_xxl_DA_2019	Score	15.24	—	Unverified