Machine Translation

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–600 of 10752 papers

Title	Date	Tasks	Status	Hype
Non-Autoregressive Text Generation with Pre-trained Language Models	Feb 16, 2021	Machine TranslationSentence	CodeCode Available	1
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training	Feb 16, 2021	Image ClassificationLanguage Modeling	CodeCode Available	1
The first large scale collection of diverse Hausa language datasets	Feb 13, 2021	Machine Translation	CodeCode Available	1
Cross-lingual Visual Pre-training for Multimodal Machine Translation	Jan 25, 2021	Language ModellingMachine Translation	CodeCode Available	1
Towards a Better Integration of Fuzzy Matches in Neural Machine Translation through Data Augmentation	Jan 24, 2021	Data AugmentationMachine Translation	CodeCode Available	1
A Reinforcement Learning Based Encoder-Decoder Framework for Learning Stock Trading Rules	Jan 8, 2021	DecoderDeep Reinforcement Learning	CodeCode Available	1
N-Bref : A High-fidelity Decompiler Exploiting Programming Structures	Jan 1, 2021	Code GenerationDecoder	CodeCode Available	1
Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers	Jan 1, 2021	Abstractive Text SummarizationLanguage Modeling	CodeCode Available	1
Vocabulary Learning via Optimal Transport for Neural Machine Translation	Dec 31, 2020	GPUMachine Translation	CodeCode Available	1
Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade	Dec 31, 2020	Machine TranslationTranslation	CodeCode Available	1
Improving BERT with Syntax-aware Local Attention	Dec 30, 2020	Machine TranslationQuestion Answering	CodeCode Available	1
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning	Dec 29, 2020	DecoderGrammatical Error Correction	CodeCode Available	1
Learning Light-Weight Translation Models from Deep Transformer	Dec 27, 2020	Knowledge DistillationMachine Translation	CodeCode Available	1
RealFormer: Transformer Likes Residual Attention	Dec 21, 2020	Language ModelingLanguage Modelling	CodeCode Available	1
Finding Sparse Structures for Domain Specific Neural Machine Translation	Dec 19, 2020	Domain AdaptationMachine Translation	CodeCode Available	1
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation	Dec 14, 2020	Conditional Text GenerationContrastive Learning	CodeCode Available	1
Mask-Align: Self-Supervised Neural Word Alignment	Dec 13, 2020	Machine TranslationTranslation	CodeCode Available	1
ParsiNLU: A Suite of Language Understanding Challenges for Persian	Dec 11, 2020	Machine TranslationNatural Language Inference	CodeCode Available	1
Document-aligned Japanese-English Conversation Parallel Corpus	Dec 11, 2020	Machine TranslationSentence	CodeCode Available	1
Globetrotter: Connecting Languages by Connecting Images	Dec 8, 2020	Machine TranslationRetrieval	CodeCode Available	1
SemMT: A Semantic-based Testing Approach for Machine Translation Systems	Dec 3, 2020	Machine TranslationSemantic Similarity	CodeCode Available	1
SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP	Dec 1, 2020	Machine TranslationSentence	CodeCode Available	1
Is normalization indispensable for training deep neural network?	Dec 1, 2020	General Classificationimage-classification	CodeCode Available	1
EDITOR: an Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints	Nov 13, 2020	Imitation LearningMachine Translation	CodeCode Available	1
An Unsupervised method for OCR Post-Correction and Spelling Normalisation for Finnish	Nov 6, 2020	Machine TranslationNMT	CodeCode Available	1
Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation	Nov 6, 2020	Machine TranslationStyle Transfer	CodeCode Available	1
Detecting Hallucinated Content in Conditional Neural Sequence Generation	Nov 5, 2020	Abstractive Text SummarizationHallucination	CodeCode Available	1
MK-SQuIT: Synthesizing Questions using Iterative Template-filling	Nov 4, 2020	Dataset GenerationMachine Translation	CodeCode Available	1
PheMT: A Phenomenon-wise Dataset for Machine Translation Robustness on User-Generated Contents	Nov 4, 2020	Machine TranslationNMT	CodeCode Available	1
Emergent Communication Pretraining for Few-Shot Machine Translation	Nov 2, 2020	Machine TranslationNMT	CodeCode Available	1
Bifixer and Bicleaner: two open-source tools to clean your parallel data	Nov 1, 2020	Machine TranslationTranslation	CodeCode Available	1
Tencent AI Lab Machine Translation Systems for the WMT20 Biomedical Translation Task	Nov 1, 2020	Machine TranslationTranslation	CodeCode Available	1
Filtering Noisy Parallel Corpus using Transformers with Proxy Task Learning	Nov 1, 2020	Language ModelingLanguage Modelling	CodeCode Available	1
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation	Nov 1, 2020	Dynamic Time WarpingMachine Translation	CodeCode Available	1
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages	Nov 1, 2020	Machine TranslationTranslation	CodeCode Available	1
Effective Deep Learning Models for Automatic Diacritization of Arabic Text	Nov 1, 2020	Arabic Text DiacritizationDecoder	CodeCode Available	1
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation	Oct 24, 2020	Knowledge DistillationMachine Translation	CodeCode Available	1
Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions	Oct 24, 2020	Machine TranslationObject Recognition	CodeCode Available	1
XOR QA: Cross-lingual Open-Retrieval Question Answering	Oct 22, 2020	ArticlesMachine Translation	CodeCode Available	1
Beyond English-Centric Multilingual Machine Translation	Oct 21, 2020	Machine TranslationTranslation	CodeCode Available	1
Multi-Unit Transformers for Neural Machine Translation	Oct 21, 2020	Machine TranslationTranslation	CodeCode Available	1
On the Potential of Lexico-logical Alignments for Semantic Parsing to SQL Queries	Oct 21, 2020	DecoderMachine Translation	CodeCode Available	1
Analyzing the Source and Target Contributions to Predictions in Neural Machine Translation	Oct 21, 2020	Language ModelingLanguage Modelling	CodeCode Available	1
Exploring Sequence-to-Sequence Models for SPARQL Pattern Composition	Oct 21, 2020	Machine TranslationQuestion Answering	CodeCode Available	1
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information	Oct 21, 2020	Audio captioningDecoder	CodeCode Available	1
Bayesian Attention Modules	Oct 20, 2020	Image CaptioningMachine Translation	CodeCode Available	1
Human-Paraphrased References Improve Neural Machine Translation	Oct 20, 2020	Machine TranslationNMT	CodeCode Available	1
Incorporating Terminology Constraints in Automatic Post-Editing	Oct 19, 2020	Automatic Post-EditingData Augmentation	CodeCode Available	1
Rethinking Document-level Neural Machine Translation	Oct 18, 2020	Document TranslationMachine Translation	CodeCode Available	1
Incorporating BERT into Parallel Sequence Decoding with Adapters	Oct 13, 2020	Machine TranslationNatural Language Understanding	CodeCode Available	1

Show:10 25 50

← PrevPage 12 of 216Next →

All datasets WMT2014 English-German WMT2014 English-French IWSLT2014 German-English ACES WMT2016 English-Romanian WMT2016 Romanian-English WMT2014 German-English IWSLT2015 German-English WMT2016 English-German IWSLT2015 English-Vietnamese IWSLT2015 English-German WMT2016 German-English

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Transformer Cycle (Rev)	BLEU score	35.14	—	Unverified
2	Noisy back-translation	BLEU score	35	—	Unverified
3	Transformer+Rep(Uni)	BLEU score	33.89	—	Unverified
4	T5-11B	BLEU score	32.1	—	Unverified
5	BiBERT	BLEU score	31.26	—	Unverified
6	Transformer + R-Drop	BLEU score	30.91	—	Unverified
7	Bi-SimCut	BLEU score	30.78	—	Unverified
8	BERT-fused NMT	BLEU score	30.75	—	Unverified
9	Data Diversification - Transformer	BLEU score	30.7	—	Unverified
10	SimCut	BLEU score	30.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Transformer+BT (ADMIN init)	BLEU score	46.4	—	Unverified
2	Noisy back-translation	BLEU score	45.6	—	Unverified
3	mRASP+Fine-Tune	BLEU score	44.3	—	Unverified
4	Transformer + R-Drop	BLEU score	43.95	—	Unverified
5	Transformer (ADMIN init)	BLEU score	43.8	—	Unverified
6	Admin	BLEU score	43.8	—	Unverified
7	BERT-fused NMT	BLEU score	43.78	—	Unverified
8	MUSE(Paralllel Multi-scale Attention)	BLEU score	43.5	—	Unverified
9	T5	BLEU score	43.4	—	Unverified
10	Local Joint Self-attention	BLEU score	43.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PiNMT	BLEU score	40.43	—	Unverified
2	BiBERT	BLEU score	38.61	—	Unverified
3	Bi-SimCut	BLEU score	38.37	—	Unverified
4	Cutoff + Relaxed Attention + LM	BLEU score	37.96	—	Unverified
5	DRDA	BLEU score	37.95	—	Unverified
6	Transformer + R-Drop + Cutoff	BLEU score	37.9	—	Unverified
7	SimCut	BLEU score	37.81	—	Unverified
8	Cutoff+Knee	BLEU score	37.78	—	Unverified
9	Cutoff	BLEU score	37.6	—	Unverified
10	CipherDAug	BLEU score	37.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HWTSC-Teacher-Sim	Score	19.97	—	Unverified
2	MS-COMET-22	Score	19.89	—	Unverified
3	MS-COMET-QE-22	Score	19.76	—	Unverified
4	KG-BERTScore	Score	17.28	—	Unverified
5	metricx_xl_DA_2019	Score	17.17	—	Unverified
6	COMET-QE	Score	16.8	—	Unverified
7	COMET-22	Score	16.31	—	Unverified
8	UniTE-src	Score	15.68	—	Unverified
9	UniTE-ref	Score	15.38	—	Unverified
10	metricx_xxl_DA_2019	Score	15.24	—	Unverified