Machine Translation

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 9601–9650 of 10752 papers

Title	Date	Tasks	Status
Embedding Projection for Targeted Cross-Lingual Sentiment: Model Comparisons and a Real-World Study	Jun 24, 2019	Machine TranslationSentence	CodeCode Available
Semantic Textual Similarity Assessment in Chest X-ray Reports Using a Domain-Specific Cosine-Based Metric	Feb 19, 2024	DiagnosticMachine Translation	CodeCode Available
Recovering Dropped Pronouns in Chinese Conversations via Modeling Their Referents	May 17, 2019	Machine TranslationSentence	CodeCode Available
Neural Machine Translation with Byte-Level Subwords	Sep 7, 2019	Machine TranslationTranslation	CodeCode Available
Mapping Supervised Bilingual Word Embeddings from English to low-resource languages	Oct 14, 2019	Machine TranslationRetrieval	CodeCode Available
Fast Sampling via Discrete Non-Markov Diffusion Models with Predetermined Transition Time	Dec 14, 2023	Image GenerationMachine Translation	CodeCode Available
Neural Machine Translation with Characters and Hierarchical Encoding	Oct 20, 2016	DecoderMachine Translation	CodeCode Available
Investigating the translation capabilities of Large Language Models trained on parallel data only	Jun 13, 2024	DecoderLanguage Modeling	CodeCode Available
Fast-Slow Recurrent Neural Networks	May 24, 2017	Language ModelingLanguage Modelling	CodeCode Available
Fast Structured Decoding for Sequence Models	Oct 25, 2019	Machine TranslationSentence	CodeCode Available
FASTSUBS: An Efficient and Exact Procedure for Finding the Most Likely Lexical Substitutes Based on an N-gram Language Model	May 24, 2012	Language ModelingLanguage Modelling	CodeCode Available
Curriculum learning for language modeling	Aug 4, 2021	Language ModelingLanguage Modelling	CodeCode Available
BEER 1.1: ILLC UvA submission to metrics and tuning task	Sep 1, 2015	Learning-To-RankMachine Translation	CodeCode Available
FastTrees: Parallel Latent Tree-Induction for Faster Sequence Encoding	Nov 28, 2021	Language ModelingLanguage Modelling	CodeCode Available
Marathi To English Neural Machine Translation With Near Perfect Corpus And Transformers	Feb 26, 2020	Machine TranslationTranslation	CodeCode Available
Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation	Sep 22, 2021	Machine TranslationTranslation	CodeCode Available
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings	Nov 3, 2018	Cross-Lingual Bitext MiningMachine Translation	CodeCode Available
CLIReval: Evaluating Machine Translation as a Cross-Lingual Information Retrieval Task	Jul 1, 2020	Cross-Lingual Information RetrievalDocument Translation	CodeCode Available
FBK-DH at SemEval-2020 Task 12: Using Multi-channel BERT for Multilingual Offensive Language Detection	Dec 1, 2020	Language IdentificationMachine Translation	CodeCode Available
Marian: Cost-effective High-Quality Neural Machine Translation in C++	May 30, 2018	CPUGPU	CodeCode Available
Marian: Fast Neural Machine Translation in C++	Apr 1, 2018	DecoderMachine Translation	CodeCode Available
Putting Evaluation in Context: Contextual Embeddings Improve Machine Translation Evaluation	Jul 1, 2019	Machine TranslationSentence	CodeCode Available
MARMOT: A Toolkit for Translation Quality Estimation at the Word Level	May 1, 2016	Machine TranslationSentence	CodeCode Available
ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change	Jan 17, 2024	Machine TranslationRetrieval	CodeCode Available
Classification of telicity using cross-linguistic annotation projection	Sep 1, 2017	ClassificationGeneral Classification	CodeCode Available
MASIVE: Open-Ended Affective State Identification in English and Spanish	Jul 16, 2024	Emotion RecognitionMachine Translation	CodeCode Available
Ckylark: A More Robust PCFG-LA Parser	Jun 1, 2015	Machine TranslationNatural Language Inference	CodeCode Available
Curated Datasets and Neural Models for Machine Translation of Informal Registers between Mayan and Spanish Vernaculars	Apr 11, 2024	Machine TranslationTranslation	CodeCode Available
Chunk-based Nearest Neighbor Machine Translation	May 24, 2022	Domain AdaptationLanguage Modeling	CodeCode Available
Neural Machine Translation with Error Correction	Jul 21, 2020	DecoderMachine Translation	CodeCode Available
FGraDA: A Dataset and Benchmark for Fine-Grained Domain Adaptation in Machine Translation	Dec 31, 2020	Autonomous VehiclesDiversity	CodeCode Available
Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme	Jul 26, 2024	Machine Translation	CodeCode Available
A Copy Mechanism for Handling Knowledge Base Elements in SPARQL Neural Machine Translation	Nov 18, 2022	Machine TranslationNMT	CodeCode Available
Massive Exploration of Neural Machine Translation Architectures	Mar 11, 2017	GPUMachine Translation	CodeCode Available
Massively Multilingual Document Alignment with Cross-lingual Sentence-Mover's Distance	Jan 31, 2020	Machine TranslationSentence	CodeCode Available
An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation	Aug 28, 2023	Machine TranslationNMT	CodeCode Available
A Statistical Extension of Byte-Pair Encoding	Aug 1, 2021	Data CompressionMachine Translation	CodeCode Available
Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings	Nov 1, 2021	DecoderMachine Translation	CodeCode Available
Finding the Optimal Vocabulary Size for Neural Machine Translation	Apr 5, 2020	ClassificationGeneral Classification	CodeCode Available
Cultural and Geographical Influences on Image Translatability of Words across Languages	Jun 1, 2021	Cultural Vocal Bursts Intensity PredictionLow Resource Neural Machine Translation	CodeCode Available
Massively Translingual Compound Analysis and Translation Discovery	May 1, 2018	Machine TranslationTranslation	CodeCode Available
Chunk-Based Bi-Scale Decoder for Neural Machine Translation	May 3, 2017	DecoderMachine Translation	CodeCode Available
Choosing the Right Word: Using Bidirectional LSTM Tagger for Writing Support Systems	Jan 8, 2019	Grammatical Error CorrectionMachine Translation	CodeCode Available
Putting words into the system's mouth: A targeted attack on neural machine translation using monolingual data poisoning	Jul 12, 2021	Data PoisoningMachine Translation	CodeCode Available
BatchGEMBA: Token-Efficient Machine Translation Evaluation with Batched Prompting and Prompt Compression	Mar 4, 2025	Large Language ModelMachine Translation	CodeCode Available
Putting words into the system’s mouth: A targeted attack on neural machine translation using monolingual data poisoning	Aug 1, 2021	Data PoisoningMachine Translation	CodeCode Available
Embedding-Enhanced Giza++: Improving Alignment in Low- and High- Resource Scenarios Using Embedding Space Geometry	Apr 18, 2021	de-enMachine Translation	CodeCode Available
An Empirical Study of Building a Strong Baseline for Constituency Parsing	Jul 1, 2018	Abstractive Text SummarizationConstituency Parsing	CodeCode Available
Baselines and test data for cross-lingual inference	Apr 18, 2017	Cross-Lingual Word EmbeddingsMachine Translation	CodeCode Available
Few-shot learning through contextual data augmentation	Mar 31, 2021	Data AugmentationFew-Shot Learning	CodeCode Available

Show:10 25 50

← PrevPage 193 of 216Next →

All datasets WMT2014 English-German WMT2014 English-French IWSLT2014 German-English ACES WMT2016 English-Romanian WMT2016 Romanian-English WMT2014 German-English IWSLT2015 German-English WMT2016 English-German IWSLT2015 English-Vietnamese IWSLT2015 English-German WMT2016 German-English

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Transformer Cycle (Rev)	BLEU score	35.14	—	Unverified
2	Noisy back-translation	BLEU score	35	—	Unverified
3	Transformer+Rep(Uni)	BLEU score	33.89	—	Unverified
4	T5-11B	BLEU score	32.1	—	Unverified
5	BiBERT	BLEU score	31.26	—	Unverified
6	Transformer + R-Drop	BLEU score	30.91	—	Unverified
7	Bi-SimCut	BLEU score	30.78	—	Unverified
8	BERT-fused NMT	BLEU score	30.75	—	Unverified
9	Data Diversification - Transformer	BLEU score	30.7	—	Unverified
10	SimCut	BLEU score	30.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Transformer+BT (ADMIN init)	BLEU score	46.4	—	Unverified
2	Noisy back-translation	BLEU score	45.6	—	Unverified
3	mRASP+Fine-Tune	BLEU score	44.3	—	Unverified
4	Transformer + R-Drop	BLEU score	43.95	—	Unverified
5	Transformer (ADMIN init)	BLEU score	43.8	—	Unverified
6	Admin	BLEU score	43.8	—	Unverified
7	BERT-fused NMT	BLEU score	43.78	—	Unverified
8	MUSE(Paralllel Multi-scale Attention)	BLEU score	43.5	—	Unverified
9	T5	BLEU score	43.4	—	Unverified
10	Local Joint Self-attention	BLEU score	43.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PiNMT	BLEU score	40.43	—	Unverified
2	BiBERT	BLEU score	38.61	—	Unverified
3	Bi-SimCut	BLEU score	38.37	—	Unverified
4	Cutoff + Relaxed Attention + LM	BLEU score	37.96	—	Unverified
5	DRDA	BLEU score	37.95	—	Unverified
6	Transformer + R-Drop + Cutoff	BLEU score	37.9	—	Unverified
7	SimCut	BLEU score	37.81	—	Unverified
8	Cutoff+Knee	BLEU score	37.78	—	Unverified
9	Cutoff	BLEU score	37.6	—	Unverified
10	CipherDAug	BLEU score	37.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HWTSC-Teacher-Sim	Score	19.97	—	Unverified
2	MS-COMET-22	Score	19.89	—	Unverified
3	MS-COMET-QE-22	Score	19.76	—	Unverified
4	KG-BERTScore	Score	17.28	—	Unverified
5	metricx_xl_DA_2019	Score	17.17	—	Unverified
6	COMET-QE	Score	16.8	—	Unverified
7	COMET-22	Score	16.31	—	Unverified
8	UniTE-src	Score	15.68	—	Unverified
9	UniTE-ref	Score	15.38	—	Unverified
10	metricx_xxl_DA_2019	Score	15.24	—	Unverified