Machine Translation

Machine translation is the task of translating a sentence in a source language to a different target language.

Approaches for machine translation can range from rule-based to statistical to neural-based. More recently, encoder-decoder attention-based architectures like BERT have attained major improvements in machine translation.

One of the most popular datasets used to benchmark machine translation systems is the WMT family of datasets. Some of the most commonly used evaluation metrics for machine translation systems include BLEU, METEOR, NIST, and others.

( Image credit: Google seq2seq )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 10752 papers

Title	Date	Tasks	Status	Hype
Beyond MLE: Investigating SEARNN for Low-Resourced Neural Machine Translation	May 20, 2024	Machine TranslationStructured Prediction	—Unverified	0
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus	May 20, 2024	Machine TranslationNatural Language Inference	CodeCode Available	0
Chasing COMET: Leveraging Minimum Bayes Risk Decoding for Self-Improving Machine Translation	May 20, 2024	Domain AdaptationMachine Translation	—Unverified	0
Cyber Risks of Machine Translation Critical Errors : Arabic Mental Health Tweets as a Case Study	May 19, 2024	Machine TranslationNMT	—Unverified	0
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset	May 17, 2024	16kBenchmarking	CodeCode Available	3
Word Alignment as Preference for Machine Translation	May 15, 2024	HallucinationLanguage Modelling	—Unverified	0
LLM-Assisted Rule Based Machine Translation for Low/No-Resource Languages	May 14, 2024	Machine TranslationSentence	—Unverified	0
Enhancing Gender-Inclusive Machine Translation with Neomorphemes and Large Language Models	May 14, 2024	Machine TranslationTranslation	—Unverified	0
An Empirical Study on the Robustness of Massively Multilingual Neural Machine Translation	May 13, 2024	Machine TranslationTranslation	CodeCode Available	0
CANTONMT: Investigating Back-Translation and Model-Switch Mechanisms for Cantonese-English Neural Machine Translation	May 13, 2024	Machine TranslationTranslation	CodeCode Available	0
Learning to Solve Geometry Problems via Simulating Human Dual-Reasoning Process	May 10, 2024	Geometry Problem SolvingMachine Translation	CodeCode Available	0
Using Machine Translation to Augment Multilingual Classification	May 9, 2024	ClassificationImage Captioning	—Unverified	0
Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages	May 8, 2024	DiversityMachine Translation	CodeCode Available	0
Automated Program Repair: Emerging trends pose and expose problems for benchmarks	May 8, 2024	Machine TranslationProgram Repair	—Unverified	0
Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages	May 8, 2024	Information RetrievalMachine Translation	—Unverified	0
Guylingo: The Republic of Guyana Creole Corpora	May 6, 2024	DiversityMachine Translation	CodeCode Available	0
Comparative study of models trained on synthetic data for Ukrainian grammatical error correction	May 5, 2024	Grammatical Error CorrectionMachine Translation	CodeCode Available	0
Sentiment Analysis Across Languages: Evaluation Before and After Machine Translation to English	May 5, 2024	DiversityMachine Translation	CodeCode Available	0
Relay Decoding: Concatenating Large Language Models for Machine Translation	May 5, 2024	Machine TranslationTranslation	—Unverified	0
The Call for Socially Aware Language Technologies	May 3, 2024	Machine TranslationSentiment Analysis	—Unverified	0
Reinforcement Learning for Edit-Based Non-Autoregressive Neural Machine Translation	May 2, 2024	Machine TranslationNMT	—Unverified	0
The IgboAPI Dataset: Empowering Igbo Language Technologies through Multi-dialectal Enrichment	May 2, 2024	Machine TranslationTranslation	—Unverified	0
Efficient Sample-Specific Encoder Perturbations	May 1, 2024	AttributeDecoder	—Unverified	0
Context-Aware Machine Translation with Source Coreference Explanation	Apr 30, 2024	Machine TranslationTranslation	CodeCode Available	0
Does Generative AI speak Nigerian-Pidgin?: Issues about Representativeness and Bias for Multilingualism in LLMs	Apr 30, 2024	Machine TranslationTranslation	CodeCode Available	0
Suvach -- Generated Hindi QA benchmark	Apr 30, 2024	Machine TranslationQuestion Answering	—Unverified	0
3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset	Apr 29, 2024	Machine TranslationMultimodal Machine Translation	CodeCode Available	1
Modeling Orthographic Variation Improves NLP Performance for Nigerian Pidgin	Apr 28, 2024	Data AugmentationMachine Translation	—Unverified	0
Quality Estimation with k-nearest Neighbors and Automatic Evaluation for Model-specific Quality Estimation	Apr 27, 2024	Machine TranslationTranslation	—Unverified	0
Scaffold-BPE: Enhancing Byte Pair Encoding for Large Language Models with Simple and Effective Scaffold Token Removal	Apr 27, 2024	Language ModelingLanguage Modelling	—Unverified	0
I Have an Attention Bridge to Sell You: Generalization Capabilities of Modular Translation Architectures	Apr 27, 2024	Machine TranslationTranslation	—Unverified	0
Usefulness of Emotional Prosody in Neural Machine Translation	Apr 27, 2024	Emotion RecognitionMachine Translation	—Unverified	0
TIGQA:An Expert Annotated Question Answering Dataset in Tigrinya	Apr 26, 2024	Machine TranslationQuestion Answering	—Unverified	0
Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model	Apr 25, 2024	Language ModelingLanguage Modelling	—Unverified	0
Translation of Multifaceted Data without Re-Training of Machine Translation Systems	Apr 25, 2024	Machine TranslationQuestion Generation	—Unverified	0
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages	Apr 25, 2024	Cross-Lingual Question AnsweringDiversity	CodeCode Available	2
BERT vs GPT for financial engineering	Apr 24, 2024	Machine TranslationQuestion Answering	—Unverified	0
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges	Apr 24, 2024	Drug DesignInductive Bias	CodeCode Available	2
Neural Proto-Language Reconstruction	Apr 24, 2024	Data AugmentationMachine Translation	—Unverified	0
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation	Apr 23, 2024	Knowledge DistillationMachine Translation	—Unverified	0
Setting up the Data Printer with Improved English to Ukrainian Machine Translation	Apr 23, 2024	DecoderLanguage Modeling	CodeCode Available	1
Automated Multi-Language to English Machine Translation Using Generative Pre-Trained Transformers	Apr 23, 2024	Machine TranslationSentence	—Unverified	0
Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?	Apr 22, 2024	Machine TranslationTranslation	—Unverified	0
From LLM to NMT: Advancing Low-Resource Machine Translation with Claude	Apr 22, 2024	Knowledge DistillationLanguage Modeling	—Unverified	0
Evaluation of Machine Translation Based on Semantic Dependencies and Keywords	Apr 20, 2024	Information RetrievalMachine Translation	—Unverified	0
NLP-enabled Trajectory Map-matching in Urban Road Networks using a Transformer-based Encoder-decoder	Apr 18, 2024	DecoderMachine Translation	—Unverified	0
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair	Apr 18, 2024	Machine TranslationSpeech-to-Text	CodeCode Available	0
Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory	Apr 18, 2024	Machine TranslationMathematical Reasoning	—Unverified	0
Neuron Specialization: Leveraging intrinsic task modularity for multilingual machine translation	Apr 17, 2024	Cross-Lingual TransferMachine Translation	—Unverified	0
Bridging the Gap between Different Vocabularies for LLM Ensemble	Apr 15, 2024	Arithmetic ReasoningData-to-Text Generation	CodeCode Available	1

Show:10 25 50

← PrevPage 14 of 216Next →

All datasets WMT2014 English-German WMT2014 English-French IWSLT2014 German-English ACES WMT2016 English-Romanian WMT2016 Romanian-English WMT2014 German-English IWSLT2015 German-English WMT2016 English-German IWSLT2015 English-Vietnamese IWSLT2015 English-German WMT2016 German-English

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Transformer Cycle (Rev)	BLEU score	35.14	—	Unverified
2	Noisy back-translation	BLEU score	35	—	Unverified
3	Transformer+Rep(Uni)	BLEU score	33.89	—	Unverified
4	T5-11B	BLEU score	32.1	—	Unverified
5	BiBERT	BLEU score	31.26	—	Unverified
6	Transformer + R-Drop	BLEU score	30.91	—	Unverified
7	Bi-SimCut	BLEU score	30.78	—	Unverified
8	BERT-fused NMT	BLEU score	30.75	—	Unverified
9	Data Diversification - Transformer	BLEU score	30.7	—	Unverified
10	SimCut	BLEU score	30.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Transformer+BT (ADMIN init)	BLEU score	46.4	—	Unverified
2	Noisy back-translation	BLEU score	45.6	—	Unverified
3	mRASP+Fine-Tune	BLEU score	44.3	—	Unverified
4	Transformer + R-Drop	BLEU score	43.95	—	Unverified
5	Admin	BLEU score	43.8	—	Unverified
6	Transformer (ADMIN init)	BLEU score	43.8	—	Unverified
7	BERT-fused NMT	BLEU score	43.78	—	Unverified
8	MUSE(Paralllel Multi-scale Attention)	BLEU score	43.5	—	Unverified
9	T5	BLEU score	43.4	—	Unverified
10	Local Joint Self-attention	BLEU score	43.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PiNMT	BLEU score	40.43	—	Unverified
2	BiBERT	BLEU score	38.61	—	Unverified
3	Bi-SimCut	BLEU score	38.37	—	Unverified
4	Cutoff + Relaxed Attention + LM	BLEU score	37.96	—	Unverified
5	DRDA	BLEU score	37.95	—	Unverified
6	Transformer + R-Drop + Cutoff	BLEU score	37.9	—	Unverified
7	SimCut	BLEU score	37.81	—	Unverified
8	Cutoff+Knee	BLEU score	37.78	—	Unverified
9	Cutoff	BLEU score	37.6	—	Unverified
10	CipherDAug	BLEU score	37.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HWTSC-Teacher-Sim	Score	19.97	—	Unverified
2	MS-COMET-22	Score	19.89	—	Unverified
3	MS-COMET-QE-22	Score	19.76	—	Unverified
4	KG-BERTScore	Score	17.28	—	Unverified
5	metricx_xl_DA_2019	Score	17.17	—	Unverified
6	COMET-QE	Score	16.8	—	Unverified
7	COMET-22	Score	16.31	—	Unverified
8	UniTE-src	Score	15.68	—	Unverified
9	UniTE-ref	Score	15.38	—	Unverified
10	metricx_xxl_DA_2019	Score	15.24	—	Unverified