SOTAVerified

Multimodal Machine Translation

Multimodal machine translation is the task of doing machine translation with multiple data sources - for example, translating "a bird is flying over water" + an image of a bird over water to German text.

( Image credit: Findings of the Third Shared Task on Multimodal Machine Translation )

Papers

Showing 150 of 108 papers

TitleStatusHype
Seamless: Multilingual Expressive and Streaming Speech TranslationCode6
Attention Is All You NeedCode3
Self-Knowledge Distillation with Progressive Refinement of TargetsCode1
Scene Graph as Pivoting: Inference-time Image-free Unsupervised Multimodal Machine Translation with Visual Scene HallucinationCode1
VALHALLA: Visual Hallucination for Machine TranslationCode1
BERTGEN: Multi-task Generation through BERTCode1
M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-trainingCode1
MSCTD: A Multimodal Sentiment Chat Translation DatasetCode1
CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine TranslationCode1
Dynamic Context-guided Capsule Network for Multimodal Machine TranslationCode1
BigVideo: A Large-scale Video Subtitle Translation Dataset for Multimodal Machine TranslationCode1
Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive EvaluationCode1
On Vision Features in Multimodal Machine TranslationCode1
VISA: An Ambiguous Subtitles Dataset for Visual Scene-Aware Machine TranslationCode1
Cross-lingual Visual Pre-training for Multimodal Machine TranslationCode1
Neural Machine Translation with Phrase-Level Universal Visual RepresentationsCode1
Multimodal Transformer for Multimodal Machine TranslationCode1
3AM: An Ambiguity-Aware Multi-Modal Machine Translation DatasetCode1
Distill the Image to Nowhere: Inversion Knowledge Distillation for Multimodal Machine TranslationCode1
Bridging the Gap between Synthetic and Authentic Images for Multimodal Machine TranslationCode0
UMONS Submission for WMT18 Multimodal Translation TaskCode0
Does Multimodality Help Human and Machine for Translation and Image Captioning?Code0
HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa LanguageCode0
Distilling Translations with Visual AwarenessCode0
Beyond Triplet: Leveraging the Most Data for Multimodal Machine TranslationCode0
Video-Helpful Multimodal Machine TranslationCode0
TopicVD: A Topic-Based Dataset of Video-Guided Multimodal Machine Translation for DocumentariesCode0
Towards Zero-Shot Multimodal Machine TranslationCode0
Findings of the Third Shared Task on Multimodal Machine TranslationCode0
NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation SystemsCode0
Vision Matters When It Should: Sanity Checking Multimodal Machine Translation ModelsCode0
Multimodal Machine Translation with Embedding PredictionCode0
A Visual Attention Grounding Neural Model for Multimodal Machine TranslationCode0
Multi30K: Multilingual English-German Image DescriptionsCode0
Latent Variable Model for Multi-modal TranslationCode0
Incorporating Probing Signals into Multimodal Machine Translation via Visual Question-Answering PairsCode0
Cultural and Geographical Influences on Image Translatability of Words across LanguagesCode0
Multimodal Lexical TranslationCode0
ViTA: Visual-Linguistic Translation by Aligning Object TagsCode0
Efficient Object-Level Visual Context Modeling for Multimodal Machine Translation: Masking Irrelevant Objects Helps Grounding0
Adaptive Fusion Techniques for Multimodal Data0
CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation0
Doubly Attentive Transformer Machine Translation0
A Survey of Vision-Language Pre-training from the Lens of Multimodal Machine Translation0
Doubly-Attentive Decoder for Multi-modal Neural Machine Translation0
Gumbel-Attention for Multi-modal Machine Translation0
Grounded Word Sense Translation0
A Shared Task on Multimodal Machine Translation and Crosslingual Image Description0
A Dataset and Reranking Method for Multimodal MT of User-Generated Image Captions0
Good for Misconceived Reasons: Revisiting Neural Multimodal Machine Translation0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1delMeteor (EN-FR)74.6Unverified
2ERNIE-UniX2BLEU (EN-DE)49.3Unverified
3IKD-MMTBLEU (EN-DE)41.28Unverified
4DCCNBLEU (EN-DE)39.7Unverified
5CaglayanBLEU (EN-DE)39.4Unverified
6Gumbel-Attention MMTBLEU (EN-DE)39.2Unverified
7Multimodal TransformerBLEU (EN-DE)38.7Unverified
8ImagiTBLEU (EN-DE)38.4Unverified
9del+objBLEU (EN-DE)38Unverified
10VMMTFBLEU (EN-DE)37.6Unverified
#ModelMetricClaimedVerifiedStatus
1ViTABLEU (EN-HI)51.6Unverified
#ModelMetricClaimedVerifiedStatus
1ViTABLEU (EN-HI)44.6Unverified