| DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought | Dec 23, 2024 | Machine TranslationMath | CodeCode Available | 3 |
| Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation | Dec 16, 2024 | Translation | CodeCode Available | 3 |
| Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation | Jun 14, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing | Feb 23, 2024 | LipreadingLip Reading | CodeCode Available | 3 |
| SALMONN: Towards Generic Hearing Abilities for Large Language Models | Oct 20, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 3 |
| Accelerating Transformer Inference for Translation via Parallel Decoding | May 17, 2023 | Machine TranslationTranslation | CodeCode Available | 3 |
| Zero-shot Image-to-Image Translation | Feb 6, 2023 | Image-to-Image TranslationText-based Image Editing | CodeCode Available | 3 |
| Bird-Eye Transformers for Text Generation Models | Oct 8, 2022 | AttributeInductive Bias | CodeCode Available | 3 |
| Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow | Sep 7, 2022 | Domain AdaptationImage Generation | CodeCode Available | 3 |
| Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates | Sep 27, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |