| (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts | May 20, 2024 | Machine TranslationTranslation | CodeCode Available | 9 |
| One-Step Image Translation with Text-to-Image Models | Mar 18, 2024 | DenoisingTranslation | CodeCode Available | 7 |
| MaskSketch: Unpaired Structure-guided Masked Image Generation | Feb 10, 2023 | Conditional Image GenerationDiversity | CodeCode Available | 7 |
| Seamless: Multilingual Expressive and Streaming Speech Translation | Dec 8, 2023 | automatic-speech-translationMachine Translation | CodeCode Available | 6 |
| h2oGPT: Democratizing Large Language Models | Jun 13, 2023 | ChatbotFairness | CodeCode Available | 6 |
| ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages | Dec 13, 2022 | Code SummarizationLanguage Modeling | CodeCode Available | 6 |
| High-Fidelity Simultaneous Speech-To-Speech Translation | Feb 5, 2025 | DecoderSimultaneous Speech-to-Speech Translation | CodeCode Available | 5 |
| StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning | Jun 5, 2024 | Automatic Speech Recognition (ASR)de-en | CodeCode Available | 5 |
| How to Design Translation Prompts for ChatGPT: An Empirical Study | Apr 5, 2023 | Machine TranslationNatural Language Understanding | CodeCode Available | 5 |
| Multi-head Temporal Latent Attention | May 19, 2025 | GPUspeech-recognition | CodeCode Available | 4 |
| LBM: Latent Bridge Matching for Fast Image-to-Image Translation | Mar 10, 2025 | Depth EstimationImage Relighting | CodeCode Available | 4 |
| Looking Backward: Streaming Video-to-Video Translation with Feature Banks | May 24, 2024 | GPUTranslation | CodeCode Available | 4 |
| FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation | Mar 19, 2024 | Translationvalid | CodeCode Available | 4 |
| Tower: An Open Multilingual Large Language Model for Translation-Related Tasks | Feb 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation | Jan 16, 2024 | DecoderMachine Translation | CodeCode Available | 4 |
| Turning Whisper into Real-Time Transcription System | Jul 27, 2023 | speech-recognitionSpeech Recognition | CodeCode Available | 4 |
| Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation | Jun 13, 2023 | Patch MatchingTranslation | CodeCode Available | 4 |
| Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation | Mar 29, 2022 | Binary ClassificationSegmentation | CodeCode Available | 4 |
| ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning | May 19, 2025 | Machine Translationreinforcement-learning | CodeCode Available | 3 |
| Deep Reasoning Translation via Reinforcement Learning | Apr 14, 2025 | reinforcement-learningReinforcement Learning | CodeCode Available | 3 |
| DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought | Dec 23, 2024 | Machine TranslationMath | CodeCode Available | 3 |
| Findings of the WMT 2024 Shared Task on Discourse-Level Literary Translation | Dec 16, 2024 | Translation | CodeCode Available | 3 |
| Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation | Jun 14, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing | Feb 23, 2024 | LipreadingLip Reading | CodeCode Available | 3 |
| SALMONN: Towards Generic Hearing Abilities for Large Language Models | Oct 20, 2023 | Audio captioningAutomatic Speech Recognition | CodeCode Available | 3 |
| Accelerating Transformer Inference for Translation via Parallel Decoding | May 17, 2023 | Machine TranslationTranslation | CodeCode Available | 3 |
| Zero-shot Image-to-Image Translation | Feb 6, 2023 | Image-to-Image TranslationText-based Image Editing | CodeCode Available | 3 |
| Bird-Eye Transformers for Text Generation Models | Oct 8, 2022 | AttributeInductive Bias | CodeCode Available | 3 |
| Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow | Sep 7, 2022 | Domain AdaptationImage Generation | CodeCode Available | 3 |
| Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates | Sep 27, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora | Dec 31, 2020 | SentenceTranslation | CodeCode Available | 3 |
| Towards Fully Automated Manga Translation | Dec 28, 2020 | Machine TranslationTranslation | CodeCode Available | 3 |
| CodeBLEU: a Method for Automatic Evaluation of Code Synthesis | Sep 22, 2020 | Code TranslationTranslation | CodeCode Available | 3 |
| Old Photo Restoration via Deep Latent Space Translation | Sep 14, 2020 | Image RestorationTranslation | CodeCode Available | 3 |
| Bringing Old Photos Back to Life | Apr 20, 2020 | Image RestorationTranslation | CodeCode Available | 3 |
| Attention Is All You Need | Jun 12, 2017 | Abstractive Text SummarizationAll | CodeCode Available | 3 |
| MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices | Apr 16, 2025 | Pose EstimationTranslation | CodeCode Available | 2 |
| MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning | Apr 14, 2025 | Machine TranslationReinforcement Learning (RL) | CodeCode Available | 2 |
| MultiMed-ST: Large-scale Many-to-many Multilingual Medical Speech Translation | Apr 4, 2025 | Machine TranslationTranslation | CodeCode Available | 2 |
| CrackSQL: A Hybrid SQL Dialect Translation System Powered by Large Language Models | Apr 1, 2025 | Large Language ModelTranslation | CodeCode Available | 2 |
| DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory | Oct 10, 2024 | Document TranslationMachine Translation | CodeCode Available | 2 |
| MetricX-24: The Google Submission to the WMT 2024 Metrics Shared Task | Oct 4, 2024 | Translation | CodeCode Available | 2 |
| Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent | Jul 31, 2024 | Translationvalid | CodeCode Available | 2 |
| 6DoF Head Pose Estimation through Explicit Bidirectional Interaction with Face Geometry | Jul 19, 2024 | Head Pose EstimationPose Estimation | CodeCode Available | 2 |
| LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages | Jul 8, 2024 | Data AugmentationTranslation | CodeCode Available | 2 |
| Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Bridge Diffusion Model | Jul 6, 2024 | Image-to-Image TranslationTranslation | CodeCode Available | 2 |
| Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level | Jun 22, 2024 | Machine TranslationTranslation | CodeCode Available | 2 |
| A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Speech Translation | Jun 11, 2024 | DecoderSimultaneous Speech-to-Speech Translation | CodeCode Available | 2 |
| Efficient Minimum Bayes Risk Decoding using Low-Rank Matrix Completion Algorithms | Jun 5, 2024 | Low-Rank Matrix CompletionMachine Translation | CodeCode Available | 2 |
| TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation | May 28, 2024 | Machine Translationspeech-recognition | CodeCode Available | 2 |