| ArzEn-LLM: Code-Switched Egyptian Arabic-English Translation and Speech Recognition Using LLMs | Jun 26, 2024 | ArzEn Code-switched Translation to araArzEn Code-switched Translation to eng | CodeCode Available | 1 |
| ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation | May 24, 2023 | GPULanguage Modeling | CodeCode Available | 1 |
| Deep Reinforcement Learning For Sequence to Sequence Models | May 24, 2018 | Abstractive Text SummarizationCaption Generation | CodeCode Available | 1 |
| End-to-end Speech Translation via Cross-modal Progressive Training | Apr 21, 2021 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| Information-Transport-based Policy for Simultaneous Translation | Oct 22, 2022 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| Cross-modal Contrastive Learning for Speech Translation | May 5, 2022 | Contrastive LearningRetrieval | CodeCode Available | 1 |
| Challenges and Opportunities of Speech Recognition for Bengali Language | Sep 27, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data? | Jun 11, 2024 | Contrastive LearningSpeech Synthesis | —Unverified | 0 |
| Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization | Oct 29, 2024 | GPURetrieval | —Unverified | 0 |
| A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks | Oct 21, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |