| LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions | May 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Brilla AI: AI Contestant for the National Science and Maths Quiz | Mar 4, 2024 | MathQuestion Answering | CodeCode Available | 1 |
| Pushing the Limits of Zero-shot End-to-End Speech Translation | Feb 16, 2024 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 1 |
| Benchmarking Large Multimodal Models against Common Corruptions | Jan 22, 2024 | BenchmarkingImage to text | CodeCode Available | 1 |
| End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation | Nov 1, 2023 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 1 |
| Towards an AI to Win Ghana's National Science and Maths Quiz | Aug 8, 2023 | MathQuestion Answering | CodeCode Available | 1 |
| ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation | May 24, 2023 | GPULanguage Modeling | CodeCode Available | 1 |
| DUB: Discrete Unit Back-translation for Speech Translation | May 19, 2023 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| A Whisper transformer for audio captioning trained with synthetic captions and transfer learning | May 15, 2023 | Audio captioningSpeech-to-Text | CodeCode Available | 1 |
| Back Translation for Speech-to-text Translation Without Transcripts | May 15, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| PSST! Prosodic Speech Segmentation with Transformers | Feb 3, 2023 | SegmentationSpeech-to-Text | CodeCode Available | 1 |
| Pre-training for Speech Translation: CTC Meets Optimal Transport | Jan 27, 2023 | Multi-Task LearningSpeech-to-Text | CodeCode Available | 1 |
| Information-Transport-based Policy for Simultaneous Translation | Oct 22, 2022 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| JoeyS2T: Minimalistic Speech-to-Text Modeling with JoeyNMT | Oct 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Cross-modal Contrastive Learning for Speech Translation | May 5, 2022 | Contrastive LearningRetrieval | CodeCode Available | 1 |
| Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages | May 2, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation | Mar 20, 2022 | Machine TranslationSpeech-to-Text | CodeCode Available | 1 |
| A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing | Mar 18, 2022 | Representation LearningSpeaker Verification | CodeCode Available | 1 |
| Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement | Dec 21, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| X-Vector based voice activity detection for multi-genre broadcast speech-to-text | Dec 9, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 |
| Cross Attention Augmented Transducer Networks for Simultaneous Translation | Nov 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| EdiTTS: Score-based Editing for Controllable Text-to-Speech | Oct 6, 2021 | Speech SynthesisSpeech-to-Text | CodeCode Available | 1 |
| Late reverberation suppression using U-nets | Oct 5, 2021 | DecoderSpeech Dereverberation | CodeCode Available | 1 |
| Speech Emotion Recognition with Multi-Task Learning | Sep 6, 2021 | Emotion ClassificationEmotion Recognition | CodeCode Available | 1 |
| One TTS Alignment To Rule Them All | Aug 23, 2021 | AllSpeech Synthesis | CodeCode Available | 1 |