| LibriS2S: A German-English Speech-to-Speech Translation Corpus | Apr 22, 2022 | Speech-to-Speech TranslationSpeech-to-Text | CodeCode Available | 0 | 5 |
| Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models | Jul 9, 2024 | coreference-resolutionCoreference Resolution | CodeCode Available | 0 | 5 |
| Let's Give a Voice to Conversational Agents in Virtual Reality | Aug 4, 2023 | Speech-to-Texttext-to-speech | CodeCode Available | 0 | 5 |
| Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation | Dec 6, 2016 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality Conversion | Jul 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| Contextualized Translation of Automatically Segmented Speech | Aug 5, 2020 | SegmentationSentence | CodeCode Available | 0 | 5 |
| Kurdish (Sorani) Speech to Text: Presenting an Experimental Dataset | Nov 29, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| mask-Net: Learning Context Aware Invariant Features using Adversarial Forgetting (Student Abstract) | Nov 25, 2020 | Speech-to-Text | CodeCode Available | 0 | 5 |
| Audio Adversarial Examples: Targeted Attacks on Speech-to-Text | Jan 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation | Jun 20, 2024 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach | Sep 13, 2024 | In-Context LearningRetrieval | CodeCode Available | 0 | 5 |
| Attentively Embracing Noise for Robust Latent Representation in BERT | Dec 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 | 5 |
| Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision | Dec 30, 2023 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 | 5 |
| SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition | Apr 5, 2021 | speech-recognitionSpeech Recognition | CodeCode Available | 0 | 5 |
| InstaIndoor and Multi-modal Deep Learning for Indoor Scene Recognition | Dec 23, 2021 | BenchmarkingDeep Learning | CodeCode Available | 0 | 5 |
| Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNN | Jul 24, 2023 | Automatic Speech RecognitionSentiment Analysis | CodeCode Available | 0 | 5 |
| Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation | Oct 24, 2022 | SegmentationSpeech-to-Text | CodeCode Available | 0 | 5 |
| Infusing Future Information into Monotonic Attention Through Language Models | Sep 7, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Joint CTC-Attention based End-to-End Speech Recognition using Multi-task Learning | Sep 21, 2016 | DecoderMulti-Task Learning | CodeCode Available | 0 | 5 |
| Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding | Jan 10, 2025 | Automatic Speech RecognitionClassification | CodeCode Available | 0 | 5 |
| Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models | Jun 29, 2022 | Intent ClassificationSlot Filling | CodeCode Available | 0 | 5 |
| Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions | Feb 13, 2018 | BIG-bench Machine LearningManagement | CodeCode Available | 0 | 5 |
| fairseq S2T: Fast Speech-to-Text Modeling with fairseq | Oct 11, 2020 | Machine TranslationMulti-Task Learning | CodeCode Available | 0 | 5 |
| FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild | Jan 8, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 | 5 |
| End-to-End Automatic Speech Translation of Audiobooks | Feb 12, 2018 | automatic-speech-translationSpeech-to-Text | CodeCode Available | 0 | 5 |