| Let's Give a Voice to Conversational Agents in Virtual Reality | Aug 4, 2023 | Speech-to-Texttext-to-speech | CodeCode Available | 0 |
| Transformer-Based Named Entity Recognition for Automated Server Provisioning | Apr 1, 2025 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 0 |
| End to End ASR System with Automatic Punctuation Insertion | Dec 3, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Pre-training on high-resource speech recognition improves low-resource speech-to-text translation | Sep 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| LibriS2S: A German-English Speech-to-Speech Translation Corpus | Apr 22, 2022 | Speech-to-Speech TranslationSpeech-to-Text | CodeCode Available | 0 |
| Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy | Oct 13, 2022 | Generative Adversarial NetworkSpeaker anonymization | CodeCode Available | 0 |
| Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models | Jul 9, 2024 | coreference-resolutionCoreference Resolution | CodeCode Available | 0 |
| Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation | Dec 6, 2016 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild | Jan 8, 2024 | Language ModellingLarge Language Model | CodeCode Available | 0 |
| Efficient Speech Translation with Dynamic Latent Perceivers | Oct 28, 2022 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| The Warmup Dilemma: How Learning Rate Strategies Impact Speech-to-Text Model Convergence | May 29, 2025 | Speech-to-Text | CodeCode Available | 0 |
| Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text Translation | Oct 24, 2022 | SegmentationSpeech-to-Text | CodeCode Available | 0 |
| Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation | Feb 9, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation | Jul 3, 2022 | DecoderSpeech-to-Text | CodeCode Available | 0 |
| An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation | Aug 28, 2023 | Machine TranslationNMT | CodeCode Available | 0 |
| Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks | Feb 19, 2025 | Automatic Speech Recognitionspeech-recognition | CodeCode Available | 0 |
| Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding | Jan 10, 2025 | Automatic Speech RecognitionClassification | CodeCode Available | 0 |
| Direct speech-to-speech translation with a sequence-to-sequence model | Apr 12, 2019 | Speech SynthesisSpeech-to-Speech Translation | CodeCode Available | 0 |
| MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition | Nov 29, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Tools and resources for Romanian text-to-speech and speech-to-text applications | Feb 15, 2018 | speech-recognitionSpeech Recognition | CodeCode Available | 0 |
| Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text models | Jun 29, 2022 | Intent ClassificationSlot Filling | CodeCode Available | 0 |
| Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation | Dec 6, 2019 | FormMachine Translation | CodeCode Available | 0 |
| Revisiting End-to-End Speech-to-Text Translation From Scratch | Jun 9, 2022 | Decoderspeech-recognition | CodeCode Available | 0 |
| mask-Net: Learning Context Aware Invariant Features using Adversarial Forgetting (Student Abstract) | Nov 25, 2020 | Speech-to-Text | CodeCode Available | 0 |
| Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions | Feb 13, 2018 | BIG-bench Machine LearningManagement | CodeCode Available | 0 |
| Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects | Jun 27, 2024 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 0 |
| SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training | Oct 7, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation | Nov 3, 2024 | speech-recognitionSpeech Recognition | CodeCode Available | 0 |
| SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition | Apr 5, 2021 | speech-recognitionSpeech Recognition | CodeCode Available | 0 |
| A Change of Heart: Improving Speech Emotion Recognition through Speech-to-Text Modality Conversion | Jul 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation | Oct 13, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation System | May 29, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Scribosermo: Fast Speech-to-Text models for German and other Languages | Oct 15, 2021 | Speech RecognitionSpeech-to-Text | CodeCode Available | 0 |
| CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation Units | Jul 19, 2024 | Machine TranslationSpeech-to-Text | CodeCode Available | 0 |
| Contextualized Translation of Automatically Segmented Speech | Aug 5, 2020 | SegmentationSentence | CodeCode Available | 0 |
| A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture | Jan 6, 2022 | Speech-to-Texttext-to-speech | CodeCode Available | 0 |
| Audio Adversarial Examples: Targeted Attacks on Speech-to-Text | Jan 5, 2018 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection | Jun 10, 2024 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| Streaming Sequence Transduction through Dynamic Compression | Feb 2, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNN | Jul 24, 2023 | Automatic Speech RecognitionSentiment Analysis | CodeCode Available | 0 |
| fairseq S2T: Fast Speech-to-Text Modeling with fairseq | Oct 11, 2020 | Machine TranslationMulti-Task Learning | CodeCode Available | 0 |
| Attentively Embracing Noise for Robust Latent Representation in BERT | Dec 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation | Jun 20, 2024 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| Automatic Quality Assessment for Speech Translation Using Joint ASR and MT Features | Sep 20, 2016 | Speech-to-TextTranslation | CodeCode Available | 0 |
| Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair | Apr 18, 2024 | Machine TranslationSpeech-to-Text | CodeCode Available | 0 |
| Spanish and English Phoneme Recognition by Training on Simulated Classroom Audio Recordings of Collaborative Learning Environments | Feb 21, 2022 | Data AugmentationPhoneme Recognition | CodeCode Available | 0 |
| ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit | Apr 10, 2023 | BenchmarkingSimultaneous Speech-to-Text Translation | CodeCode Available | 0 |
| SparQLe: Speech Queries to Text Translation Through LLMs | Feb 13, 2025 | Speech-to-TextSpeech-to-Text Translation | CodeCode Available | 0 |
| Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration Approach | Sep 13, 2024 | In-Context LearningRetrieval | CodeCode Available | 0 |
| End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic Hands | Sep 22, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |