| MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition | Nov 29, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Modular Speech-to-Text Translation for Zero-Shot Cross-Modal Transfer | Oct 5, 2023 | Speech-to-TextSpeech-to-Text Translation | —Unverified | 0 |
| Multi-Discriminator Sobolev Defense-GAN Against Adversarial Attacks for End-to-End Speech Systems | Mar 15, 2021 | Speech-to-Text | —Unverified | 0 |
| Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search | Oct 31, 2022 | Emotion RecognitionNeural Architecture Search | —Unverified | 0 |
| Multilingual Speech Translation from Efficient Finetuning of Pretrained Models | Aug 1, 2021 | DecoderSpeech-to-Text | —Unverified | 0 |
| Multi-teacher Distillation for Multilingual Spelling Correction | Nov 20, 2023 | Multilingual NLPSpeech-to-Text | —Unverified | 0 |
| NAIST Simultaneous Speech-to-Text Translation System for IWSLT 2022 | May 1, 2022 | SegmentationSimultaneous Speech-to-Text Translation | —Unverified | 0 |
| NAIST Simultaneous Speech Translation System for IWSLT 2024 | Jun 30, 2024 | Speech-to-Speech TranslationSpeech-to-Text | —Unverified | 0 |
| Named Entity Detection and Injection for Direct Speech Translation | Oct 21, 2022 | SentenceSpeech-to-Text | —Unverified | 0 |
| Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data | Feb 8, 2024 | named-entity-recognitionNamed Entity Recognition | —Unverified | 0 |
| Natural Language Interactions in Autonomous Vehicles: Intent Detection and Slot Filling from Passenger Utterances | Apr 23, 2019 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Natural Language Robot Programming: NLP integrated with autonomous robotic grasping | Apr 6, 2023 | Robotic GraspingSpeech-to-Text | —Unverified | 0 |
| NaturalTurn: A Method to Segment Transcripts into Naturalistic Conversational Turns | Mar 22, 2024 | Speech-to-Text | —Unverified | 0 |
| NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts | Nov 8, 2024 | Mixture-of-ExpertsOptical Character Recognition (OCR) | —Unverified | 0 |
| Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision | Feb 26, 2025 | Audio SynthesisAutomatic Speech Recognition | —Unverified | 0 |
| N-gram Boosting: Improving Contextual Biasing with Normalized N-gram Targets | Aug 4, 2023 | Speech-to-Text | —Unverified | 0 |
| Noise in Speech-to-Text Voice: Analysis of Errors and Feasibility of Phonetic Similarity for Their Correction | Dec 1, 2013 | Decision MakingSpeech Recognition | —Unverified | 0 |
| Numerically Grounded Language Models for Semantic Error Correction | Aug 14, 2016 | Fact CheckingGrammatical Error Correction | —Unverified | 0 |
| Advancing STT for Low-Resource Real-World Speech | Jun 10, 2025 | SentenceSpeech-to-Text | —Unverified | 0 |
| OAVA: the open audio-visual archives aggregator | Dec 16, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| On decoder-only architecture for speech-to-text and large language model integration | Jul 8, 2023 | DecoderLanguage Modeling | —Unverified | 0 |
| Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture | Jul 5, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| On the Design of Strategic Task Recommendations for Sustainable Crowdsourcing-Based Content Moderation | Jun 4, 2021 | Recommendation SystemsSpeech-to-Text | —Unverified | 0 |
| On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models | Jun 13, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| On the Feasibility of Fully AI-automated Vishing Attacks | Sep 20, 2024 | Large Language ModelSpeech-to-Text | —Unverified | 0 |