SOTAVerified

Automatic Speech Recognition

Papers

Showing 401450 of 3174 papers

TitleStatusHype
Comparing Discrete and Continuous Space LLMs for Speech Recognition0
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition0
Speaker Tagging Correction With Non-Autoregressive Language Models0
Developing an End-to-End Framework for Predicting the Social Communication Severity Scores of Children with Autism Spectrum Disorder0
Advancing Multi-talker ASR Performance with Large Language Models0
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction0
Measuring the Accuracy of Automatic Speech Recognition SolutionsCode0
Beyond Levenshtein: Leveraging Multiple Algorithms for Robust Word Error Rate Computations And Granular Error ClassificationsCode0
Literary and Colloquial Dialect Identification for Tamil using Acoustic Features0
Automatic recognition and detection of aphasic natural speech0
Research Advances and New Paradigms for Biology-inspired Spiking Neural Networks0
Self-supervised Speech Representations Still Struggle with African American Vernacular EnglishCode0
MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues0
Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models0
Developing vocal system impaired patient-aimed voice quality assessment approach using ASR representation-included multiple features0
The State of Commercial Automatic French Legal Speech Recognition Systems and their Impact on Court Reporters et al0
Parameter-Efficient Transfer Learning under Federated Learning for Automatic Speech Recognition0
Recording for Eyes, Not Echoing to Ears: Contextualized Spoken-to-Written Conversion of ASR Transcripts0
Enhancing Large Language Model-based Speech Recognition by Contextualization for Rare and Ambiguous Words0
DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement0
SER Evals: In-domain and Out-of-domain Benchmarking for Speech Emotion RecognitionCode1
Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation0
Enhancing Dialogue Speech Recognition with Robust Contextual Awareness via Noise Representation Learning0
Audio Enhancement for Computer Audition -- An Iterative Training Paradigm Using Sample Importance0
LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech RecognitionCode1
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing0
Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text0
MooER: LLM-based Speech Recognition and Translation Models from Moore ThreadsCode3
Preserving spoken content in voice anonymisation with character-level vocoder conditioningCode0
HydraFormer: One Encoder For All Subsampling RatesCode0
wav2graph: A Framework for Supervised Learning Knowledge Graph from SpeechCode2
MathBridge: A Large Corpus Dataset for Translating Spoken Mathematical Expressions into LaTeX Formulas for Improved Readability0
ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval0
Self-Supervised Learning for Multi-Channel Neural Transducer0
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion0
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic FeaturesCode1
SynesLM: A Unified Approach for Audio-visual Speech Recognition and Translation via Language Model and Synthetic Data0
Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation0
Towards interfacing large language models with ASR systems using confidence measures and prompting0
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition0
Leveraging Self-Supervised Models for Automatic Whispered Speech RecognitionCode0
Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses0
On the Effect of Purely Synthetic Training Data for Different Automatic Speech Recognition Architectures0
Scaling A Simple Approach to Zero-Shot Speech Recognition0
Improving Domain-Specific ASR with LLM-Generated Contextual Descriptions0
Sentiment Reasoning for HealthcareCode3
A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives0
The CHiME-8 DASR Challenge for Generalizable and Array Agnostic Distant Automatic Speech Recognition and Diarization0
Quantifying the Role of Textual Predictability in Automatic Speech Recognition0
Evolutionary Prompt Design for LLM-Based Post-ASR Error CorrectionCode1
Show:102550
← PrevPage 9 of 64Next →

No leaderboard results yet.