SOTAVerified

Automatic Speech Recognition

Papers

Showing 30013050 of 3174 papers

TitleStatusHype
LIP-RTVE: An Audiovisual Database for Continuous Spanish in the WildCode0
Guiding Frame-Level CTC Alignments Using Self-knowledge DistillationCode0
Seq2seq for Automatic Paraphasia Detection in Aphasic SpeechCode0
Listening and Seeing Again: Generative Error Correction for Audio-Visual Speech RecognitionCode0
End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic HandsCode0
Detecting and Defending Against Adversarial Attacks on Automatic Speech Recognition via Diffusion ModelsCode0
Sequence Labeling Approach to the Task of Sentence Boundary DetectionCode0
Splitformer: An improved early-exit architecture for automatic speech recognition on edge devicesCode0
Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASRCode0
Spoken English Intelligibility Remediation with PocketSphinx Alignment and Feature Extraction Improves Substantially over the State of the ArtCode0
Two-stage Textual Knowledge Distillation for End-to-End Spoken Language UnderstandingCode0
Growing Trees on Sounds: Assessing Strategies for End-to-End Dependency Parsing of SpeechCode0
Realizing Petabyte Scale Acoustic ModelingCode0
LLM-based Generative Error Correction for Rare Words with Synthetic Data and Phonetic ContextCode0
Spoken Language Intent Detection using Confusion2VecCode0
Addressing Pitfalls in Auditing Practices of Automatic Speech Recognition Technologies: A Case Study of People with AphasiaCode0
ASR Benchmarking: Need for a More Representative Conversational DatasetCode0
End to End ASR System with Automatic Punctuation InsertionCode0
Detecting Adversarial Examples for Speech Recognition via Uncertainty QuantificationCode0
BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation SystemCode0
Deep Spiking Neural Networks for Large Vocabulary Automatic Speech RecognitionCode0
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASRCode0
A Comparison of Adaptation Techniques and Recurrent Neural Network ArchitecturesCode0
ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language ModelsCode0
Textless Dependency Parsing by Labeled Sequence PredictionCode0
Vedavani: A Benchmark Corpus for ASR on Vedic Sanskrit PoetryCode0
Sequential Randomized Smoothing for Adversarially Robust Speech RecognitionCode0
OLISIA: a Cascade System for Spoken Dialogue State TrackingCode0
Greek2MathTex: A Greek Speech-to-Text Framework for LaTeX Equations GenerationCode0
Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNNCode0
Efficient Ensemble for Multimodal Punctuation Restoration using Time-Delay Neural NetworkCode0
SSR7000: A Synchronized Corpus of Ultrasound Tongue Imaging for End-to-End Silent Speech RecognitionCode0
A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision QuantizationCode0
Recurrent DNNs and its Ensembles on the TIMIT Phone Recognition TaskCode0
CHSER: A Dataset and Case Study on Generative Speech Error Correction for Child ASRCode0
A low latency attention module for streaming self-supervised speech representation learningCode0
Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text NormalizationCode0
RED-ACE: Robust Error Detection for ASR using Confidence EmbeddingsCode0
Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA ProjectCode0
Automatic Speech Recognition and Query By Example for Creole Languages DocumentationCode0
Reducing Language confusion for Code-switching Speech Recognition with Token-level Language DiarizationCode0
Stable Distillation: Regularizing Continued Pre-training for Low-Resource Automatic Speech RecognitionCode0
Topic Identification For Spontaneous Speech: Enriching Audio Features With Embedded Linguistic InformationCode0
LSTM Benchmarks for Deep Learning FrameworksCode0
Thai Wav2Vec2.0 with CommonVoice V8Code0
LT-LM: a novel non-autoregressive language model for single-shot lattice rescoringCode0
VenoMave: Targeted Poisoning Against Speech RecognitionCode0
Unsupervised Rhythm and Voice Conversion to Improve ASR on Dysarthric SpeechCode0
AequeVox: Automated Fairness Testing of Speech Recognition SystemsCode0
Deep Learning for Audio Signal ProcessingCode0
Show:102550
← PrevPage 61 of 64Next →

No leaderboard results yet.