SOTAVerified

Automatic Speech Recognition

Papers

Showing 651700 of 3174 papers

TitleStatusHype
Mel-FullSubNet: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification0
How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena0
Ain't Misbehavin' -- Using LLMs to Generate Expressive Robot Behavior in Conversations with the Tabletop Robot Haru0
UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models0
An Embarrassingly Simple Approach for LLM with Strong ASR CapacityCode2
The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models0
AIR-Bench: Benchmarking Large Audio-Language Models via Generative ComprehensionCode2
The Balancing Act: Unmasking and Alleviating ASR Biases in Portuguese0
Self-consistent context aware conformer transducer for speech recognition0
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech RecognitionCode1
Paralinguistics-Aware Speech-Empowered Large Language Models for Natural ConversationCode2
Progressive unsupervised domain adaptation for ASR using ensemble models and multi-stage training0
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASRCode1
Resolving Transcription Ambiguity in Spanish: A Hybrid Acoustic-Lexical System for Punctuation Restoration0
A Comprehensive Study of the Current State-of-the-Art in Nepali Automatic Speech Recognition Systems0
Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens0
Whispering in Norwegian: Navigating Orthographic and Dialectic Challenges0
Digits micro-model for accurate and secure transactions0
Streaming Sequence Transduction through Dynamic CompressionCode0
AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents0
Byte Pair Encoding Is All You Need For Automatic Bengali Speech Recognition0
Toward Practical Automatic Speech Recognition and Post-Processing: a Call for Explainable Error Benchmark Guideline0
MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction0
Locality enhanced dynamic biasing and sampling strategies for contextual ASR0
Consistency Based Unsupervised Self-training For ASR Personalisation0
Keep Decoding Parallel with Effective Knowledge Distillation from Language Models to End-to-end Speech Recognisers0
Using Large Language Model for End-to-End Chinese ASR and NER0
Word-Level ASR Quality Estimation for Efficient Corpus Sampling and Post-Editing through Analyzing Attentions of a Reference-Free MetricCode1
Large Language Models are Efficient Learners of Noise-Robust Speech RecognitionCode2
Contextualized Automatic Speech Recognition with Attention-Based Bias Phrase Boosted Beam Search0
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks0
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition0
SlideAVSR: A Dataset of Paper Explanation Videos for Audio-Visual Speech Recognition0
On Speech Pre-emphasis as a Simple and Inexpensive Method to Boost Speech Enhancement0
Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization0
Improving ASR Contextual Biasing with Guided Attention0
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription0
SeMaScore : a new evaluation metric for automatic speech recognition tasks0
Cascaded Cross-Modal Transformer for Audio-Textual ClassificationCode0
Promptformer: Prompted Conformer Transducer for ASR0
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization0
Transcending Controlled Environments Assessing the Transferability of ASRRobust NLU Models to Real-World Applications0
XLS-R Deep Learning Model for Multilingual ASR on Low- Resource Languages: Indonesian, Javanese, and Sundanese0
UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction0
End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec20
Useful Blunders: Can Automated Speech Recognition Errors Improve Downstream Dementia Classification?0
Continuously Learning New Words in Automatic Speech Recognition0
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR0
BS-PLCNet: Band-split Packet Loss Concealment Network with Multi-task Learning Framework and Multi-discriminators0
Show:102550
← PrevPage 14 of 64Next →

No leaderboard results yet.