SOTAVerified

Speech-to-Text

Papers

Showing 101150 of 403 papers

TitleStatusHype
Revisiting End-to-End Speech-to-Text Translation From ScratchCode0
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning AbstractionsCode0
Audio Adversarial Examples: Targeted Attacks on Speech-to-TextCode0
Re-Translation Strategies For Long Form, Simultaneous, Spoken Language TranslationCode0
Towards End-to-End Training of Automatic Speech Recognition for Nigerian PidginCode0
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech TranslationCode0
SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognitionCode0
Attentively Embracing Noise for Robust Latent Representation in BERTCode0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language IdentificationCode0
Pre-training on high-resource speech recognition improves low-resource speech-to-text translationCode0
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech RecognitionCode0
Optimizing Rare Word Accuracy in Direct Speech Translation with a Retrieval-and-Demonstration ApproachCode0
M-Adapter: Modality Adaptation for End-to-End Speech-to-Text TranslationCode0
Measuring the Effect of Transcription Noise on Downstream Language Understanding TasksCode0
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text TranslationCode0
Code-Switched Urdu ASR for Noisy Telephonic Environment using Data Centric Approach with Hybrid HMM and CNN-TDNNCode0
Kurdish (Sorani) Speech to Text: Presenting an Experimental DatasetCode0
Let's Give a Voice to Conversational Agents in Virtual RealityCode0
ESPnet-ST-v2: Multipurpose Spoken Language Translation ToolkitCode0
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak SupervisionCode0
Joint CTC-Attention based End-to-End Speech Recognition using Multi-task LearningCode0
LibriS2S: A German-English Speech-to-Speech Translation CorpusCode0
Greek2MathTex: A Greek Speech-to-Text Framework for LaTeX Equations GenerationCode0
End to End ASR System with Automatic Punctuation InsertionCode0
End-to-End Automatic Speech Translation of AudiobooksCode0
End-to-End Learning of Speech 2D Feature-Trajectory for Prosthetic HandsCode0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language UnderstandingCode0
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text modelsCode0
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the WildCode0
Infusing Future Information into Monotonic Attention Through Language ModelsCode0
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language ModelsCode0
Challenges and Opportunities of Speech Recognition for Bengali Language0
Efficient Monotonic Multihead Attention0
Effectively pretraining a speech translation decoder with Machine Translation data0
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?0
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization0
A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks0
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text0
Direct Simultaneous Speech-to-Text Translation Assisted by Synchronized Streaming ASR0
Application-Agnostic Language Modeling for On-Device ASR0
Direct Punjabi to English speech translation using discrete units0
Digits micro-model for accurate and secure transactions0
Bridging the Modality Gap for Speech-to-Text Translation0
Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum0
Development of Natural Language Processing Tools for Cook Islands M\=aori0
Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models0
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR0
A Comparative Study on End-to-end Speech to Text Translation0
Developing automatic verbatim transcripts for international multilingual meetings: an end-to-end solution0
Developing a Speech Recognition System for Recognizing Tonal Speech Signals Using a Convolutional Neural Network0
Show:102550
← PrevPage 3 of 9Next →

No leaderboard results yet.