SOTAVerified

Speech-to-Text

Papers

Showing 201250 of 403 papers

TitleStatusHype
Improving Metrics for Speech Translation0
Application-Agnostic Language Modeling for On-Device ASR0
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks0
Improving Autoregressive NLP Tasks via Modular Linearized Attention0
Enhancing Speech-to-Speech Translation with Multiple TTS Targets0
ESPnet-ST-v2: Multipurpose Spoken Language Translation ToolkitCode0
Natural Language Robot Programming: NLP integrated with autonomous robotic grasping0
Improving the previous state-of-the-art Frisian ASR by fine-tuning XLS-R0
wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts0
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages0
Improving Medical Speech-to-Text Accuracy with Vision-Language Pre-training Model0
PATCorrect: Non-autoregressive Phoneme-augmented Transformer for ASR Error Correction0
Characterizing Financial Market Coverage using Artificial Intelligence0
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition0
Pushing the performances of ASR models on English and Spanish accents0
WACO: Word-Aligned Contrastive Learning for Speech TranslationCode0
M3ST: Mix at Three Levels for Speech Translation0
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech RecognitionCode0
Handling and extracting key entities from customer conversations using Speech recognition and Named Entity recognition0
Multilingual Speech Emotion Recognition With Multi-Gating Mechanism and Neural Architecture Search0
Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili0
Efficient Speech Translation with Dynamic Latent PerceiversCode0
Don't Discard Fixed-Window Audio Segmentation in Speech-to-Text TranslationCode0
Named Entity Detection and Injection for Direct Speech Translation0
Improving Semi-supervised End-to-end Automatic Speech Recognition using CycleGAN and Inter-domain Losses0
Simple and Effective Unsupervised Speech Translation0
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker PrivacyCode0
CTC Alignments Improve Autoregressive Translation0
SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-trainingCode0
Speech-to-Text and Evaluation of Multiple Machine Translation Systems0
Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks0
Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech0
Extending RNN-T-based speech recognition systems with emotion and language classification0
RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks0
M-Adapter: Modality Adaptation for End-to-End Speech-to-Text TranslationCode0
System Description on Automatic Simultaneous Translation Workshop0
Swiss German Speech to Text system evaluation0
Findings of the Third Workshop on Automatic Simultaneous Translation0
Language Model Augmented Monotonic Attention for Simultaneous Translation0
Finstreder: Simple and fast Spoken Language Understanding with Finite State Transducers using modern Speech-to-Text modelsCode0
Developing a Speech Recognition System for Recognizing Tonal Speech Signals Using a Convolutional Neural Network0
Revisiting End-to-End Speech-to-Text Translation From ScratchCode0
Towards Large Vocabulary Kazakh-Russian Sign Language Dataset: KRSL-OnlineSchool0
A Semi-Automated Live Interlingual Communication Workflow Featuring Intralingual Respeaking: Evaluation and Benchmarking0
The Nós Project: Opening routes for the Galician language in the field of language technologies0
Clinical Dialogue Transcription Error Correction using Seq2Seq Models0
Semantic-preserved Communication System for Highly Efficient Speech Transmission0
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation0
Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language0
Design of a novel Korean learning application for efficient pronunciation correction0
Show:102550
← PrevPage 5 of 9Next →

No leaderboard results yet.