SOTAVerified

Speech-to-Text

Papers

Showing 126150 of 403 papers

TitleStatusHype
CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation UnitsCode0
AI-Powered Immersive Assistance for Interactive Task Execution in Industrial Environments0
Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks0
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language ModelsCode0
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation0
Investigating Decoder-only Large Language Models for Speech-to-text Translation0
Towards Unsupervised Speaker Diarization System for Multilingual Telephone Calls Using Pre-trained Whisper Model and Mixture of Sparse Autoencoders0
NAIST Simultaneous Speech Translation System for IWSLT 20240
Calibrated SVM for Probabilistic Classification of In-Vehicle Voices into Vehicle Commands via Voice-to-Text LLM TransformationCode0
Voices Unheard: NLP Resources and Models for Yorùbá Regional DialectsCode0
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech TranslationCode0
Transferable speech-to-text large language model alignment module0
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving0
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models0
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?0
Synthetic Query Generation using Large Language Models for Virtual Assistants0
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History SelectionCode0
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications0
Semantic MIMO Systems for Speech-to-Text Transmission0
A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)0
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language PairCode0
NaturalTurn: A Method to Segment Transcripts into Naturalistic Conversational Turns0
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking0
Robust Semantic Communications for Speech Transmission0
Compact Speech Translation Models via Discrete Speech Units Pretraining0
Show:102550
← PrevPage 6 of 17Next →

No leaderboard results yet.