Speech-to-Text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 403 papers

Title	Date	Tasks	Status
CoVoSwitch: Machine Translation of Synthetic Code-Switched Text Based on Intonation Units	Jul 19, 2024	Machine TranslationSpeech-to-Text	CodeCode Available
AI-Powered Immersive Assistance for Interactive Task Execution in Industrial Environments	Jul 12, 2024	Language ModelingLanguage Modelling	—Unverified
Evaluating Voice Command Pipelines for Drone Control: From STT and LLM to Direct Classification and Siamese Networks	Jul 10, 2024	Language ModelingLanguage Modelling	—Unverified
Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models	Jul 9, 2024	coreference-resolutionCoreference Resolution	CodeCode Available
Finetuning End-to-End Models for Estonian Conversational Spoken Language Translation	Jul 4, 2024	Machine Translationspeech-recognition	—Unverified
Investigating Decoder-only Large Language Models for Speech-to-text Translation	Jul 3, 2024	Decoderparameter-efficient fine-tuning	—Unverified
Towards Unsupervised Speaker Diarization System for Multilingual Telephone Calls Using Pre-trained Whisper Model and Mixture of Sparse Autoencoders	Jul 2, 2024	Clusteringspeaker-diarization	—Unverified
NAIST Simultaneous Speech Translation System for IWSLT 2024	Jun 30, 2024	Speech-to-Speech TranslationSpeech-to-Text	—Unverified
Calibrated SVM for Probabilistic Classification of In-Vehicle Voices into Vehicle Commands via Voice-to-Text LLM Transformation	Jun 28, 2024	Speech-to-Texttext-classification	CodeCode Available
Voices Unheard: NLP Resources and Models for Yorùbá Regional Dialects	Jun 27, 2024	Automatic Speech RecognitionMachine Translation	CodeCode Available
SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation	Jun 20, 2024	Speech-to-TextSpeech-to-Text Translation	—Unverified
Transferable speech-to-text large language model alignment module	Jun 19, 2024	Language ModelingLanguage Modelling	—Unverified
CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving	Jun 16, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?	Jun 11, 2024	Contrastive LearningSpeech Synthesis	—Unverified
Synthetic Query Generation using Large Language Models for Virtual Assistants	Jun 10, 2024	Information Retrievalspeech-recognition	—Unverified
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection	Jun 10, 2024	Speech-to-TextSpeech-to-Text Translation	—Unverified
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications	May 19, 2024	Language ModelingLanguage Modelling	—Unverified
Semantic MIMO Systems for Speech-to-Text Transmission	May 13, 2024	Semantic CommunicationSpeech-to-Text	—Unverified
A Toolchain for Comprehensive Audio/Video Analysis Using Deep Learning Based Multimodal Approach (A use case of riot or violent context detection)	May 2, 2024	Acoustic Scene ClassificationEvent Detection	—Unverified
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair	Apr 18, 2024	Machine TranslationSpeech-to-Text	CodeCode Available
NaturalTurn: A Method to Segment Transcripts into Naturalistic Conversational Turns	Mar 22, 2024	Speech-to-Text	—Unverified
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking	Mar 13, 2024	Chinese Spell CheckingIn-Context Learning	—Unverified
Robust Semantic Communications for Speech Transmission	Mar 8, 2024	Generative Adversarial NetworkSemantic Communication	—Unverified
Compact Speech Translation Models via Discrete Speech Units Pretraining	Feb 29, 2024	DecoderSelf-Supervised Learning	—Unverified

Show:10 25 50

← PrevPage 6 of 17Next →

No leaderboard results yet.