Speech-to-Text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 403 papers

Title	Date	Tasks	Status	Hype
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking	Mar 13, 2024	Chinese Spell CheckingIn-Context Learning	—Unverified	0
Robust Semantic Communications for Speech Transmission	Mar 8, 2024	Generative Adversarial NetworkSemantic Communication	—Unverified	0
Brilla AI: AI Contestant for the National Science and Maths Quiz	Mar 4, 2024	MathQuestion Answering	CodeCode Available	1
Compact Speech Translation Models via Discrete Speech Units Pretraining	Feb 29, 2024	DecoderSelf-Supervised Learning	—Unverified	0
Direct Punjabi to English speech translation using discrete units	Feb 25, 2024	Speech-to-Speech TranslationSpeech-to-Text	—Unverified	0
Hands-Free VR	Feb 23, 2024	DiversityLanguage Modelling	—Unverified	0
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification	Feb 20, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?	Feb 19, 2024	Speech-to-TextSpeech-to-Text Translation	—Unverified	0
Pushing the Limits of Zero-shot End-to-End Speech Translation	Feb 16, 2024	Speech-to-TextSpeech-to-Text Translation	CodeCode Available	1
Syllable based DNN-HMM Cantonese Speech to Text System	Feb 13, 2024	speech-recognitionSpeech Recognition	—Unverified	0
Careless Whisper: Speech-to-Text Hallucination Harms	Feb 12, 2024	HallucinationLanguage Modeling	CodeCode Available	0
Named Entity Recognition for Address Extraction in Speech-to-Text Transcriptions Using Synthetic Data	Feb 8, 2024	named-entity-recognitionNamed Entity Recognition	—Unverified	0
Digits micro-model for accurate and secure transactions	Feb 2, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
A Case Study on Filtering for End-to-End Speech Translation	Feb 2, 2024	Speech-to-Speech TranslationSpeech-to-Text	—Unverified	0
Streaming Sequence Transduction through Dynamic Compression	Feb 2, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0
Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases	Feb 1, 2024	speech-recognitionSpeech Recognition	—Unverified	0
Benchmarking Large Multimodal Models against Common Corruptions	Jan 22, 2024	BenchmarkingImage to text	CodeCode Available	1
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks	Jan 18, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild	Jan 8, 2024	Language ModellingLarge Language Model	CodeCode Available	0
Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision	Dec 30, 2023	Speech-to-TextSpeech-to-Text Translation	CodeCode Available	0
OAVA: the open audio-visual archives aggregator	Dec 16, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Revisiting the Entropy Semiring for Neural Speech Recognition	Dec 13, 2023	speech-recognitionSpeech Recognition	—Unverified	0
Efficient Monotonic Multihead Attention	Dec 7, 2023	Simultaneous Speech-to-Text TranslationSpeech-to-Text	—Unverified	0
End-to-End Speech-to-Text Translation: A Survey	Dec 2, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Multi-teacher Distillation for Multilingual Spelling Correction	Nov 20, 2023	Multilingual NLPSpeech-to-Text	—Unverified	0

Show:10 25 50

← PrevPage 5 of 17Next →

No leaderboard results yet.