Speech-to-Text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 403 papers

Title	Date	Tasks	Status	Hype
DuplexMamba: Enhancing Real-time Speech Conversations with Duplex and Streaming Capabilities	Feb 16, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
SparQLe: Speech Queries to Text Translation Through LLMs	Feb 13, 2025	Speech-to-TextSpeech-to-Text Translation	CodeCode Available	0
Speech to Speech Translation with Translatotron: A State of the Art Review	Feb 9, 2025	speech-recognitionSpeech Recognition	—Unverified	0
High-Fidelity Simultaneous Speech-To-Speech Translation	Feb 5, 2025	DecoderSimultaneous Speech-to-Speech Translation	CodeCode Available	5
When End-to-End is Overkill: Rethinking Cascaded Speech-to-Text Translation	Feb 1, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia	Jan 23, 2025	Emotion RecognitionEvent Detection	CodeCode Available	3
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning	Jan 15, 2025	cross-modal alignmentLanguage Modeling	CodeCode Available	1
MinMo: A Multimodal Large Language Model for Seamless Voice Interaction	Jan 10, 2025	Instruction FollowingLanguage Modeling	—Unverified	0
Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding	Jan 10, 2025	Automatic Speech RecognitionClassification	CodeCode Available	0
Existential Crisis: A Social Robot's Reason for Being	Jan 6, 2025	Speech-to-Text	—Unverified	0
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison	Jan 4, 2025	DecoderKnowledge Distillation	—Unverified	0
Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages	Dec 31, 2024	Automatic Speech RecognitionData Augmentation	—Unverified	0
How "Real" is Your Real-Time Simultaneous Speech-to-Text Translation System?	Dec 24, 2024	Simultaneous Speech-to-Text TranslationSpeech-to-Text	—Unverified	0
Fine-tuning Whisper on Low-Resource Languages for Real-World Applications	Dec 20, 2024	FormSentence	CodeCode Available	1
Greek2MathTex: A Greek Speech-to-Text Framework for LaTeX Equations Generation	Dec 11, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0
Representation Purification for End-to-End Speech Translation	Dec 5, 2024	Machine TranslationRhythm	—Unverified	0
Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of a Virtual Campus Environment with OpenAI GPT Integration with Unity 3D	Nov 19, 2024	Speech-to-Texttext-to-speech	—Unverified	0
Whisper Finetuning on Nepali Language	Nov 19, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Isochrony-Controlled Speech-to-Text Translation: A study on translating from Sino-Tibetan to Indo-European Languages	Nov 11, 2024	DecoderMachine Translation	—Unverified	0
NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts	Nov 8, 2024	Mixture-of-ExpertsOptical Character Recognition (OCR)	—Unverified	0
CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR	Nov 7, 2024	Language ModellingLarge Language Model	—Unverified	0
LASER: Attention with Exponential Transformation	Nov 5, 2024	Speech-to-Text	—Unverified	0
SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation	Nov 3, 2024	speech-recognitionSpeech Recognition	—Unverified	0
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?	Oct 31, 2024	Rhythmspeech-recognition	—Unverified	0
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization	Oct 29, 2024	GPURetrieval	—Unverified	0

Show:10 25 50

← PrevPage 2 of 17Next →

No leaderboard results yet.