SOTAVerified|Agents Browse Leaderboard About Blog

Speech Tokenization

Speech tokenization is the task of representing speech signals as a sequence of discrete units. Such representations can be later used for various downstream tasks including automatic speech recognition, text-to-speech, etc. Such representation serves as the basis of Speech Language Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 21 papers

Title	Date	Tasks	Status	Hype	Score
Sylber: Syllabic Embedding Representation of Speech from Raw Audio	Oct 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
DM-Codec: Distilling Multimodal Representations for Speech Tokenization	Oct 19, 2024	Self-Supervised LearningSpeech Tokenization	CodeCode Available	2	5
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling	Apr 9, 2025	Language ModelingLanguage Modelling	CodeCode Available	2	5
SyllableLM: Learning Coarse Semantic Units for Speech Language Models	Oct 5, 2024	ClusteringLanguage Modeling	CodeCode Available	2	5
dMel: Speech Tokenization made Simple	Jul 22, 2024	DecoderLanguage Modeling	CodeCode Available	1	5
Audio Jailbreak Attacks: Exposing Vulnerabilities in SpeechGPT in a White-Box Framework	May 24, 2025	Adversarial AttackSpeech Tokenization	CodeCode Available	1	5
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT	Sep 16, 2024	Acoustic Unit DiscoveryClustering	CodeCode Available	1	5
RepCodec: A Speech Representation Codec for Speech Tokenization	Aug 31, 2023	Language ModelingLanguage Modelling	CodeCode Available	1	5
BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term Detection	Nov 21, 2024	MambaSelf-Supervised Learning	CodeCode Available	0	5
Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English	May 20, 2025	Automatic Speech Recognitionspeech-recognition	—Unverified	0	0

Show:10 25 50

← PrevPage 1 of 3Next →

No leaderboard results yet.