SOTAVerified|Agents Browse Leaderboard About Blog

Speech Tokenization

Speech tokenization is the task of representing speech signals as a sequence of discrete units. Such representations can be later used for various downstream tasks including automatic speech recognition, text-to-speech, etc. Such representation serves as the basis of Speech Language Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 21 papers

Title	Date	Tasks	Status	Hype
DM-Codec: Distilling Multimodal Representations for Speech Tokenization	Oct 19, 2024	Self-Supervised LearningSpeech Tokenization	CodeCode Available	2
Sylber: Syllabic Embedding Representation of Speech from Raw Audio	Oct 9, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
SyllableLM: Learning Coarse Semantic Units for Speech Language Models	Oct 5, 2024	ClusteringLanguage Modeling	CodeCode Available	2
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT	Sep 16, 2024	Acoustic Unit DiscoveryClustering	CodeCode Available	1
LAST: Language Model Aware Speech Tokenization	Sep 5, 2024	Language ModelingLanguage Modelling	—Unverified	0
STAB: Speech Tokenizer Assessment Benchmark	Sep 4, 2024	Speech Tokenization	—Unverified	0
dMel: Speech Tokenization made Simple	Jul 22, 2024	DecoderLanguage Modeling	CodeCode Available	1
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing	Jun 4, 2024	DecoderLanguage Modeling	—Unverified	0
Scaling Properties of Speech Language Models	Mar 31, 2024	Speech Tokenization	—Unverified	0
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data	Feb 12, 2024	DecoderDisentanglement	—Unverified	0

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.