SOTAVerified|Agents Browse Leaderboard About Blog

Speech Tokenization

Speech tokenization is the task of representing speech signals as a sequence of discrete units. Such representations can be later used for various downstream tasks including automatic speech recognition, text-to-speech, etc. Such representation serves as the basis of Speech Language Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 21 papers

Title	Date	Tasks	Status
Scaling Properties of Speech Language Models	Mar 31, 2024	Speech Tokenization	—Unverified
STAB: Speech Tokenizer Assessment Benchmark	Sep 4, 2024	Speech Tokenization	—Unverified
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation	Mar 2, 2025	DecoderRepresentation Learning	—Unverified
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data	Feb 12, 2024	DecoderDisentanglement	—Unverified
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models	Oct 31, 2024	DecoderResynthesis	—Unverified
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing	Jun 4, 2024	DecoderLanguage Modeling	—Unverified
Exploring the Effect of Segmentation and Vocabulary Size on Speech Tokenization for Speech Language Models	May 23, 2025	Speech TokenizationSpoken Language Understanding	—Unverified
Factorized RVQ-GAN For Disentangled Speech Tokenization	Jun 18, 2025	DisentanglementKnowledge Distillation	—Unverified
Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English	May 20, 2025	Automatic Speech Recognitionspeech-recognition	—Unverified
LAST: Language Model Aware Speech Tokenization	Sep 5, 2024	Language ModelingLanguage Modelling	—Unverified

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.