SOTAVerified|Agents Browse Leaderboard About Blog

Speech Tokenization

Speech tokenization is the task of representing speech signals as a sequence of discrete units. Such representations can be later used for various downstream tasks including automatic speech recognition, text-to-speech, etc. Such representation serves as the basis of Speech Language Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–20 of 21 papers

Title	Date	Tasks	Status
Exploring the Effect of Segmentation and Vocabulary Size on Speech Tokenization for Speech Language Models	May 23, 2025	Speech TokenizationSpoken Language Understanding	—Unverified
Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English	May 20, 2025	Automatic Speech Recognitionspeech-recognition	—Unverified
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation	Mar 2, 2025	DecoderRepresentation Learning	—Unverified
Recent Advances in Discrete Speech Tokens: A Review	Feb 10, 2025	Language ModelingLanguage Modelling	—Unverified
BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term Detection	Nov 21, 2024	MambaSelf-Supervised Learning	CodeCode Available
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models	Oct 31, 2024	DecoderResynthesis	—Unverified
LAST: Language Model Aware Speech Tokenization	Sep 5, 2024	Language ModelingLanguage Modelling	—Unverified
STAB: Speech Tokenizer Assessment Benchmark	Sep 4, 2024	Speech Tokenization	—Unverified
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing	Jun 4, 2024	DecoderLanguage Modeling	—Unverified
Scaling Properties of Speech Language Models	Mar 31, 2024	Speech Tokenization	—Unverified

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.