SOTAVerified

Speech Tokenization

Speech tokenization is the task of representing speech signals as a sequence of discrete units. Such representations can be later used for various downstream tasks including automatic speech recognition, text-to-speech, etc. Such representation serves as the basis of Speech Language Models.

Papers

Showing 1120 of 21 papers

TitleStatusHype
Exploring the Effect of Segmentation and Vocabulary Size on Speech Tokenization for Speech Language Models0
Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English0
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation0
Recent Advances in Discrete Speech Tokens: A Review0
BEST-STD: Bidirectional Mamba-Enhanced Speech Tokenization for Spoken Term DetectionCode0
DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models0
LAST: Language Model Aware Speech Tokenization0
STAB: Speech Tokenizer Assessment Benchmark0
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing0
Scaling Properties of Speech Language Models0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.