SOTAVerified

Target Speaker Extraction

Extract the dialogue content of the specified target in a multi-person dialogue.

Papers

Showing 110 of 55 papers

TitleStatusHype
Metis: A Foundation Speech Generation Model with Masked Generative Pre-trainingCode9
Multi-Level Speaker Representation for Target Speaker ExtractionCode3
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker ExtractionCode3
TSELM: Target Speaker Extraction using Discrete Tokens and Language ModelsCode2
LauraTSE: Target Speaker Extraction using Auto-Regressive Decoder-Only Language ModelsCode1
USEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionCode1
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band ModelingCode1
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory AttentionCode1
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker ExtractionCode1
RTFS-Net: Recurrent Time-Frequency Modelling for Efficient Audio-Visual Speech SeparationCode1
Show:102550
← PrevPage 1 of 6Next →

No leaderboard results yet.