SOTAVerified

Target Speaker Extraction

Extract the dialogue content of the specified target in a multi-person dialogue.

Papers

Showing 2650 of 55 papers

TitleStatusHype
STCON System for the CHiME-8 Challenge0
Wanna hear your voice? A sample is all we need!0
Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions0
Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and RestorationCode0
Spectron: Target Speaker Extraction using Conditional Transformer with Adversarial RefinementCode0
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning0
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling0
Binaural Selective Attention Model for Target Speaker Extraction0
Target Speaker Extraction with Curriculum Learning0
Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training0
Target Speaker Extraction by Directly Exploiting Contextual Information in the Time-Frequency Domain0
Listening to Multi-talker Conversations: Modular and End-to-end Perspectives0
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction0
Conditional Diffusion Model for Target Speaker Extraction0
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction0
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer0
Beamformer-Guided Target Speaker Extraction0
Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge0
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings0
ExARN: self-attending RNN for target speaker extraction0
Adapting self-supervised models to multi-talker speech recognition using speaker embeddings0
ImagineNET: Target Speaker Extraction with Intermittent Visual Cue through Embedding InpaintingCode0
Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction0
Semi-supervised Time Domain Target Speaker Extraction with Attention0
Speaker-conditioning Single-channel Target Speaker Extraction using Conformer-based Architectures0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.