SOTAVerified

Target Speaker Extraction

Extract the dialogue content of the specified target in a multi-person dialogue.

Papers

Showing 2650 of 55 papers

TitleStatusHype
Speaker-conditioning Single-channel Target Speaker Extraction using Conformer-based Architectures0
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer0
STCON System for the CHiME-8 Challenge0
Target Speaker Extraction by Directly Exploiting Contextual Information in the Time-Frequency Domain0
Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments0
Target Speaker Extraction with Curriculum Learning0
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction0
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning0
AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement0
A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction0
Beamformer-Guided Target Speaker Extraction0
Binaural Selective Attention Model for Target Speaker Extraction0
C^2AV-TSE: Context and Confidence-aware Audio Visual Target Speaker Extraction0
Coarse-to-Fine Recursive Speech Separation for Unknown Number of Speakers0
Conditional Diffusion Model for Target Speaker Extraction0
Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training0
ExARN: self-attending RNN for target speaker extraction0
Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction0
FlowTSE: Target Speaker Extraction with Flow Matching0
Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings0
Incorporating Linguistic Constraints from External Knowledge Source for Audio-Visual Target Speech Extraction0
Listening to Multi-talker Conversations: Modular and End-to-end Perspectives0
Listen to Extract: Onset-Prompted Target Speaker Extraction0
Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions0
Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection0
Show:102550
← PrevPage 2 of 3Next →

No leaderboard results yet.