SOTAVerified

Speaker Identification

Papers

Showing 151200 of 248 papers

TitleStatusHype
Symmetric Saliency-based Adversarial Attack To Speaker Identification0
Test-Time Training for Speech0
Text-based Speaker Identification on Multiparty Dialogues Using Multi-document Convolutional Neural Networks0
Text Independent Speaker Identification System for Access Control0
The Deterministic plus Stochastic Model of the Residual Signal and its Applications0
The DIRHA simulated corpus0
The exploitation of Multiple Feature Extraction Techniques for Speaker Identification in Emotional States under Disguised Voices0
SoK: The Faults in our ASRs: An Overview of Attacks against Automatic Speech Recognition and Speaker Identification Systems0
The RATS Collection: Supporting HLT Research with Degraded Audio Data0
TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches0
Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications0
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR0
Triplet loss based embeddings for forensic speaker identification in Spanish0
T-vectors: Weakly Supervised Speaker Identification Using Hierarchical Transformer Model0
Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction0
Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification0
VAST: A Corpus of Video Annotation for Speech Technologies0
VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution0
Voice Privacy with Smart Digital Assistants in Educational Settings0
Voxceleb-ESP: preliminary experiments detecting Spanish celebrities from their voices0
VoxWatch: An open-set speaker recognition benchmark on VoxCeleb0
WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment0
Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification0
Weakly Supervised Training of Speaker Identification Models0
Supervised Speaker Embedding De-Mixing in Two-Speaker Environment0
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement0
Adaptive blind audio source extraction supervised by dominant speaker identification using x-vectors0
Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition0
Advanced Rich Transcription System for Estonian Speech0
Advances in Online Audio-Visual Meeting Transcription0
AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification0
A Joint Model for Quotation Attribution and Coreference Resolution0
A Lightweight Speaker Recognition System Using Timbre Properties0
A Multi Level Data Fusion Approach for Speaker Identification on Telephone Speech0
A Novel Minimum Divergence Approach to Robust Speaker Identification0
An Unsupervised Speaker Clustering Technique based on SOM and I-vectors for Speech Recognition Systems0
基於聽覺感知模型之類神經網路及其在語者識別上之應用 (Two-stage Attentional Auditory Model Inspired Neural Network and Its Application to Speaker Identification) [In Chinese]0
A Preliminary Exploration with GPT-4o Voice Mode0
A Real-time Speaker Diarization System Based on Spatial Spectrum0
A Study of Acoustic Features in Arabic Speaker Identification under Noisy Environmental Conditions0
A Study of Few-Shot Audio Classification0
A Survey on Paralinguistics in Tamil Speech Processing0
A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR0
A user study to compare two conversational assistants designed for people with hearing impairments0
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification0
Can Musical Emotion Be Quantified With Neural Jitter Or Shimmer? A Novel EEG Based Study With Hindustani Classical Music0
CASA-Based Speaker Identification Using Cascaded GMM-CNN Classifier in Noisy and Emotional Talking Conditions0
Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models0
Comparison of Gender- and Speaker-adaptive Emotion Recognition0
Comparison of Multiple Features and Modeling Methods for Text-dependent Speaker Verification0
Show:102550
← PrevPage 4 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MSM-MAETop-1 (%)96.6Unverified
2M2D/0.6Top-1 (%)96.5Unverified
3M2D/0.7Top-1 (%)96.3Unverified
4M2D ratio=0.6Top-1 (%)94.8Unverified
5AudioMAE (local)Top-1 (%)94.8Unverified
6ATST Base (ours)Top-1 (%)94.3Unverified
7AudioMAE (global)Top-1 (%)94.1Unverified
8AutoSpeech (N=8,C=128)Top-1 (%)87.66Unverified
9SSAST-FRAMETop-1 (%)80.8Unverified
10SSAMBATop-1 (%)70.1Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)67.77Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)80.83Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)95.13Unverified