Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models Jun 23, 2025 Domain Adaptation GPU
Code Code Available 3M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset Jun 17, 2025 Domain Adaptation speaker-diarization
— Unverified 0Exploring Speaker Diarization with Mixture of Experts Jun 17, 2025 Mixture-of-Experts speaker-diarization
— Unverified 0Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models Jun 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition Jun 15, 2025 Decoder speaker-diarization
— Unverified 0Diarization-Aware Multi-Speaker Automatic Speech Recognition via Large Language Models Jun 6, 2025 Automatic Speech Recognition speaker-diarization
— Unverified 0Improving Neural Diarization through Speaker Attribute Attractors and Local Dependency Modeling Jun 5, 2025 Attribute Decoder
— Unverified 0Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm Jun 3, 2025 Action Detection Activity Detection
Code Code Available 1Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization May 30, 2025 GPU Knowledge Distillation
— Unverified 0Pretraining Multi-Speaker Identification for Neural Speaker Diarization May 30, 2025 speaker-diarization Speaker Diarization
— Unverified 0VoxRAG: A Step Toward Transcription-Free RAG Systems in Spoken Question Answering May 22, 2025 Question Answering RAG
— Unverified 0HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification May 22, 2025 speaker-diarization Speaker Diarization
— Unverified 0Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge May 22, 2025 speaker-diarization Speaker Diarization
— Unverified 0The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition May 20, 2025 Audio-Visual Speech Recognition speaker-diarization
— Unverified 0Multi-Stage Speaker Diarization for Noisy Classrooms May 16, 2025 Action Detection Activity Detection
Code Code Available 0Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning Apr 23, 2025 Self-Supervised Learning speaker-diarization
— Unverified 0SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors Mar 20, 2025 speaker-diarization Speaker Diarization
— Unverified 0Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge Feb 14, 2025 Action Detection Activity Detection
— Unverified 0Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond Feb 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Language Modelling for Speaker Diarization in Telephonic Interviews Jan 28, 2025 Acoustic Modelling Language Modelling
— Unverified 0SCDiar: a streaming diarization system based on speaker change detection and speech recognition Jan 28, 2025 Change Detection speaker-diarization
— Unverified 0SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models Jan 14, 2025 speaker-diarization Speaker Diarization
— Unverified 0Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection Jan 7, 2025 Action Detection Activity Detection
— Unverified 0Unsupervised Speech Segmentation: A General Approach Using Speech Language Models Jan 7, 2025 Boundary Detection Segmentation
Code Code Available 1DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition Dec 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch Dec 11, 2024 Denoising speaker-diarization
— Unverified 0Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding Dec 5, 2024 Audio Generation Automatic Speech Recognition
— Unverified 0Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment Dec 1, 2024 Action Detection Activity Detection
Code Code Available 0Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation Nov 26, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Sequence-to-Sequence Neural Diarization with Automatic Speaker Detection and Representation Nov 21, 2024 Action Detection Activity Detection
— Unverified 0DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions Nov 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Guided Speaker Embedding Oct 16, 2024 speaker-diarization Speaker Diarization
— Unverified 0Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party Meetings Sep 25, 2024 Clustering speaker-diarization
— Unverified 0On the calibration of powerset speaker diarization models Sep 24, 2024 speaker-diarization Speaker Diarization
Code Code Available 0META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR Sep 18, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Self-Tuning Spectral Clustering for Speaker Diarization Sep 16, 2024 Clustering speaker-diarization
Code Code Available 0TCG CREST System Description for the Second DISPLACE Challenge Sep 16, 2024 Action Detection Activity Detection
— Unverified 0Leveraging Self-Supervised Learning for Speaker Diarization Sep 14, 2024 Self-Supervised Learning speaker-diarization
Code Code Available 3Unified Audio Event Detection Sep 13, 2024 Event Detection Sound Event Detection
— Unverified 0Data Efficient Child-Adult Speaker Diarization with Simulated Conversations Sep 13, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens Sep 10, 2024 speaker-diarization Speaker Diarization
Code Code Available 0A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR Sep 9, 2024 Automatic Speech Recognition speaker-diarization
— Unverified 0LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization Sep 1, 2024 speaker-diarization Speaker Diarization
— Unverified 0Speaker Tagging Correction With Non-Autoregressive Language Models Aug 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings Aug 30, 2024 speaker-diarization Speaker Diarization
— Unverified 0Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization Aug 22, 2024 speaker-diarization Speaker Diarization
— Unverified 0An approach to optimize inference of the DIART speaker diarization pipeline Aug 5, 2024 Inference Optimization Knowledge Distillation
— Unverified 0Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation Aug 1, 2024 Action Detection Activity Detection
— Unverified 0Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization Jul 25, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning Jul 21, 2024 Representation Learning Self-Supervised Learning
— Unverified 0