End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification Feb 24, 2020 Clustering General Classification
Code Code Available 1Speaker Diarization with Region Proposal Network Feb 14, 2020 Region Proposal speaker-diarization
Code Code Available 1Phoneme Boundary Detection using Learnable Segmental Features Feb 11, 2020 Boundary Detection Keyword Spotting
Code Code Available 1End-to-End Neural Speaker Diarization with Self-attention Sep 13, 2019 Clustering speaker-diarization
Code Code Available 1AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection Jan 5, 2019 Active Speaker Detection Audio-Visual Active Speaker Detection
Code Code Available 1Speaker Diarization with LSTM Oct 28, 2017 Clustering speaker-diarization
Code Code Available 1Exploring Speaker Diarization with Mixture of Experts Jun 17, 2025 Mixture-of-Experts speaker-diarization
— Unverified 0M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset Jun 17, 2025 Domain Adaptation speaker-diarization
— Unverified 0Seewo's Submission to MLC-SLM: Lessons learned from Speech Reasoning Language Models Jun 16, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition Jun 15, 2025 Decoder speaker-diarization
— Unverified 0Diarization-Aware Multi-Speaker Automatic Speech Recognition via Large Language Models Jun 6, 2025 Automatic Speech Recognition speaker-diarization
— Unverified 0Improving Neural Diarization through Speaker Attribute Attractors and Local Dependency Modeling Jun 5, 2025 Attribute Decoder
— Unverified 0Pretraining Multi-Speaker Identification for Neural Speaker Diarization May 30, 2025 speaker-diarization Speaker Diarization
— Unverified 0Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization May 30, 2025 GPU Knowledge Distillation
— Unverified 0VoxRAG: A Step Toward Transcription-Free RAG Systems in Spoken Question Answering May 22, 2025 Question Answering RAG
— Unverified 0Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge May 22, 2025 speaker-diarization Speaker Diarization
— Unverified 0HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification May 22, 2025 speaker-diarization Speaker Diarization
— Unverified 0The Multimodal Information Based Speech Processing (MISP) 2025 Challenge: Audio-Visual Diarization and Recognition May 20, 2025 Audio-Visual Speech Recognition speaker-diarization
— Unverified 0Multi-Stage Speaker Diarization for Noisy Classrooms May 16, 2025 Action Detection Activity Detection
Code Code Available 0Speaker Diarization for Low-Resource Languages Through Wav2vec Fine-Tuning Apr 23, 2025 Self-Supervised Learning speaker-diarization
— Unverified 0SeniorTalk: A Chinese Conversation Dataset with Rich Annotations for Super-Aged Seniors Mar 20, 2025 speaker-diarization Speaker Diarization
— Unverified 0Microphone Array Geometry Independent Multi-Talker Distant ASR: NTT System for the DASR Task of the CHiME-8 Challenge Feb 14, 2025 Action Detection Activity Detection
— Unverified 0Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond Feb 6, 2025 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0SCDiar: a streaming diarization system based on speaker change detection and speech recognition Jan 28, 2025 Change Detection speaker-diarization
— Unverified 0Language Modelling for Speaker Diarization in Telephonic Interviews Jan 28, 2025 Acoustic Modelling Language Modelling
— Unverified 0SEAL: Speaker Error Correction using Acoustic-conditioned Large Language Models Jan 14, 2025 speaker-diarization Speaker Diarization
— Unverified 0Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection Jan 7, 2025 Action Detection Activity Detection
— Unverified 0TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch Dec 11, 2024 Denoising speaker-diarization
— Unverified 0Comprehensive Audio Query Handling System with Integrated Expert Models and Contextual Understanding Dec 5, 2024 Audio Generation Automatic Speech Recognition
— Unverified 0Automating Feedback Analysis in Surgical Training: Detection, Categorization, and Assessment Dec 1, 2024 Action Detection Activity Detection
Code Code Available 0Disentangled-Transformer: An Explainable End-to-End Automatic Speech Recognition Model with Speech Content-Context Separation Nov 26, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Sequence-to-Sequence Neural Diarization with Automatic Speaker Detection and Representation Nov 21, 2024 Action Detection Activity Detection
— Unverified 0DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions Nov 11, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Guided Speaker Embedding Oct 16, 2024 speaker-diarization Speaker Diarization
— Unverified 0Incorporating Spatial Cues in Modular Speaker Diarization for Multi-channel Multi-party Meetings Sep 25, 2024 Clustering speaker-diarization
— Unverified 0On the calibration of powerset speaker diarization models Sep 24, 2024 speaker-diarization Speaker Diarization
Code Code Available 0META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR Sep 18, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0TCG CREST System Description for the Second DISPLACE Challenge Sep 16, 2024 Action Detection Activity Detection
— Unverified 0Self-Tuning Spectral Clustering for Speaker Diarization Sep 16, 2024 Clustering speaker-diarization
Code Code Available 0Unified Audio Event Detection Sep 13, 2024 Event Detection Sound Event Detection
— Unverified 0Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens Sep 10, 2024 speaker-diarization Speaker Diarization
Code Code Available 0A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR Sep 9, 2024 Automatic Speech Recognition speaker-diarization
— Unverified 0LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization Sep 1, 2024 speaker-diarization Speaker Diarization
— Unverified 0Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings Aug 30, 2024 speaker-diarization Speaker Diarization
— Unverified 0Speaker Tagging Correction With Non-Autoregressive Language Models Aug 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization Aug 22, 2024 speaker-diarization Speaker Diarization
— Unverified 0An approach to optimize inference of the DIART speaker diarization pipeline Aug 5, 2024 Inference Optimization Knowledge Distillation
— Unverified 0Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation Aug 1, 2024 Action Detection Activity Detection
— Unverified 0Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning Jul 21, 2024 Representation Learning Self-Supervised Learning
— Unverified 0TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 2024 Jul 17, 2024 speaker-diarization Speaker Diarization
— Unverified 0