PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit May 20, 2022 All Automatic Speech Recognition (ASR)
Code Code Available 6Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models Jun 23, 2025 Domain Adaptation GPU
Code Code Available 3Leveraging Self-Supervised Learning for Speaker Diarization Sep 14, 2024 Self-Supervised Learning speaker-diarization
Code Code Available 3DiarizationLM: Speaker Diarization Post-Processing with Large Language Models Jan 7, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 3pyannote.audio: neural building blocks for speaker diarization Nov 4, 2019 Action Detection Activity Detection
Code Code Available 3DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition Dec 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings Mar 4, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 2Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering Oct 25, 2022 Clustering CPU
Code Code Available 2Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation Sep 14, 2021 Clustering Segmentation
Code Code Available 2Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm Jun 3, 2025 Action Detection Activity Detection
Code Code Available 1Unsupervised Speech Segmentation: A General Approach Using Speech Language Models Jan 7, 2025 Boundary Detection Segmentation
Code Code Available 1Data Efficient Child-Adult Speaker Diarization with Simulated Conversations Sep 13, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization Jul 25, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions Jun 12, 2024 speaker-diarization Speaker Diarization
Code Code Available 1LLM-based speaker diarization correction: A generalizable approach Jun 7, 2024 speaker-diarization Speaker Diarization
Code Code Available 1Online speaker diarization of meetings guided by speech separation Jan 30, 2024 Action Detection Activity Detection
Code Code Available 1DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors Dec 7, 2023 Decoder speaker-diarization
Code Code Available 1Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors Sep 25, 2023 Decoder speaker-diarization
Code Code Available 1Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture Sep 17, 2023 speaker-diarization Speaker Diarization
Code Code Available 1DiaCorrect: Error Correction Back-end For Speaker Diarization Sep 15, 2023 Automatic Speech Recognition Decoder
Code Code Available 1DiariST: Streaming Speech Translation with Speaker Diarization Sep 14, 2023 speaker-diarization Speaker Diarization
Code Code Available 1Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach Sep 11, 2023 speaker-diarization Speaker Diarization
Code Code Available 1Speech Emotion Diarization: Which Emotion Appears When? Jun 22, 2023 Emotion Recognition speaker-diarization
Code Code Available 1Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks Jun 7, 2023 Audio Classification Audio Tagging
Code Code Available 1A Light Weight Model for Active Speaker Detection Mar 8, 2023 Active Speaker Detection Audio-Visual Active Speaker Detection
Code Code Available 1VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge Feb 20, 2023 Speaker Diarization Speaker Recognition
Code Code Available 1BER: Balanced Error Rate For Speaker Diarization Nov 8, 2022 speaker-diarization Speaker Diarization
Code Code Available 1Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation Oct 24, 2022 Action Detection Activity Detection
Code Code Available 1The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines Aug 17, 2022 Machine Translation speaker-diarization
Code Code Available 1Utterance-by-utterance overlap-aware neural diarization with Graph-PIT Jul 28, 2022 Clustering Segmentation
Code Code Available 1Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization Apr 24, 2022 speaker-diarization Speaker Diarization
Code Code Available 1Low-Latency Speech Separation Guided Diarization for Telephone Conversations Apr 5, 2022 Action Detection Activity Detection
Code Code Available 1From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization Apr 2, 2022 speaker-diarization Speaker Diarization
Code Code Available 1Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings Mar 30, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1AVA-AVD: Audio-Visual Speaker Diarization in the Wild Nov 29, 2021 Relation Network speaker-diarization
Code Code Available 1BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications Oct 12, 2021 Action Detection Activity Detection
Code Code Available 1TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context Oct 8, 2021 speaker-diarization Speaker Diarization
Code Code Available 1Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection Sep 23, 2021 Clustering speaker-diarization
Code Code Available 1Encoder-Decoder Based Attractors for End-to-End Neural Diarization Jun 20, 2021 Decoder speaker-diarization
Code Code Available 1Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech May 19, 2021 Clustering Constrained Clustering
Code Code Available 1End-to-end speaker segmentation for overlap-aware resegmentation Apr 8, 2021 Action Detection Activity Detection
Code Code Available 1Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem Apr 5, 2021 graph partitioning speaker-diarization
Code Code Available 1The Third DIHARD Diarization Challenge Dec 2, 2020 speaker-diarization Speaker Diarization
Code Code Available 1VoxLingua107: a Dataset for Spoken Language Recognition Nov 25, 2020 Action Detection Activity Detection
Code Code Available 1Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm Oct 21, 2020 speaker-diarization Speaker Diarization
Code Code Available 1Speaker Diarization: Using Recurrent Neural Networks Jun 10, 2020 speaker-diarization Speaker Diarization
Code Code Available 1Speaker Diarization as a Fully Online Learning Problem in MiniVox Jun 8, 2020 Self-Supervised Learning speaker-diarization
Code Code Available 1End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors May 20, 2020 Clustering Decoder
Code Code Available 1Speech Recognition and Multi-Speaker Diarization of Long Conversations May 16, 2020 Data Augmentation speaker-diarization
Code Code Available 1Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap Mar 5, 2020 Clustering speaker-diarization
Code Code Available 1