PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit May 20, 2022 All Automatic Speech Recognition (ASR)
Code Code Available 65 pyannote.audio: neural building blocks for speaker diarization Nov 4, 2019 Action Detection Activity Detection
Code Code Available 35 Leveraging Self-Supervised Learning for Speaker Diarization Sep 14, 2024 Self-Supervised Learning speaker-diarization
Code Code Available 35 DiarizationLM: Speaker Diarization Post-Processing with Large Language Models Jan 7, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 35 Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models Jun 23, 2025 Domain Adaptation GPU
Code Code Available 35 Highly Efficient Real-Time Streaming and Fully On-Device Speaker Diarization with Multi-Stage Clustering Oct 25, 2022 Clustering CPU
Code Code Available 25 PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings Mar 4, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 25 Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation Sep 14, 2021 Clustering Segmentation
Code Code Available 25 DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition Dec 30, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 25 Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings Mar 30, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Low-Latency Speech Separation Guided Diarization for Telephone Conversations Apr 5, 2022 Action Detection Activity Detection
Code Code Available 15 Speaker Diarization: Using Recurrent Neural Networks Jun 10, 2020 speaker-diarization Speaker Diarization
Code Code Available 15 Speaker Diarization with Region Proposal Network Feb 14, 2020 Region Proposal speaker-diarization
Code Code Available 15 Speech Recognition and Multi-Speaker Diarization of Long Conversations May 16, 2020 Data Augmentation speaker-diarization
Code Code Available 15 Speaker Diarization with LSTM Oct 28, 2017 Clustering speaker-diarization
Code Code Available 15 The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines Aug 17, 2022 Machine Translation speaker-diarization
Code Code Available 15 Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions Jun 12, 2024 speaker-diarization Speaker Diarization
Code Code Available 15 End-to-End Neural Speaker Diarization with Self-attention Sep 13, 2019 Clustering speaker-diarization
Code Code Available 15 Frame-wise streaming end-to-end speaker diarization with non-autoregressive self-attention-based attractors Sep 25, 2023 Decoder speaker-diarization
Code Code Available 15 Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem Apr 5, 2021 graph partitioning speaker-diarization
Code Code Available 15 End-to-end speaker segmentation for overlap-aware resegmentation Apr 8, 2021 Action Detection Activity Detection
Code Code Available 15 Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach Sep 11, 2023 speaker-diarization Speaker Diarization
Code Code Available 15 LLM-based speaker diarization correction: A generalizable approach Jun 7, 2024 speaker-diarization Speaker Diarization
Code Code Available 15 Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization Apr 24, 2022 speaker-diarization Speaker Diarization
Code Code Available 15 A Light Weight Model for Active Speaker Detection Mar 8, 2023 Active Speaker Detection Audio-Visual Active Speaker Detection
Code Code Available 15 Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks Jun 7, 2023 Audio Classification Audio Tagging
Code Code Available 15 Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture Sep 17, 2023 speaker-diarization Speaker Diarization
Code Code Available 15 Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm Jun 3, 2025 Action Detection Activity Detection
Code Code Available 15 Speech Emotion Diarization: Which Emotion Appears When? Jun 22, 2023 Emotion Recognition speaker-diarization
Code Code Available 15 Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization Jul 25, 2024 speaker-diarization Speaker Diarization
Code Code Available 15 DiariST: Streaming Speech Translation with Speaker Diarization Sep 14, 2023 speaker-diarization Speaker Diarization
Code Code Available 15 DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors Dec 7, 2023 Decoder speaker-diarization
Code Code Available 15 Phoneme Boundary Detection using Learnable Segmental Features Feb 11, 2020 Boundary Detection Keyword Spotting
Code Code Available 15 Data Efficient Child-Adult Speaker Diarization with Simulated Conversations Sep 13, 2024 speaker-diarization Speaker Diarization
Code Code Available 15 Online speaker diarization of meetings guided by speech separation Jan 30, 2024 Action Detection Activity Detection
Code Code Available 15 DiaCorrect: Error Correction Back-end For Speaker Diarization Sep 15, 2023 Automatic Speech Recognition Decoder
Code Code Available 15 Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech May 19, 2021 Clustering Constrained Clustering
Code Code Available 15 From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization Apr 2, 2022 speaker-diarization Speaker Diarization
Code Code Available 15 Encoder-Decoder Based Attractors for End-to-End Neural Diarization Jun 20, 2021 Decoder speaker-diarization
Code Code Available 15 Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap Mar 5, 2020 Clustering speaker-diarization
Code Code Available 15 End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification Feb 24, 2020 Clustering General Classification
Code Code Available 15 AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection Jan 5, 2019 Active Speaker Detection Audio-Visual Active Speaker Detection
Code Code Available 15 AVA-AVD: Audio-Visual Speaker Diarization in the Wild Nov 29, 2021 Relation Network speaker-diarization
Code Code Available 15 End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors May 20, 2020 Clustering Decoder
Code Code Available 15 BER: Balanced Error Rate For Speaker Diarization Nov 8, 2022 speaker-diarization Speaker Diarization
Code Code Available 15 BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications Oct 12, 2021 Action Detection Activity Detection
Code Code Available 15 Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm Oct 21, 2020 speaker-diarization Speaker Diarization
Code Code Available 15 Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation Oct 24, 2022 Action Detection Activity Detection
Code Code Available 15 Speaker Diarization as a Fully Online Learning Problem in MiniVox Jun 8, 2020 Self-Supervised Learning speaker-diarization
Code Code Available 15 The Third DIHARD Diarization Challenge Dec 2, 2020 speaker-diarization Speaker Diarization
Code Code Available 15