Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation Oct 24, 2022 Action Detection Activity Detection
Code Code Available 1Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization Oct 7, 2022 Knowledge Distillation speaker-diarization
— Unverified 0Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting Sep 24, 2022 speaker-diarization Speaker Diarization
— Unverified 0Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization Aug 27, 2022 Action Detection Activity Detection
— Unverified 0The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines Aug 17, 2022 Machine Translation speaker-diarization
Code Code Available 1Chronological Self-Training for Real-Time Speaker Diarization Aug 5, 2022 speaker-diarization Speaker Diarization
— Unverified 0Utterance-by-utterance overlap-aware neural diarization with Graph-PIT Jul 28, 2022 Clustering Segmentation
Code Code Available 1Unsupervised Speaker Diarization that is Agnostic to Language, Overlap-Aware, and Tuning Free Jul 25, 2022 speaker-diarization Speaker Diarization
— Unverified 0Online Target Speaker Voice Activity Detection for Speaker Diarization Jul 13, 2022 Action Detection Activity Detection
— Unverified 0Speaker Diarization and Identification from Single-Channel Classroom Audio Recording Using Virtual Microphones Jul 1, 2022 speaker-diarization Speaker Diarization
— Unverified 0Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization Jun 28, 2022 Clustering Online Clustering
— Unverified 0Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios Jun 17, 2022 Action Detection Activity Detection
— Unverified 0Audio-video fusion strategies for active speaker detection in meetings Jun 9, 2022 Active Speaker Detection Management
— Unverified 0Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors Jun 6, 2022 Multi-Label Classification MUlTI-LABEL-ClASSIFICATION
— Unverified 0A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification. Jun 1, 2022 speaker-diarization Speaker Diarization
— Unverified 0Bazinga! A Dataset for Multi-Party Dialogues Structuring Jun 1, 2022 Entity Linking Punctuation Restoration
— Unverified 0PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit May 20, 2022 All Automatic Speech Recognition (ASR)
Code Code Available 6Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization May 19, 2022 Clustering speaker-diarization
— Unverified 0Reformulating Speaker Diarization as Community Detection With Emphasis On Topological Structure Apr 26, 2022 Clustering Community Detection
— Unverified 0Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization Apr 24, 2022 speaker-diarization Speaker Diarization
Code Code Available 1Self-supervised Speaker Diarization Apr 8, 2022 speaker-diarization Speaker Diarization
— Unverified 0Low-Latency Speech Separation Guided Diarization for Telephone Conversations Apr 5, 2022 Action Detection Activity Detection
Code Code Available 1From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization Apr 2, 2022 speaker-diarization Speaker Diarization
Code Code Available 1Multimodal Clustering with Role Induced Constraints for Speaker Diarization Apr 1, 2022 Clustering speaker-diarization
— Unverified 0EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers Mar 31, 2022 Decoder speaker-diarization
Code Code Available 0Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset Mar 31, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Generation of Speaker Representations Using Heterogeneous Training Batch Assembly Mar 30, 2022 speaker-diarization Speaker Diarization
— Unverified 0Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings Mar 30, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Multi-scale Speaker Diarization with Dynamic Scale Weighting Mar 30, 2022 Decoder speaker-diarization
— Unverified 0Using Active Speaker Faces for Diarization in TV shows Mar 30, 2022 Face Clustering Face Detection
— Unverified 0Training Speaker Embedding Extractors Using Multi-Speaker Audio with Unknown Speaker Boundaries Mar 29, 2022 speaker-diarization Speaker Diarization
— Unverified 0Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features Mar 29, 2022 speaker-diarization Speaker Diarization
Code Code Available 0Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios Mar 18, 2022 Action Detection Activity Detection
Code Code Available 0Tight integration of neural- and clustering-based diarization through deep unfolding of infinite Gaussian mixture model Feb 14, 2022 Clustering speaker-diarization
— Unverified 0The xmuspeech system for multi-channel multi-party meeting transcription challenge Feb 11, 2022 speaker-diarization Speaker Diarization
— Unverified 0The USTC-Ximalaya system for the ICASSP 2022 multi-channel multi-party meeting transcription (M2MeT) challenge Feb 10, 2022 Action Detection Activity Detection
— Unverified 0Royalflush Speaker Diarization System for ICASSP 2022 Multi-channel Multi-party Meeting Transcription Challenge Feb 10, 2022 speaker-diarization Speaker Diarization
— Unverified 0The Volcspeech system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge Feb 9, 2022 Data Augmentation Language Modelling
— Unverified 0Cross-Channel Attention-Based Target Speaker Voice Activity Detection: Experimental Results for M2MeT Challenge Feb 6, 2022 Action Detection Activity Detection
— Unverified 0The CUHK-TENCENT speaker diarization system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge Feb 4, 2022 Action Detection Activity Detection
— Unverified 0AVA-AVD: Audio-Visual Speaker Diarization in the Wild Nov 29, 2021 Relation Network speaker-diarization
Code Code Available 1Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information Nov 28, 2021 Action Detection Activity Detection
Code Code Available 0Low-Latency Online Speaker Diarization with Graph-Based Label Generation Nov 27, 2021 Clustering speaker-diarization
— Unverified 0Auxiliary Loss of Transformer with Residual Connection for End-to-End Speaker Diarization Oct 14, 2021 speaker-diarization Speaker Diarization
— Unverified 0BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications Oct 12, 2021 Action Detection Activity Detection
Code Code Available 1Multi-Channel End-to-End Neural Diarization with Distributed Microphones Oct 10, 2021 speaker-diarization Speaker Diarization
— Unverified 0TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context Oct 8, 2021 speaker-diarization Speaker Diarization
Code Code Available 1Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR Oct 7, 2021 Action Detection Activity Detection
— Unverified 0North America Bixby Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2021 Sep 28, 2021 Clustering speaker-diarization
— Unverified 0Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection Sep 23, 2021 Clustering speaker-diarization
Code Code Available 1