psifx -- Psychological and Social Interactions Feature Extraction Package Jul 14, 2024 Pose Estimation speaker-diarization
— Unverified 0A Benchmark for Multi-speaker Anonymization Jul 8, 2024 Benchmarking Disentanglement
— Unverified 0Systematic Evaluation of Online Speaker Diarization Systems Regarding their Latency Jul 5, 2024 Online Clustering Segmentation
— Unverified 0The USTC-NERCSLIP Systems for The ICMC-ASR Challenge Jul 2, 2024 Automatic Speech Recognition Pseudo Label
— Unverified 0Towards Unsupervised Speaker Diarization System for Multilingual Telephone Calls Using Pre-trained Whisper Model and Mixture of Sparse Autoencoders Jul 2, 2024 Clustering speaker-diarization
— Unverified 0Audio-Visual Approach For Multimodal Concurrent Speaker Detection Jul 1, 2024 Multimodal Deep Learning speaker-diarization
— Unverified 0Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios Jul 1, 2024 speaker-diarization Speaker Diarization
— Unverified 0From Modular to End-to-End Speaker Diarization Jun 27, 2024 speaker-diarization Speaker Diarization
— Unverified 0Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization Jun 26, 2024 Clustering Form
— Unverified 0AG-LSEC: Audio Grounded Lexical Speaker Error Correction Jun 25, 2024 Language Modeling Language Modelling
— Unverified 0Investigating Confidence Estimation Measures for Speaker Diarization Jun 24, 2024 speaker-diarization Speaker Diarization
— Unverified 0A Review of Common Online Speaker Diarization Methods Jun 20, 2024 speaker-diarization Speaker Diarization
— Unverified 0System Description for the Displace Speaker Diarization Challenge 2023 Jun 20, 2024 Clustering speaker-diarization
— Unverified 0Joint vs Sequential Speaker-Role Detection and Automatic Speech Recognition for Air-traffic Control Jun 19, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0The BabyView dataset: High-resolution egocentric videos of infants' and young children's everyday experiences Jun 14, 2024 Depth Estimation Image Segmentation
— Unverified 0Exploring Spoken Language Identification Strategies for Automatic Transcription of Multilingual Broadcast and Institutional Speech Jun 13, 2024 Language Identification speaker-diarization
— Unverified 0The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environments Jun 13, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Neural Blind Source Separation and Diarization for Distant Speech Recognition Jun 12, 2024 blind source separation Distant Speech Recognition
— Unverified 0Target Speech Diarization with Multimodal Prompts Jun 11, 2024 speaker-diarization Speaker Diarization
— Unverified 0ASoBO: Attentive Beamformer Selection for Distant Speaker Diarization in Meetings Jun 5, 2024 speaker-diarization Speaker Diarization
— Unverified 0Speaker Embeddings With Weakly Supervised Voice Activity Detection For Efficient Speaker Diarization May 15, 2024 Action Detection Activity Detection
— Unverified 0A Semi-Automatic Approach to Create Large Gender- and Age-Balanced Speaker Corpora: Usefulness of Speaker Diarization & Identification Apr 26, 2024 speaker-diarization Speaker Diarization
— Unverified 0Unsupervised Speaker Diarization in Distributed IoT Networks Using Federated Learning Apr 16, 2024 Change Detection Federated Learning
— Unverified 03D-Speaker-Toolkit: An Open-Source Toolkit for Multimodal Speaker Verification and Diarization Mar 29, 2024 Self-Supervised Learning speaker-diarization
Code Code Available 0Assessing the Robustness of Spectral Clustering for Deep Speaker Diarization Mar 21, 2024 Clustering speaker-diarization
— Unverified 0Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications Mar 11, 2024 Action Detection Activity Detection
— Unverified 0Listening to Multi-talker Conversations: Modular and End-to-end Perspectives Feb 14, 2024 GPU speaker-diarization
— Unverified 0Channel-Combination Algorithms for Robust Distant Voice Activity and Overlapped Speech Detection Feb 13, 2024 Action Detection Activity Detection
— Unverified 0The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models Feb 12, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Spatial-Temporal Activity-Informed Diarization and Separation Jan 30, 2024 speaker-diarization Speaker Diarization
— Unverified 0End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization Jan 23, 2024 Clustering Graph Clustering
Code Code Available 0NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription Jan 16, 2024 Automatic Speech Recognition Benchmarking
— Unverified 0Multi-Input Multi-Output Target-Speaker Voice Activity Detection For Unified, Flexible, and Robust Audio-Visual Speaker Diarization Jan 16, 2024 Action Detection Activity Detection
— Unverified 0Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation Jan 7, 2024 Audio-Visual Speech Recognition Automatic Speech Recognition
Code Code Available 0Uncertainty Quantification in Machine Learning for Joint Speaker Diarization and Identification Dec 28, 2023 speaker-diarization Speaker Diarization
— Unverified 0Speaker Mask Transformer for Multi-talker Overlapped Speech Recognition Dec 18, 2023 speaker-diarization Speaker Diarization
— Unverified 0EEND-DEMUX: End-to-End Neural Speaker Diarization via Demultiplexed Speaker Embeddings Dec 11, 2023 speaker-diarization Speaker Diarization
— Unverified 0Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization Dec 7, 2023 Decoder speaker-diarization
— Unverified 0Summary of the DISPLACE Challenge 2023 -- DIarization of SPeaker and LAnguage in Conversational Environments Nov 21, 2023 speaker-diarization Speaker Diarization
— Unverified 0UniX-Encoder: A Universal X-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing Oct 25, 2023 speaker-diarization Speaker Diarization
— Unverified 0EmoDiarize: Speaker Diarization and Emotion Identification from Speech Signals using Convolutional Neural Networks Oct 19, 2023 Data Augmentation Emotion Recognition
— Unverified 0Powerset multi-class cross entropy loss for neural speaker diarization Oct 19, 2023 Multi-class Classification Multi-Label Classification
Code Code Available 0The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System Oct 18, 2023 Automatic Speech Recognition speaker-diarization
— Unverified 0Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation Oct 18, 2023 Action Detection Activity Detection
— Unverified 0End-to-end Online Speaker Diarization with Target Speaker Tracking Oct 12, 2023 Action Detection Activity Detection
— Unverified 0One model to rule them all ? Towards End-to-End Joint Speaker Diarization and Speech Recognition Oct 2, 2023 All Automatic Speech Recognition
— Unverified 0NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization Sep 22, 2023 Automatic Speech Recognition speaker-diarization
— Unverified 0Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation Sep 19, 2023 speaker-diarization Speaker Diarization
— Unverified 0Towards Word-Level End-to-End Neural Speaker Diarization with Auxiliary Network Sep 15, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Aligning Speakers: Evaluating and Visualizing Text-based Diarization Using Efficient Multiple Sequence Alignment (Extended Version) Sep 14, 2023 Multiple Sequence Alignment speaker-diarization
— Unverified 0