| Moshi: a speech-text foundation model for real-time dialogue | Sep 17, 2024 | Action DetectionActivity Detection | CodeCode Available | 9 | 5 |
| pyannote.audio: neural building blocks for speaker diarization | Nov 4, 2019 | Action DetectionActivity Detection | CodeCode Available | 3 | 5 |
| audino: A Modern Annotation Tool for Audio and Speech | Jun 9, 2020 | Action DetectionActivity Detection | CodeCode Available | 2 | 5 |
| AV Taris: Online Audio-Visual Speech Recognition | Dec 14, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| WASE: Learning When to Attend for Speaker Extraction in Cocktail Party Environments | Jun 13, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control Communications | Oct 12, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| X-Vector based voice activity detection for multi-genre broadcast speech-to-text | Dec 9, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering | Jun 27, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Learning spectro-temporal representations of complex sounds with parameterized neural networks | Mar 12, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| MM-ALT: A Multimodal Automatic Lyric Transcription System | Jul 13, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| HGCN: Harmonic gated compensation network for speech enhancement | Jan 30, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Harvesting Ambient RF for Presence Detection Through Deep Learning | Feb 13, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Exploiting Temporal Side Information in Massive IoT Connectivity | Jan 5, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| End-to-end speaker segmentation for overlap-aware resegmentation | Apr 8, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| ROAD: The ROad event Awareness Dataset for Autonomous Driving | Feb 23, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence | Nov 2, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation | Oct 24, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings | Mar 7, 2023 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Classification of Abnormal Hand Movement for Aiding in Autism Detection: Machine Learning Study | Aug 18, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| WiFi CSI Based Temporal Activity Detection via Dual Pyramid Network | Dec 19, 2024 | Action DetectionAction Recognition | CodeCode Available | 1 | 5 |
| InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender Segmentation | Jun 6, 2024 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization | Nov 12, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Online speaker diarization of meetings guided by speech separation | Jan 30, 2024 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| SG-VAD: Stochastic Gates Based Speech Activity Detection | Oct 28, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0 | Oct 26, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| NAS-VAD: Neural Architecture Search for Voice Activity Detection | Jan 22, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| A Hybrid CNN-BiLSTM Voice Activity Detector | Mar 5, 2021 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| VoxLingua107: a Dataset for Spoken Language Recognition | Nov 25, 2020 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| VANPY: Voice Analysis Framework | Feb 17, 2025 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation Algorithm | Jun 3, 2025 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development | Jul 17, 2023 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vessels | Jul 12, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Low-Latency Speech Separation Guided Diarization for Telephone Conversations | Apr 5, 2022 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| An End-to-End Architecture for Keyword Spotting and Voice Activity Detection | Nov 28, 2016 | Action DetectionActivity Detection | CodeCode Available | 1 | 5 |
| Personal VAD: Speaker-Conditioned Voice Activity Detection | Aug 12, 2019 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Activity Detection for Massive Connectivity in Cell-free Networks with Unknown Large-scale Fading, Channel Statistics, Noise Variance, and Activity Probability: A Bayesian Approach | Jan 30, 2024 | Activity DetectionVariational Inference | CodeCode Available | 0 | 5 |
| Pre-Equalization Aided Grant-Free Massive Access in Massive MIMO System | Feb 10, 2025 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Optimizing Large Language Models for ESG Activity Detection in Financial Texts | Feb 28, 2025 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Personalized Activity Recognition with Deep Triplet Embeddings | Jan 15, 2020 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Protest Activity Detection and Perceived Violence Estimation from Social Media Images | Sep 18, 2017 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| A Framework for Adapting Human-Robot Interaction to Diverse User Groups | Oct 15, 2024 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Adversarial Multi-Task Deep Learning for Noise-Robust Voice Activity Detection with Low Algorithmic Delay | Jul 4, 2022 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Long-term Conversation Analysis: Exploring Utility and Privacy | Jun 28, 2023 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Argus: Efficient Activity Detection System for Extended Video Analysis | Mar 2, 2020 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations | Aug 14, 2023 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| A Pursuit of Temporal Accuracy in General Activity Detection | Mar 8, 2017 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Learning Latent Super-Events to Detect Multiple Activities in Videos | Dec 5, 2017 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| Fine-Grained Classroom Activity Detection from Audio with Neural Networks | Jul 29, 2021 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| A Convolutional Neural Network Smartphone App for Real-Time Voice Activity Detection | Feb 1, 2018 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |
| FunASR: A Fundamental End-to-End Speech Recognition Toolkit | May 18, 2023 | Action DetectionActivity Detection | CodeCode Available | 0 | 5 |