SOTAVerified

Activity Detection

Detecting activities in extended videos.

Papers

Showing 150 of 380 papers

TitleStatusHype
Moshi: a speech-text foundation model for real-time dialogueCode9
pyannote.audio: neural building blocks for speaker diarizationCode3
audino: A Modern Annotation Tool for Audio and SpeechCode2
Speaker Diarization with Overlapping Community Detection Using Graph Attention Networks and Label Propagation AlgorithmCode1
VANPY: Voice Analysis FrameworkCode1
WiFi CSI Based Temporal Activity Detection via Dual Pyramid NetworkCode1
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender SegmentationCode1
Online speaker diarization of meetings guided by speech separationCode1
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and DevelopmentCode1
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker EmbeddingsCode1
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural DiarizationCode1
SG-VAD: Stochastic Gates Based Speech Activity DetectionCode1
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0Code1
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimationCode1
MM-ALT: A Multimodal Automatic Lyric Transcription SystemCode1
A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vesselsCode1
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency FilteringCode1
Low-Latency Speech Separation Guided Diarization for Telephone ConversationsCode1
HGCN: Harmonic gated compensation network for speech enhancementCode1
NAS-VAD: Neural Architecture Search for Voice Activity DetectionCode1
Exploiting Temporal Side Information in Massive IoT ConnectivityCode1
X-Vector based voice activity detection for multi-genre broadcast speech-to-textCode1
AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-OccurrenceCode1
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsCode1
Classification of Abnormal Hand Movement for Aiding in Autism Detection: Machine Learning StudyCode1
WASE: Learning When to Attend for Speaker Extraction in Cocktail Party EnvironmentsCode1
End-to-end speaker segmentation for overlap-aware resegmentationCode1
Learning spectro-temporal representations of complex sounds with parameterized neural networksCode1
A Hybrid CNN-BiLSTM Voice Activity DetectorCode1
ROAD: The ROad event Awareness Dataset for Autonomous DrivingCode1
AV Taris: Online Audio-Visual Speech RecognitionCode1
VoxLingua107: a Dataset for Spoken Language RecognitionCode1
Harvesting Ambient RF for Presence Detection Through Deep LearningCode1
An End-to-End Architecture for Keyword Spotting and Voice Activity DetectionCode1
CBF-AFA: Chunk-Based Multi-SSL Fusion for Automatic Fluency Assessment0
Distributed Activity Detection for Cell-Free Hybrid Near-Far Field Communications0
Attention Is Not Always the Answer: Optimizing Voice Activity Detection with Simple Feature Fusion0
Joint Activity Detection and Channel Estimation for Massive Connectivity: Where Message Passing Meets Score-Based Generative Priors0
Towards Robust Overlapping Speech Detection: A Speaker-Aware Progressive Approach Using WavLM0
Robust Activity Detection for Massive Random Access0
Improving endpoint detection in end-to-end streaming ASR for conversational speech0
Multi-Stage Speaker Diarization for Noisy ClassroomsCode0
MicroNAS: An Automated Framework for Developing a Fall Detection System0
Fast MLE and MAPE-Based Device Activity Detection for Grant-Free Access via PSCA and PSCA-Net0
Federated Learning for Secure and Efficient Device Activity Detection in mMTC Networks0
Lightweight Learning for Grant-Free Activity Detection in Cell-Free Massive MIMO Networks0
Robust Learning-Based Sparse Recovery for Device Activity Detection in Grant-Free Random Access Cell-Free Massive MIMO: Enhancing Resilience to Impairments0
CADDI: An in-Class Activity Detection Dataset using IMU data from low-cost sensors0
Optimizing Large Language Models for ESG Activity Detection in Financial TextsCode0
Mixture of Experts-augmented Deep Unfolding for Activity Detection in IRS-aided Systems0
Show:102550
← PrevPage 1 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CNN-BiLSTM_bestROC-AUC95.14Unverified
2CNN-BiLSTM_smallROC-AUC95.13Unverified
3SG-VAD (ours)ROC-AUC94.3Unverified
4ADA-VADROC-AUC79.1Unverified