SOTAVerified

Activity Detection

Detecting activities in extended videos.

Papers

Showing 125 of 380 papers

TitleStatusHype
Moshi: a speech-text foundation model for real-time dialogueCode9
pyannote.audio: neural building blocks for speaker diarizationCode3
audino: A Modern Annotation Tool for Audio and SpeechCode2
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0Code1
Exploiting Temporal Side Information in Massive IoT ConnectivityCode1
Online speaker diarization of meetings guided by speech separationCode1
MM-ALT: A Multimodal Automatic Lyric Transcription SystemCode1
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender SegmentationCode1
AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-OccurrenceCode1
Low-Latency Speech Separation Guided Diarization for Telephone ConversationsCode1
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsCode1
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural DiarizationCode1
End-to-end speaker segmentation for overlap-aware resegmentationCode1
NAS-VAD: Neural Architecture Search for Voice Activity DetectionCode1
AV Taris: Online Audio-Visual Speech RecognitionCode1
A Hybrid CNN-BiLSTM Voice Activity DetectorCode1
An End-to-End Architecture for Keyword Spotting and Voice Activity DetectionCode1
HGCN: Harmonic gated compensation network for speech enhancementCode1
Classification of Abnormal Hand Movement for Aiding in Autism Detection: Machine Learning StudyCode1
A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vesselsCode1
Harvesting Ambient RF for Presence Detection Through Deep LearningCode1
Learning spectro-temporal representations of complex sounds with parameterized neural networksCode1
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and DevelopmentCode1
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimationCode1
ROAD: The ROad event Awareness Dataset for Autonomous DrivingCode1
Show:102550
← PrevPage 1 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CNN-BiLSTM_bestROC-AUC95.14Unverified
2CNN-BiLSTM_smallROC-AUC95.13Unverified
3SG-VAD (ours)ROC-AUC94.3Unverified
4ADA-VADROC-AUC79.1Unverified