SOTAVerified

Activity Detection

Detecting activities in extended videos.

Papers

Showing 125 of 380 papers

TitleStatusHype
Moshi: a speech-text foundation model for real-time dialogueCode9
pyannote.audio: neural building blocks for speaker diarizationCode3
audino: A Modern Annotation Tool for Audio and SpeechCode2
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural DiarizationCode1
MM-ALT: A Multimodal Automatic Lyric Transcription SystemCode1
Online speaker diarization of meetings guided by speech separationCode1
InaGVAD : a Challenging French TV and Radio Corpus Annotated for Speech Activity Detection and Speaker Gender SegmentationCode1
AV Taris: Online Audio-Visual Speech RecognitionCode1
End-to-end speaker segmentation for overlap-aware resegmentationCode1
HGCN: Harmonic gated compensation network for speech enhancementCode1
Learning spectro-temporal representations of complex sounds with parameterized neural networksCode1
An End-to-End Architecture for Keyword Spotting and Voice Activity DetectionCode1
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0Code1
NAS-VAD: Neural Architecture Search for Voice Activity DetectionCode1
A Hybrid CNN-BiLSTM Voice Activity DetectorCode1
A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vesselsCode1
BERTraffic: BERT-based Joint Speaker Role and Speaker Change Detection for Air Traffic Control CommunicationsCode1
AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-OccurrenceCode1
Classification of Abnormal Hand Movement for Aiding in Autism Detection: Machine Learning StudyCode1
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimationCode1
Exploiting Temporal Side Information in Massive IoT ConnectivityCode1
Harvesting Ambient RF for Presence Detection Through Deep LearningCode1
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and DevelopmentCode1
Low-Latency Speech Separation Guided Diarization for Telephone ConversationsCode1
ROAD: The ROad event Awareness Dataset for Autonomous DrivingCode1
Show:102550
← PrevPage 1 of 16Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CNN-BiLSTM_bestROC-AUC95.14Unverified
2CNN-BiLSTM_smallROC-AUC95.13Unverified
3SG-VAD (ours)ROC-AUC94.3Unverified
4ADA-VADROC-AUC79.1Unverified