SOTAVerified

Activity Detection

Detecting activities in extended videos.

Papers

Showing 101150 of 380 papers

TitleStatusHype
In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms0
The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 20230
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for ConversationsCode0
An enhanced system for the detection and active cancellation of snoring signals0
ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and DevelopmentCode1
Long-term Conversation Analysis: Exploring Utility and PrivacyCode0
Multi-microphone Automatic Speech Segmentation in Meetings Based on Circular Harmonics Features0
Parallel Neurosymbolic Integration with Concordia0
SVVAD: Personal Voice Activity Detection for Speaker Verification0
Building Accurate Low Latency ASR for Streaming Voice Search0
Joint Activity-Delay Detection and Channel Estimation for Asynchronous Massive Random Access0
Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction0
FunASR: A Fundamental End-to-End Speech Recognition Toolkit0
Deep Learning for Asynchronous Massive Access with Data Frame Length Diversity0
Joint Activity Detection and Channel Estimation for Clustered Massive Machine Type Communications0
Cooperative Multi-Cell Massive Access with Temporally Correlated Activity0
Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence0
Grant-free Massive Random Access with Retransmission: Receiver Optimization and Performance Analysis0
Evaluation of Noise Reduction Methods for Sentence Recognition by Sinhala Speaking ListenersCode0
Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV0
End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations0
A processing framework to access large quantities of whispered speech found in ASMR0
Multi-Task Sub-Band Network For Deep Residual Echo Suppression0
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker EmbeddingsCode1
Improving Transformer-based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention Heads0
Learnable Frontends that do not Learn: Quantifying Sensitivity to Filterbank Initialisation0
The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description0
KIDS: kinematics-based (in)activity detection and segmentation in a sleep case study0
Activity Detection for Grant-Free NOMA in Massive IoT Networks0
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks0
Trajectory-User Linking Is Easier Than You Think0
BC-VAD: A Robust Bone Conduction Voice Activity Detection0
Proximal Gradient-Based Unfolding for Massive Random Access in IoT Networks0
Joint Estimation of Clustered User Activity and Correlated Channels with Unknown Covariance in mMTC0
Multi-timescale Event Detection in Nonintrusive Load Monitoring based on MDL Principle0
Token Turing Machines0
On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches0
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural DiarizationCode1
Two-stream Multi-dimensional Convolutional Network for Real-time Violence Detection0
OFDM-Based Massive Connectivity for LEO Satellite Internet of Things0
Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction0
Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition0
SG-VAD: Stochastic Gates Based Speech Activity DetectionCode1
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0Code1
TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge0
Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimationCode1
Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization0
The DKU-DukeECE Diarization System for the VoxCeleb Speaker Recognition Challenge 20220
Learnable Acoustic Frontends in Bird Activity Detection0
Signed Latent Factors for Spamming Activity Detection0
Show:102550
← PrevPage 3 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1CNN-BiLSTM_bestROC-AUC95.14Unverified
2CNN-BiLSTM_smallROC-AUC95.13Unverified
3SG-VAD (ours)ROC-AUC94.3Unverified
4ADA-VADROC-AUC79.1Unverified