SOTAVerified

audio-visual event localization

Papers

Showing 1120 of 26 papers

TitleStatusHype
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and BaselineCode1
CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event LocalizationCode0
Leveraging the Video-level Semantic Consistency of Event for Audio-visual Event LocalizationCode0
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing0
Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-wise Pseudo Labeling0
AVE-CLIP: AudioCLIP-based Multi-window Temporal Transformer for Audio Visual Event Localization0
Audio-Visual Semantic Graph Network for Audio-Visual Event Localization0
MPN: Multimodal Parallel Network for Audio-Visual Event Localization0
Multimodal Trustworthy Semantic Communication for Audio-Visual Event Localization0
Multi-Modulation Network for Audio-Visual Event Localization0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1UnAV mAP47.8Unverified
2ActionFormer mAP42.2Unverified