SOTAVerified

audio-visual event localization

Papers

Showing 110 of 26 papers

TitleStatusHype
ActionFormer: Localizing Moments of Actions with TransformersCode2
Positive Sample Propagation along the Audio-Visual Event LineCode1
Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and BaselineCode1
Audio-Visual Event Localization in Unconstrained VideosCode1
MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video ParsingCode1
Modality-Independent Teachers Meet Weakly-Supervised Audio-Visual Event ParserCode1
Dual-modality seq2seq network for audio-visual event localizationCode1
Cross-Modal Background Suppression for Audio-Visual Event LocalizationCode1
Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity CollaborationCode1
Towards Open-Vocabulary Audio-Visual Event LocalizationCode1
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1UnAV mAP47.8Unverified
2ActionFormer mAP42.2Unverified