TriDet: Temporal Action Detection with Relative Boundary Modeling

2023-03-13CVPR 2023Code Available2· sign in to hype

Dingfeng Shi, Yujie Zhong, Qiong Cao, Lin Ma, Jia Li, DaCheng Tao

Code Available — Be the first to reproduce this paper.

Code

github.com/dingfengshi/tridet
OfficialIn paperpytorch★ 211

Abstract

In this paper, we present a one-stage framework TriDet for temporal action detection. Existing methods often suffer from imprecise boundary predictions due to the ambiguous action boundaries in videos. To alleviate this problem, we propose a novel Trident-head to model the action boundary via an estimated relative probability distribution around the boundary. In the feature pyramid of TriDet, we propose an efficient Scalable-Granularity Perception (SGP) layer to mitigate the rank loss problem of self-attention that takes place in the video features and aggregate information across different temporal granularities. Benefiting from the Trident-head and the SGP-based feature pyramid, TriDet achieves state-of-the-art performance on three challenging benchmarks: THUMOS14, HACS and EPIC-KITCHEN 100, with lower computational costs, compared to previous methods. For example, TriDet hits an average mAP of 69.3\% on THUMOS14, outperforming the previous best by 2.5\%, but with only 74.6\% of its latency. The code is released to https://github.com/sssste/TriDet.

Tasks

Action Detection Temporal Action Localization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ActivityNet-1.3	TriDet (TSP features)	mAP	36.8	—	Unverified
EPIC-KITCHENS-100	TriDet (verb)	Avg mAP (0.1-0.5)	25.4	—	Unverified
HACS	TriDet (SlowFast)	Average-mAP	38.6	—	Unverified
HACS	TriDet (I3D RGB)	Average-mAP	36.8	—	Unverified
THUMOS14	TriDet (I3D features)	Avg mAP (0.3:0.7)	69.3	—	Unverified

TriDet: Temporal Action Detection with Relative Boundary Modeling

Code

Abstract

Tasks

Benchmark Results

Reproductions