Temporal Action Localization
Temporal Action Localization aims to detect activities in the video stream and output beginning and end timestamps. It is closely related to Temporal Action Proposal Generation.
Papers
Showing 1–10 of 1477 papers
All datasetsTHUMOS14ActivityNet-1.3HACSFineActionMultiTHUMOSCrossTaskEPIC-KITCHENS-100MUSESActivityNet-1.2Ego4D MQ testEgo4D MQ valMEXaction2
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | AdaTAD (VideoMAEv2-giant) | Avg mAP (0.3:0.7) | 76.9 | — | Unverified |
| 2 | RDFA-S6 (InternVideo2-6B) | Avg mAP (0.3:0.7) | 74.2 | — | Unverified |
| 3 | ActionMamba(InternVideo2-6B) | Avg mAP (0.3:0.7) | 72.72 | — | Unverified |
| 4 | GCM | mAP IOU@0.1 | 72.5 | — | Unverified |
| 5 | AGT (Ours) | mAP IOU@0.1 | 72.1 | — | Unverified |
| 6 | InternVideo2-6B | Avg mAP (0.3:0.7) | 72 | — | Unverified |
| 7 | ActionFormer (InternVideo features) | Avg mAP (0.3:0.7) | 71.58 | — | Unverified |
| 8 | TriDet (VideoMAE v2-g feature) | Avg mAP (0.3:0.7) | 70.1 | — | Unverified |
| 9 | InternVideo2-1B | Avg mAP (0.3:0.7) | 69.8 | — | Unverified |
| 10 | ActionFormer (VideoMAE V2-g features) | Avg mAP (0.3:0.7) | 69.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UnLoc-L | mAP IOU@0.5 | 59.3 | — | Unverified |
| 2 | RDFA-S6 (InternVideo2-6B) | mAP | 42.9 | — | Unverified |
| 3 | ActionMamba (InternVideo2-6B) | mAP | 42.02 | — | Unverified |
| 4 | PRN+BMN (ensemble) | mAP | 42 | — | Unverified |
| 5 | AdaTAD (VideoMAEv2-giant) | mAP | 41.93 | — | Unverified |
| 6 | InternVideo2-6B | mAP | 41.2 | — | Unverified |
| 7 | InternVideo2-1B | mAP | 40.4 | — | Unverified |
| 8 | UniMD+Sync. | mAP | 39.83 | — | Unverified |
| 9 | PRN (CSN) | mAP | 39.4 | — | Unverified |
| 10 | InternVideo | mAP | 39 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | RDFA-S6 (InternVideo2-6B) | Average-mAP | 45.8 | — | Unverified |
| 2 | ActionMamba(InternVideo2-6B) | Average-mAP | 44.56 | — | Unverified |
| 3 | DyFADet(VideoMAEv2) | Average-mAP | 44.3 | — | Unverified |
| 4 | InternVideo2-6B | Average-mAP | 43.3 | — | Unverified |
| 5 | TriDet (VideoMAEv2) | Average-mAP | 43.1 | — | Unverified |
| 6 | InternVideo2-1B | Average-mAP | 42.4 | — | Unverified |
| 7 | InternVideo | Average-mAP | 41.55 | — | Unverified |
| 8 | TriDet (SlowFast) | Average-mAP | 38.6 | — | Unverified |
| 9 | TriDet (I3D RGB) | Average-mAP | 36.8 | — | Unverified |
| 10 | TadTr (I3D RGB) | Average-mAP | 32.09 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | RDFA-S6 (InternVideo2-6B) | mAP | 29.6 | — | Unverified |
| 2 | ActionMamba(InternVideo2-6B) | mAP | 29.04 | — | Unverified |
| 3 | InternVideo2-6B | mAP | 27.7 | — | Unverified |
| 4 | DyFADet (VideoMAE v2-g) | mAP | 23.8 | — | Unverified |
| 5 | VideoMAE V2-g | mAP | 18.24 | — | Unverified |
| 6 | InternVideo | mAP | 17.57 | — | Unverified |
| 7 | BMN (i3d feaure) | mAP | 9.25 | — | Unverified |
| 8 | G-TAD (i3d feature) | mAP | 9.06 | — | Unverified |
| 9 | DBG (i3d feature) | mAP | 6.75 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TriDet (VideoMAEv2) | Average mAP | 37.5 | — | Unverified |
| 2 | DualDETR (I3D-rgb) | Average mAP | 32.64 | — | Unverified |
| 3 | TriDet (I3D-rgb) | Average mAP | 30.7 | — | Unverified |
| 4 | TemporalMaxer | Average mAP | 29.9 | — | Unverified |
| 5 | PointTAD | Average mAP | 23.5 | — | Unverified |
| 6 | PDAN | Average mAP | 17.3 | — | Unverified |
| 7 | MS-TCT | Average mAP | 16.2 | — | Unverified |
| 8 | MLAD | Average mAP | 14.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | VideoCLIP | Recall | 47.3 | — | Unverified |
| 2 | VLM | Recall | 46.5 | — | Unverified |
| 3 | TACo | Recall | 42.5 | — | Unverified |
| 4 | Text-Video Embedding | Recall | 33.6 | — | Unverified |
| 5 | Fully-supervised upper-bound | Recall | 31.6 | — | Unverified |
| 6 | Zhukov | Recall | 22.4 | — | Unverified |
| 7 | Alayrac | Recall | 13.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | AdaTAD (verb, VideoMAE-L) | Avg mAP (0.1-0.5) | 29.3 | — | Unverified |
| 2 | TriDet (verb) | Avg mAP (0.1-0.5) | 25.4 | — | Unverified |
| 3 | TemporalMaxer (verb) | Avg mAP (0.1-0.5) | 24.5 | — | Unverified |
| 4 | ActionFormer (verb) | Avg mAP (0.1-0.5) | 23.5 | — | Unverified |
| 5 | G-TAD (verb) | Avg mAP (0.1-0.5) | 9.4 | — | Unverified |
| 6 | BMN (verb) | Avg mAP (0.1-0.5) | 8.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TemporalMaxer | mAP | 27.2 | — | Unverified |
| 2 | MUSES | mAP | 18.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | DeepMetricLearner | mAP IOU@0.5 | 35.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ActionFormer (SlowFast+Omnivore+EgoVLP) | Average mAP | 21.76 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ActionFormer (SlowFast+Omnivore+EgoVLP) | Average mAP | 21.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | S-CNN | mAP | 7.4 | — | Unverified |