Action Segmentation
Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.
Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation
Papers
Showing 1–10 of 219 papers
All datasetsBreakfast50 SaladsGTEACOINAssembly101JIGSAWSYoutube INRIA Instructional50SaladsMPII Cooking 2 Dataset
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UnLoc-L | Frame accuracy | 72.8 | — | Unverified |
| 2 | Univl | Frame accuracy | 70 | — | Unverified |
| 3 | Norton | Frame accuracy | 69.8 | — | Unverified |
| 4 | VideoClip | Frame accuracy | 68.7 | — | Unverified |
| 5 | VLM | Frame accuracy | 68.4 | — | Unverified |
| 6 | TACo | Frame accuracy | 68.4 | — | Unverified |
| 7 | MIL-NCE | Frame accuracy | 61 | — | Unverified |
| 8 | ActBERT | Frame accuracy | 57 | — | Unverified |
| 9 | CBT | Frame accuracy | 53.9 | — | Unverified |