SOTAVerified

Action Segmentation

Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.

Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Papers

Showing 110 of 219 papers

TitleStatusHype
Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person ScenariosCode0
EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language modelsCode1
M2R2: MulitModal Robotic Representation for Temporal Action Segmentation0
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos0
Towards Generalizing Temporal Action Segmentation to Unseen Views0
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning0
Cost-Sensitive Learning for Long-Tailed Temporal Action SegmentationCode0
Condensing Action Segmentation Datasets via Generative Network Inversion0
End-to-End Action Segmentation Transformer0
Show:102550
← PrevPage 1 of 22Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1RL+TreeEdit Distance88.53Unverified
2RL (full)Edit Distance87.96Unverified
3TricorNetEdit Distance86.8Unverified
4SDL+SC-CRFEdit Distance86.21Unverified
5TCNEdit Distance83.1Unverified
6ST-CNN+SegEdit Distance66.56Unverified