Action Segmentation

Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.

Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 219 papers

Title	Date	Tasks	Status	Hype
Self-supervised pretraining of vision transformers for animal behavioral analysis and neural encoding	Jul 13, 2025	Action SegmentationContrastive Learning	—Unverified	0
HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios	Jun 11, 2025	Action RecognitionAction Segmentation	CodeCode Available	0
EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models	Jun 2, 2025	Action RecognitionAction Segmentation	CodeCode Available	1
M2R2: MulitModal Robotic Representation for Temporal Action Segmentation	Apr 25, 2025	Action SegmentationTemporal Action Segmentation	—Unverified	0
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos	Apr 7, 2025	Action SegmentationRepresentation Learning	—Unverified	0
Towards Generalizing Temporal Action Segmentation to Unseen Views	Apr 3, 2025	Action SegmentationSegmentation	—Unverified	0
What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning	Mar 27, 2025	Action Segmentationcounterfactual	—Unverified	0
Cost-Sensitive Learning for Long-Tailed Temporal Action Segmentation	Mar 24, 2025	Action SegmentationSegmentation	CodeCode Available	0
Condensing Action Segmentation Datasets via Generative Network Inversion	Mar 18, 2025	Action SegmentationIncremental Learning	—Unverified	0
End-to-End Action Segmentation Transformer	Mar 8, 2025	Action SegmentationData Augmentation	—Unverified	0

Show:10 25 50

← PrevPage 1 of 22Next →

All datasets Breakfast 50 Salads GTEA COIN Assembly101 JIGSAWS Youtube INRIA Instructional 50Salads MPII Cooking 2 Dataset

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AdaFocus (newly extracted I3D-features, LT-Context model)	Average F1	76.2	—	Unverified
2	FACT (efficient hybrid of convolution and transformer model)	Average F1	74.7	—	Unverified
3	ASQuery	Average F1	74.6	—	Unverified
4	BIT	Average F1	73.7	—	Unverified
5	DiffAct	Average F1	73.6	—	Unverified
6	BaFormer	Average F1	72.4	—	Unverified
7	CETNet	Average F1	71.8	—	Unverified
8	SF-TMN(ASFormer)	Average F1	71.6	—	Unverified
9	RF++-SSTDA	Acc	70.8	—	Unverified
10	ASPnet	Average F1	70.6	—	Unverified