Action Segmentation

Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.

Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 219 papers

Title	Date	Tasks	Status	Hype
Alleviating Over-segmentation Errors by Detecting Action Boundaries	Jul 14, 2020	Action ClassificationAction Segmentation	CodeCode Available	1
Automated freezing of gait assessment with marker-based motion capture and multi-stage spatial-temporal graph convolutional neural networks	Mar 29, 2021	Action SegmentationSegmentation	CodeCode Available	1
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment	Apr 7, 2024	Action Quality AssessmentAction Segmentation	CodeCode Available	1
Refining Action Segmentation With Hierarchical Video Representations	Jan 1, 2021	Action SegmentationSegmentation	CodeCode Available	1
Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation	Mar 20, 2021	Action SegmentationClustering	CodeCode Available	1
EPFL-Smart-Kitchen-30: Densely annotated cooking dataset with 3D kinematics to challenge video and language models	Jun 2, 2025	Action RecognitionAction Segmentation	CodeCode Available	1
An End-to-end 3D Convolutional Neural Network for Action Detection and Segmentation in Videos	Nov 30, 2017	Action DetectionAction Segmentation	—Unverified	0
Action Understanding with Multiple Classes of Actors	Apr 27, 2017	Action RecognitionAction Segmentation	—Unverified	0
An Efficient Framework for Few-shot Skeleton-based Temporal Action Segmentation	Jul 20, 2022	Action SegmentationData Augmentation	—Unverified	0
ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation	Dec 5, 2024	Action AnticipationAction Segmentation	—Unverified	0
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation	Jul 31, 2023	Action SegmentationHuman-Object Interaction Detection	—Unverified	0
Distill and Collect for Semi-Supervised Temporal Action Segmentation	Nov 2, 2022	Action SegmentationSegmentation	—Unverified	0
An Efficient 3D CNN for Action/Object Segmentation in Video	Jul 21, 2019	Action SegmentationDecoder	—Unverified	0
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for Action Segmentation	Apr 4, 2023	Action RecognitionAction Segmentation	—Unverified	0
Dilated Temporal Fully-Convolutional Network for Semantic Segmentation of Motion Capture Data	Jun 24, 2018	Action SegmentationMotion Synthesis	—Unverified	0
Anchor-Constrained Viterbi for Set-Supervised Action Segmentation	Apr 5, 2021	Action SegmentationSegmentation	—Unverified	0
Action Shuffle Alternating Learning for Unsupervised Action Segmentation	Apr 5, 2021	Action SegmentationSegmentation	—Unverified	0
Depthwise Separable Temporal Convolutional Network for Action Segmentation	Jan 19, 2021	Action SegmentationDecoder	—Unverified	0
2by2: Weakly-Supervised Learning for Global Action Segmentation	Dec 17, 2024	Action SegmentationWeakly-supervised Learning	—Unverified	0
Depth Over RGB: Automatic Evaluation of Open Surgery Skills Using Depth Camera	Jan 18, 2024	Action SegmentationData Compression	—Unverified	0
Grasp Type Revisited: A Modern Perspective on a Classical Feature for Vision	Jun 1, 2015	Action SegmentationAction Understanding	—Unverified	0
Hand Guided High Resolution Feature Enhancement for Fine-Grained Atomic Action Segmentation within Complex Human Assemblies	Nov 24, 2022	Action ClassificationAction Recognition	—Unverified	0
Coupled Generative Adversarial Network for Continuous Fine-grained Action Segmentation	Sep 20, 2019	Action SegmentationGenerative Adversarial Network	—Unverified	0
Friends Across Time: Multi-Scale Action Segmentation Transformer for Surgical Phase Recognition	Jan 22, 2024	Action SegmentationOffline surgical phase recognition	—Unverified	0
A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation	Jun 3, 2019	Action RecognitionAction Segmentation	—Unverified	0

Show:10 25 50

← PrevPage 3 of 9Next →

All datasets Breakfast 50 Salads GTEA COIN Assembly101 JIGSAWS Youtube INRIA Instructional 50Salads MPII Cooking 2 Dataset

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AdaFocus (newly extracted I3D-features, LT-Context model)	Average F1	76.2	—	Unverified
2	FACT (efficient hybrid of convolution and transformer model)	Average F1	74.7	—	Unverified
3	ASQuery	Average F1	74.6	—	Unverified
4	BIT	Average F1	73.7	—	Unverified
5	DiffAct	Average F1	73.6	—	Unverified
6	BaFormer	Average F1	72.4	—	Unverified
7	CETNet	Average F1	71.8	—	Unverified
8	SF-TMN(ASFormer)	Average F1	71.6	—	Unverified
9	RF++-SSTDA	Acc	70.8	—	Unverified
10	ASPnet	Average F1	70.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Br-Prompt+ASPnet (RGB, flow, accelerometer)	F1@50%	88.5	—	Unverified
2	Semantic2Graph	F1@50%	87.3	—	Unverified
3	BaFormer	F1@50%	83.9	—	Unverified
4	DiffAct	F1@50%	83.7	—	Unverified
5	SF-TMN(ASFormer)	F1@50%	82.9	—	Unverified
6	LTContext	F1@50%	82	—	Unverified
7	UVAST	F1@50%	81.7	—	Unverified
8	Br-Prompt+ASFormer	F1@50%	81.3	—	Unverified
9	EUT	F1@50%	81	—	Unverified
10	CETNet	F1@50%	80.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Semantic2Graph	F1@50%	91.3	—	Unverified
2	FACT	F1@50%	87.5	—	Unverified
3	DiffAct	F1@50%	84.7	—	Unverified
4	BaFormer	F1@50%	83.5	—	Unverified
5	SF-TMN(ASFormer)	F1@50%	83.1	—	Unverified
6	Br-Prompt+ASFormer	F1@50%	83	—	Unverified
7	DPRN	F1@50%	82.9	—	Unverified
8	BIT	F1@50%	82.6	—	Unverified
9	CETNet	F1@50%	81.3	—	Unverified
10	UVAST	F1@50%	81	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UnLoc-L	Frame accuracy	72.8	—	Unverified
2	Univl	Frame accuracy	70	—	Unverified
3	Norton	Frame accuracy	69.8	—	Unverified
4	VideoClip	Frame accuracy	68.7	—	Unverified
5	TACo	Frame accuracy	68.4	—	Unverified
6	VLM	Frame accuracy	68.4	—	Unverified
7	MIL-NCE	Frame accuracy	61	—	Unverified
8	ActBERT	Frame accuracy	57	—	Unverified
9	CBT	Frame accuracy	53.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ASQuery	F1@10%	37.8	—	Unverified
2	LTContext	F1@10%	33.9	—	Unverified
3	ASFormer	F1@10%	33.4	—	Unverified
4	C2F-TCN	F1@10%	33.3	—	Unverified
5	UVAST	F1@10%	32.1	—	Unverified
6	MS-TCN++	F1@10%	31.6	—	Unverified
7	ProTAS(Offline)	F1@10%	28.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RL+Tree	Edit Distance	88.53	—	Unverified
2	RL (full)	Edit Distance	87.96	—	Unverified
3	TricorNet	Edit Distance	86.8	—	Unverified
4	SDL+SC-CRF	Edit Distance	86.21	—	Unverified
5	TCN	Edit Distance	83.1	—	Unverified
6	ST-CNN+Seg	Edit Distance	66.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TSA (FINCH)	Acc	62.4	—	Unverified
2	TSA (Kmeans)	Acc	59.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EUT	Acc	87.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Unsup. TW-FINCH (K=avg/activity)	Accuracy	42	—	Unverified