Action Segmentation

Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.

Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 219 papers

Title	Date	Tasks	Status
SFGANS Self-supervised Future Generator for human ActioN Segmentation	Dec 31, 2023	Action SegmentationSegmentation	—Unverified
SF-TMN: SlowFast Temporal Modeling Network for Surgical Phase Recognition	Jun 15, 2023	Action SegmentationSurgical phase recognition	—Unverified
ViSTec: Video Modeling for Sports Technique Recognition and Tactical Analysis	Feb 25, 2024	Action SegmentationInductive Bias	—Unverified
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition	Nov 28, 2023	Action ClassificationAction Recognition	—Unverified
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding	May 20, 2021	Action SegmentationLanguage Modeling	—Unverified
Watch-Bot: Unsupervised Learning for Reminding Humans of Forgotten Actions	Dec 14, 2015	Action SegmentationObject	—Unverified
Stacked Spatio-Temporal Graph Convolutional Networks for Action Segmentation	Nov 26, 2018	Action RecognitionAction Segmentation	—Unverified
Stitch Contrast and Segment_Learning a Human Action Segmentation Model Using Trimmed Skeleton Videos	Dec 19, 2024	Action ClassificationAction Localization	—Unverified
Actor and Action Modular Network for Text-based Video Segmentation	Nov 2, 2020	Action SegmentationAction Understanding	—Unverified
Surgical Phase Recognition in Laparoscopic Cholecystectomy	Jun 14, 2022	Action SegmentationSurgical phase recognition	—Unverified
Watch-n-Patch: Unsupervised Learning of Actions and Relations	Mar 11, 2016	Action SegmentationClustering	—Unverified
TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment	Aug 23, 2021	Action SegmentationContrastive Learning	—Unverified
TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering	Mar 9, 2023	Action SegmentationClustering	—Unverified
Temporal2Seq: A Unified Framework for Temporal Video Understanding Tasks	Sep 27, 2024	Action DetectionAction Segmentation	—Unverified
Actor-Action Semantic Segmentation with Region Masks	Jul 23, 2018	Action SegmentationInstance Segmentation	—Unverified
Action Understanding with Multiple Classes of Actors	Apr 27, 2017	Action RecognitionAction Segmentation	—Unverified
Temporal Action Segmentation with High-level Complex Activity Labels	Aug 15, 2021	Action RecognitionAction Segmentation	—Unverified
Action Shuffle Alternating Learning for Unsupervised Action Segmentation	Apr 5, 2021	Action SegmentationSegmentation	—Unverified
Temporal Context Consistency Above All: Enhancing Long-Term Anticipation by Learning and Enforcing Temporal Constraints	Dec 27, 2024	Action AnticipationAction Segmentation	—Unverified
Watch-n-Patch: Unsupervised Understanding of Actions and Relations	Jun 1, 2015	Action SegmentationUnsupervised Action Segmentation	—Unverified
Action Segmentation with Mixed Temporal Domain Adaptation	Apr 15, 2021	Action SegmentationDomain Adaptation	—Unverified
Temporal Deformable Residual Networks for Action Segmentation in Videos	Jun 1, 2018	Action SegmentationSegmentation	—Unverified
Weakly-Supervised Online Action Segmentation in Multi-View Instructional Videos	Mar 24, 2022	Action SegmentationSegmentation	—Unverified
Action Segmentation Using 2D Skeleton Heatmaps and Multi-Modality Fusion	Sep 12, 2023	Action SegmentationActivity Recognition	—Unverified
Action parsing using context features	May 20, 2022	Action ParsingAction Segmentation	—Unverified
Action in Mind: A Neural Network Approach to Action Recognition and Segmentation	Apr 30, 2021	Action RecognitionAction Segmentation	—Unverified
Temporal Segment Transformer for Action Segmentation	Feb 25, 2023	Action SegmentationDenoising	—Unverified
Therbligs in Action: Video Understanding through Motion Primitives	Apr 6, 2023	Action AnticipationAction Recognition	—Unverified
TimeLogic: A Temporal Logic Benchmark for Video QA	Jan 13, 2025	2kAction Segmentation	—Unverified
Weakly Supervised Actor-Action Segmentation via Robust Multi-Task Ranking	Jul 1, 2017	Action ClassificationAction Segmentation	—Unverified
Timestamp-Supervised Action Segmentation with Graph Convolutional Networks	Jun 30, 2022	Action SegmentationSegmentation	—Unverified
Towards Generalizing Temporal Action Segmentation to Unseen Views	Apr 3, 2025	Action SegmentationSegmentation	—Unverified
ActFusion: a Unified Diffusion Model for Action Segmentation and Anticipation	Dec 5, 2024	Action AnticipationAction Segmentation	—Unverified
Transformers in Action: Weakly Supervised Action Segmentation	Jan 14, 2022	Action Segmentation	—Unverified
Weakly-Supervised Action Segmentation and Unseen Error Detection in Anomalous Instructional Videos	Jan 1, 2023	Action SegmentationSegmentation	—Unverified
TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation	May 22, 2017	Action SegmentationDecoder	—Unverified
Turning to a Teacher for Timestamp Supervised Temporal Action Segmentation	Jul 2, 2022	Action SegmentationModel Optimization	—Unverified
Understanding Multi-Task Activities from Single-Task Videos	Jan 1, 2025	Action SegmentationSegmentation	—Unverified
Understanding Spatio-Temporal Relations in Human-Object Interaction using Pyramid Graph Convolutional Network	Oct 10, 2024	Action RecognitionAction Segmentation	—Unverified
Uni4D: A Unified Self-Supervised Learning Framework for Point Cloud Videos	Apr 7, 2025	Action SegmentationRepresentation Learning	—Unverified
Enhancing Transformer Backbone for Egocentric Video Action Segmentation	May 19, 2023	Action SegmentationDecoder	—Unverified
Error Detection in Egocentric Procedural Task Videos	Jan 1, 2024	Action SegmentationActive Object Detection	—Unverified
Exploring Temporally Dynamic Data Augmentation for Video Recognition	Jun 30, 2022	Action LocalizationAction Segmentation	—Unverified
End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding	Jan 29, 2018	Action ClassificationAction Segmentation	—Unverified
Fast and Unsupervised Action Boundary Detection for Action Segmentation	Jan 1, 2022	Action SegmentationBoundary Detection	—Unverified
End-to-End Action Segmentation Transformer	Mar 8, 2025	Action SegmentationData Augmentation	—Unverified
FIFA: Fast Inference Approximation for Action Segmentation	Aug 9, 2021	Action SegmentationSegmentation	—Unverified
Fine-grained Action Segmentation using the Semi-Supervised Action GAN	Sep 20, 2019	Action ClassificationAction Segmentation	—Unverified
Fine-Grained Semantic Segmentation of Motion Capture Data using Dilated Temporal Fully-Convolutional Networks	Mar 2, 2019	Action SegmentationImage Segmentation	—Unverified
Friends Across Time: Multi-Scale Action Segmentation Transformer for Surgical Phase Recognition	Jan 22, 2024	Action SegmentationOffline surgical phase recognition	—Unverified

Show:10 25 50

← PrevPage 4 of 5Next →

All datasets Breakfast 50 Salads GTEA COIN Assembly101 JIGSAWS Youtube INRIA Instructional 50Salads MPII Cooking 2 Dataset

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AdaFocus (newly extracted I3D-features, LT-Context model)	Average F1	76.2	—	Unverified
2	FACT (efficient hybrid of convolution and transformer model)	Average F1	74.7	—	Unverified
3	ASQuery	Average F1	74.6	—	Unverified
4	BIT	Average F1	73.7	—	Unverified
5	DiffAct	Average F1	73.6	—	Unverified
6	BaFormer	Average F1	72.4	—	Unverified
7	CETNet	Average F1	71.8	—	Unverified
8	SF-TMN(ASFormer)	Average F1	71.6	—	Unverified
9	RF++-SSTDA	Acc	70.8	—	Unverified
10	ASPnet	Average F1	70.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Br-Prompt+ASPnet (RGB, flow, accelerometer)	F1@50%	88.5	—	Unverified
2	Semantic2Graph	F1@50%	87.3	—	Unverified
3	BaFormer	F1@50%	83.9	—	Unverified
4	DiffAct	F1@50%	83.7	—	Unverified
5	SF-TMN(ASFormer)	F1@50%	82.9	—	Unverified
6	LTContext	F1@50%	82	—	Unverified
7	UVAST	F1@50%	81.7	—	Unverified
8	Br-Prompt+ASFormer	F1@50%	81.3	—	Unverified
9	EUT	F1@50%	81	—	Unverified
10	CETNet	F1@50%	80.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Semantic2Graph	F1@50%	91.3	—	Unverified
2	FACT	F1@50%	87.5	—	Unverified
3	DiffAct	F1@50%	84.7	—	Unverified
4	BaFormer	F1@50%	83.5	—	Unverified
5	SF-TMN(ASFormer)	F1@50%	83.1	—	Unverified
6	Br-Prompt+ASFormer	F1@50%	83	—	Unverified
7	DPRN	F1@50%	82.9	—	Unverified
8	BIT	F1@50%	82.6	—	Unverified
9	CETNet	F1@50%	81.3	—	Unverified
10	UVAST	F1@50%	81	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UnLoc-L	Frame accuracy	72.8	—	Unverified
2	Univl	Frame accuracy	70	—	Unverified
3	Norton	Frame accuracy	69.8	—	Unverified
4	VideoClip	Frame accuracy	68.7	—	Unverified
5	TACo	Frame accuracy	68.4	—	Unverified
6	VLM	Frame accuracy	68.4	—	Unverified
7	MIL-NCE	Frame accuracy	61	—	Unverified
8	ActBERT	Frame accuracy	57	—	Unverified
9	CBT	Frame accuracy	53.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ASQuery	F1@10%	37.8	—	Unverified
2	LTContext	F1@10%	33.9	—	Unverified
3	ASFormer	F1@10%	33.4	—	Unverified
4	C2F-TCN	F1@10%	33.3	—	Unverified
5	UVAST	F1@10%	32.1	—	Unverified
6	MS-TCN++	F1@10%	31.6	—	Unverified
7	ProTAS(Offline)	F1@10%	28.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RL+Tree	Edit Distance	88.53	—	Unverified
2	RL (full)	Edit Distance	87.96	—	Unverified
3	TricorNet	Edit Distance	86.8	—	Unverified
4	SDL+SC-CRF	Edit Distance	86.21	—	Unverified
5	TCN	Edit Distance	83.1	—	Unverified
6	ST-CNN+Seg	Edit Distance	66.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TSA (FINCH)	Acc	62.4	—	Unverified
2	TSA (Kmeans)	Acc	59.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EUT	Acc	87.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Unsup. TW-FINCH (K=avg/activity)	Accuracy	42	—	Unverified