Action Segmentation

Action Segmentation is a challenging problem in high-level video understanding. In its simplest form, Action Segmentation aims to segment a temporally untrimmed video by time and label each segmented part with one of pre-defined action labels. The results of Action Segmentation can be further used as input to various applications, such as video-to-text and action localization.

Source: TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 219 papers

Title	Date	Tasks	Status	Hype
Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos	Jan 1, 2024	Action SegmentationSegmentation	CodeCode Available	1
FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Action Segmentation	Jan 1, 2024	Action SegmentationSegmentation	CodeCode Available	2
Error Detection in Egocentric Procedural Task Videos	Jan 1, 2024	Action SegmentationActive Object Detection	—Unverified	0
SFGANS Self-supervised Future Generator for human ActioN Segmentation	Dec 31, 2023	Action SegmentationSegmentation	—Unverified	0
SMC-NCA: Semantic-guided Multi-level Contrast for Semi-supervised Temporal Action Segmentation	Dec 19, 2023	Action SegmentationContrastive Learning	CodeCode Available	0
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-modal Knowledge Transfer	Dec 12, 2023	Action RecognitionAction Segmentation	CodeCode Available	0
A Decoupled Spatio-Temporal Framework for Skeleton-based Action Segmentation	Dec 10, 2023	Action SegmentationSkeleton Based Action Segmentation	CodeCode Available	1
Activity Grammars for Temporal Action Segmentation	Dec 7, 2023	Action SegmentationSegmentation	CodeCode Available	1
Synchronization is All You Need: Exocentric-to-Egocentric Transfer for Temporal Action Segmentation with Unlabeled Synchronized Video Pairs	Dec 5, 2023	Action SegmentationAll	CodeCode Available	0
SigFormer: Sparse Signal-Guided Transformer for Multi-Modal Human Action Segmentation	Nov 29, 2023	Action SegmentationOptical Flow Estimation	CodeCode Available	0
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition	Nov 28, 2023	Action ClassificationAction Recognition	—Unverified	0
CASR: Refining Action Segmentation via Marginalizing Frame-levle Causal Relationships	Nov 21, 2023	Action SegmentationCausal Discovery	—Unverified	0
Is Weakly-supervised Action Segmentation Ready For Human-Robot Interaction? No, Let's Improve It With Action-union Learning	Oct 22, 2023	Action RecognitionAction Segmentation	CodeCode Available	2
NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding	Oct 12, 2023	Action SegmentationAutonomous Driving	—Unverified	0
End-to-End Streaming Video Temporal Action Segmentation with Reinforce Learning	Sep 27, 2023	Action RecognitionAction Segmentation	CodeCode Available	1
Action Segmentation Using 2D Skeleton Heatmaps and Multi-Modality Fusion	Sep 12, 2023	Action SegmentationActivity Recognition	—Unverified	0
OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation	Sep 12, 2023	Action SegmentationBoundary Detection	CodeCode Available	0
Prompt-enhanced Hierarchical Transformer Elevating Cardiopulmonary Resuscitation Instruction via Temporal Action Segmentation	Aug 31, 2023	Action SegmentationSegmentation	—Unverified	0
BIT: Bi-Level Temporal Modeling for Efficient Supervised Action Segmentation	Aug 28, 2023	Action SegmentationSegmentation	—Unverified	0
LAC: Latent Action Composition for Skeleton-based Action Segmentation	Aug 28, 2023	Action SegmentationContrastive Learning	—Unverified	0
How Much Temporal Long-Term Context is Needed for Action Segmentation?	Aug 22, 2023	Action SegmentationSegmentation	CodeCode Available	1
UnLoc: A Unified Framework for Video Localization Tasks	Aug 21, 2023	Action SegmentationMoment Retrieval	—Unverified	0
DPMix: Mixture of Depth and Point Cloud Video Experts for 4D Action Segmentation	Jul 31, 2023	Action SegmentationHuman-Object Interaction Detection	—Unverified	0
HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding	Jul 9, 2023	Action RecognitionAction Segmentation	CodeCode Available	0
SF-TMN: SlowFast Temporal Modeling Network for Surgical Phase Recognition	Jun 15, 2023	Action SegmentationSurgical phase recognition	—Unverified	0
Permutation-Aware Action Segmentation via Unsupervised Frame-to-Segment Alignment	May 31, 2023	Action SegmentationDecoder	CodeCode Available	1
Enhancing Transformer Backbone for Egocentric Video Action Segmentation	May 19, 2023	Action SegmentationDecoder	—Unverified	0
Pretrained Language Models as Visual Planners for Human Assistance	Apr 17, 2023	Action SegmentationLanguage Modelling	CodeCode Available	1
Leveraging triplet loss for unsupervised action segmentation	Apr 13, 2023	Action SegmentationClustering	CodeCode Available	1
MED-VT++: Unifying Multimodal Learning with a Multiscale Encoder-Decoder Video Transformer	Apr 12, 2023	Action SegmentationDecoder	—Unverified	0
Therbligs in Action: Video Understanding through Motion Primitives	Apr 6, 2023	Action AnticipationAction Recognition	—Unverified	0
DIR-AS: Decoupling Individual Identification and Temporal Reasoning for Action Segmentation	Apr 4, 2023	Action RecognitionAction Segmentation	—Unverified	0
Diffusion Action Segmentation	Mar 31, 2023	Action SegmentationDenoising	CodeCode Available	1
MS-TCRNet: Multi-Stage Temporal Convolutional Recurrent Networks for Action Segmentation Using Sensor-Augmented Kinematics	Mar 14, 2023	Action SegmentationData Augmentation	CodeCode Available	0
TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering	Mar 9, 2023	Action SegmentationClustering	—Unverified	0
Temporal Segment Transformer for Action Segmentation	Feb 25, 2023	Action SegmentationDenoising	—Unverified	0
Video Action Segmentation via Contextually Refined Temporal Keypoints	Jan 1, 2023	Action SegmentationGraph Matching	—Unverified	0
LAC - Latent Action Composition for Skeleton-based Action Segmentation	Jan 1, 2023	Action SegmentationContrastive Learning	—Unverified	0
Weakly-Supervised Action Segmentation and Unseen Error Detection in Anomalous Instructional Videos	Jan 1, 2023	Action SegmentationSegmentation	—Unverified	0
Markov Game Video Augmentation for Action Segmentation	Jan 1, 2023	Action SegmentationData Augmentation	—Unverified	0
Reducing the Label Bias for Timestamp Supervised Temporal Action Segmentation	Jan 1, 2023	Action SegmentationTemporal Action Segmentation	—Unverified	0
ASPnet: Action Segmentation With Shared-Private Representation of Multiple Data Sources	Jan 1, 2023	Action SegmentationDisentanglement	—Unverified	0
Timestamp-Supervised Action Segmentation from the Perspective of Clustering	Dec 22, 2022	Action SegmentationClustering	CodeCode Available	0
C2F-TCN: A Framework for Semi and Fully Supervised Temporal Action Segmentation	Dec 20, 2022	Action SegmentationDecoder	—Unverified	0
Hand Guided High Resolution Feature Enhancement for Fine-Grained Atomic Action Segmentation within Complex Human Assemblies	Nov 24, 2022	Action ClassificationAction Recognition	—Unverified	0
Distill and Collect for Semi-Supervised Temporal Action Segmentation	Nov 2, 2022	Action SegmentationSegmentation	—Unverified	0
Temporal Action Segmentation: An Analysis of Modern Techniques	Oct 19, 2022	Action SegmentationSegmentation	CodeCode Available	2
Robust Action Segmentation from Timestamp Supervision	Oct 12, 2022	Action SegmentationSegmentation	—Unverified	0
Streaming Video Temporal Action Segmentation In Real Time	Sep 28, 2022	Action SegmentationLanguage Modelling	CodeCode Available	1
Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos	Sep 13, 2022	Action SegmentationGraph Neural Network	—Unverified	0

Show:10 25 50

← PrevPage 2 of 5Next →

All datasets Breakfast 50 Salads GTEA COIN Assembly101 JIGSAWS Youtube INRIA Instructional 50Salads MPII Cooking 2 Dataset

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AdaFocus (newly extracted I3D-features, LT-Context model)	Average F1	76.2	—	Unverified
2	FACT (efficient hybrid of convolution and transformer model)	Average F1	74.7	—	Unverified
3	ASQuery	Average F1	74.6	—	Unverified
4	BIT	Average F1	73.7	—	Unverified
5	DiffAct	Average F1	73.6	—	Unverified
6	BaFormer	Average F1	72.4	—	Unverified
7	CETNet	Average F1	71.8	—	Unverified
8	SF-TMN(ASFormer)	Average F1	71.6	—	Unverified
9	RF++-SSTDA	Acc	70.8	—	Unverified
10	ASPnet	Average F1	70.6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Br-Prompt+ASPnet (RGB, flow, accelerometer)	F1@50%	88.5	—	Unverified
2	Semantic2Graph	F1@50%	87.3	—	Unverified
3	BaFormer	F1@50%	83.9	—	Unverified
4	DiffAct	F1@50%	83.7	—	Unverified
5	SF-TMN(ASFormer)	F1@50%	82.9	—	Unverified
6	LTContext	F1@50%	82	—	Unverified
7	UVAST	F1@50%	81.7	—	Unverified
8	Br-Prompt+ASFormer	F1@50%	81.3	—	Unverified
9	EUT	F1@50%	81	—	Unverified
10	CETNet	F1@50%	80.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Semantic2Graph	F1@50%	91.3	—	Unverified
2	FACT	F1@50%	87.5	—	Unverified
3	DiffAct	F1@50%	84.7	—	Unverified
4	BaFormer	F1@50%	83.5	—	Unverified
5	SF-TMN(ASFormer)	F1@50%	83.1	—	Unverified
6	Br-Prompt+ASFormer	F1@50%	83	—	Unverified
7	DPRN	F1@50%	82.9	—	Unverified
8	BIT	F1@50%	82.6	—	Unverified
9	CETNet	F1@50%	81.3	—	Unverified
10	UVAST	F1@50%	81	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UnLoc-L	Frame accuracy	72.8	—	Unverified
2	Univl	Frame accuracy	70	—	Unverified
3	Norton	Frame accuracy	69.8	—	Unverified
4	VideoClip	Frame accuracy	68.7	—	Unverified
5	TACo	Frame accuracy	68.4	—	Unverified
6	VLM	Frame accuracy	68.4	—	Unverified
7	MIL-NCE	Frame accuracy	61	—	Unverified
8	ActBERT	Frame accuracy	57	—	Unverified
9	CBT	Frame accuracy	53.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ASQuery	F1@10%	37.8	—	Unverified
2	LTContext	F1@10%	33.9	—	Unverified
3	ASFormer	F1@10%	33.4	—	Unverified
4	C2F-TCN	F1@10%	33.3	—	Unverified
5	UVAST	F1@10%	32.1	—	Unverified
6	MS-TCN++	F1@10%	31.6	—	Unverified
7	ProTAS(Offline)	F1@10%	28.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RL+Tree	Edit Distance	88.53	—	Unverified
2	RL (full)	Edit Distance	87.96	—	Unverified
3	TricorNet	Edit Distance	86.8	—	Unverified
4	SDL+SC-CRF	Edit Distance	86.21	—	Unverified
5	TCN	Edit Distance	83.1	—	Unverified
6	ST-CNN+Seg	Edit Distance	66.56	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TSA (FINCH)	Acc	62.4	—	Unverified
2	TSA (Kmeans)	Acc	59.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	EUT	Acc	87.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Unsup. TW-FINCH (K=avg/activity)	Accuracy	42	—	Unverified