Action Anticipation

Next action anticipation is defined as observing 1, ... , T frames and predicting the action that happens after a gap of T_a seconds. It is important to note that a new action starts after T_a seconds that is not seen in the observed frames. Here T_a=1 second.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 110 papers

Title	Date	Tasks	Status	Hype	Score
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning	Jun 11, 2025	Action AnticipationLarge Language Model	CodeCode Available	7	5
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World	Mar 24, 2024	Action AnticipationAction Quality Assessment	CodeCode Available	2	5
EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation	Jun 26, 2024	Action AnticipationAction Recognition	CodeCode Available	2	5
Learning State-Aware Visual Representations from Audible Interactions	Sep 27, 2022	Action AnticipationAction Recognition	CodeCode Available	1	5
Future Transformer for Long-term Action Anticipation	May 27, 2022	Action AnticipationLong Term Action Anticipation	CodeCode Available	1	5
MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Action Anticipation	Jan 1, 2025	Action AnticipationMamba	CodeCode Available	1	5
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video Recognition	Jan 20, 2022	Action AnticipationAction Classification	CodeCode Available	1	5
Real-time Online Video Detection with Temporal Smoothing Transformers	Sep 19, 2022	Action AnticipationAction Detection	CodeCode Available	1	5
Rethinking Learning Approaches for Long-Term Action Anticipation	Oct 20, 2022	Action AnticipationFuture prediction	CodeCode Available	1	5
Pedestrian 3D Bounding Box Prediction	Jun 28, 2022	Action AnticipationAutonomous Driving	CodeCode Available	1	5
Action Anticipation with Goal Consistency	Jun 26, 2023	Action Anticipation	CodeCode Available	1	5
Gated Temporal Diffusion for Stochastic Long-Term Dense Anticipation	Jul 16, 2024	Action AnticipationAutonomous Driving	CodeCode Available	1	5
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?	Jul 31, 2023	Action Anticipationcounterfactual	CodeCode Available	1	5
Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation	Oct 23, 2022	Action Anticipation	CodeCode Available	1	5
What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality Attention	May 22, 2019	Action AnticipationAction Recognition	CodeCode Available	1	5
Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video	May 4, 2020	Action AnticipationAction Recognition	CodeCode Available	1	5
Technical Report: Temporal Aggregate Representations	Jun 6, 2021	Action AnticipationAction Recognition	CodeCode Available	1	5
Intention-Conditioned Long-Term Human Egocentric Action Forecasting	Jul 25, 2022	Action AnticipationLong Term Action Anticipation	CodeCode Available	1	5
Multimodal Large Models Are Effective Action Anticipators	Jan 1, 2025	Action AnticipationLong Term Action Anticipation	CodeCode Available	1	5
Pedestrian Action Anticipation using Contextual Feature Fusion in Stacked RNNs	May 13, 2020	Action AnticipationAutonomous Vehicles	CodeCode Available	1	5
Action Scene Graphs for Long-Form Understanding of Egocentric Videos	Dec 6, 2023	Action AnticipationForm	CodeCode Available	1	5
Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023	Jun 28, 2023	Action AnticipationImage Captioning	CodeCode Available	1	5
A Dynamic Spatial-temporal Attention Network for Early Anticipation of Traffic Accidents	Jun 18, 2021	Accident AnticipationAction Anticipation	CodeCode Available	1	5
Semantically Guided Representation Learning For Action Anticipation	Jul 2, 2024	Action AnticipationRepresentation Learning	CodeCode Available	1	5
Anticipative Video Transformer	Jun 3, 2021	Action Anticipation	CodeCode Available	1	5
Temporal Aggregate Representations for Long-Range Video Understanding	Jun 1, 2020	Action AnticipationAction Recognition	CodeCode Available	1	5
Rescaling Egocentric Vision	Jun 23, 2020	Action AnticipationAction Detection	CodeCode Available	1	5
Higher Order Recurrent Space-Time Transformer for Video Action Prediction	Apr 17, 2021	Action AnticipationAction Recognition	CodeCode Available	1	5
Video + CLIP Baseline for Ego4D Long-term Action Anticipation	Jul 1, 2022	Action AnticipationLong Term Action Anticipation	CodeCode Available	1	5
Video Representation Learning with Visual Tempo Consistency	Jun 28, 2020	Action AnticipationAction Detection	CodeCode Available	1	5
Enhancing Next Active Object-based Egocentric Action Anticipation with Guided Attention	May 22, 2023	Action AnticipationObject	CodeCode Available	0	5
Encouraging LSTMs to Anticipate Actions Very Early	Mar 21, 2017	Action AnticipationAutonomous Navigation	CodeCode Available	0	5
Text-Derived Knowledge Helps Vision: A Simple Cross-modal Distillation for Video-based Action Anticipation	Oct 12, 2022	Action AnticipationTransfer Learning	CodeCode Available	0	5
TransAction: ICL-SJTU Submission to EPIC-Kitchens Action Anticipation Challenge 2021	Jul 28, 2021	Action Anticipation	CodeCode Available	0	5
Technical Report for Ego4D Long Term Action Anticipation Challenge 2023	Jul 4, 2023	Action AnticipationDecoder	CodeCode Available	0	5
Interaction Region Visual Transformer for Egocentric Action Anticipation	Nov 25, 2022	Action AnticipationHuman-Object Interaction Detection	CodeCode Available	0	5
Scaling Egocentric Vision: The EPIC-KITCHENS Dataset	Apr 8, 2018	Action Anticipation	CodeCode Available	0	5
Unified Recurrence Modeling for Video Action Anticipation	Jun 2, 2022	Action AnticipationDecision Making	CodeCode Available	0	5
RED: Reinforced Encoder-Decoder Networks for Action Anticipation	Jul 16, 2017	Action AnticipationDecoder	CodeCode Available	0	5
Predicting the Next Action by Modeling the Abstract Goal	Sep 12, 2022	Action Anticipation	CodeCode Available	0	5
QuIIL at T3 challenge: Towards Automation in Life-Saving Intervention Procedures from First-Person View	Jul 18, 2024	Action AnticipationAction Recognition	CodeCode Available	0	5
Object-centric Video Representation for Long-term Action Anticipation	Oct 31, 2023	Action AnticipationHuman-Object Interaction Detection	CodeCode Available	0	5
Hierarchical and Multimodal Data for Daily Activity Understanding	Apr 24, 2025	Action Anticipationcounterfactual	CodeCode Available	0	5
HalluciNet-ing Spatiotemporal Representations Using a 2D-CNN	Dec 10, 2019	Action AnticipationAction Classification	CodeCode Available	0	5
Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities	Mar 28, 2022	3D Action RecognitionAction Anticipation	CodeCode Available	0	5
Mamba Fusion: Learning Actions Through Questioning	Sep 17, 2024	Action AnticipationAction Recognition	CodeCode Available	0	5
Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video	Nov 25, 2019	Action AnticipationHuman-Object Interaction Detection	CodeCode Available	0	5
Action Anticipation from SoccerNet Football Video Broadcasts	Apr 16, 2025	Action AnticipationAction Spotting	CodeCode Available	0	5
Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos	Feb 7, 2023	Action AnticipationAction Recognition	CodeCode Available	0	5
From Recognition to Prediction: Leveraging Sequence Reasoning for Action Anticipation	Aug 5, 2024	Action AnticipationAction Recognition	CodeCode Available	0	5

Show:10 25 50

← PrevPage 1 of 3Next →

All datasets EPIC-KITCHENS-100 EPIC-KITCHENS-100 (test)EPIC-KITCHENS-55 (Seen test set (S1))EPIC-KITCHENS-55 (Unseen test set (S2)EGTEA Assembly101 EgoExoLearn 50 Salads

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	PlausiVL	Recall@5	27.6	—	Unverified
2	InAViT	Recall@5	25.89	—	Unverified
3	UADT	Recall@5	23	—	Unverified
4	S-GEAR	Recall@5	19.9	—	Unverified
5	AFFT	Recall@5	18.5	—	Unverified
6	MeMViT-24	Recall@5	17.7	—	Unverified
7	AVT+	Recall@5	15.9	—	Unverified
8	TempAgg	Recall@5	14.73	—	Unverified
9	RU-LSTM	Recall@5	13.94	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	InAViT	recall@5	23.75	—	Unverified
2	AVT++	recall@5	16.7	—	Unverified
3	AFFT	recall@5	14.9	—	Unverified
4	Abstract Goal	recall@5	14.29	—	Unverified
5	AVT+	recall@5	12.6	—	Unverified
6	TempAgg	recall@5	12.6	—	Unverified
7	RULSTM	recall@5	11.2	—	Unverified
8	TBN	recall@5	11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Abstract Goal	Top 1 Accuracy - Act.	22.03	—	Unverified
2	AVT+	Top 1 Accuracy - Act.	16.84	—	Unverified
3	ImagineRNN	Top 1 Accuracy - Act.	14.66	—	Unverified
4	RULSTM [24, 23]	Top 1 Accuracy - Act.	14.39	—	Unverified
5	ED	Top 1 Accuracy - Act.	8.08	—	Unverified
6	ATSN	Top 1 Accuracy - Act.	6	—	Unverified
7	2SCNN	Top 1 Accuracy - Act.	4.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Abstract Goal	Top 1 Accuracy - Act.	13.28	—	Unverified
2	AVT+	Top 1 Accuracy - Act.	10.41	—	Unverified
3	ImagineRNN	Top 1 Accuracy - Act.	9.25	—	Unverified
4	RULSTM [24, 23]	Top 1 Accuracy - Act.	8.16	—	Unverified
5	ED	Top 1 Accuracy - Act.	2.65	—	Unverified
6	ATSN	Top 1 Accuracy - Act.	2.39	—	Unverified
7	2SCNN	Top 1 Accuracy - Act.	2.29	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UADT	Top-1 Accuracy	68.4	—	Unverified
2	InAViT	Top-1 Accuracy	67.8	—	Unverified
3	Abstract Goal	Top-1 Accuracy	49.8	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Goal Consistency	Verbs Recall@5	60.04	—	Unverified
2	TempAgg	Verbs Recall@5	59.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Action anticipation baseline (co-training, with gaze)	Accuracy	45.45	—	Unverified
2	Action anticipation baseline (co-training, no gaze)	Accuracy	38.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UADT	Top-1 Accuracy	62.7	—	Unverified