SOTAVerified

Action Anticipation

Next action anticipation is defined as observing 1, ... , T frames and predicting the action that happens after a gap of T_a seconds. It is important to note that a new action starts after T_a seconds that is not seen in the observed frames. Here T_a=1 second.

Papers

Showing 150 of 110 papers

TitleStatusHype
V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and PlanningCode7
EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real WorldCode2
EgoVideo: Exploring Egocentric Foundation Model and Downstream AdaptationCode2
MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Long-Term Video RecognitionCode1
Gated Temporal Diffusion for Stochastic Long-Term Dense AnticipationCode1
Multimodal Large Models Are Effective Action AnticipatorsCode1
Action Scene Graphs for Long-Form Understanding of Egocentric VideosCode1
Rethinking Learning Approaches for Long-Term Action AnticipationCode1
Technical Report: Temporal Aggregate RepresentationsCode1
What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality AttentionCode1
Temporal Aggregate Representations for Long-Range Video UnderstandingCode1
Anticipative Video TransformerCode1
AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?Code1
Anticipative Feature Fusion Transformer for Multi-Modal Action AnticipationCode1
Pedestrian 3D Bounding Box PredictionCode1
Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023Code1
Real-time Online Video Detection with Temporal Smoothing TransformersCode1
Rescaling Egocentric VisionCode1
Higher Order Recurrent Space-Time Transformer for Video Action PredictionCode1
Video Representation Learning with Visual Tempo ConsistencyCode1
A Dynamic Spatial-temporal Attention Network for Early Anticipation of Traffic AccidentsCode1
Intention-Conditioned Long-Term Human Egocentric Action ForecastingCode1
Pedestrian Action Anticipation using Contextual Feature Fusion in Stacked RNNsCode1
Semantically Guided Representation Learning For Action AnticipationCode1
Learning State-Aware Visual Representations from Audible InteractionsCode1
Action Anticipation with Goal ConsistencyCode1
Rolling-Unrolling LSTMs for Action Anticipation from First-Person VideoCode1
Video + CLIP Baseline for Ego4D Long-term Action AnticipationCode1
MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Action AnticipationCode1
Future Transformer for Long-term Action AnticipationCode1
Action Anticipation for Collaborative Environments: The Impact of Contextual Information and Uncertainty-Based Prediction0
Anticipating human actions by correlating past with the future with Jaccard similarity measures0
Leveraging Temporal Context in Low Representational Power Regimes0
Egocentric Object Manipulation Graphs0
Analysis over vision-based models for pedestrian action anticipation0
Action Anticipation By Predicting Future Dynamic Images0
DiffAnt: Diffusion Models for Action Anticipation0
Delving into 3D Action Anticipation from Streaming Videos0
User-in-the-loop Evaluation of Multimodal LLMs for Activity Assistance0
Leveraging Next-Active Objects for Context-Aware Anticipation in Egocentric Videos0
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain0
Intention Action Anticipation Model with Guide-Feedback Loop Mechanism0
Deep Sequence Learning for Video Anticipation: From Discrete and Deterministic to Continuous and Stochastic0
Inductive Attention for Video Action Anticipation0
ICPR 2024 Competition on Rider Intention Prediction0
Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models0
VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought0
JOADAA: joint online action detection and action anticipation0
Knowledge Distillation for Action Anticipation via Label Smoothing0
Human Action Anticipation: A Survey0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1PlausiVLRecall@527.6Unverified
2InAViTRecall@525.89Unverified
3UADTRecall@523Unverified
4S-GEARRecall@519.9Unverified
5AFFTRecall@518.5Unverified
6MeMViT-24Recall@517.7Unverified
7AVT+Recall@515.9Unverified
8TempAggRecall@514.73Unverified
9RU-LSTMRecall@513.94Unverified
#ModelMetricClaimedVerifiedStatus
1InAViTrecall@523.75Unverified
2AVT++recall@516.7Unverified
3AFFTrecall@514.9Unverified
4Abstract Goalrecall@514.29Unverified
5AVT+recall@512.6Unverified
6TempAggrecall@512.6Unverified
7RULSTMrecall@511.2Unverified
8TBNrecall@511Unverified
#ModelMetricClaimedVerifiedStatus
1Abstract GoalTop 1 Accuracy - Act.22.03Unverified
2AVT+Top 1 Accuracy - Act.16.84Unverified
3ImagineRNNTop 1 Accuracy - Act.14.66Unverified
4RULSTM [24, 23]Top 1 Accuracy - Act.14.39Unverified
5EDTop 1 Accuracy - Act.8.08Unverified
6ATSNTop 1 Accuracy - Act.6Unverified
72SCNNTop 1 Accuracy - Act.4.32Unverified
#ModelMetricClaimedVerifiedStatus
1Abstract GoalTop 1 Accuracy - Act.13.28Unverified
2AVT+Top 1 Accuracy - Act.10.41Unverified
3ImagineRNNTop 1 Accuracy - Act.9.25Unverified
4RULSTM [24, 23]Top 1 Accuracy - Act.8.16Unverified
5EDTop 1 Accuracy - Act.2.65Unverified
6ATSNTop 1 Accuracy - Act.2.39Unverified
72SCNNTop 1 Accuracy - Act.2.29Unverified
#ModelMetricClaimedVerifiedStatus
1UADTTop-1 Accuracy68.4Unverified
2InAViTTop-1 Accuracy67.8Unverified
3Abstract GoalTop-1 Accuracy49.8Unverified
#ModelMetricClaimedVerifiedStatus
1Goal ConsistencyVerbs Recall@560.04Unverified
2TempAggVerbs Recall@559.11Unverified
#ModelMetricClaimedVerifiedStatus
1Action anticipation baseline (co-training, with gaze)Accuracy45.45Unverified
2Action anticipation baseline (co-training, no gaze)Accuracy38.7Unverified
#ModelMetricClaimedVerifiedStatus
1UADTTop-1 Accuracy62.7Unverified