SOTAVerified

Action Parsing

Action parsing is the task of, given a video or still image, assigning each frame or image a label describing the action in that frame or image.

Papers

Showing 115 of 15 papers

TitleStatusHype
Local Temporal Bilinear Pooling for Fine-grained Action ParsingCode1
Modeling Worlds in TextCode1
Action Recognition by Hierarchical Mid-level Action Elements0
An Expressive Deep Model for Human Action Parsing from A Single Image0
DAP3D-Net: Where, What and How Actions Occur in Videos?0
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles0
Intra- and Inter-Action Understanding via Temporal Action Parsing0
Learning Knowledge Graph-based World Models of Textual Environments0
SSCAP: Self-supervised Co-occurrence Action Parsing for Unsupervised Temporal Action Segmentation0
Action parsing using context features0
A Baseline Framework for Part-level Action Parsing and Action Recognition0
Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework0
PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation0
Technical Report: Disentangled Action Parsing Networks for Accurate Part-level Action Parsing0
Frontal Low-rank Random Tensors for Fine-grained Action SegmentationCode0
Show:102550

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Seq2SeqSet accuracy18.1Unverified
2CALMSet accuracy13.79Unverified