SOTAVerified

Action Understanding

Papers

Showing 5188 of 88 papers

TitleStatusHype
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment0
Weakly Supervised Actor-Action Segmentation via Robust Multi-Task Ranking0
Who is Mistaken?0
The SkatingVerse Workshop & Challenge: Methods and Results0
Action Understanding with Multiple Classes of Actors0
Actor and Action Modular Network for Text-based Video Segmentation0
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition0
A Hierarchical Pose-Based Approach to Complex Action Understanding Using Dictionaries of Actionlets and Motion Poselets0
An Expressive Deep Model for Human Action Parsing from A Single Image0
ATTACH Dataset: Annotated Two-Handed Assembly Actions for Human Action Understanding0
Boundary Content Graph Neural Network for Temporal Action Proposal Generation0
Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery0
Can Humans Fly? Action Understanding With Multiple Classes of Actors0
CathAction: A Benchmark for Endovascular Intervention Understanding0
Comparing Machines and Children: Using Developmental Psychology Experiments to Assess the Strengths and Weaknesses of LaMDA Responses0
Compositional Structure Learning for Action Understanding0
Cortical Mirror-System Activation During Real-Life Game Playing: An Intracranial Electroencephalography (EEG) Study0
DAP3D-Net: Where, What and How Actions Occur in Videos?0
Enhancing Video Transformers for Action Understanding with VLM-aided Training0
Event-based Timestamp Image Encoding Network for Human Action Recognition and Anticipation0
Explainable robotic systems: Understanding goal-driven actions in a reinforcement learning scenario0
Exploring Uncertainty in Conditional Multi-Modal Retrieval Systems0
FineGym: A Hierarchical Video Dataset for Fine-grained Action Understanding0
From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding0
Grasp Type Revisited: A Modern Perspective on a Classical Feature for Vision0
GTA: Global Temporal Attention for Video Action Understanding0
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models0
Heterogeneous Skeleton-Based Action Representation Learning0
Hierarchical Attention Network for Action Recognition in Videos0
Human Action Segmentation With Hierarchical Supervoxel Consistency0
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding0
Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study0
Intra- and Inter-Action Understanding via Temporal Action Parsing0
Invisible-to-Visible: Privacy-Aware Human Instance Segmentation using Airborne Ultrasound via Collaborative Learning Variational Autoencoder0
JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection0
Kantian Deontology Meets AI Alignment: Towards Morally Grounded Fairness Metrics0
MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion0
MMAct: A Large-Scale Dataset for Cross Modal Human Action Understanding0
Show:102550
← PrevPage 2 of 2Next →

No leaderboard results yet.