SOTAVerified

Action Understanding

Papers

Showing 2650 of 88 papers

TitleStatusHype
YouMakeup VQA Challenge: Towards Fine-grained Action Understanding in Domain-Specific VideosCode1
LLaVA-Pose: Enhancing Human Pose and Action Understanding via Keypoint-Integrated Instruction TuningCode0
The Role of Video Generation in Enhancing Data-Limited Action Understanding0
PCBEAR: Pose Concept Bottleneck for Explainable Action Recognition0
RoboAct-CLIP: Video-Driven Pre-training of Atomic Action Understanding for Robotics0
Can DeepSeek Reason Like a Surgeon? An Empirical Evaluation for Vision-Language Understanding in Robotic-Assisted Surgery0
ScreenLLM: Stateful Screen Schema for Efficient Action Understanding and Prediction0
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models0
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding0
STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding0
Heterogeneous Skeleton-Based Action Representation Learning0
About Time: Advances, Challenges, and Outlooks of Action Understanding0
MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion0
CathAction: A Benchmark for Endovascular Intervention Understanding0
Probing Fine-Grained Action Understanding and Cross-View Generalization of Foundation Models0
Region-aware Image-based Human Action Retrieval with Transformers0
VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment0
Self-Supervised Skeleton-Based Action Representation Learning: A Benchmark and BeyondCode0
The SkatingVerse Workshop & Challenge: Methods and Results0
Social-MAE: Social Masked Autoencoder for Multi-person Motion Representation Learning0
Enhancing Video Transformers for Action Understanding with VLM-aided Training0
Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study0
Multitask Learning in Minimally Invasive Surgical Vision: A Review0
Towards Weakly Supervised End-to-end Learning for Long-video Action Recognition0
Kantian Deontology Meets AI Alignment: Towards Morally Grounded Fairness Metrics0
Show:102550
← PrevPage 2 of 4Next →

No leaderboard results yet.