SOTAVerified

Vision-Language-Action

Papers

Showing 2650 of 157 papers

TitleStatusHype
Hybrid Reasoning for Perception, Explanation, and Autonomous Action in Manufacturing0
Real-Time Execution of Action Chunking Flow PoliciesCode3
BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models0
BitVLA: 1-bit Vision-Language-Action Models for Robotics ManipulationCode2
Surgeon Style Fingerprinting and Privacy Risk Quantification via Discrete Diffusion Models in a Vision-Language-Action FrameworkCode0
Robotic Policy Learning via Human-assisted Action Preference Optimization0
RoboCerebra: A Large-scale Benchmark for Long-horizon Robotic Manipulation Evaluation0
DriveAction: A Benchmark for Exploring Human-like Driving Decisions in VLA Models0
Adversarial Attacks on Robotic Vision Language Action ModelsCode1
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding0
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient RoboticsCode11
OG-VLA: 3D-Aware Vision Language Action Model via Orthographic Image Generation0
LoHoVLA: A Unified Vision-Language-Action Model for Long-Horizon Embodied Tasks0
Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction0
Impromptu VLA: Open Weights and Open Data for Driving Vision-Language-Action ModelsCode3
TrackVLA: Embodied Visual Tracking in the Wild0
Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better0
ForceVLA: Enhancing VLA Models with a Force-aware MoE for Contact-rich Manipulation0
ChatVLA-2: Vision-Language-Action Model with Open-World Embodied Reasoning from Pretrained KnowledgeCode1
Hume: Introducing System-2 Thinking in Visual-Language-Action Model0
Embodied AI with Foundation Models for Mobile Service Robots: A Systematic Review0
What Can RL Bring to VLA Generalization? An Empirical Study0
VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement LearningCode3
Interactive Post-Training for Vision-Language-Action Models0
DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving0
Show:102550
← PrevPage 2 of 7Next →

No leaderboard results yet.