| Zero-Shot Monocular Scene Flow Estimation in the Wild | Jan 17, 2025 | Depth EstimationPrediction | —Unverified | 0 |
| StereoGen: High-quality Stereo Image Generation from a Single Image | Jan 15, 2025 | Depth EstimationImage Generation | —Unverified | 0 |
| Capability-Aware Shared Hypernetworks for Flexible Heterogeneous Multi-Robot Coordination | Jan 10, 2025 | DiversityImitation Learning | CodeCode Available | 0 |
| Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation | Jan 8, 2025 | Code GenerationLanguage Modeling | —Unverified | 0 |
| MADation: Face Morphing Attack Detection with Foundation Models | Jan 7, 2025 | Face Morphing Attack DetectionFace Recognition | CodeCode Available | 0 |
| Spot Risks Before Speaking! Unraveling Safety Attention Heads in Large Vision-Language Models | Jan 3, 2025 | Zero-shot Generalization | CodeCode Available | 0 |
| On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach | Jan 1, 2025 | Adversarial RobustnessZero-shot Generalization | —Unverified | 0 |
| On the Out-Of-Distribution Generalization of Large Multimodal Models | Jan 1, 2025 | In-Context LearningOut-of-Distribution Generalization | —Unverified | 0 |
| From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models | Dec 31, 2024 | Decision MakingZero-shot Generalization | —Unverified | 0 |
| EC-Diffuser: Multi-Object Manipulation via Entity-Centric Behavior Generation | Dec 25, 2024 | ObjectZero-shot Generalization | —Unverified | 0 |