| AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Jun 16, 2025 | Action GenerationAutonomous Driving | CodeCode Available | 3 |
| ConTextTab: A Semantics-Aware Tabular In-Context Learner | Jun 12, 2025 | In-Context LearningWorld Knowledge | CodeCode Available | 2 |
| MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning | Jun 12, 2025 | Image GenerationMultimodal Reasoning | —Unverified | 0 |
| RoCA: Robust Cross-Domain End-to-End Autonomous Driving | Jun 11, 2025 | Autonomous DrivingDomain Adaptation | —Unverified | 0 |
| Serendipitous Recommendation with Multimodal LLM | Jun 9, 2025 | Recommendation SystemsWorld Knowledge | —Unverified | 0 |
| ReCogDrive: A Reinforced Cognitive Framework for End-to-End Autonomous Driving | Jun 9, 2025 | Autonomous DrivingImitation Learning | —Unverified | 0 |
| Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation | Jun 6, 2025 | Computational EfficiencyWorld Knowledge | —Unverified | 0 |
| Quantifying Cross-Modality Memorization in Vision-Language Models | Jun 5, 2025 | Machine UnlearningMemorization | —Unverified | 0 |
| TIIF-Bench: How Does Your T2I Model Follow Your Instructions? | Jun 2, 2025 | BenchmarkingInstruction Following | —Unverified | 0 |
| From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models | Jun 1, 2025 | World Knowledge | —Unverified | 0 |