| Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning | Nov 4, 2021 | Hierarchical Reinforcement Learningreinforcement-learning | —Unverified | 0 |
| Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models | Jul 8, 2025 | Future predictionLarge Language Model | —Unverified | 0 |
| Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis | Nov 26, 2024 | Decodermultimodal generation | —Unverified | 0 |
| VisLanding: Monocular 3D Perception for UAV Safe Landing via Depth-Normal Synergy | Jun 17, 2025 | Decision MakingSemantic Segmentation | —Unverified | 0 |
| Visual Image Reconstruction from Brain Activity via Latent Representation | May 13, 2025 | Early ClassificationImage Reconstruction | —Unverified | 0 |
| ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers | May 26, 2025 | cross-modal alignmentPosition | —Unverified | 0 |
| VQ-AR: Vector Quantized Autoregressive Probabilistic Time Series Forecasting | May 31, 2022 | Decision MakingInductive Bias | —Unverified | 0 |
| WeLM: A Well-Read Pre-trained Language Model for Chinese | Sep 21, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| What Matters for Model Merging at Scale? | Oct 4, 2024 | modelTask Arithmetic | —Unverified | 0 |
| What Matters to You? Towards Visual Representation Alignment for Robot Learning | Oct 11, 2023 | Zero-shot Generalization | —Unverified | 0 |