| Video Event Reasoning and Prediction by Fusing World Knowledge from LLMs with Vision Foundation Models | Jul 8, 2025 | Future predictionLarge Language Model | —Unverified | 0 |
| Distributed Poisson multi-Bernoulli filtering via generalised covariance intersection | Jun 23, 2025 | Future prediction | —Unverified | 0 |
| DySS: Dynamic Queries and State-Space Learning for Efficient 3D Object Detection from Multi-Camera Videos | Jun 11, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Looking Beyond Visible Cues: Implicit Video Question Answering via Dual-Clue Reasoning | Jun 9, 2025 | Future predictionQuestion Answering | CodeCode Available | 0 |
| LETS Forecast: Learning Embedology for Time Series Forecasting | Jun 6, 2025 | Future predictionTime Series | CodeCode Available | 1 |
| Are Statistical Methods Obsolete in the Era of Deep Learning? | May 27, 2025 | Deep LearningEpidemiology | —Unverified | 0 |
| ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos | May 24, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 |
| Learning from Streaming Video with Orthogonal Gradients | Apr 2, 2025 | Future predictionRepresentation Learning | —Unverified | 0 |
| AdaWorld: Learning Adaptable World Models with Latent Actions | Mar 24, 2025 | Future prediction | CodeCode Available | 3 |
| PhysTwin: Physics-Informed Reconstruction and Simulation of Deformable Objects from Videos | Mar 23, 2025 | 4D reconstructionDeformable Object Manipulation | CodeCode Available | 3 |