| No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance | Apr 4, 2024 | BenchmarkingImage Generation | CodeCode Available | 2 |
| Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning | Apr 4, 2024 | 3D Scene ReconstructionDepth Estimation | CodeCode Available | 2 |
| Decision Transformer as a Foundation Model for Partially Observable Continuous Control | Apr 3, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Apr 3, 2024 | Image GenerationImage Reconstruction | CodeCode Available | 9 |
| Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation | Apr 2, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| F^2Depth: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis | Mar 27, 2024 | Depth EstimationIndoor Monocular Depth Estimation | —Unverified | 0 |
| Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation | Mar 22, 2024 | Depth EstimationSurface Normal Estimation | CodeCode Available | 7 |
| Federated reinforcement learning for robot motion planning with zero-shot generalization | Mar 20, 2024 | Motion PlanningZero-shot Generalization | —Unverified | 0 |
| Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets | Mar 19, 2024 | Computed Tomography (CT)Segmentation | —Unverified | 0 |
| Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models | Mar 19, 2024 | image-classificationImage Classification | CodeCode Available | 1 |