| Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment | Aug 22, 2024 | Multi-Task LearningRetrieval | —Unverified | 0 |
| Generalizable Facial Expression Recognition | Aug 20, 2024 | Domain AdaptationFacial Expression Recognition | CodeCode Available | 1 |
| Zero-Shot Object-Centric Representation Learning | Aug 17, 2024 | ObjectObject Discovery | —Unverified | 0 |
| OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction | Aug 16, 2024 | PredictionTraffic Prediction | CodeCode Available | 2 |
| One Shot is Enough for Sequential Infrared Small Target Segmentation | Aug 9, 2024 | One-Shot SegmentationSegmentation | CodeCode Available | 0 |
| Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation | Aug 7, 2024 | Adversarial RobustnessImage Segmentation | —Unverified | 0 |
| Visual Grounding for Object-Level Generalization in Reinforcement Learning | Aug 4, 2024 | Language ModellingObject | CodeCode Available | 1 |
| HeteroMorpheus: Universal Control Based on Morphological Heterogeneity Modeling | Aug 2, 2024 | DiversityZero-shot Generalization | CodeCode Available | 0 |
| Segment Anything for Videos: A Systematic Survey | Jul 31, 2024 | Image SegmentationRobot Manipulation Generalization | CodeCode Available | 5 |
| HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image Priors | Jul 26, 2024 | Depth EstimationGPU | CodeCode Available | 2 |