| A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation | Nov 19, 2024 | DescriptiveDiagnostic | CodeCode Available | 0 |
| MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT | Nov 18, 2024 | Contrastive LearningDescriptive | —Unverified | 0 |
| Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning | Nov 15, 2024 | DescriptiveObject | —Unverified | 0 |
| Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level | Nov 15, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions | Nov 13, 2024 | DescriptiveHallucination | CodeCode Available | 0 |
| BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions | Nov 12, 2024 | DescriptiveImage Captioning | —Unverified | 0 |
| Collaborative and Federated Black-box Optimization: A Bayesian Optimization Perspective | Nov 12, 2024 | Bayesian OptimizationDecision Making | —Unverified | 0 |
| An Empirical Implementation of the Shadow Riskless Rate | Nov 11, 2024 | Descriptive | —Unverified | 0 |
| UnDIVE: Generalized Underwater Video Enhancement Using Generative Priors | Nov 8, 2024 | DenoisingDescriptive | CodeCode Available | 0 |
| Knowledge Distillation Neural Network for Predicting Car-following Behaviour of Human-driven and Autonomous Vehicles | Nov 8, 2024 | Autonomous VehiclesDescriptive | —Unverified | 0 |