| Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension | Dec 4, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 1 |
| Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey | Dec 3, 2024 | Change DetectionDescriptive | CodeCode Available | 3 |
| Analyzing the Impact of AI Tools on Student Study Habits and Academic Performance | Dec 3, 2024 | Descriptive | —Unverified | 0 |
| SelfPrompt: Autonomously Evaluating LLM Robustness via Domain-Constrained Knowledge Guidelines and Refined Adversarial Prompts | Dec 1, 2024 | DescriptiveKnowledge Graphs | —Unverified | 0 |
| EventGPT: Event Stream Understanding with Multimodal Large Language Models | Dec 1, 2024 | Descriptive | —Unverified | 0 |
| Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints | Nov 28, 2024 | Descriptive | —Unverified | 0 |
| TechCoach: Towards Technical-Point-Aware Descriptive Action Coaching | Nov 26, 2024 | Action AssessmentDescriptive | —Unverified | 0 |
| What's in the Image? A Deep-Dive into the Vision of Vision Language Models | Nov 26, 2024 | AttributeDescriptive | —Unverified | 0 |
| SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis | Nov 25, 2024 | DescriptiveForm | —Unverified | 0 |
| Utilization and Profitability of Tractor Services for Maize Farming in Ejura-Sekyedumase Municipality, Ghana | Nov 24, 2024 | Descriptive | —Unverified | 0 |
| From MTEB to MTOB: Retrieval-Augmented Classification for Descriptive Grammars | Nov 23, 2024 | DescriptiveIn-Context Learning | CodeCode Available | 0 |
| MolReFlect: Towards Fine-grained In-Context Alignment between Molecules and Texts | Nov 22, 2024 | DescriptiveMolecule Captioning | —Unverified | 0 |
| The Explabox: Model-Agnostic Machine Learning Transparency & Analysis | Nov 22, 2024 | DescriptiveFairness | —Unverified | 0 |
| Proportional infinite-width infinite-depth limit for deep linear neural networks | Nov 22, 2024 | Descriptive | —Unverified | 0 |
| Omni-IML: Towards Unified Image Manipulation Localization | Nov 22, 2024 | DecoderDescriptive | —Unverified | 0 |
| MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts | Nov 22, 2024 | Descriptive | —Unverified | 0 |
| Uterine Ultrasound Image Captioning Using Deep Learning Techniques | Nov 21, 2024 | Deep LearningDescriptive | —Unverified | 0 |
| A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation | Nov 19, 2024 | DescriptiveDiagnostic | CodeCode Available | 0 |
| MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT | Nov 18, 2024 | Contrastive LearningDescriptive | —Unverified | 0 |
| Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level | Nov 15, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning | Nov 15, 2024 | DescriptiveObject | —Unverified | 0 |
| Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted Captions | Nov 13, 2024 | DescriptiveHallucination | CodeCode Available | 0 |
| BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions | Nov 12, 2024 | DescriptiveImage Captioning | —Unverified | 0 |
| Collaborative and Federated Black-box Optimization: A Bayesian Optimization Perspective | Nov 12, 2024 | Bayesian OptimizationDecision Making | —Unverified | 0 |
| An Empirical Implementation of the Shadow Riskless Rate | Nov 11, 2024 | Descriptive | —Unverified | 0 |