| Towards Understanding the Use of MLLM-Enabled Applications for Visual Interpretation by Blind and Low Vision People | Mar 7, 2025 | Descriptive | —Unverified | 0 |
| A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning | Mar 6, 2025 | DescriptiveImage Captioning | CodeCode Available | 0 |
| Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs | Mar 5, 2025 | Computational EfficiencyDescriptive | CodeCode Available | 1 |
| Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test | Mar 4, 2025 | Autonomous DrivingDescriptive | —Unverified | 0 |
| Enhancing Monocular 3D Scene Completion with Diffusion Model | Mar 2, 2025 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 1 |
| Assessing Large Language Models in Agentic Multilingual National Bias | Feb 25, 2025 | Decision MakingDescriptive | —Unverified | 0 |
| Software implemented fault diagnosis of natural gas pumping unit based on feedforward neural network | Feb 25, 2025 | DescriptiveDiagnostic | —Unverified | 0 |
| Dataset Featurization: Uncovering Natural Language Features through Unsupervised Data Reconstruction | Feb 24, 2025 | Descriptive | —Unverified | 0 |
| CLIP-SENet: CLIP-based Semantic Enhancement Network for Vehicle Re-identification | Feb 24, 2025 | DescriptiveVehicle Re-Identification | —Unverified | 0 |
| Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics | Feb 20, 2025 | Descriptive | CodeCode Available | 0 |
| ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models | Feb 17, 2025 | Code GenerationDescriptive | CodeCode Available | 0 |
| ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition | Feb 17, 2025 | Chord RecognitionDescriptive | —Unverified | 0 |
| FE-LWS: Refined Image-Text Representations via Decoder Stacking and Fused Encodings for Remote Sensing Image Captioning | Feb 13, 2025 | Caption GenerationDecoder | —Unverified | 0 |
| PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology | Feb 13, 2025 | Decision MakingDescriptive | —Unverified | 0 |
| ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image Classification | Feb 12, 2025 | DecoderDescriptive | CodeCode Available | 2 |
| A Multimodal PDE Foundation Model for Prediction and Scientific Text Descriptions | Feb 9, 2025 | DescriptiveMultimodal Deep Learning | CodeCode Available | 0 |
| Augmented Conditioning Is Enough For Effective Training Image Generation | Feb 6, 2025 | Conditional Image GenerationDescriptive | —Unverified | 0 |
| Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs | Feb 4, 2025 | 16kDescriptive | CodeCode Available | 1 |
| Combining physics-based and data-driven models: advancing the frontiers of research with Scientific Machine Learning | Jan 30, 2025 | Descriptive | —Unverified | 0 |
| Towards Recommender Systems LLMs Playground (RecSysLLMsP): Exploring Polarization and Engagement in Simulated Social Networks | Jan 29, 2025 | DescriptiveRecommendation Systems | —Unverified | 0 |
| Audio Large Language Models Can Be Descriptive Speech Quality Evaluators | Jan 27, 2025 | Descriptive | CodeCode Available | 0 |
| Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLM | Jan 27, 2025 | DescriptiveEvent Detection | CodeCode Available | 0 |
| Addressing Out-of-Label Hazard Detection in Dashcam Videos: Insights from the COOOL Challenge | Jan 27, 2025 | Anomaly DetectionAutonomous Driving | CodeCode Available | 0 |
| Query-based versus resource-based cache strategies in tag-based browsing systems | Jan 26, 2025 | DescriptiveTAG | —Unverified | 0 |
| Measuring and Mitigating Hallucinations in Vision-Language Dataset Generation for Remote Sensing | Jan 24, 2025 | Caption GenerationDataset Generation | —Unverified | 0 |
| Attribute-based Visual Reprogramming for Image Classification with CLIP | Jan 23, 2025 | AttributeDescriptive | CodeCode Available | 0 |
| Optimizing Portfolios with Pakistan-Exposed ETFs: Risk and Performance Insight | Jan 23, 2025 | Descriptive | —Unverified | 0 |
| A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs | Jan 23, 2025 | DescriptiveDiagnostic | —Unverified | 0 |
| Investigation of the Privacy Concerns in AI Systems for Young Digital Citizens: A Comparative Stakeholder Analysis | Jan 23, 2025 | Descriptivevalid | —Unverified | 0 |
| A Resource-Efficient Training Framework for Remote Sensing Text--Image Retrieval | Jan 18, 2025 | DescriptiveImage Retrieval | CodeCode Available | 0 |
| Augmenting a Large Language Model with a Combination of Text and Visual Data for Conversational Visualization of Global Geospatial Data | Jan 16, 2025 | Data InteractionDescriptive | —Unverified | 0 |
| Electronic Health Records: Towards Digital Twins in Healthcare | Jan 16, 2025 | Descriptive | —Unverified | 0 |
| Empirical Study on the Factors Influencing Stock Market Volatility in China | Jan 15, 2025 | Descriptive | —Unverified | 0 |
| A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network Security | Jan 14, 2025 | DescriptiveIntrusion Detection | CodeCode Available | 0 |
| A Preliminary Survey of Semantic Descriptive Model for Images | Jan 13, 2025 | DescriptiveImage Description | —Unverified | 0 |
| FinerWeb-10BT: Refining Web Data with LLM-Based Line-Level Filtering | Jan 13, 2025 | DescriptiveHellaSwag | CodeCode Available | 0 |
| Initial Findings on Sensor based Open Vocabulary Activity Recognition via Text Embedding Inversion | Jan 13, 2025 | Activity RecognitionDescriptive | —Unverified | 0 |
| Effect of Information Technology on Job Creation to Support Economic: Case Studies of Graduates in Universities (2023-2024) of the KRG of Iraq | Jan 8, 2025 | Descriptive | —Unverified | 0 |
| Visual question answering: from early developments to recent advances -- a survey | Jan 7, 2025 | DescriptiveNatural Language Understanding | —Unverified | 0 |
| SMIR: Efficient Synthetic Data Pipeline To Improve Multi-Image Reasoning | Jan 7, 2025 | DescriptiveSynthetic Data Generation | CodeCode Available | 0 |
| Exploring a Datasets Statistical Effect Size Impact on Model Performance, and Data Sample-Size Sufficiency | Jan 5, 2025 | DescriptiveExperimental Design | —Unverified | 0 |
| Time Series Language Model for Descriptive Caption Generation | Jan 3, 2025 | Caption GenerationDenoising | —Unverified | 0 |
| MDSF: Context-Aware Multi-Dimensional Data Storytelling Framework based on Large language Model | Jan 2, 2025 | DescriptiveLanguage Modeling | —Unverified | 0 |
| Quality of life and perceived care of patients in advanced chronic kidney disease consultations: a cross-sectional descriptive study | Jan 2, 2025 | Descriptive | —Unverified | 0 |
| A redescription mining framework for post-hoc explaining and relating deep learning models | Jan 2, 2025 | Descriptive | —Unverified | 0 |
| Semantic and Expressive Variations in Image Captions Across Languages | Jan 1, 2025 | DescriptiveImage Captioning | —Unverified | 0 |
| FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression | Jan 1, 2025 | Descriptive | CodeCode Available | 2 |
| DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension | Jan 1, 2025 | DescriptiveReferring Expression | —Unverified | 0 |
| HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction | Jan 1, 2025 | DescriptiveInstruction Following | —Unverified | 0 |
| TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification | Dec 31, 2024 | Audio ClassificationClassification | —Unverified | 0 |