SOTAVerified

Descriptive

Papers

Showing 101150 of 1477 papers

TitleStatusHype
Towards Understanding the Use of MLLM-Enabled Applications for Visual Interpretation by Blind and Low Vision People0
A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image CaptioningCode0
Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMsCode1
Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test0
Enhancing Monocular 3D Scene Completion with Diffusion ModelCode1
Assessing Large Language Models in Agentic Multilingual National Bias0
Software implemented fault diagnosis of natural gas pumping unit based on feedforward neural network0
Dataset Featurization: Uncovering Natural Language Features through Unsupervised Data Reconstruction0
CLIP-SENet: CLIP-based Semantic Enhancement Network for Vehicle Re-identification0
Ranking Joint Policies in Dynamic Games using Evolutionary DynamicsCode0
ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language ModelsCode0
ChordFormer: A Conformer-Based Architecture for Large-Vocabulary Audio Chord Recognition0
FE-LWS: Refined Image-Text Representations via Decoder Stacking and Fused Encodings for Remote Sensing Image Captioning0
PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology0
ViLa-MIL: Dual-scale Vision-Language Multiple Instance Learning for Whole Slide Image ClassificationCode2
A Multimodal PDE Foundation Model for Prediction and Scientific Text DescriptionsCode0
Augmented Conditioning Is Enough For Effective Training Image Generation0
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMsCode1
Combining physics-based and data-driven models: advancing the frontiers of research with Scientific Machine Learning0
Towards Recommender Systems LLMs Playground (RecSysLLMsP): Exploring Polarization and Engagement in Simulated Social Networks0
Audio Large Language Models Can Be Descriptive Speech Quality EvaluatorsCode0
Generating customized prompts for Zero-Shot Rare Event Medical Image Classification using LLMCode0
Addressing Out-of-Label Hazard Detection in Dashcam Videos: Insights from the COOOL ChallengeCode0
Query-based versus resource-based cache strategies in tag-based browsing systems0
Measuring and Mitigating Hallucinations in Vision-Language Dataset Generation for Remote Sensing0
Attribute-based Visual Reprogramming for Image Classification with CLIPCode0
Optimizing Portfolios with Pakistan-Exposed ETFs: Risk and Performance Insight0
A Cognitive Paradigm Approach to Probe the Perception-Reasoning Interface in VLMs0
Investigation of the Privacy Concerns in AI Systems for Young Digital Citizens: A Comparative Stakeholder Analysis0
A Resource-Efficient Training Framework for Remote Sensing Text--Image RetrievalCode0
Augmenting a Large Language Model with a Combination of Text and Visual Data for Conversational Visualization of Global Geospatial Data0
Electronic Health Records: Towards Digital Twins in Healthcare0
Empirical Study on the Factors Influencing Stock Market Volatility in China0
A Comparative Analysis of DNN-based White-Box Explainable AI Methods in Network SecurityCode0
A Preliminary Survey of Semantic Descriptive Model for Images0
FinerWeb-10BT: Refining Web Data with LLM-Based Line-Level FilteringCode0
Initial Findings on Sensor based Open Vocabulary Activity Recognition via Text Embedding Inversion0
Effect of Information Technology on Job Creation to Support Economic: Case Studies of Graduates in Universities (2023-2024) of the KRG of Iraq0
Visual question answering: from early developments to recent advances -- a survey0
SMIR: Efficient Synthetic Data Pipeline To Improve Multi-Image ReasoningCode0
Exploring a Datasets Statistical Effect Size Impact on Model Performance, and Data Sample-Size Sufficiency0
Time Series Language Model for Descriptive Caption Generation0
MDSF: Context-Aware Multi-Dimensional Data Storytelling Framework based on Large language Model0
Quality of life and perceived care of patients in advanced chronic kidney disease consultations: a cross-sectional descriptive study0
A redescription mining framework for post-hoc explaining and relating deep learning models0
Semantic and Expressive Variations in Image Captions Across Languages0
FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual CompressionCode2
DViN: Dynamic Visual Routing Network for Weakly Supervised Referring Expression Comprehension0
HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction0
TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification0
Show:102550
← PrevPage 3 of 30Next →

No leaderboard results yet.