SOTAVerified

Descriptive

Papers

Showing 351400 of 1477 papers

TitleStatusHype
TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification0
AltGen: AI-Driven Alt Text Generation for Enhancing EPUB Accessibility0
Is Your Text-to-Image Model Robust to Caption Noise?0
Multi-Agent Norm Perception and Induction in Distributed Healthcare0
Underutilization of Syntactic Processing by Chinese Learners of English in Comprehending English Sentences, Evidenced from Adapted Garden-Path Ambiguity Experiment0
TalkWithMachines: Enhancing Human-Robot Interaction for Interpretable Industrial Robotics Through Large/Vision Language Models0
Descriptive Caption Enhancement with Visual Specialists for Multimodal PerceptionCode0
Real Classification by Description: Extending CLIP's Limits of Part Attributes RecognitionCode0
SEKE: Specialised Experts for Keyword ExtractionCode0
JoVALE: Detecting Human Actions in Video Using Audiovisual and Language ContextsCode0
Organizational culture and the usage of Industry 4.0 technologies: evidence from Swiss businesses0
Digital Transformation in Switzerland: The Current State and Expectations0
Is it the end of (generative) linguistics as we know it?0
Implicit Location-Caption Alignment via Complementary Masking for Weakly-Supervised Dense Video CaptioningCode0
Multilingual and Explainable Text Detoxification with Parallel CorporaCode0
Semi-automated analysis of audio-recorded lessons: The case of teachers' engaging messages0
CoinMath: Harnessing the Power of Coding Instruction for Math LLMsCode0
Bridging Vision and Language: Modeling Causality and Temporality in Video Narratives0
Automated Image Captioning with CNNs and TransformersCode0
Interpreting Graphic Notation with MusicLDM: An AI Improvisation of Cornelius Cardew's Treatise0
MOPI-HFRS: A Multi-objective Personalized Health-aware Food Recommendation System with LLM-enhanced InterpretationCode0
Hallucination Elimination and Semantic Enhancement Framework for Vision-Language Models in Traffic ScenariosCode0
Language-Guided Image Tokenization for Generation0
Cardiometabolic Risk Factors in South Asians: An Epidemiological and Anthropological Study in an Urban Populace of Eastern India0
ProtDAT: A Unified Framework for Protein Sequence Design from Any Protein Text Description0
Analyzing the Impact of AI Tools on Student Study Habits and Academic Performance0
SelfPrompt: Autonomously Evaluating LLM Robustness via Domain-Constrained Knowledge Guidelines and Refined Adversarial Prompts0
EventGPT: Event Stream Understanding with Multimodal Large Language Models0
Enhancing Sketch Animation: Text-to-Video Diffusion Models with Temporal Consistency and Rigidity Constraints0
What's in the Image? A Deep-Dive into the Vision of Vision Language Models0
TechCoach: Towards Technical-Point-Aware Descriptive Action Coaching0
SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis0
Utilization and Profitability of Tractor Services for Maize Farming in Ejura-Sekyedumase Municipality, Ghana0
From MTEB to MTOB: Retrieval-Augmented Classification for Descriptive GrammarsCode0
Omni-IML: Towards Unified Image Manipulation Localization0
MolReFlect: Towards In-Context Fine-grained Alignments between Molecules and Texts0
Proportional infinite-width infinite-depth limit for deep linear neural networks0
MolReFlect: Towards Fine-grained In-Context Alignment between Molecules and Texts0
The Explabox: Model-Agnostic Machine Learning Transparency & Analysis0
Uterine Ultrasound Image Captioning Using Deep Learning Techniques0
A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT SegmentationCode0
MMBind: Unleashing the Potential of Distributed and Heterogeneous Data for Multimodal Learning in IoT0
Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning0
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level0
Bridging the Visual Gap: Fine-Tuning Multimodal Models with Knowledge-Adapted CaptionsCode0
BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions0
Collaborative and Federated Black-box Optimization: A Bayesian Optimization Perspective0
An Empirical Implementation of the Shadow Riskless Rate0
UnDIVE: Generalized Underwater Video Enhancement Using Generative PriorsCode0
Knowledge Distillation Neural Network for Predicting Car-following Behaviour of Human-driven and Autonomous Vehicles0
Show:102550
← PrevPage 8 of 30Next →

No leaderboard results yet.