SOTAVerified

Image to text

Papers

Showing 76100 of 246 papers

TitleStatusHype
A Data-Driven Guided Decoding Mechanism for Diagnostic CaptioningCode0
SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between ObjectsCode0
Towards a text-based quantitative and explainable histopathology image analysisCode0
Aligning Multilingual Word Embeddings for Cross-Modal Retrieval TaskCode0
Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)Code0
RoCOCO: Robustness Benchmark of MS-COCO to Stress-test Image-Text Matching ModelsCode0
Self-Supervised Image-to-Text and Text-to-Image SynthesisCode0
PromptHash:Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing RetrievalCode0
Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language modelsCode0
Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored SearchCode0
Probing Multimodal Large Language Models for Global and Local Semantic RepresentationsCode0
Real-world validation of a multimodal LLM-powered pipeline for High-Accuracy Clinical Trial Patient Matching leveraging EHR dataCode0
Adaptively Clustering Neighbor Elements for Image-Text GenerationCode0
Benchmarking Vision-Language Contrastive Methods for Medical Representation LearningCode0
Pragmatic Radiology Report GenerationCode0
PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing RetrievalCode0
GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language ModelsCode0
CLIP-FSAC++: Few-Shot Anomaly Classification with Anomaly Descriptor Based on CLIPCode0
Face2Text: Collecting an Annotated Image Description Corpus for the Generation of Rich Face DescriptionsCode0
CLIP-based Synergistic Knowledge Transfer for Text-based Person RetrievalCode0
Characterizing and Understanding the Behavior of Quantized Models for Reliable DeploymentCode0
Multi-LLM Collaborative Caption Generation in Scientific DocumentsCode0
Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report GenerationCode0
MirrorGAN: Learning Text-to-image Generation by RedescriptionCode0
Effective Use of Word Order for Text Categorization with Convolutional Neural NetworksCode0
Show:102550
← PrevPage 4 of 10Next →

No leaderboard results yet.