Image to text

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–175 of 246 papers

Title	Date	Tasks	Status
RefineNet: Enhancing Text-to-Image Conversion with High-Resolution and Detail Accuracy through Hierarchical Transformers and Progressive Refinement	Dec 27, 2023	Computational EfficiencyImage Generation	—Unverified
DiffuVST: Narrating Fictional Scenes with Global-History-Guided Denoising Models	Dec 12, 2023	DenoisingDiversity	—Unverified
Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval	Dec 4, 2023	AttributeCross-Modal Person Re-Identification	—Unverified
Learning Pseudo-Labeler beyond Noun Concepts for Open-Vocabulary Object Detection	Dec 4, 2023	Image to textobject-detection	—Unverified
Pragmatic Radiology Report Generation	Nov 28, 2023	Image to text	CodeCode Available
Beyond Images: An Integrative Multi-modal Approach to Chest X-Ray Report Generation	Nov 18, 2023	Image to textSemantic Similarity	—Unverified
AI Recommendation System for Enhanced Customer Experience: A Novel Image-to-Text Method	Nov 16, 2023	Image to textObject	—Unverified
Efficient End-to-End Visual Document Understanding with Rationale Distillation	Nov 16, 2023	document understandingImage to text	—Unverified
Semantically Grounded QFormer for Efficient Vision Language Understanding	Nov 13, 2023	DiversityImage to text	—Unverified
GPT-4V(ision) as a Generalist Evaluator for Vision-Language Tasks	Nov 2, 2023	Image GenerationImage to text	—Unverified
Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning	Oct 12, 2023	Image CaptioningImage-text Retrieval	—Unverified
SingleInsert: Inserting New Concepts from a Single Image into Text-to-Image Models for Flexible Editing	Oct 12, 2023	Image GenerationImage to text	—Unverified
Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API	Oct 7, 2023	Decoderdocument understanding	—Unverified
Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency	Oct 5, 2023	Image GenerationImage to text	—Unverified
Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search	Sep 28, 2023	cross-modal alignmentCross-Modal Retrieval	CodeCode Available
SurrogatePrompt: Bypassing the Safety Filter of Text-to-Image Models via Substitution	Sep 25, 2023	Image to text	—Unverified
CLIP-based Synergistic Knowledge Transfer for Text-based Person Retrieval	Sep 18, 2023	Image to textPerson Retrieval	CodeCode Available
Offline Detection of Misspelled Handwritten Words by Convolving Recognition Model Features with Text Labels	Sep 18, 2023	Generative Adversarial NetworkHandwriting Recognition	—Unverified
BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification	Sep 9, 2023	Image to textLanguage Modeling	—Unverified
Sequential Semantic Generative Communication for Progressive Text-to-Image Generation	Sep 8, 2023	Image GenerationImage to text	—Unverified
GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training	Aug 22, 2023	image-classificationImage Classification	—Unverified
Multimodal Neurons in Pretrained Text-Only Transformers	Aug 3, 2023	Image CaptioningImage to text	—Unverified
Revisiting DETR Pre-training for Object Detection	Aug 2, 2023	Image to textObject	—Unverified
Towards a Visual-Language Foundation Model for Computational Pathology	Jul 24, 2023	Contrastive Learningimage-classification	—Unverified
PiTL: Cross-modal Retrieval with Weakly-supervised Vision-language Pre-training via Prompting	Jul 14, 2023	Cross-Modal RetrievalImage to text	—Unverified

Show:10 25 50

← PrevPage 7 of 10Next →

No leaderboard results yet.