SOTAVerified

Image to text

Papers

Showing 5160 of 246 papers

TitleStatusHype
TrojVLM: Backdoor Attack Against Vision Language Models0
Robotic Environmental State Recognition with Pre-Trained Vision-Language Models and Black-Box Optimization0
Evaluating authenticity and quality of image captions via sentiment and semantic analyses0
See or Guess: Counterfactually Regularized Image CaptioningCode1
UniFashion: A Unified Vision-Language Model for Multimodal Fashion Retrieval and GenerationCode1
Ask, Attend, Attack: A Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models0
In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic SegmentationCode2
Instruction Tuning-free Visual Token Complement for Multimodal LLMs0
GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language ModelsCode0
Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local SimilaritiesCode2
Show:102550
← PrevPage 6 of 25Next →

No leaderboard results yet.