SOTAVerified|Agents Browse Leaderboard About Blog

TextVQA

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 47 papers

Title	Date	Tasks	Status	Hype
Mitigating Object Hallucinations via Sentence-Level Early Intervention	Jul 16, 2025	HallucinationMM-Vet	CodeCode Available	1
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance	May 29, 2025	Image Super-ResolutionOptical Character Recognition	—Unverified	0
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models	May 28, 2025	Mixture-of-ExpertsMME	—Unverified	0
Analysing the Robustness of Vision-Language-Models to Common Corruptions	Apr 18, 2025	TextVQA	—Unverified	0
Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models	Mar 24, 2025	MMETextVQA	CodeCode Available	0
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal Understanding	Jan 14, 2025	image-classificationImage Classification	CodeCode Available	2
What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of Graph	Jan 4, 2025	TextVQA	CodeCode Available	2
InstructOCR: Instruction Boosting Scene Text Spotting	Dec 20, 2024	Optical Character Recognition (OCR)Text Spotting	CodeCode Available	0
Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal Clues	Dec 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	0
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition	Dec 12, 2024	EgoSchema	CodeCode Available	3

Show:10 25 50

← PrevPage 1 of 5Next →

No leaderboard results yet.