SOTAVerified

TextVQA

Papers

Showing 110 of 47 papers

TitleStatusHype
Mitigating Object Hallucinations via Sentence-Level Early InterventionCode1
TextSR: Diffusion Super-Resolution with Multilingual OCR Guidance0
EvoMoE: Expert Evolution in Mixture of Experts for Multimodal Large Language Models0
Analysing the Robustness of Vision-Language-Models to Common Corruptions0
Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language ModelsCode0
Parameter-Inverted Image Pyramid Networks for Visual Perception and Multimodal UnderstandingCode2
What Kind of Visual Tokens Do We Need? Training-free Visual Token Pruning for Multi-modal Large Language Models from the Perspective of GraphCode2
InstructOCR: Instruction Boosting Scene Text SpottingCode0
Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal CluesCode0
Lyra: An Efficient and Speech-Centric Framework for Omni-CognitionCode3
Show:102550
← PrevPage 1 of 5Next →

No leaderboard results yet.