SOTAVerified

TextVQA

Papers

Showing 2130 of 47 papers

TitleStatusHype
Analysing the Robustness of Vision-Language-Models to Common Corruptions0
Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language ModelsCode0
InstructOCR: Instruction Boosting Scene Text SpottingCode0
Track the Answer: Extending TextVQA from Image to Video with Spatio-Temporal CluesCode0
HyViLM: Enhancing Fine-Grained Recognition with a Hybrid Encoder for Vision-Language Models0
Enhancing Instruction-Following Capability of Visual-Language Models by Reducing Image Redundancy0
EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model0
FlexAttention for Efficient High-Resolution Vision-Language Models0
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs0
OmniFusion Technical ReportCode0
Show:102550
← PrevPage 3 of 5Next →

No leaderboard results yet.