| Lost in OCR Translation? Vision-Based Approaches to Robust Document Retrieval | May 8, 2025 | Computational EfficiencyOptical Character Recognition | —Unverified | 0 |
| ChemRxivQuest: A Curated Chemistry Question-Answer Database Extracted from ChemRxiv Preprints | May 8, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| DOTA: Deformable Optimized Transformer Architecture for End-to-End Text Recognition with Retrieval-Augmented Generation | May 7, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Automated Parsing of Engineering Drawings for Structured Information Extraction Using a Fine-tuned Document Understanding Transformer | May 2, 2025 | document understandingHallucination | —Unverified | 0 |
| Evaluating Menu OCR and Translation: A Benchmark for Aligning Human and Automated Evaluations in Large Vision-Language Models | Apr 16, 2025 | document understandingLayout Design | CodeCode Available | 0 |
| Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR | Apr 15, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Relation-Rich Visual Document Generator for Visual Information Extraction | Apr 14, 2025 | Diversitydocument understanding | CodeCode Available | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 |
| Towards Calibration Enhanced Network by Inverse Adversarial Attack | Apr 8, 2025 | Adversarial AttackOptical Character Recognition | —Unverified | 0 |
| Playing Non-Embedded Card-Based Games with Reinforcement Learning | Apr 7, 2025 | Board GamesDecision Making | CodeCode Available | 3 |
| Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents | Apr 1, 2025 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 1 |
| Context-Independent OCR with Multimodal LLMs: Effects of Image Resolution and Visual Complexity | Mar 31, 2025 | Image CaptioningOptical Character Recognition | —Unverified | 0 |
| TFIC: End-to-End Text-Focused Image Compression for Coding for Machines | Mar 25, 2025 | Image CompressionOptical Character Recognition | —Unverified | 0 |
| AI-Driven Multi-Stage Computer Vision System for Defect Detection in Laser-Engraved Industrial Nameplates | Mar 5, 2025 | Anomaly DetectionDefect Detection | —Unverified | 0 |
| Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription | Feb 27, 2025 | Handwritten Text RecognitionHTR | CodeCode Available | 0 |
| MultiOCR-QA: Dataset for Evaluating Robustness of LLMs in Question Answering on Multilingual OCR Texts | Feb 24, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| KITAB-Bench: A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding | Feb 20, 2025 | document understandingOptical Character Recognition | —Unverified | 0 |
| Reading the unreadable: Creating a dataset of 19th century English newspapers using image-to-text language models | Feb 18, 2025 | Image to textOptical Character Recognition | CodeCode Available | 0 |
| Visual Graph Question Answering with ASP and LLMs for Language Parsing | Feb 13, 2025 | Graph Question AnsweringOptical Character Recognition | —Unverified | 0 |
| Benchmarking Vision-Language Models on Optical Character Recognition in Dynamic Video Environments | Feb 10, 2025 | BenchmarkingOptical Character Recognition | CodeCode Available | 1 |
| Éclair -- Extracting Content and Layout with Integrated Reading Order for Documents | Feb 6, 2025 | Image CaptioningOptical Character Recognition | —Unverified | 0 |
| LoCoML: A Framework for Real-World ML Inference Pipelines | Jan 24, 2025 | Automatic Speech RecognitionMachine Translation | —Unverified | 0 |
| Exploring AI-based System Design for Pixel-level Protected Health Information Detection in Medical Images | Jan 16, 2025 | De-identificationOptical Character Recognition | —Unverified | 0 |
| Comparative analysis of optical character recognition methods for Sámi texts from the National Library of Norway | Jan 13, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Efficient License Plate Recognition in Videos Using Visual Rhythm and Accumulative Line Analysis | Jan 8, 2025 | License Plate DetectionLicense Plate Recognition | CodeCode Available | 0 |