| Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR | Apr 15, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Relation-Rich Visual Document Generator for Visual Information Extraction | Apr 14, 2025 | Diversitydocument understanding | CodeCode Available | 0 |
| NoTeS-Bank: Benchmarking Neural Transcription and Search for Scientific Notes Understanding | Apr 12, 2025 | BenchmarkingDocument AI | —Unverified | 0 |
| Towards Calibration Enhanced Network by Inverse Adversarial Attack | Apr 8, 2025 | Adversarial AttackOptical Character Recognition | —Unverified | 0 |
| Playing Non-Embedded Card-Based Games with Reinforcement Learning | Apr 7, 2025 | Board GamesDecision Making | CodeCode Available | 3 |
| Multimodal LLMs for OCR, OCR Post-Correction, and Named Entity Recognition in Historical Documents | Apr 1, 2025 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 1 |
| Context-Independent OCR with Multimodal LLMs: Effects of Image Resolution and Visual Complexity | Mar 31, 2025 | Image CaptioningOptical Character Recognition | —Unverified | 0 |
| TFIC: End-to-End Text-Focused Image Compression for Coding for Machines | Mar 25, 2025 | Image CompressionOptical Character Recognition | —Unverified | 0 |
| AI-Driven Multi-Stage Computer Vision System for Defect Detection in Laser-Engraved Industrial Nameplates | Mar 5, 2025 | Anomaly DetectionDefect Detection | —Unverified | 0 |
| Judge a Book by its Cover: Investigating Multi-Modal LLMs for Multi-Page Handwritten Document Transcription | Feb 27, 2025 | Handwritten Text RecognitionHTR | CodeCode Available | 0 |