| Mind the Gap: Analyzing Lacunae with Transformer-Based Transcription | Jun 28, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| OSPC: Detecting Harmful Memes with Large Language Model as a Catalyst | Jun 14, 2024 | Image CaptioningLanguage Modeling | —Unverified | 0 |
| M3T: A New Benchmark Dataset for Multi-Modal Document-Level Machine Translation | Jun 12, 2024 | Document Level Machine TranslationDocument Translation | CodeCode Available | 0 |
| VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text | Jun 10, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Scaling Automatic Extraction of Pseudocode | Jun 7, 2024 | Code GenerationOptical Character Recognition | —Unverified | 0 |
| CORU: Comprehensive Post-OCR Parsing and Receipt Understanding Dataset | Jun 6, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Generalized Jersey Number Recognition Using Multi-task Learning With Orientation-guided Weight Refinement | Jun 3, 2024 | Jersey Number RecognitionMulti-Task Learning | —Unverified | 0 |
| Vision Language Models for Spreadsheet Understanding: Challenges and Opportunities | May 25, 2024 | Boundary DetectionOptical Character Recognition | —Unverified | 0 |
| Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition | May 23, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Transfer Learning Approach for Railway Technical Map (RTM) Component Identification | May 21, 2024 | Managementobject-detection | —Unverified | 0 |