| GeoContrastNet: Contrastive Key-Value Edge Learning for Language-Agnostic Document Understanding | May 6, 2024 | Contrastive Learningdocument understanding | CodeCode Available | 0 |
| DELINE8K: A Synthetic Data Pipeline for the Semantic Segmentation of Historical Documents | Apr 30, 2024 | 8kDiversity | CodeCode Available | 0 |
| Multi-Page Document Visual Question Answering using Self-Attention Scoring Mechanism | Apr 29, 2024 | document understandingGPU | CodeCode Available | 0 |
| ViOCRVQA: Novel Benchmark Dataset and Vision Reader for Visual Question Answering by Understanding Vietnamese Text in Images | Apr 29, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 |
| Mixed Text Recognition with Efficient Parameter Fine-Tuning and Transformer | Apr 19, 2024 | DecoderOptical Character Recognition | —Unverified | 0 |
| Resilience of Large Language Models for Noisy Instructions | Apr 15, 2024 | Automatic Speech RecognitionOptical Character Recognition | —Unverified | 0 |
| TEXT2TASTE: A Versatile Egocentric Vision System for Intelligent Reading Assistance Using Large Language Model | Apr 14, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Making Old Kurdish Publications Processable by Augmenting Available Optical Character Recognition Engines | Apr 9, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement | Apr 8, 2024 | BinarizationDocument Enhancement | CodeCode Available | 2 |
| PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents | Mar 23, 2024 | ArticlesOptical Character Recognition | CodeCode Available | 1 |