| MultiQG-TI: Towards Question Generation from Multi-modal Sources | Jul 7, 2023 | Image to textOptical Character Recognition | CodeCode Available | 0 |
| T-MARS: Improving Visual Representations by Circumventing Text Feature Learning | Jul 6, 2023 | Optical Character Recognition | CodeCode Available | 1 |
| Resume Information Extraction via Post-OCR Text Processing | Jun 23, 2023 | Object RecognitionOptical Character Recognition | —Unverified | 0 |
| A Survey on Multimodal Large Language Models | Jun 23, 2023 | HallucinationIn-Context Learning | —Unverified | 0 |
| Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents | Jun 5, 2023 | DenoisingDocument Classification | —Unverified | 0 |
| TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain | Jun 3, 2023 | BenchmarkingDecoder | CodeCode Available | 1 |
| DuoSearch: A Novel Search Engine for Bulgarian Historical Documents | May 30, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Exploring Better Text Image Translation with Multimodal Codebook | May 27, 2023 | Machine TranslationOptical Character Recognition | CodeCode Available | 1 |
| Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers | May 27, 2023 | Image Super-ResolutionLicense Plate Recognition | CodeCode Available | 1 |
| Measuring Intersectional Biases in Historical Documents | May 21, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |
| OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models | May 13, 2023 | Key Information ExtractionNutrition | CodeCode Available | 2 |
| E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation | May 9, 2023 | DecoderMachine Translation | CodeCode Available | 0 |
| Evaluating BERT-based Scientific Relation Classifiers for Scholarly Knowledge Graph Construction on Digital Library Collections | May 3, 2023 | graph constructionOptical Character Recognition | —Unverified | 0 |
| DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents | Apr 24, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 1 |
| Multimodal Short Video Rumor Detection System Based on Contrastive Learning | Apr 17, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| TransDocs: Optical Character Recognition with word to word translation | Apr 15, 2023 | Deep LearningDocument Translation | CodeCode Available | 0 |
| Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts | Apr 7, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Linking Representations with Multimodal Contrastive Learning | Apr 7, 2023 | Contrastive LearningOptical Character Recognition | —Unverified | 0 |
| Efficient OCR for Building a Diverse Digital History | Apr 5, 2023 | DiversityImage Retrieval | CodeCode Available | 1 |
| A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision | Mar 30, 2023 | DecoderMulti-Task Learning | —Unverified | 0 |
| Optical Character Recognition and Transcription of Berber Signs from Images in a Low-Resource Language Amazigh | Mar 21, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset | Mar 9, 2023 | BenchmarkingDeep Learning | CodeCode Available | 0 |
| User-Centric Evaluation of OCR Systems for Kwak'wala | Feb 26, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification | Feb 16, 2023 | Few-Shot Image ClassificationFew-Shot Learning | CodeCode Available | 1 |
| Noisy Parallel Data Alignment | Jan 23, 2023 | Optical Character RecognitionOptical Character Recognition (OCR) | CodeCode Available | 0 |