| Knowledge Discovery in Optical Music Recognition: Enhancing Information Retrieval with Instance Segmentation | Aug 27, 2024 | Information RetrievalInstance Segmentation | —Unverified | 0 |
| A Permuted Autoregressive Approach to Word-Level Recognition for Urdu Digital Text | Aug 27, 2024 | Data AugmentationOptical Character Recognition | —Unverified | 0 |
| Ancient but Digitized: Developing Handwritten Optical Character Recognition for East Syriac Script Through Creating KHAMIS Dataset | Aug 24, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese | Aug 22, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Large Language Models for Page Stream Segmentation | Aug 21, 2024 | DecoderOptical Character Recognition | —Unverified | 0 |
| ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area | Aug 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning | Aug 10, 2024 | HallucinationOptical Character Recognition | CodeCode Available | 11 |
| Revisiting Multi-Modal LLM Evaluation | Aug 9, 2024 | Chart UnderstandingOptical Character Recognition | —Unverified | 0 |
| Handwritten Code Recognition for Pen-and-Paper CS Education | Aug 7, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| Focus, Distinguish, and Prompt: Unleashing CLIP for Efficient and Flexible Scene Text Retrieval | Aug 1, 2024 | AttributeOptical Character Recognition | CodeCode Available | 1 |