MathWriting: A Dataset For Handwritten Mathematical Expression Recognition Apr 16, 2024 Form Optical Character Recognition (OCR)
— Unverified 0TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content Apr 16, 2024 Information Retrieval Knowledge Graphs
— Unverified 0Resilience of Large Language Models for Noisy Instructions Apr 15, 2024 Automatic Speech Recognition Optical Character Recognition
— Unverified 0Convolution-based Probability Gradient Loss for Semantic Segmentation Apr 10, 2024 Optical Character Recognition (OCR) Semantic Segmentation
Code Code Available 0Making Old Kurdish Publications Processable by Augmenting Available Optical Character Recognition Engines Apr 9, 2024 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? Apr 9, 2024 Optical Character Recognition (OCR)
Code Code Available 2NAF-DPM: A Nonlinear Activation-Free Diffusion Probabilistic Model for Document Enhancement Apr 8, 2024 Binarization Document Enhancement
Code Code Available 2HAMMR: HierArchical MultiModal React agents for generic VQA Apr 8, 2024 Optical Character Recognition (OCR) Question Answering
— Unverified 0Design and Development of a Framework For Stroke-Based Handwritten Gujarati Font Generation Apr 4, 2024 Font Generation Optical Character Recognition (OCR)
— Unverified 0CMULAB: An Open-Source Framework for Training and Deployment of Natural Language Processing Models Apr 3, 2024 Optical Character Recognition (OCR) speech-recognition
Code Code Available 1Optical Text Recognition in Nepali and Bengali: A Transformer-based Approach Apr 3, 2024 Decoder Machine Translation
— Unverified 0RealKIE: Five Novel Datasets for Enterprise Key Information Extraction Mar 29, 2024 Key Information Extraction Optical Character Recognition (OCR)
— Unverified 0Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want Mar 29, 2024 Instruction Following Language Modelling
Code Code Available 2ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages Mar 26, 2024 Machine Reading Comprehension Optical Character Recognition (OCR)
Code Code Available 1SciCapenter: Supporting Caption Composition for Scientific Figures with Machine-Generated Captions and Ratings Mar 26, 2024 Optical Character Recognition (OCR)
— Unverified 0The Solution for the ICCV 2023 1st Scientific Figure Captioning Challenge Mar 26, 2024 Caption Generation Image Captioning
— Unverified 0Grammatical vs Spelling Error Correction: An Investigation into the Responsiveness of Transformer-based Language Models using BART and MarianMT Mar 25, 2024 Optical Character Recognition (OCR) speech-recognition
— Unverified 0Visually Guided Generative Text-Layout Pre-training for Document Intelligence Mar 25, 2024 Document Classification document understanding
Code Code Available 2Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation Mar 25, 2024 Image Generation Optical Character Recognition (OCR)
— Unverified 0PEaCE: A Chemistry-Oriented Dataset for Optical Character Recognition on Scientific Documents Mar 23, 2024 Articles Optical Character Recognition
Code Code Available 1Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs Mar 19, 2024 Chart Question Answering Optical Character Recognition (OCR)
— Unverified 0mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding Mar 19, 2024 document understanding Optical Character Recognition (OCR)
— Unverified 0Financial Table Extraction in Image Documents Mar 18, 2024 Image Segmentation Optical Character Recognition (OCR)
— Unverified 0Advancing Multilingual Handwritten Numeral Recognition with Attention-driven Transfer Learning Mar 18, 2024 Handwritten Digit Recognition Optical Character Recognition
Code Code Available 0OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System Mar 18, 2024 All Decision Making
— Unverified 0Advanced Knowledge Extraction of Physical Design Drawings, Translation and conversion to CAD formats using Deep Learning Mar 17, 2024 Edge Detection Line Detection
— Unverified 0TextBlockV2: Towards Precise-Detection-Free Scene Text Spotting with Pre-trained Language Model Mar 15, 2024 Language Modeling Language Modelling
— Unverified 0Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation Mar 14, 2024 Image to text Optical Character Recognition (OCR)
— Unverified 0Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answering Mar 14, 2024 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 0Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking Mar 13, 2024 Chinese Spell Checking In-Context Learning
— Unverified 0The future of document indexing: GPT and Donut revolutionize table of content processing Mar 12, 2024 Language Modeling Language Modelling
— Unverified 0Open-Vocabulary Scene Text Recognition via Pseudo-Image Labeling and Margin Loss Mar 12, 2024 Image Inpainting Optical Character Recognition (OCR)
— Unverified 0DeepSeek-VL: Towards Real-World Vision-Language Understanding Mar 8, 2024 Chatbot Language Modelling
Code Code Available 7TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document Mar 7, 2024 document understanding Key Information Extraction
Code Code Available 5Multimodal Transformer for Comics Text-Cloze Mar 6, 2024 Language Modeling Language Modelling
— Unverified 0LOCR: Location-Guided Transformer for Optical Character Recognition Mar 4, 2024 Marketing Optical Character Recognition
— Unverified 0Large Language Models for Simultaneous Named Entity Extraction and Spelling Correction Mar 1, 2024 Decoder Optical Character Recognition
— Unverified 0ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting Mar 1, 2024 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Advancing Generative Model Evaluation: A Novel Algorithm for Realistic Image Synthesis and Comparison in OCR System Feb 27, 2024 Image Generation Optical Character Recognition (OCR)
— Unverified 0Representing Online Handwriting for Recognition in Large Vision-Language Models Feb 23, 2024 Handwriting Recognition Optical Character Recognition
— Unverified 0Syntactic Language Change in English and German: Metrics, Parsers, and Convergences Feb 18, 2024 Optical Character Recognition (OCR) Sentence
Code Code Available 0TEXTRON: Weakly Supervised Multilingual Text Detection through Data Programming Feb 15, 2024 Optical Character Recognition (OCR) Text Detection
Code Code Available 1Beyond the Mud: Datasets and Benchmarks for Computer Vision in Off-Road Racing Feb 12, 2024 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0ClusterTabNet: Supervised clustering method for table detection and table structure recognition Feb 12, 2024 Clustering Optical Character Recognition (OCR)
Code Code Available 1Segmentation-free Connectionist Temporal Classification loss based OCR Model for Text Captcha Classification Feb 8, 2024 CAPTCHA Detection Classification
— Unverified 0SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models Feb 8, 2024 Benchmarking Diversity
Code Code Available 7Enhancement of Bengali OCR by Specialized Models and Advanced Techniques for Diverse Document Types Feb 7, 2024 Optical Character Recognition (OCR) Table Recognition
— Unverified 0ExTTNet: A Deep Learning Algorithm for Extracting Table Texts from Invoice Images Feb 3, 2024 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0From Training-Free to Adaptive: Empirical Insights into MLLMs' Understanding of Detection Information Jan 31, 2024 Hallucination object-detection
— Unverified 0MouSi: Poly-Visual-Expert Vision-Language Models Jan 30, 2024 Image Segmentation Image-text matching
Code Code Available 2