MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts Oct 3, 2023 Chatbot Image Captioning
Code Code Available 2Constructing Image-Text Pair Dataset from Books Oct 3, 2023 Image-text Retrieval Optical Character Recognition (OCR)
— Unverified 0Comprehensive Overview of Named Entity Recognition: Models, Domain-Specific Applications and Challenges Sep 25, 2023 named-entity-recognition Named Entity Recognition
— Unverified 0Order-preserving Consistency Regularization for Domain Adaptation and Generalization Sep 23, 2023 Data Augmentation Domain Adaptation
Code Code Available 0STEP -- Towards Structured Scene-Text Spotting Sep 5, 2023 Optical Character Recognition (OCR) Scene Text Detection
Code Code Available 0Bengali Document Layout Analysis -- A YOLOV8 Based Ensembling Approach Sep 2, 2023 Data Augmentation Document Layout Analysis
— Unverified 0Separate and Locate: Rethink the Text in Text-based Visual Question Answering Aug 31, 2023 Optical Character Recognition (OCR) Position
Code Code Available 0DTrOCR: Decoder-only Transformer for Optical Character Recognition Aug 30, 2023 Decoder Handwritten Text Recognition
Code Code Available 2Enhancing OCR Performance through Post-OCR Models: Adopting Glyph Embedding for Improved Correction Aug 29, 2023 Optical Character Recognition (OCR)
— Unverified 0Vision Grid Transformer for Document Layout Analysis Aug 29, 2023 Document AI Document Layout Analysis
— Unverified 0Optimal Projections for Discriminative Dictionary Learning using the JL-lemma Aug 27, 2023 Dictionary Learning Dimensionality Reduction
Code Code Available 0Bengali Document Layout Analysis with Detectron2 Aug 26, 2023 Data Augmentation Document Layout Analysis
— Unverified 0DISGO: Automatic End-to-End Evaluation for Scene Text OCR Aug 25, 2023 Machine Translation Optical Character Recognition
— Unverified 0Nougat: Neural Optical Understanding for Academic Documents Aug 25, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 5American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers Aug 24, 2023 Articles Language Modeling
— Unverified 0CNN based Cuneiform Sign Detection Learned from Annotated 3D Renderings and Mapped Photographs with Illumination Augmentation Aug 22, 2023 Optical Character Recognition (OCR)
— Unverified 0bbOCR: An Open-source Multi-domain OCR Pipeline for Bengali Documents Aug 21, 2023 distortion correction Optical Character Recognition
Code Code Available 1BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions Aug 19, 2023 MME Optical Character Recognition (OCR)
Code Code Available 2OCR Language Models with Custom Vocabularies Aug 18, 2023 Decoder Language Modeling
— Unverified 0FashionLOGO: Prompting Multimodal Large Language Models for Fashion Logo Embeddings Aug 17, 2023 Image Retrieval Logo Recognition
Code Code Available 0OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data Generation Aug 8, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Training BERT Models to Carry Over a Coding System Developed on One Corpus to Another Aug 7, 2023 Domain Adaptation Optical Character Recognition (OCR)
— Unverified 0Universal Defensive Underpainting Patch: Making Your Text Invisible to Optical Character Recognition Aug 4, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1Toward Zero-shot Character Recognition: A Gold Standard Dataset with Radical-level Annotations Aug 1, 2023 Denoising Image Denoising
— Unverified 0Making the V in Text-VQA Matter Aug 1, 2023 Optical Character Recognition (OCR) TextVQA
— Unverified 0Optimizing the Neural Network Training for OCR Error Correction of Historical Hebrew Texts Jul 30, 2023 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Toward a Period-Specific Optimized Neural Network for OCR Error Correction of Historical Hebrew Texts Jul 30, 2023 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Augmented Math: Authoring AR-Based Explorable Explanations by Augmenting Static Math Textbooks Jul 30, 2023 Math Optical Character Recognition
Code Code Available 0Multi-Granularity Prediction with Learnable Fusion for Scene Text Recognition Jul 25, 2023 Language Modelling Optical Character Recognition (OCR)
— Unverified 0MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary Jul 24, 2023 document understanding Optical Character Recognition (OCR)
— Unverified 0A comparative analysis of SRGAN models Jul 18, 2023 Generative Adversarial Network Image Super-Resolution
— Unverified 0Modular Multimodal Machine Learning for Extraction of Theorems and Proofs in Long Scientific Documents (Extended Version) Jul 18, 2023 Articles Document AI
Code Code Available 1Handwritten and Printed Text Segmentation: A Signature Case Study Jul 15, 2023 Binary Classification Optical Character Recognition
— Unverified 0Handwritten Text Recognition Using Convolutional Neural Network Jul 11, 2023 Handwritten Text Recognition Optical Character Recognition
— Unverified 0A Novel Pipeline for Improving Optical Character Recognition through Post-processing Using Natural Language Processing Jul 9, 2023 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Artificial Eye for the Blind Jul 7, 2023 Object object-detection
— Unverified 0mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding Jul 4, 2023 document understanding Language Modeling
— Unverified 0Estimating Post-OCR Denoising Complexity on Numerical Texts Jul 3, 2023 Denoising Optical Character Recognition (OCR)
— Unverified 0Fraunhofer SIT at CheckThat! 2023: Mixing Single-Modal Classifiers to Estimate the Check-Worthiness of Multi-Modal Tweets Jul 2, 2023 Fact Checking Optical Character Recognition (OCR)
— Unverified 0LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding Jun 29, 2023 16k Image Captioning
Code Code Available 2UTRNet: High-Resolution Urdu Text Recognition In Printed Documents Jun 27, 2023 Line Detection Optical Character Recognition (OCR)
Code Code Available 1Resume Information Extraction via Post-OCR Text Processing Jun 23, 2023 Object Recognition Optical Character Recognition
— Unverified 0A Survey on Multimodal Large Language Models Jun 23, 2023 Hallucination In-Context Learning
— Unverified 0Document Image Cleaning using Budget-Aware Black-Box Approximation Jun 22, 2023 Optical Character Recognition (OCR)
Code Code Available 0GenPlot: Increasing the Scale and Diversity of Chart Derendering Data Jun 20, 2023 Derendering Diversity
Code Code Available 1Weakly supervised information extraction from inscrutable handwritten document images Jun 12, 2023 Language Modeling Language Modelling
— Unverified 0When Vision Fails: Text Attacks Against ViT and OCR Jun 12, 2023 Optical Character Recognition (OCR)
Code Code Available 0SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning Jun 6, 2023 Caption Generation Image Captioning
Code Code Available 0Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents Jun 5, 2023 Denoising Document Classification
— Unverified 0TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain Jun 3, 2023 Benchmarking Decoder
Code Code Available 1