DocFormerv2: Local Features for Document Understanding Jun 2, 2023 Decoder document understanding
Code Code Available 1Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering Jun 1, 2023 Optical Character Recognition (OCR) Question Answering
Code Code Available 1Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model May 31, 2023 Denoising Optical Character Recognition (OCR)
— Unverified 0A template-independent approach for information extraction in real estate documents May 30, 2023 Information Retrieval Natural Language Understanding
Code Code Available 0DuoSearch: A Novel Search Engine for Bulgarian Historical Documents May 30, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 0GlyphControl: Glyph Conditional Control for Visual Text Generation May 29, 2023 Optical Character Recognition (OCR) Text Generation
Code Code Available 2FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions May 28, 2023 Attribute Image Captioning
Code Code Available 1Exploring Better Text Image Translation with Multimodal Codebook May 27, 2023 Machine Translation Optical Character Recognition
Code Code Available 1Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers May 27, 2023 Image Super-Resolution License Plate Recognition
Code Code Available 1People and Places of Historical Europe: Bootstrapping Annotation Pipeline and a New Corpus of Named Entities in Late Medieval Texts May 26, 2023 Information Retrieval named-entity-recognition
— Unverified 0MRN: Multiplexed Routing Network for Incremental Multilingual Text Recognition May 24, 2023 Continual Learning Incremental Learning
Code Code Available 1Quantifying Character Similarity with Vision Transformers May 24, 2023 Optical Character Recognition (OCR)
Code Code Available 0DUBLIN -- Document Understanding By Language-Image Network May 23, 2023 Document Classification document understanding
— Unverified 0Measuring Intersectional Biases in Historical Documents May 21, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 0XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages May 19, 2023 In-Context Learning Multilingual NLP
Code Code Available 1TextDiffuser: Diffusion Models as Text Painters May 18, 2023 Optical Character Recognition (OCR)
— Unverified 0Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding May 16, 2023 Decoder document understanding
— Unverified 0Mobile User Interface Element Detection Via Adaptively Prompt Tuning May 16, 2023 object-detection Object Detection
Code Code Available 0OCRBench: On the Hidden Mystery of OCR in Large Multimodal Models May 13, 2023 Key Information Extraction Nutrition
Code Code Available 2Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution May 12, 2023 Contrastive Learning Optical Character Recognition (OCR)
Code Code Available 1Combining OCR Models for Reading Early Modern Printed Books May 11, 2023 Font Recognition Optical Character Recognition (OCR)
Code Code Available 0TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition May 9, 2023 Optical Character Recognition (OCR) Scene Text Recognition
Code Code Available 1E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation May 9, 2023 Decoder Machine Translation
Code Code Available 0Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation May 4, 2023 Optical Character Recognition (OCR)
— Unverified 0Evaluating BERT-based Scientific Relation Classifiers for Scholarly Knowledge Graph Construction on Digital Library Collections May 3, 2023 graph construction Optical Character Recognition
— Unverified 0LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model Apr 28, 2023 Instruction Following model
Code Code Available 5DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents Apr 24, 2023 Optical Character Recognition Optical Character Recognition (OCR)
Code Code Available 1ICDAR 2023 Competition on Reading the Seal Title Apr 24, 2023 Optical Character Recognition (OCR) Task 2
— Unverified 0Multimodal Short Video Rumor Detection System Based on Contrastive Learning Apr 17, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0TransDocs: Optical Character Recognition with word to word translation Apr 15, 2023 Deep Learning Document Translation
Code Code Available 0Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts Apr 7, 2023 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Linking Representations with Multimodal Contrastive Learning Apr 7, 2023 Contrastive Learning Optical Character Recognition
— Unverified 0TagGPT: Large Language Models are Zero-shot Multimodal Taggers Apr 6, 2023 Optical Character Recognition (OCR) Prompt Engineering
Code Code Available 1ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules Apr 5, 2023 Chart Understanding Derendering
Code Code Available 1Efficient OCR for Building a Diverse Digital History Apr 5, 2023 Diversity Image Retrieval
Code Code Available 1GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation Mar 31, 2023 Image Generation Optical Character Recognition (OCR)
Code Code Available 2A semi-automatic method for document classification in the shipping industry Mar 29, 2023 Classification Document Classification
— Unverified 0OVeNet: Offset Vector Network for Semantic Segmentation Mar 25, 2023 Optical Character Recognition (OCR) Scene Understanding
Code Code Available 0Optical Character Recognition and Transcription of Berber Signs from Images in a Low-Resource Language Amazigh Mar 21, 2023 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0CLIP-ReIdent: Contrastive Training for Player Re-Identification Mar 21, 2023 Optical Character Recognition (OCR) Sports Analytics
— Unverified 0The System Description of dun_oscar team for The ICPR MSR Challenge Mar 13, 2023 Optical Character Recognition (OCR)
— Unverified 0BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset Mar 9, 2023 Benchmarking Deep Learning
Code Code Available 0Meme Sentiment Analysis Enhanced with Multimodal Spatial Encoding and Facial Embedding Mar 3, 2023 Optical Character Recognition (OCR) Position
— Unverified 0StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training Mar 1, 2023 Document Image Classification image-classification
Code Code Available 0Language Is Not All You Need: Aligning Perception with Language Models Feb 27, 2023 All Image Captioning
— Unverified 0User-Centric Evaluation of OCR Systems for Kwak'wala Feb 26, 2023 Optical Character Recognition Optical Character Recognition (OCR)
— Unverified 0Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification Feb 16, 2023 Few-Shot Image Classification Few-Shot Learning
Code Code Available 1An Investigation into Pre-Training Object-Centric Representations for Reinforcement Learning Feb 9, 2023 Object Optical Character Recognition (OCR)
— Unverified 0SPARLING: Learning Latent Representations with Extremely Sparse Activations Feb 3, 2023 Optical Character Recognition (OCR)
— Unverified 0DEVICE: DEpth and VIsual ConcEpts Aware Transformer for TextCaps Feb 3, 2023 Image Captioning Optical Character Recognition (OCR)
— Unverified 0