| EAFormer: Scene Text Segmentation with Edge-Aware Transformers | Jul 24, 2024 | DecoderSegmentation | CodeCode Available | 3 |
| Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation | Jan 31, 2024 | Hierarchical Text Segmentationparameter-efficient fine-tuning | CodeCode Available | 3 |
| Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical Perception | Oct 16, 2024 | Binary ClassificationChunking | CodeCode Available | 3 |
| Self-supervised Implicit Glyph Attention for Text Recognition | Mar 7, 2022 | Scene Text RecognitionText Segmentation | CodeCode Available | 1 |
| Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach | Nov 27, 2020 | SegmentationStyle Transfer | CodeCode Available | 1 |
| PSSTRNet: Progressive Segmentation-guided Scene Text Removal Network | Jun 13, 2023 | DecoderSegmentation | CodeCode Available | 1 |
| DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding | Nov 28, 2022 | object-detectionObject Detection | CodeCode Available | 1 |
| Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation | Jan 3, 2020 | Cross-Lingual Word EmbeddingsMulti-Task Learning | CodeCode Available | 1 |
| Chapter Captor: Text Segmentation in Novels | Nov 9, 2020 | SegmentationText Segmentation | CodeCode Available | 1 |
| WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition | Oct 7, 2021 | Label Error DetectionOptical Character Recognition | CodeCode Available | 1 |
| Khmer Word Segmentation Using Conditional Random Fields | Oct 15, 2015 | SegmentationText Segmentation | CodeCode Available | 1 |
| ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations | Feb 16, 2025 | Text Segmentation | CodeCode Available | 1 |
| CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases | Oct 27, 2016 | Joint Entity and Relation ExtractionRelation | CodeCode Available | 1 |
| Self-supervised Character-to-Character Distillation for Text Recognition | Nov 1, 2022 | Data AugmentationRepresentation Learning | CodeCode Available | 1 |
| WAS: Dataset and Methods for Artistic Text Segmentation | Jul 31, 2024 | DecoderDiversity | CodeCode Available | 1 |
| Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model | Jan 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks | Nov 29, 2022 | AvgBinarization | CodeCode Available | 1 |
| Toward Unifying Text Segmentation and Long Document Summarization | Oct 28, 2022 | ArticlesDocument Summarization | CodeCode Available | 1 |
| CCDWT-GAN: Generative Adversarial Networks Based on Color Channel Using Discrete Wavelet Transform for Document Image Binarization | May 27, 2023 | BinarizationImage Enhancement | CodeCode Available | 1 |
| Filtered Semi-Markov CRF | Nov 29, 2023 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 1 |
| Text Segmentation by Cross Segment Attention | Apr 30, 2020 | Discourse SegmentationInformation Retrieval | CodeCode Available | 1 |
| Structural Text Segmentation of Legal Documents | Dec 7, 2020 | Change DetectionInformation Retrieval | CodeCode Available | 1 |
| A Comparative Study of Sentence Embedding Models for Assessing Semantic Variation | Aug 8, 2023 | Document SummarizationSemantic Similarity | —Unverified | 0 |
| Color and Gradient Features for Text Segmentation from Video Frames | Aug 22, 2017 | ClusteringText Segmentation | —Unverified | 0 |
| Inforex -- a web-based tool for text corpus management and semantic annotation | May 1, 2012 | ManagementNamed Entity Recognition (NER) | —Unverified | 0 |
| COCO_TS Dataset: Pixel-level Annotations Based on Weak Supervision for Scene Text Segmentation | Apr 1, 2019 | SegmentationSemantic Segmentation | —Unverified | 0 |
| A Reliable Knowledge Processing Framework for Combustion Science using Foundation Models | Dec 31, 2023 | ArticlesPrompt Engineering | —Unverified | 0 |
| A Fast Hierarchical Method for Multi-script and Arbitrary Oriented Scene Text Extraction | Jul 28, 2014 | ClusteringText Detection | —Unverified | 0 |
| Curved Diffusion: A Generative Model With Optical Geometry Control | Nov 29, 2023 | Text Segmentation | —Unverified | 0 |
| Automatic News Source Detection in Twitter Based on Text Segmentation | Dec 1, 2014 | Text Segmentation | —Unverified | 0 |
| German-Arabic Speech-to-Speech Translation for Psychiatric Diagnosis | Dec 1, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| How Text Segmentation Algorithms Gain from Topic Models | Jun 1, 2012 | Text SegmentationTopic Models | —Unverified | 0 |
| Identifying the Most Dominant Event in a News Article by Mining Event Coreference Relations | Jun 1, 2018 | Text SegmentationText Summarization | —Unverified | 0 |
| Information Bottleneck Inspired Method For Chat Text Segmentation | Nov 1, 2017 | Representation LearningText Generation | —Unverified | 0 |
| Fuzzy Segmentations of a String | Jan 31, 2022 | ClusteringSegmentation | —Unverified | 0 |
| BTS: A Bi-Lingual Benchmark for Text Segmentation in the Wild | Jan 1, 2022 | SegmentationStyle Transfer | —Unverified | 0 |
| Towards Deployable OCR models for Indic languages | May 13, 2022 | Optical Character Recognition (OCR)Segmentation | —Unverified | 0 |
| BR-TaxQA-R: A Dataset for Question Answering with References for Brazilian Personal Income Tax Law, including case law | May 21, 2025 | Answer GenerationQuestion Answering | —Unverified | 0 |
| Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition | Apr 19, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Evaluating Text Segmentation using Boundary Edit Distance | Aug 1, 2013 | Information RetrievalQuestion Answering | —Unverified | 0 |
| Enhancing Question Answering Precision with Optimized Vector Retrieval and Instructions | Nov 1, 2024 | Document EmbeddingInformation Retrieval | —Unverified | 0 |
| BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation | May 22, 2025 | SegmentationText Segmentation | —Unverified | 0 |
| From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions | Feb 27, 2024 | Headline GenerationSegmentation | —Unverified | 0 |
| Fused Text Segmentation Networks for Multi-oriented Scene Text Detection | Sep 11, 2017 | Multi-Oriented Scene Text Detectionobject-detection | —Unverified | 0 |
| A Cascade Model for Proposition Extraction in Argumentation | Aug 1, 2019 | Argument MiningSegmentation | —Unverified | 0 |
| Generating abbreviations using Google Books library | Oct 4, 2014 | Text Segmentation | —Unverified | 0 |
| Highly Fast Text Segmentation With Pairwise Markov Chains | Feb 17, 2021 | Chunkingnamed-entity-recognition | —Unverified | 0 |
| Handwritten and Printed Text Segmentation: A Signature Case Study | Jul 15, 2023 | Binary ClassificationOptical Character Recognition | —Unverified | 0 |
| Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts | Dec 27, 2024 | SegmentationText Detection | —Unverified | 0 |
| EEG-Language Modeling for Pathology Detection | Sep 2, 2024 | Contrastive LearningEEG | —Unverified | 0 |