| Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical Perception | Oct 16, 2024 | Binary ClassificationChunking | CodeCode Available | 3 |
| EAFormer: Scene Text Segmentation with Edge-Aware Transformers | Jul 24, 2024 | DecoderSegmentation | CodeCode Available | 3 |
| Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation | Jan 31, 2024 | Hierarchical Text Segmentationparameter-efficient fine-tuning | CodeCode Available | 3 |
| ControlText: Unlocking Controllable Fonts in Multilingual Text Rendering without Font Annotations | Feb 16, 2025 | Text Segmentation | CodeCode Available | 1 |
| Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model | Jan 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| WAS: Dataset and Methods for Artistic Text Segmentation | Jul 31, 2024 | DecoderDiversity | CodeCode Available | 1 |
| Filtered Semi-Markov CRF | Nov 29, 2023 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 1 |
| PSSTRNet: Progressive Segmentation-guided Scene Text Removal Network | Jun 13, 2023 | DecoderSegmentation | CodeCode Available | 1 |
| CCDWT-GAN: Generative Adversarial Networks Based on Color Channel Using Discrete Wavelet Transform for Document Image Binarization | May 27, 2023 | BinarizationImage Enhancement | CodeCode Available | 1 |
| Three-stage binarization of color document images based on discrete wavelet transform and generative adversarial networks | Nov 29, 2022 | AvgBinarization | CodeCode Available | 1 |
| DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding | Nov 28, 2022 | object-detectionObject Detection | CodeCode Available | 1 |
| Self-supervised Character-to-Character Distillation for Text Recognition | Nov 1, 2022 | Data AugmentationRepresentation Learning | CodeCode Available | 1 |
| Toward Unifying Text Segmentation and Long Document Summarization | Oct 28, 2022 | ArticlesDocument Summarization | CodeCode Available | 1 |
| Self-supervised Implicit Glyph Attention for Text Recognition | Mar 7, 2022 | Scene Text RecognitionText Segmentation | CodeCode Available | 1 |
| WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition | Oct 7, 2021 | Label Error DetectionOptical Character Recognition | CodeCode Available | 1 |
| Structural Text Segmentation of Legal Documents | Dec 7, 2020 | Change DetectionInformation Retrieval | CodeCode Available | 1 |
| Rethinking Text Segmentation: A Novel Dataset and A Text-Specific Refinement Approach | Nov 27, 2020 | SegmentationStyle Transfer | CodeCode Available | 1 |
| Chapter Captor: Text Segmentation in Novels | Nov 9, 2020 | SegmentationText Segmentation | CodeCode Available | 1 |
| Text Segmentation by Cross Segment Attention | Apr 30, 2020 | Discourse SegmentationInformation Retrieval | CodeCode Available | 1 |
| Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation | Jan 3, 2020 | Cross-Lingual Word EmbeddingsMulti-Task Learning | CodeCode Available | 1 |
| CoType: Joint Extraction of Typed Entities and Relations with Knowledge Bases | Oct 27, 2016 | Joint Entity and Relation ExtractionRelation | CodeCode Available | 1 |
| Khmer Word Segmentation Using Conditional Random Fields | Oct 15, 2015 | SegmentationText Segmentation | CodeCode Available | 1 |
| The impact of fine tuning in LLaMA on hallucinations for named entity extraction in legal documentation | Jun 10, 2025 | Text Segmentation | —Unverified | 0 |
| BP-Seg: A graphical model approach to unsupervised and non-contiguous text segmentation using belief propagation | May 22, 2025 | SegmentationText Segmentation | —Unverified | 0 |
| BR-TaxQA-R: A Dataset for Question Answering with References for Brazilian Personal Income Tax Law, including case law | May 21, 2025 | Answer GenerationQuestion Answering | —Unverified | 0 |
| TSAL: Few-shot Text Segmentation Based on Attribute Learning | Apr 15, 2025 | AttributeFew-Shot Learning | —Unverified | 0 |
| POSTA: A Go-to Framework for Customized Artistic Poster Generation | Mar 19, 2025 | Text Segmentation | —Unverified | 0 |
| A Context-Driven Training-Free Network for Lightweight Scene Text Segmentation and Recognition | Mar 19, 2025 | Scene Text RecognitionText Detection | —Unverified | 0 |
| DocSAM: Unified Document Image Segmentation via Query Decomposition and Heterogeneous Mixed Learning | Jan 1, 2025 | Document Layout AnalysisImage Segmentation | —Unverified | 0 |
| Char-SAM: Turning Segment Anything Model into Scene Text Segmentation Annotator with Character-level Visual Prompts | Dec 27, 2024 | SegmentationText Detection | —Unverified | 0 |
| Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models | Dec 15, 2024 | Contrastive LearningDecoder | —Unverified | 0 |
| ROSE: Revolutionizing Open-Set Dense Segmentation with Patch-Wise Perceptual Large Multimodal Model | Nov 29, 2024 | PredictionSegmentation | —Unverified | 0 |
| Recent Trends in Linear Text Segmentation: a Survey | Nov 25, 2024 | SegmentationSurvey | —Unverified | 0 |
| Enhancing Question Answering Precision with Optimized Vector Retrieval and Instructions | Nov 1, 2024 | Document EmbeddingInformation Retrieval | —Unverified | 0 |
| EEG-Language Modeling for Pathology Detection | Sep 2, 2024 | Contrastive LearningEEG | —Unverified | 0 |
| TocBERT: Medical Document Structure Extraction Using Bidirectional Transformers | Jun 27, 2024 | Document SummarizationHierarchical Text Segmentation | —Unverified | 0 |
| Automating Easy Read Text Segmentation | Jun 17, 2024 | SegmentationText Segmentation | —Unverified | 0 |
| Proofread: Fixes All Errors with One Tap | Jun 6, 2024 | AllQuantization | —Unverified | 0 |
| Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering | Apr 26, 2024 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights | Mar 6, 2024 | Boundary DetectionSentence | CodeCode Available | 0 |
| From Text Segmentation to Smart Chaptering: A Novel Benchmark for Structuring Video Transcriptions | Feb 27, 2024 | Headline GenerationSegmentation | —Unverified | 0 |
| A Reliable Knowledge Processing Framework for Combustion Science using Foundation Models | Dec 31, 2023 | ArticlesPrompt Engineering | —Unverified | 0 |
| Segmenting Messy Text: Detecting Boundaries in Text Derived from Historical Newspaper Images | Dec 20, 2023 | Optical Character RecognitionSegmentation | —Unverified | 0 |
| UPOCR: Towards Unified Pixel-Level OCR Interface | Dec 5, 2023 | DecoderOptical Character Recognition | —Unverified | 0 |
| Curved Diffusion: A Generative Model With Optical Geometry Control | Nov 29, 2023 | Text Segmentation | —Unverified | 0 |
| Self-supervised Scene Text Segmentation with Object-centric Layered Representations Augmented by Text Regions | Aug 25, 2023 | SegmentationStyle Transfer | —Unverified | 0 |
| A Comparative Study of Sentence Embedding Models for Assessing Semantic Variation | Aug 8, 2023 | Document SummarizationSemantic Similarity | —Unverified | 0 |
| Handwritten and Printed Text Segmentation: A Signature Case Study | Jul 15, 2023 | Binary ClassificationOptical Character Recognition | —Unverified | 0 |
| Expanding Scope: Adapting English Adversarial Attacks to Chinese | Jun 8, 2023 | Adversarial AttackAdversarial Robustness | CodeCode Available | 0 |
| Weakly-Supervised Text Instance Segmentation | Mar 20, 2023 | Contrastive LearningInstance Segmentation | —Unverified | 0 |