| Self-supervised Character-to-Character Distillation for Text Recognition | Nov 1, 2022 | Data AugmentationRepresentation Learning | CodeCode Available | 1 |
| Toward Unifying Text Segmentation and Long Document Summarization | Oct 28, 2022 | ArticlesDocument Summarization | CodeCode Available | 1 |
| Structured Summarization: Unified Text Segmentation and Segment Labeling as a Generation Task | Sep 28, 2022 | DecoderSegmentation | —Unverified | 0 |
| OCR for TIFF Compressed Document Images Directly in Compressed Domain Using Text segmentation and Hidden Markov Model | Sep 13, 2022 | Optical Character Recognition (OCR)Text Segmentation | —Unverified | 0 |
| DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon | Jun 22, 2022 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Unsupervised Tokenization Learning | May 23, 2022 | Text Segmentation | —Unverified | 0 |
| Towards Deployable OCR models for Indic languages | May 13, 2022 | Optical Character Recognition (OCR)Segmentation | —Unverified | 0 |
| TopWORDS-Seg: Simultaneous Text Segmentation and Word Discovery for Open-Domain Chinese Texts via Bayesian Inference | May 1, 2022 | Bayesian InferenceSegmentation | —Unverified | 0 |
| Self-supervised Implicit Glyph Attention for Text Recognition | Mar 7, 2022 | Scene Text RecognitionText Segmentation | CodeCode Available | 1 |
| Fuzzy Segmentations of a String | Jan 31, 2022 | ClusteringSegmentation | —Unverified | 0 |