| XCompress: LLM assisted Python-based text compression toolkit | Aug 12, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians by Application of the Bayes Codes for Context Tree Models | May 1, 2024 | Computational EfficiencyText Compression | —Unverified | 0 |
| Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance | Mar 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Semantic Text Compression for Classification | Sep 19, 2023 | ClassificationDecoder | —Unverified | 0 |
| EntropyRank: Unsupervised Keyphrase Extraction via Side-Information Optimization for Language Model-based Text Compression | Aug 25, 2023 | Keyphrase ExtractionLanguage Modeling | —Unverified | 0 |
| Approximating Human-Like Few-shot Learning with GPT-based Compression | Aug 14, 2023 | Data CompressionFew-Shot Learning | —Unverified | 0 |
| Gzip versus bag-of-words for text classification | Jul 27, 2023 | Classificationtext-classification | CodeCode Available | 0 |
| Optimal alphabet for single text compression | Jan 13, 2022 | Text Compression | —Unverified | 0 |
| Contextualized Semantic Distance between Highly Overlapped Texts | Oct 4, 2021 | Domain AdaptationLanguage Modeling | CodeCode Available | 0 |
| Text Compression-aided Transformer Encoding | Feb 11, 2021 | Text Compression | —Unverified | 0 |