| Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent | Feb 22, 2017 | Data CompressionOpen-Ended Question Answering | —Unverified | 0 |
| Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance | Mar 10, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians by Application of the Bayes Codes for Context Tree Models | May 1, 2024 | Computational EfficiencyText Compression | —Unverified | 0 |
| Language and Dialect Discrimination Using Compression-Inspired Language Models | Dec 1, 2016 | Authorship AttributionDialect Identification | —Unverified | 0 |
| Lexis: An Optimization Framework for Discovering the Hierarchical Structure of Sequential Data | Feb 17, 2016 | Text Compression | —Unverified | 0 |
| Long-Short Range Context Neural Networks for Language Modeling | Aug 22, 2017 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction | May 7, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Machine Translation with Unsupervised Length-Constraints | Apr 7, 2020 | DecoderMachine Translation | —Unverified | 0 |
| IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language Model | Dec 10, 2024 | ArticlesFew-Shot Learning | CodeCode Available | 0 |
| AlphaZip: Neural Network-Enhanced Lossless Text Compression | Sep 23, 2024 | BenchmarkingData Compression | CodeCode Available | 0 |
| Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance | Dec 23, 2024 | Computational EfficiencyText Compression | CodeCode Available | 0 |
| Gzip versus bag-of-words for text classification | Jul 27, 2023 | Classificationtext-classification | CodeCode Available | 0 |
| Data-efficient Neural Text Compression with Interactive Learning | Jun 1, 2019 | Active LearningHeadline Generation | CodeCode Available | 0 |
| Authorship Verification based on Compression-Models | Jun 1, 2017 | Authorship VerificationText Classification | CodeCode Available | 0 |
| Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text Approaches | Feb 10, 2025 | Document SummarizationMulti-Document Summarization | CodeCode Available | 0 |
| XCompress: LLM assisted Python-based text compression toolkit | Aug 12, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 0 |
| Contextualized Semantic Distance between Highly Overlapped Texts | Oct 4, 2021 | Domain AdaptationLanguage Modeling | CodeCode Available | 0 |
| Syntactically Informed Text Compression with Recurrent Neural Networks | Aug 8, 2016 | Text Compression | CodeCode Available | 0 |