| TokAlign: Efficient Vocabulary Adaptation via Token Alignment | Jun 4, 2025 | SentenceText Compression | CodeCode Available | 1 |
| Beyond Text Compression: Evaluating Tokenizers Across Scales | Jun 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Measuring Information Distortion in Hierarchical Ultra long Novel Generation:The Optimal Expansion Ratio | May 18, 2025 | Text Compression | —Unverified | 0 |
| Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method | May 12, 2025 | Semantic CompressionSemantic Similarity | —Unverified | 0 |
| Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction | May 7, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Text Compression for Efficient Language Generation | Mar 14, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text Approaches | Feb 10, 2025 | Document SummarizationMulti-Document Summarization | CodeCode Available | 0 |
| Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance | Dec 23, 2024 | Computational EfficiencyText Compression | CodeCode Available | 0 |
| L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression | Dec 21, 2024 | Data CompressionText Compression | CodeCode Available | 1 |
| An Enhanced Text Compression Approach Using Transformer-based Language Models | Dec 15, 2024 | de-enText Compression | —Unverified | 0 |