| LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Mar 19, 2024 | GSM8KLanguage Modelling | CodeCode Available | 9 | 5 |
| Recurrent Context Compression: Efficiently Expanding the Context Window of LLM | Jun 10, 2024 | Long-Context UnderstandingQuestion Answering | CodeCode Available | 2 | 5 |
| TokAlign: Efficient Vocabulary Adaptation via Token Alignment | Jun 4, 2025 | SentenceText Compression | CodeCode Available | 1 | 5 |
| L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression | Dec 21, 2024 | Data CompressionText Compression | CodeCode Available | 1 | 5 |
| LLMZip: Lossless Text Compression using Large Language Models | Jun 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language Models | Aug 20, 2017 | GPUText Compression | CodeCode Available | 1 | 5 |
| FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression | Sep 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training | Sep 6, 2024 | Text Compression | CodeCode Available | 1 | 5 |
| Neural Retrievers are Biased Towards LLM-Generated Content | Oct 31, 2023 | Information RetrievalRetrieval | CodeCode Available | 1 | 5 |
| Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance | Dec 23, 2024 | Computational EfficiencyText Compression | CodeCode Available | 0 | 5 |