Text Compression

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 43 papers

Title	Date	Tasks	Status	Hype	Score
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression	Mar 19, 2024	GSM8KLanguage Modelling	CodeCode Available	9	5
Recurrent Context Compression: Efficiently Expanding the Context Window of LLM	Jun 10, 2024	Long-Context UnderstandingQuestion Answering	CodeCode Available	2	5
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer Training	Sep 6, 2024	Text Compression	CodeCode Available	1	5
A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language Models	Aug 20, 2017	GPUText Compression	CodeCode Available	1	5
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text Compression	Dec 21, 2024	Data CompressionText Compression	CodeCode Available	1	5
Neural Retrievers are Biased Towards LLM-Generated Content	Oct 31, 2023	Information RetrievalRetrieval	CodeCode Available	1	5
LLMZip: Lossless Text Compression using Large Language Models	Jun 6, 2023	Language ModelingLanguage Modelling	CodeCode Available	1	5
TokAlign: Efficient Vocabulary Adaptation via Token Alignment	Jun 4, 2025	SentenceText Compression	CodeCode Available	1	5
FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression	Sep 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Data-efficient Neural Text Compression with Interactive Learning	Jun 1, 2019	Active LearningHeadline Generation	CodeCode Available	0	5
XCompress: LLM assisted Python-based text compression toolkit	Aug 12, 2024	BenchmarkingLanguage Modeling	CodeCode Available	0	5
Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit Distance	Dec 23, 2024	Computational EfficiencyText Compression	CodeCode Available	0	5
Contextualized Semantic Distance between Highly Overlapped Texts	Oct 4, 2021	Domain AdaptationLanguage Modeling	CodeCode Available	0	5
Gzip versus bag-of-words for text classification	Jul 27, 2023	Classificationtext-classification	CodeCode Available	0	5
Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text Approaches	Feb 10, 2025	Document SummarizationMulti-Document Summarization	CodeCode Available	0	5
Authorship Verification based on Compression-Models	Jun 1, 2017	Authorship VerificationText Classification	CodeCode Available	0	5
IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language Model	Dec 10, 2024	ArticlesFew-Shot Learning	CodeCode Available	0	5
Syntactically Informed Text Compression with Recurrent Neural Networks	Aug 8, 2016	Text Compression	CodeCode Available	0	5
AlphaZip: Neural Network-Enhanced Lossless Text Compression	Sep 23, 2024	BenchmarkingData Compression	CodeCode Available	0	5
Text Compression for Efficient Language Generation	Mar 14, 2025	Language ModelingLanguage Modelling	—Unverified	0	0
Text Compression for Sentiment Analysis via Evolutionary Algorithms	Sep 20, 2017	Data CompressionEvolutionary Algorithms	—Unverified	0	0
Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent	Feb 22, 2017	Data CompressionOpen-Ended Question Answering	—Unverified	0	0
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance	Mar 10, 2024	Language ModelingLanguage Modelling	—Unverified	0	0
Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians by Application of the Bayes Codes for Context Tree Models	May 1, 2024	Computational EfficiencyText Compression	—Unverified	0	0
Theoretical Analysis of Byte-Pair Encoding	Nov 13, 2024	Language ModelingLanguage Modelling	—Unverified	0	0

Show:10 25 50

← PrevPage 1 of 2Next →

No leaderboard results yet.