SOTAVerified

Text Compression

Papers

Showing 125 of 43 papers

TitleStatusHype
TokAlign: Efficient Vocabulary Adaptation via Token AlignmentCode1
Beyond Text Compression: Evaluating Tokenizers Across Scales0
Measuring Information Distortion in Hierarchical Ultra long Novel Generation:The Optimal Expansion Ratio0
Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method0
Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction0
Text Compression for Efficient Language Generation0
Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text ApproachesCode0
Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit DistanceCode0
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text CompressionCode1
An Enhanced Text Compression Approach Using Transformer-based Language Models0
IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language ModelCode0
Theoretical Analysis of Byte-Pair Encoding0
FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text CompressionCode1
AlphaZip: Neural Network-Enhanced Lossless Text CompressionCode0
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer TrainingCode1
XCompress: LLM assisted Python-based text compression toolkitCode0
Recurrent Context Compression: Efficiently Expanding the Context Window of LLMCode2
Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians by Application of the Bayes Codes for Context Tree Models0
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt CompressionCode9
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance0
Neural Retrievers are Biased Towards LLM-Generated ContentCode1
Semantic Text Compression for Classification0
EntropyRank: Unsupervised Keyphrase Extraction via Side-Information Optimization for Language Model-based Text Compression0
Approximating Human-Like Few-shot Learning with GPT-based Compression0
Gzip versus bag-of-words for text classificationCode0
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.