SOTAVerified

Text Compression

Papers

Showing 143 of 43 papers

TitleStatusHype
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt CompressionCode9
Recurrent Context Compression: Efficiently Expanding the Context Window of LLMCode2
BPE Gets Picky: Efficient Vocabulary Refinement During Tokenizer TrainingCode1
A Batch Noise Contrastive Estimation Approach for Training Large Vocabulary Language ModelsCode1
L3TC: Leveraging RWKV for Learned Lossless Low-Complexity Text CompressionCode1
Neural Retrievers are Biased Towards LLM-Generated ContentCode1
LLMZip: Lossless Text Compression using Large Language ModelsCode1
TokAlign: Efficient Vocabulary Adaptation via Token AlignmentCode1
FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text CompressionCode1
Data-efficient Neural Text Compression with Interactive LearningCode0
XCompress: LLM assisted Python-based text compression toolkitCode0
Assessing Human Editing Effort on LLM-Generated Texts via Compression-Based Edit DistanceCode0
Contextualized Semantic Distance between Highly Overlapped TextsCode0
Gzip versus bag-of-words for text classificationCode0
Scaling Multi-Document Event Summarization: Evaluating Compression vs. Full-Text ApproachesCode0
Authorship Verification based on Compression-ModelsCode0
IntellectSeeker: A Personalized Literature Management System with the Probabilistic Model and Large Language ModelCode0
Syntactically Informed Text Compression with Recurrent Neural NetworksCode0
AlphaZip: Neural Network-Enhanced Lossless Text CompressionCode0
Text Compression for Efficient Language Generation0
Text Compression for Sentiment Analysis via Evolutionary Algorithms0
Tuple-oriented Compression for Large-scale Mini-batch Stochastic Gradient Descent0
Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance0
Variational Bayesian Methods for a Tree-Structured Stick-Breaking Process Mixture of Gaussians by Application of the Bayes Codes for Context Tree Models0
Theoretical Analysis of Byte-Pair Encoding0
An Enhanced Text Compression Approach Using Transformer-based Language Models0
A Neural Network Approach for Mixing Language Models0
Approximating Human-Like Few-shot Learning with GPT-based Compression0
A study for Image compression using Re-Pair algorithm0
Beyond Text Compression: Evaluating Tokenizers Across Scales0
Deleter: Leveraging BERT to Perform Unsupervised Successive Text Compression0
EntropyRank: Unsupervised Keyphrase Extraction via Side-Information Optimization for Language Model-based Text Compression0
Hypernym Mercury: Token Optimization Through Semantic Field Constriction And Reconstruction From Hypernyms. A New Text Compression Method0
Language and Dialect Discrimination Using Compression-Inspired Language Models0
Lexis: An Optimization Framework for Discovering the Hierarchical Structure of Sequential Data0
Long-Short Range Context Neural Networks for Language Modeling0
Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction0
Machine Translation with Unsupervised Length-Constraints0
Measuring Information Distortion in Hierarchical Ultra long Novel Generation:The Optimal Expansion Ratio0
Optimal alphabet for single text compression0
Semantic Text Compression for Classification0
Sequential Recurrent Neural Networks for Language Modeling0
Text Compression-aided Transformer Encoding0
Show:102550

No leaderboard results yet.