SOTAVerified

Long-Context Understanding

Papers

Showing 110 of 81 papers

TitleStatusHype
Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language ModelsCode0
Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?Code1
PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding0
DAM: Dynamic Attention Mask for Long-Context Large Language Model Inference AccelerationCode1
MesaNet: Sequence Modeling by Locally Optimal Test-Time TrainingCode0
ATLAS: Learning to Optimally Memorize the Context at Test Time0
SpecExtend: A Drop-in Enhancement for Speculative Decoding of Long SequencesCode0
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM CompressionCode1
MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language ModelsCode1
Beyond Needle(s) in the Embodied Haystack: Environment, Architecture, and Training Considerations for Long Context Reasoning0
Show:102550
← PrevPage 1 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4o1 Image, 4*4 Stitching, Exact Accuracy83Unverified
2GPT-4V1 Image, 4*4 Stitching, Exact Accuracy54.72Unverified
3Gemini Pro 1.51 Image, 4*4 Stitching, Exact Accuracy39.85Unverified
4Gemini Pro 1.01 Image, 4*4 Stitching, Exact Accuracy24.78Unverified
5LLaVA-Llama-31 Image, 4*4 Stitching, Exact Accuracy17.5Unverified
6Claude 3 Opus1 Image, 4*4 Stitching, Exact Accuracy12.3Unverified
7IDEFICS2-8B1 Image, 4*4 Stitching, Exact Accuracy7.8Unverified
8InstructBLIP-Flan-T5-XXL1 Image, 4*4 Stitching, Exact Accuracy6.2Unverified
9CogVLM2-Llama-31 Image, 4*4 Stitching, Exact Accuracy0.9Unverified
10mPLUG-Owl-v21 Image, 4*4 Stitching, Exact Accuracy0.3Unverified