SOTAVerified

16k

Papers

Showing 1120 of 146 papers

TitleStatusHype
LongBench: A Bilingual, Multitask Benchmark for Long Context UnderstandingCode3
Investigating Efficiently Extending Transformers for Long Input SummarizationCode3
FlashDMoE: Fast Distributed MoE in a Single KernelCode3
SnapKV: LLM Knows What You are Looking for Before GenerationCode3
LinFusion: 1 GPU, 1 Minute, 16K ImageCode3
M+: Extending MemoryLLM with Scalable Long-Term MemoryCode3
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the KeyCode2
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image UnderstandingCode2
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256KCode2
Giraffe: Adventures in Expanding Context Lengths in LLMsCode2
Show:102550
← PrevPage 2 of 15Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Suprime21'"1Unverified