SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 32413250 of 177340 papers

TitleStatusHype
Evaluating Language Model Agency through NegotiationsCode3
DiffusionEdge: Diffusion Probabilistic Model for Crisp Edge DetectionCode3
Pheme: Efficient and Conversational Speech GenerationCode3
Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual ModelsCode3
VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web TasksCode3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignCode3
SliceGPT: Compress Large Language Models by Deleting Rows and ColumnsCode3
Hi-SAM: Marrying Segment Anything Model for Hierarchical Text SegmentationCode3
LongAlign: A Recipe for Long Context Alignment of Large Language ModelsCode3
Noise Contrastive Alignment of Language Models with Explicit RewardsCode3
Show:102550
← PrevPage 325 of 17734Next →