SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 27412750 of 474278 papers

TitleStatusHype
CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent EvaluationCode3
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
LLMmap: Fingerprinting For Large Language ModelsCode3
SongComposer: A Large Language Model for Lyric and Melody Generation in Song CompositionCode3
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic ThinkingCode3
ExCoT: Optimizing Reasoning for Text-to-SQL with Execution FeedbackCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMsCode3
Qihoo-T2X: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Any-TaskCode3
What Language Model to Train if You Have One Million GPU Hours?Code3
Show:102550
← PrevPage 275 of 47428Next →