SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 901910 of 661570 papers

TitleStatusHype
Fake News Detection: It's All in the Data!Code5
LiveBench: A Challenging, Contamination-Limited LLM BenchmarkCode5
OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and UnderstandingCode5
Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything ModelCode5
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video GenerationCode5
MedCare: Advancing Medical LLMs through Decoupling Clinical Alignment and Knowledge AggregationCode5
MixTex: Unambiguous Recognition Should Not Rely Solely on Real DataCode5
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMsCode5
LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-trainingCode5
ESC-Eval: Evaluating Emotion Support Conversations in Large Language ModelsCode5
Show:102550
← PrevPage 91 of 66157Next →