SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 32513260 of 474278 papers

TitleStatusHype
ECG-FM: An Open Electrocardiogram Foundation ModelCode3
ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use CapabilitiesCode3
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality DataCode3
Compact 3D Gaussian Splatting for Static and Dynamic Radiance FieldsCode3
Data Poisoning in LLMs: Jailbreak-Tuning and Scaling LawsCode3
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for MedicineCode3
Lighthouse: A User-Friendly Library for Reproducible Video Moment Retrieval and Highlight DetectionCode3
Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2Code3
RAGEval: Scenario Specific RAG Evaluation Dataset Generation FrameworkCode3
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language ModelsCode3
Show:102550
← PrevPage 326 of 47428Next →