SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 981990 of 661570 papers

TitleStatusHype
Uncertainty Quantification for Language Models: A Suite of Black-Box, White-Box, LLM Judge, and Ensemble ScorersCode5
WeNet 2.0: More Productive End-to-End Speech Recognition ToolkitCode5
WebVoyager: Building an End-to-End Web Agent with Large Multimodal ModelsCode5
MLE-bench: Evaluating Machine Learning Agents on Machine Learning EngineeringCode5
Free Process Rewards without Process LabelsCode5
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMsCode5
Executable Code Actions Elicit Better LLM AgentsCode5
InspireMusic: Integrating Super Resolution and Large Language Model for High-Fidelity Long-Form Music GenerationCode5
PatchRefiner: Leveraging Synthetic Data for Real-Domain High-Resolution Monocular Metric Depth EstimationCode5
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric DepthCode5
Show:102550
← PrevPage 99 of 66157Next →