SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 21812190 of 177340 papers

TitleStatusHype
Scaling Granite Code Models to 128K ContextCode4
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and SocietyCode4
Expressive Whole-Body 3D Gaussian AvatarCode4
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text RecognitionCode4
SiamMask: A Framework for Fast Online Object Tracking and SegmentationCode4
RewardBench 2: Advancing Reward Model EvaluationCode4
VLN-R1: Vision-Language Navigation via Reinforcement Fine-TuningCode4
HumanoidBench: Simulated Humanoid Benchmark for Whole-Body Locomotion and ManipulationCode4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 BitsCode4
SAT: Dynamic Spatial Aptitude Training for Multimodal Language ModelsCode4
Show:102550
← PrevPage 219 of 17734Next →