SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 94269450 of 474278 papers

TitleStatusHype
Simultaneous Multi-objective Alignment Across Verifiable and Non-verifiable RewardsCode0
IMAGEdit: Let Any Subject TransformCode0
EvoStruggle: A Dataset Capturing the Evolution of Struggle across Activities and Skill LevelsCode0
CodeGenLink: A Tool to Find the Likely Origin and License of Automatically Generated CodeCode0
ProtoMask: Segmentation-Guided Prototype LearningCode0
SEE: See Everything Every Time -- Adaptive Brightness Adjustment for Broad Light Range Images via EventsCode0
RoVerFly: Robust and Versatile Implicit Hybrid Control of Quadrotor-Payload SystemsCode0
Graph2Region: Efficient Graph Similarity Learning with Structure and Scale RestorationCode0
Learning a Zeroth-Order Optimizer for Fine-Tuning LLMsCode0
Automated Structured Radiology Report Generation with Rich Clinical ContextCode0
MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text GenerationCode0
Align Your Tangent: Training Better Consistency Models via Manifold-Aligned TangentsCode0
Solar PV Installation Potential Assessment on Building Facades Based on Vision and Language Foundation ModelsCode0
NSARM: Next-Scale Autoregressive Modeling for Robust Real-World Image Super-ResolutionCode0
The Social Laboratory: A Psychometric Framework for Multi-Agent LLM EvaluationCode0
Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time RegressionCode0
Resolving UnderEdit & OverEdit with Iterative & Neighbor-Assisted Model EditingCode0
PodEval: A Multimodal Evaluation Framework for Podcast Audio GenerationCode0
SeMoBridge: Semantic Modality Bridge for Efficient Few-Shot Adaptation of CLIPCode0
Fair CCA for Fair Representation Learning: An ADNI StudyCode0
CWM: An Open-Weights LLM for Research on Code Generation with World Models0
Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation0
MR^2-Bench: Going Beyond Matching to Reasoning in Multimodal RetrievalCode0
TGPO: Temporal Grounded Policy Optimization for Signal Temporal Logic TasksCode0
DGM4+: Dataset Extension for Global Scene InconsistencyCode0
Show:102550
← PrevPage 378 of 18972Next →