SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 86518675 of 474278 papers

TitleStatusHype
ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation0
AlignFlow: Improving Flow-based Generative Models with Semi-Discrete Optimal TransportCode0
Directional Reasoning Injection for Fine-Tuning MLLMs0
DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning0
Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD)0
XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models0
Constantly Improving Image Models Need Constantly Improving Benchmarks0
Train a Unified Multimodal Data Quality Classifier with Synthetic Data0
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks0
IAD-GPT: Advancing Visual Knowledge in Multimodal Large Language Model for Industrial Anomaly DetectionCode0
Global-focal Adaptation with Information Separation for Noise-robust Transfer Fault DiagnosisCode0
Budget-aware Test-time Scaling via Discriminative VerificationCode0
Predicting Task Performance with Context-aware Scaling LawsCode0
Multi-identity Human Image Animation with Structural Video DiffusionCode0
Structure-R1: Dynamically Leveraging Structural Knowledge in LLM Reasoning through Reinforcement LearningCode0
Measuring the Effect of Disfluency in Multilingual Knowledge Probing BenchmarksCode0
Scaling Artificial Intelligence for Multi-Tumor Early Detection with More Reports, Fewer MasksCode0
MERLIN: A Testbed for Multilingual Multimodal Entity Recognition and LinkingCode0
SteeringSafety: A Systematic Safety Evaluation Framework of Representation Steering in LLMs0
WoW: Towards a World omniscient World model Through Embodied Interaction0
UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation0
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth0
Reasoning in Space via Grounding in the World0
MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems0
Agentic Entropy-Balanced Policy Optimization0
Show:102550
← PrevPage 347 of 18972Next →