SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 52015225 of 661570 papers

TitleStatusHype
PDE-Transformer: Efficient and Versatile Transformers for Physics SimulationsCode2
EasyText: Controllable Diffusion Transformer for Multilingual Text RenderingCode2
Optimal Weighted Convolution for Classification and DenosingCode2
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning EvaluationCode2
GeoVision Labeler: Zero-Shot Geospatial Classification with Vision and Language ModelsCode2
Optimal Density Functions for Weighted Convolution in Learning ModelsCode2
Logits-Based FinetuningCode2
Tackling View-Dependent Semantics in 3D Language Gaussian SplattingCode2
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM AgentsCode2
When Large Multimodal Models Confront Evolving Knowledge:Challenges and PathwaysCode2
ReasonGen-R1: CoT for Autoregressive Image generation models through SFT and RLCode2
ViStoryBench: Comprehensive Benchmark Suite for Story VisualizationCode2
TC-GS: A Faster Gaussian Splatting Module Utilizing Tensor CoresCode2
TextRegion: Text-Aligned Region Tokens from Frozen Image-Text ModelsCode2
GSO: Challenging Software Optimization Tasks for Evaluating SWE-AgentsCode2
UniTEX: Universal High Fidelity Generative Texturing for 3D ShapesCode2
Vision Language Models are BiasedCode2
Diffusion Guidance Is a Controllable Policy Improvement OperatorCode2
ML-Agent: Reinforcing LLM Agents for Autonomous Machine Learning EngineeringCode2
SWE-bench Goes Live!Code2
VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation ModelsCode2
OpenUni: A Simple Baseline for Unified Multimodal Understanding and GenerationCode2
D-AR: Diffusion via Autoregressive ModelsCode2
MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary ProgrammingCode2
Securing AI Agents with Information-Flow ControlCode2
Show:102550
← PrevPage 209 of 26463Next →