SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 951975 of 659983 papers

TitleStatusHype
Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose PredictionCode5
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of ExpertsCode5
RLHF Workflow: From Reward Modeling to Online RLHFCode5
Single-seed generation of Brownian paths and integrals for adaptive and high order SDE solversCode5
Evaluating Real-World Robot Manipulation Policies in SimulationCode5
Granite Code Models: A Family of Open Foundation Models for Code IntelligenceCode5
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion EncodingCode5
When LLMs Meet Cybersecurity: A Systematic Literature ReviewCode5
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient ManipulationCode5
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language ModelsCode5
XFeat: Accelerated Features for Lightweight Image MatchingCode5
Make Your LLM Fully Utilize the ContextCode5
ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity PreservingCode5
NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and ResultsCode5
MARIO Eval: Evaluate Your Math LLM with your Math LLM--A mathematical dataset evaluation toolkitCode5
Do "English" Named Entity Recognizers Work Well on Global Englishes?Code5
Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMsCode5
Lean Copilot: Large Language Models as Copilots for Theorem Proving in LeanCode5
Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded ScenesCode5
SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural NetworksCode5
Magic Clothing: Controllable Garment-Driven Image SynthesisCode5
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference OptimizationCode5
MING-MOE: Enhancing Medical Multi-Task Learning in Large Language Models with Sparse Mixture of Low-Rank Adapter ExpertsCode5
The Path To Autonomous Cyber DefenseCode5
LM Transparency Tool: Interactive Tool for Analyzing Transformer Language ModelsCode5
Show:102550
← PrevPage 39 of 26400Next →