SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 62016225 of 474278 papers

TitleStatusHype
Sparse Autoencoders for Hypothesis GenerationCode2
Seeing World Dynamics in a NutshellCode2
The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models via Visual Information SteeringCode2
On-device Sora: Enabling Training-Free Diffusion-based Text-to-Video Generation for Mobile DevicesCode2
Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose EstimationCode2
Reviving The Classics: Active Reward Modeling in Large Language Model AlignmentCode2
STAIR: Improving Safety Alignment with Introspective ReasoningCode2
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise SearchCode2
On the Guidance of Flow MatchingCode2
CodeSteer: Symbolic-Augmented Language Models via Code/Text GuidanceCode2
Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUsCode2
Honegumi: An Interface for Accelerating the Adoption of Bayesian Optimization in the Experimental SciencesCode2
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal PuzzlesCode2
Massive Values in Self-Attention Modules are the Key to Contextual Knowledge UnderstandingCode2
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference OptimizationCode2
Efficient Diffusion Models: A SurveyCode2
Compressed Image Generation with Denoising Diffusion Codebook ModelsCode2
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion TransformerCode2
Towards Robust and Generalizable Lensless Imaging with Modular Learned ReconstructionCode2
Preference Leakage: A Contamination Problem in LLM-as-a-judgeCode2
When Do LLMs Help With Node Classification? A Comprehensive AnalysisCode2
LEAD: Large Foundation Model for EEG-Based Alzheimer's Disease DetectionCode2
FlexCloud: Direct, Modular Georeferencing and Drift-Correction of Point Cloud MapsCode2
Segment Anything for HistopathologyCode2
PyMOLfold: Interactive Protein and Ligand Structure Prediction in PyMOLCode2
Show:102550
← PrevPage 249 of 18972Next →