SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 60766100 of 474278 papers

TitleStatusHype
ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone GenerationCode2
GiGL: Large-Scale Graph Neural Networks at SnapchatCode2
Helix-mRNA: A Hybrid Foundation Model For Full Sequence mRNA TherapeuticsCode2
MoM: Linear Sequence Modeling with Mixture-of-MemoriesCode2
Smaller But Better: Unifying Layout Generation with Smaller Large Language ModelsCode2
DataSciBench: An LLM Agent Benchmark for Data ScienceCode2
Calibration and Option Pricing with Stochastic Volatility and Double Exponential JumpsCode2
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language ModelsCode2
Repo2Run: Automated Building Executable Environment for Code Repository at ScaleCode2
Medical Image Classification with KAN-Integrated Transformers and Dilated Neighborhood AttentionCode2
Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning FrameworkCode2
JL1-CD: A New Benchmark for Remote Sensing Change Detection and a Robust Multi-Teacher Knowledge Distillation FrameworkCode2
Event-Based Video Frame Interpolation With Cross-Modal Asymmetric Bidirectional Motion FieldsCode2
SIFT: Grounding LLM Reasoning in Contexts via StickersCode2
TESS 2: A Large-Scale Generalist Diffusion Language ModelCode2
Refining Sentence Embedding Model through Ranking Sentences Generation with Large Language ModelsCode2
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule GenerationCode2
DAMamba: Vision State Space Model with Dynamic Adaptive ScanCode2
Training Turn-by-Turn Verifiers for Dialogue Tutoring Agents: The Curious Case of LLMs as Your Coding TutorsCode2
Rethinking Diverse Human Preference Learning through Principal Component AnalysisCode2
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement LearningCode2
Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear DistillationCode2
WMT24++: Expanding the Language Coverage of WMT24 to 55 Languages & DialectsCode2
UXAgent: An LLM Agent-Based Usability Testing Framework for Web DesignCode2
Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference OptimizationCode2
Show:102550
← PrevPage 244 of 18972Next →