SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 12761300 of 659983 papers

TitleStatusHype
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and GenerationCode4
Co-Evolving LLM Coder and Unit Tester via Reinforcement LearningCode4
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and UnderstandingCode4
RewardBench 2: Advancing Reward Model EvaluationCode4
GigaAM: Efficient Self-Supervised Learner for Speech RecognitionCode4
AutoSchemaKG: Autonomous Knowledge Graph Construction through Dynamic Schema Induction from Web-Scale CorporaCode4
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global IlluminationCode4
Skywork Open Reasoner 1 Technical ReportCode4
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video GenerationCode4
On Path to Multimodal Historical Reasoning: HistBench and HistAgentCode4
ImgEdit: A Unified Image Editing Dataset and BenchmarkCode4
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data GenerationCode4
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-EvolutionCode4
DeepInverse: A Python package for solving imaging inverse problems with deep learningCode4
A Survey of LLM DATACode4
Partition Generative Modeling: Masked Modeling Without MasksCode4
LORE: Lagrangian-Optimized Robust Embeddings for Visual EncodersCode4
Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language ModelsCode4
Qiskit Machine Learning: an open-source library for quantum machine learning tasks at scale on quantum hardware and classical simulatorsCode4
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement LearningCode4
Scaling Up Biomedical Vision-Language Models: Fine-Tuning, Instruction Tuning, and Multi-Modal LearningCode4
SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory SynthesisCode4
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPOCode4
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement LearningCode4
lmgame-Bench: How Good are LLMs at Playing Games?Code4
Show:102550
← PrevPage 52 of 26400Next →