SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 20262050 of 661570 papers

TitleStatusHype
Knowledge Fusion of Chat LLMs: A Preliminary Technical ReportCode4
R1-Searcher++: Incentivizing the Dynamic Knowledge Acquisition of LLMs via Reinforcement LearningCode4
The case for 4-bit precision: k-bit Inference Scaling LawsCode4
ActiveAnno3D -- An Active Learning Framework for Multi-Modal 3D Object DetectionCode4
DepGraph: Towards Any Structural PruningCode4
Improving Training Stability for Multitask Ranking Models in Recommender SystemsCode4
High-Resolution Image Synthesis with Latent Diffusion ModelsCode4
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open ResourcesCode4
Decoder Tuning: Efficient Language Understanding as DecodingCode4
Programming Is Hard -- Or at Least It Used to Be: Educational Opportunities And Challenges of AI Code GenerationCode4
The CLRS-Text Algorithmic Reasoning Language BenchmarkCode4
PointMamba: A Simple State Space Model for Point Cloud AnalysisCode4
Reducing Activation Recomputation in Large Transformer ModelsCode4
Learning to Generate Instruction Tuning Datasets for Zero-Shot Task AdaptationCode4
ReChorus2.0: A Modular and Task-Flexible Recommendation LibraryCode4
FoundationPose: Unified 6D Pose Estimation and Tracking of Novel ObjectsCode4
ChangeMamba: Remote Sensing Change Detection With Spatiotemporal State Space ModelCode4
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth ApproachCode4
SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object DetectionCode4
SNAC: Multi-Scale Neural Audio CodecCode4
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex InstructionsCode4
Boximator: Generating Rich and Controllable Motions for Video SynthesisCode4
Phoenix: Democratizing ChatGPT across LanguagesCode4
Blendify -- Python rendering framework for BlenderCode4
Benchmarking Retrieval-Augmented Generation for MedicineCode4
Show:102550
← PrevPage 82 of 26463Next →