SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 87018725 of 474278 papers

TitleStatusHype
Mamba-R: Vision Mamba ALSO Needs RegistersCode2
Large language models can be zero-shot anomaly detectors for time series?Code2
EditWorld: Simulating World Dynamics for Instruction-Following Image EditingCode2
EMR-Merging: Tuning-Free High-Performance Model MergingCode2
RectifID: Personalizing Rectified Flow with Anchored Classifier GuidanceCode2
TopoLogic: An Interpretable Pipeline for Lane Topology Reasoning on Driving ScenesCode2
Extracting Prompts by Inverting LLM OutputsCode2
Metric Flow Matching for Smooth Interpolations on the Data ManifoldCode2
Flatten Anything: Unsupervised Neural Surface ParameterizationCode2
S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language ModelsCode2
AnalogCoder: Analog Circuit Design via Training-Free Code GenerationCode2
DreamText: High Fidelity Scene Text SynthesisCode2
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer ModelsCode2
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language ModelsCode2
EHRMamba: Towards Generalizable and Scalable Foundation Models for Electronic Health RecordsCode2
Vikhr: Constructing a State-of-the-art Bilingual Open-Source Instruction-Following Large Language Model for RussianCode2
CViT: Continuous Vision Transformer for Operator LearningCode2
Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid ModelingCode2
BrainMorph: A Foundational Keypoint Model for Robust and Flexible Brain MRI RegistrationCode2
Learning Diffusion Priors from Observations by Expectation MaximizationCode2
Dense Connector for MLLMsCode2
Context and Geometry Aware Voxel Transformer for Semantic Scene CompletionCode2
A General Framework for Jersey Number Recognition in Sports VideoCode2
Model Editing as a Robust and Denoised variant of DPO: A Case Study on ToxicityCode2
FedCache 2.0: Federated Edge Learning with Knowledge Caching and Dataset DistillationCode2
Show:102550
← PrevPage 349 of 18972Next →