SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 60516075 of 474278 papers

TitleStatusHype
voc2vec: A Foundation Model for Non-Verbal VocalizationCode2
SalM2: An Extremely Lightweight Saliency Mamba Model for Real-Time Cognitive Awareness of Driver AttentionCode2
Mantis: Lightweight Calibrated Foundation Model for User-Friendly Time Series ClassificationCode2
A Training-free LLM-based Approach to General Chinese Character Error CorrectionCode2
OccProphet: Pushing Efficiency Frontier of Camera-Only 4D Occupancy Forecasting with Observer-Forecaster-Refiner FrameworkCode2
Protein Large Language Models: A Comprehensive SurveyCode2
AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of MindCode2
Boosting Global-Local Feature Matching via Anomaly Synthesis for Multi-Class Point Cloud Anomaly DetectionCode2
VaViM and VaVAM: Autonomous Driving through Video Generative ModelingCode2
PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric PruningCode2
KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio GenerationCode2
AttentionEngine: A Versatile Framework for Efficient Attention Mechanisms on Diverse Hardware PlatformsCode2
HiddenDetect: Detecting Jailbreak Attacks against Large Vision-Language Models via Monitoring Hidden StatesCode2
ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone GenerationCode2
GiGL: Large-Scale Graph Neural Networks at SnapchatCode2
Optimizing Model Selection for Compound AI SystemsCode2
dtaianomaly: A Python library for time series anomaly detectionCode2
Risk-mediated dynamic regulation of effective contacts de-synchronizes outbreaks in metapopulation epidemic modelsCode2
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPOCode2
A Survey on Data Contamination for Large Language ModelsCode2
MAGO-SP: Detection and Correction of Water-Fat Swaps in Magnitude-Only VIBE MRICode2
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton OperatorsCode2
Fast and Accurate Blind Flexible DockingCode2
MedVAE: Efficient Automated Interpretation of Medical Images with Large-Scale Generalizable AutoencodersCode2
Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language ModelsCode2
Show:102550
← PrevPage 243 of 18972Next →