SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 79517975 of 474278 papers

TitleStatusHype
GeneFlow: Translation of Single-cell Gene Expression to Histopathological Images via Rectified FlowCode0
A Dual Large Language Models Architecture with Herald Guided Prompts for Parallel Fine Grained Traffic Signal ControlCode0
BlurGuard: A Simple Approach for Robustifying Image Protection Against AI-Powered EditingCode0
RaanA: A Fast, Flexible, and Data-Efficient Post-Training Quantization AlgorithmCode0
MedM2T: A MultiModal Framework for Time-Aware Modeling with Electronic Health Record and Electrocardiogram DataCode0
MeisenMeister: A Simple Two Stage Pipeline for Breast Cancer Classification on MRICode0
Understanding the Implicit User Intention via Reasoning with Large Language Model for Image EditingCode0
Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor SegmentationCode0
VCORE: Variance-Controlled Optimization-based Reweighting for Chain-of-Thought SupervisionCode0
HiRA: A Hierarchical Reasoning Framework for Decoupled Planning and Execution in Deep SearchCode0
Uncertainty-Based Smooth Policy Regularisation for Reinforcement Learning with Few DemonstrationsCode0
Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal PerspectivesCode0
Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data SchedulerCode0
MedCalc-Eval and MedCalc-Env: Advancing Medical Calculation Capabilities of Large Language ModelsCode0
NAUTILUS: A Large Multimodal Model for Underwater Scene UnderstandingCode0
Mechanics of Learned Reasoning 1: TempoBench, A Benchmark for Interpretable Deconstruction of Reasoning System PerformanceCode0
Sketch-to-Layout: Sketch-Guided Multimodal Layout GenerationCode0
Gaussian Combined Distance: A Generic Metric for Object DetectionCode0
Continuous Autoregressive Language ModelsCode0
Soft Task-Aware Routing of Experts for Equivariant Representation LearningCode0
Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUsCode0
Higher-order Linear AttentionCode0
T3: Test-Time Model Merging in VLMs for Zero-Shot Medical Imaging AnalysisCode0
RL-Exec: Impact-Aware Reinforcement Learning for Opportunistic Optimal Liquidation, Outperforms TWAP and a Book-Liquidity VWAP on BTC-USD ReplaysCode0
Urban-MAS: Human-Centered Urban Prediction with LLM-Based Multi-Agent SystemCode0
Show:102550
← PrevPage 319 of 18972Next →