SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 76517700 of 661570 papers

TitleStatusHype
AI Psychometrics: Evaluating the Psychological Reasoning of Large Language Models with Psychometric Validities0
Beyond Linearity in Attention Projections: The Case for Nonlinear Queries0
Colony Grounded SAM2: Zero-shot detection and segmentation of bacterial colonies using foundation models0
COT-FM: Cluster-wise Optimal Transport Flow Matching0
DINOv3 with Test-Time Calibration for Automated Carotid Intima-Media Thickness Measurement on CUBS v10
Taming Vision Priors for Data Efficient mmWave Channel Modeling0
VisualLeakBench: Auditing the Fragility of Large Vision-Language Models against PII Leakage and Social Engineering0
Cylindrical Mechanical Projector for Omnidirectional Fringe Projection Profilometry0
High-Fidelity Text-to-Image Generation from Pre-Trained Vision-Language Models via Distribution-Conditioned Diffusion Decoding0
SERUM: Simple, Efficient, Robust, and Unifying Marking for Diffusion-based Image Generation0
MAD: Microenvironment-Aware Distillation -- A Pretraining Strategy for Virtual Spatial Omics from Microscopy0
Hybrid Intent-Aware Personalization with Machine Learning and RAG-Enabled Large Language Models for Financial Services Marketing0
Citation-Enforced RAG for Fiscal Document Intelligence: Cited, Explainable Knowledge Retrieval in Tax Compliance0
FlowAD: Ego-Scene Interactive Modeling for Autonomous Driving0
Combining Microscopy Data and Metadata for Reconstruction of Cellular Traction Forces Using a Hybrid Vision Transformer-U-Net0
WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics0
Language-Guided Token Compression with Reinforcement Learning in Large Vision-Language ModelsCode0
VulnAgent-X: A Layered Agentic Framework for Repository-Level Vulnerability DetectionCode0
VeloEdit: Training-Free Consistent and Continuous Instruction-Based Image Editing via Velocity Field DecompositionCode0
Average Calibration Losses for Reliable Uncertainty in Medical Image SegmentationCode0
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence3
Layout-Guided Controllable Pathology Image Generation with In-Context Diffusion Transformers0
Client-Conditional Federated Learning via Local Training Data Statistics0
Teleodynamic Learning a new Paradigm For Interpretable AI0
The Artificial Self: Characterising the landscape of AI identity0
Efficient Compositional Multi-tasking for On-device Large Language Models0
Streamline pathology foundation model by cross-magnification distillation0
UniFField: A Generalizable Unified Neural Feature Field for Visual, Semantic, and Spatial Uncertainties in Any Scene0
DeepSport: A Multimodal Large Language Model for Comprehensive Sports Video Reasoning via Agentic Reinforcement Learning0
Knowledge Distillation with Structured Chain-of-Thought for Text-to-SQL0
Consistency of Large Reasoning Models Under Multi-Turn Attacks0
Kernel-based optimization of measurement operators for quantum reservoir computers0
Resolving Java Code Repository Issues with iSWE Agent0
From Classical to Quantum: Extending Prometheus for Unsupervised Discovery of Phase Transitions in Three Dimensions and Quantum Systems0
Unsupervised Discovery of Intermediate Phase Order in the Frustrated J_1-J_2 Heisenberg Model via Prometheus Framework0
FlashOptim: Optimizers for Memory-Efficient Training0
Computational Pathology in the Era of Emerging Foundation and Agentic AI -- International Expert Perspectives on Clinical Integration and Translational Readiness0
Evaluating LLM-Based Grant Proposal Review via Structured Perturbations0
AgentOS: From Application Silos to a Natural Language-Driven Data Ecosystem0
Duration Aware Scheduling for ASR Serving Under Workload Drift0
Ghost Framing Theory: Exploring the role of generative AI in new venture rhetorical legitimation0
Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context0
"I followed what felt right, not what I was told": Autonomy, Coaching, and Recognizing Bias Through AI-Mediated Dialogue0
RIE-Greedy: Regularization-Induced Exploration for Contextual Bandits0
Counterweights and Complementarities: The Convergence of AI and Blockchain Powering a Decentralized Future0
Worst-case low-rank approximations0
Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings0
Heavy-Tailed Principle Component Analysis0
MRI2Qmap: multi-parametric quantitative mapping with MRI-driven denoising priors0
UniCompress: Token Compression for Unified Vision-Language Understanding and Generation0
Show:102550
← PrevPage 154 of 13232Next →