SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 57265750 of 661570 papers

TitleStatusHype
Protecting De-identified Documents from Search-based Linkage Attacks0
LeAD-M3D: Leveraging Asymmetric Distillation for Real-Time Monocular 3D Detection0
World Models for Learning Dexterous Hand-Object Interactions from Human Videos0
SolarGPT-QA: A Domain-Adaptive Large Language Model for Educational Question Answering in Space Weather and Heliophysics0
Prompt Sensitivity and Answer Consistency of Small Open-Source Language Models for Clinical Question Answering in Low-Resource Healthcare0
Match4Annotate: Propagating Sparse Video Annotations via Implicit Neural Feature Matching0
Defining AI Models and AI Systems: A Framework to Resolve the Boundary Problem0
Knowledge Graph Extraction from Biomedical Literature for Alkaptonuria Rare Disease0
CUBE: A Standard for Unifying Agent Benchmarks0
GLANCE: Gaze-Led Attention Network for Compressed Edge-inference0
Context-Length Robustness in Question Answering Models: A Comparative Empirical Study0
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification0
Time-Aware Prior Fitted Networks for Zero-Shot Forecasting with Exogenous Variables0
Don't Trust Stubborn Neighbors: A Security Framework for Agentic Networks0
Longitudinal Risk Prediction in Mammography with Privileged History Distillation0
Conflict-Aware Multimodal Fusion for Ambivalence and Hesitancy Recognition0
Persona-Conditioned Risk Behavior in Large Language Models: A Simulated Gambling Study with GPT-4.10
Informationally Compressive Anonymization: Non-Degrading Sensitive Input Protection for Privacy-Preserving Supervised Machine Learning0
Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models0
The Internet of Physical AI Agents: Interoperability, Longevity, and the Cost of Getting It Wrong0
ExpertGen: Scalable Sim-to-Real Expert Policy Learning from Imperfect Behavior Priors0
Optimizing Hospital Capacity During Pandemics: A Dual-Component Framework for Strategic Patient Relocation0
MoLoRA: Composable Specialization via Per-Token Adapter Routing0
NLP Occupational Emergence Analysis: How Occupations Form and Evolve in Real Time -- A Zero-Assumption Method Demonstrated on AI in the US Technology Workforce, 2022-20260
Selective Memory for Artificial Intelligence: Write-Time Gating with Hierarchical Archiving0
Show:102550
← PrevPage 230 of 26463Next →