SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 19762000 of 661570 papers

TitleStatusHype
APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs0
Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM0
AI Generalisation Gap In Comorbid Sleep Disorder Staging0
ZeroFold: Protein-RNA Binding Affinity Predictions from Pre-Structural Embeddings0
A Theory of LLM Information Susceptibility0
Ukrainian Visual Word Sense Disambiguation Benchmark0
Steering Code LLMs with Activation Directions for Language and Library Control0
Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments0
Flying Pigs, FaR and Beyond: Evaluating LLM Reasoning in Counterfactual Worlds0
PRISM: Video Dataset Condensation with Progressive Refinement and Insertion for Sparse Motion0
Decorrelation, Diversity, and Emergent Intelligence: The Isomorphism Between Social Insect Colonies and Ensemble Machine Learning0
Inverting Neural Networks: New Methods to Generate Neural Network Inputs from Prescribed Outputs0
When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning0
Test-Time Adaptation via Cache Personalization for Facial Expression Recognition in Videos0
TimeTox: An LLM-Based Pipeline for Automated Extraction of Time Toxicity from Clinical Trial Protocols0
A transformer architecture alteration to incentivise externalised reasoning0
Bounding Box Anomaly Scoring for simple and efficient Out-of-Distribution detection0
Improving LLM Predictions via Inter-Layer Structural Encoders0
Vision-based Deep Learning Analysis of Unordered Biomedical Tabular Datasets via Optimal Spatial Cartography0
MuQ-Eval: An Open-Source Per-Sample Quality Metric for AI Music Generation Evaluation0
Voice Privacy from an Attribute-based Perspective0
PopResume: Causal Fairness Evaluation of LLM/VLM Resume Screeners with Population-Representative Dataset0
SOUPLE: Enhancing Audio-Visual Localization and Segmentation with Learnable Prompt Contexts0
Exposure-Normalized Bed and Chair Fall Rates via Continuous AI Monitoring0
Conditionally Identifiable Latent Representation for Multivariate Time Series with Structural Dynamics0
Show:102550
← PrevPage 80 of 26463Next →