SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 63516400 of 661570 papers

TitleStatusHype
Generative Inverse Design of Cold Metals for Low-Power Electronics0
SmoothVLA: Aligning Vision-Language-Action Models with Physical Constraints via Intrinsic Smoothness Optimization0
True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity0
OmniCompliance-100K: A Multi-Domain, Rule-Grounded, Real-World Safety Compliance Dataset0
DCP-CLIP:A Coarse-to-Fine Framework for Open-Vocabulary Semantic Segmentation with Dual Interaction0
Traffic and weather driven hybrid digital twin for bridge monitoring0
EviAgent: Evidence-Driven Agent for Radiology Report Generation0
Human-like Object Grouping in Self-supervised Vision Transformers0
IMS3: Breaking Distributional Aggregation in Diffusion-Based Dataset Distillation0
vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models0
Leveraging a Statistical Shape Model for Efficient Generation of Annotated Training Data: A Case Study on Liver Landmarks Segmentation0
Shapes are not enough: CONSERVAttack and its use for finding vulnerabilities and uncertainties in machine learning applications0
When Visual Privacy Protection Meets Multimodal Large Language Models0
Location Aware Embedding for Geotargeting in Sponsored Search Advertising0
A Systematic Evaluation Protocol of Graph-Derived Signals for Tabular Machine Learning0
PhyGaP: Physically-Grounded Gaussians with Polarization Cues0
The Taxonomies, Training, and Applications of Event Stream Modelling for Electronic Health Records0
Towards Generalizable Deepfake Detection via Real Distribution Bias Correction0
Beyond Explicit Edges: Robust Reasoning over Noisy and Sparse Knowledge Graphs0
Formal Abductive Explanations for Navigating Mental Health Help-Seeking and Diversity in Tech Workplaces0
SemEval-2026 Task 6: CLARITY -- Unmasking Political Question Evasions0
Schrödinger Bridge Over A Compact Connected Lie Group0
Intrinsic Tolerance in C-Arm Imaging: How Extrinsic Re-optimization Preserves 3D Reconstruction Accuracy0
Probing neural audio codecs for distinctions among English nuclear tunes0
A Theory of Appropriateness That Accounts for Norms of Rationality0
NepTam: A Nepali-Tamang Parallel Corpus and Baseline Machine Translation Experiments0
Demand-Driven Context: A Methodology for Building Enterprise Knowledge Bases Through Agent Failure0
TMPDiff: Temporal Mixed-Precision for Diffusion Models0
Soft Mean Expected Calibration Error (SMECE): A Calibration Metric for Probabilistic Labels0
OasisSimp: An Open-source Asian-English Sentence Simplification Dataset0
Revisiting the Perception-Distortion Trade-off with Spatial-Semantic Guided Super-Resolution0
Is the reconstruction loss culprit? An attempt to outperform JEPA0
Improving Visual Reasoning with Iterative Evidence Refinement0
Towards Agentic Honeynet Configuration0
Low-Field Magnetic Resonance Image Enhancement using Undersampled k-Space0
The GELATO Dataset for Legislative NER0
Multifidelity Surrogate Modeling of Depressurized Loss of Forced Cooling in High-temperature Gas Reactors0
Align Forward, Adapt Backward: Closing the Discretization Gap in Logic Gate Networks0
Clinician input steers frontier AI models toward both accurate and harmful decisions0
PA-Net: Precipitation-Adaptive Mixture-of-Experts for Long-Tail Rainfall Nowcasting0
Evaluation of Visual Place Recognition Methods for Image Pair Retrieval in 3D Vision and Robotics0
EyeWorld: A Generative World Model of Ocular State and Dynamics0
The Geometry of Multi-Task Grokking: Transverse Instability, Superposition, and Weight Decay Phase Structure0
What's the Price of Monotonicity? A Multi-Dataset Benchmark of Monotone-Constrained Gradient Boosting for Credit PD0
Distributionally Robust Geometric Joint Chance-Constrained Optimization: Neurodynamic Approaches0
Two-Step Data Augmentation for Masked Face Detection and Recognition: Turning Fake Masks to Real0
Empowering Future Cybersecurity Leaders: Advancing Students through FINDS Education for Digital Forensic Excellence0
ALTIS: Automated Loss Triage and Impact Scoring from Sentinel-1 SAR for Property-Level Flood Damage Assessment0
ArrayTac: A tactile display for simultaneous rendering of shape, stiffness and friction0
Maximin Robust Bayesian Experimental Design0
Show:102550
← PrevPage 128 of 13232Next →