SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 701750 of 659983 papers

TitleStatusHype
When Visuals Aren't the Problem: Evaluating Vision-Language Models on Misleading Data Visualizations0
SynLeaF: A Dual-Stage Multimodal Fusion Framework for Synthetic Lethality Prediction Across Pan- and Single-Cancer Contexts0
Causal Evidence that Language Models use Confidence to Drive Behavior0
Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement0
SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection0
Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models0
Gumbel Distillation for Parallel Text Generation0
Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting0
Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research0
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation0
TiCo: Time-Controllable Training for Spoken Dialogue Models0
The Dual Mechanisms of Spatial Reasoning in Vision-Language Models0
3D-Layout-R1: Structured Reasoning for Language-Instructed Spatial Editing0
WorldCache: Content-Aware Caching for Accelerated Video World Models0
Generating and Evaluating Sustainable Procurement Criteria for the Swiss Public Sector using In-Context Prompting with Large Language Models0
Generalized multi-object classification and tracking with sparse feature resonator networks0
Maximum Entropy Relaxation of Multi-Way Cardinality Constraints for Synthetic Population Generation0
A vision-language model and platform for temporally mapping surgery from video0
A Foundation Model for Instruction-Conditioned In-Context Time Series Tasks0
flexvec: SQL Vector Retrieval with Programmatic Embedding Modulation0
Precision-Varying Prediction (PVP): Robustifying ASR systems against adversarial attacks0
TrajLoom: Dense Future Trajectory Generation from Video0
Dress-ED: Instruction-Guided Editing for Virtual Try-On and Try-Off0
Understanding LLM Performance Degradation in Multi-Instance Processing: The Roles of Instance Count and Context Length0
Do Consumers Accept AIs as Moral Compliance Agents?0
Bridging the Know-Act Gap via Task-Level Autoregressive Reasoning0
Causal Discovery in Action: Learning Chain-Reaction Mechanisms from Interventions0
Transfer learning via interpolating structures0
A Vision Language Model for Generating Procedural Plant Architecture Representations from Simulated Images0
To Agree or To Be Right? The Grounding-Sycophancy Tradeoff in Medical Vision-Language Models0
Toward Faithful Segmentation Attribution via Benchmarking and Dual-Evidence Fusion0
PIVM: Diffusion-Based Prior-Integrated Variation Modeling for Anatomically Precise Abdominal CT Synthesis0
Learning to Trust: How Humans Mentally Recalibrate AI Confidence Signals0
FAAR: Format-Aware Adaptive Rounding for NVFP40
Rethinking Multimodal Fusion for Time Series: Auxiliary Modalities Need Constrained Fusion0
Three Creates All: You Only Sample 3 Steps0
AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access0
Latent Style-based Quantum Wasserstein GAN for Drug Design0
Probabilistic modeling over permutations using quantum computers0
Computational Arbitrage in AI Model Markets0
Spatially-Aware Evaluation Framework for Aerial LiDAR Point Cloud Semantic Segmentation: Distance-Based Metrics on Challenging Regions0
OsteoFlow: Lyapunov-Guided Flow Distillation for Predicting Bone Remodeling after Mandibular Reconstruction0
Stability-Preserving Online Adaptation of Neural Closed-loop Maps0
Do Large Language Models Reduce Research Novelty? Evidence from Information Systems Journals0
Hebbian Attractor Networks for Robot Locomotion0
Efficient Universal Perception Encoder0
Static Scene Reconstruction from Dynamic Egocentric Videos0
Towards Automated Community Notes Generation with Large Vision Language Models for Combating Contextual Deception0
Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation0
MAGPI: Multifidelity-Augmented Gaussian Process Inputs for Surrogate Modeling from Scarce Data0
Show:102550
← PrevPage 15 of 13200Next →