SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 91519200 of 661570 papers

TitleStatusHype
APPLV: Adaptive Planner Parameter Learning from Vision-Language-Action Model0
Why Channel-Centric Models are not Enough to Predict End-to-End Performance in Private 5G: A Measurement Campaign and Case Study0
Towards Visual Query Segmentation in the Wild0
Multi-Kernel Gated Decoder Adapters for Robust Multi-Task Thyroid Ultrasound under Cross-Center Shift0
Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting0
FedLECC: Cluster- and Loss-Guided Client Selection for Federated Learning under Non-IID Data0
Quantifying Memorization and Privacy Risks in Genomic Language Models0
Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates0
Vision-Language Models Encode Clinical Guidelines for Concept-Based Medical Reasoning0
Quantifying Uncertainty in AI Visibility: A Statistical Framework for Generative Search Measurement0
MEGC2026: Micro-Expression Grand Challenge on Visual Question Answering0
Using Vision Language Foundation Models to Generate Plant Simulation Configurations via In-Context Learning0
Optimizing Reinforcement Learning Training over Digital Twin Enabled Multi-fidelity Networks0
Interpretable Markov-Based Spatiotemporal Risk Surfaces for Missing-Child Search Planning with Reinforcement Learning and LLM-Based Quality Assurance0
Kernel Debiased Plug-in Estimation based on the Universal Least Favorable Submodel0
The qs Inequality: Quantifying the Double Penalty of Mixture-of-Experts at Inference0
Semantic Level of Detail: Multi-Scale Knowledge Representation via Heat Kernel Diffusion on Hyperbolic Manifolds0
Can You Hear, Localize, and Segment Continually? An Exemplar-Free Continual Learning Benchmark for Audio-Visual Segmentation0
MAcPNN: Mutual Assisted Learning on Data Streams with Temporal Dependence0
Data-driven robust Markov decision processes on Borel spaces: performance guarantees via an axiomatic approach0
SVG-EAR: Parameter-Free Linear Compensation for Sparse Video Generation via Error-aware Routing0
SurgCalib: Gaussian Splatting-Based Hand-Eye Calibration for Robot-Assisted Minimally Invasive Surgery0
MAPLE: Elevating Medical Reasoning from Statistical Consensus to Process-Led Alignment0
Diffusion-Based Authentication of Copy Detection Patterns: A Multimodal Framework with Printer Signature Conditioning0
Security Considerations for Multi-agent Systems0
Gender Fairness in Audio Deepfake Detection: Performance and Disparity Analysis0
Statistical Inference via Generative Models: Flow Matching and Causal Inference0
Improving through Interaction: Searching Behavioral Representation Spaces with CMA-ES-IGCode0
An accurate flatness measure to estimate the generalization performance of CNN models0
Automating Detection and Root-Cause Analysis of Flaky Tests in Quantum Software0
AI Phenomenology for Understanding Human-AI Experiences Across Eras0
The Missing Memory Hierarchy: Demand Paging for LLM Context Windows0
When to Retrain after Drift: A Data-Only Test of Post-Drift Data Size Sufficiency0
CORE-Acu: Structured Reasoning Traces and Knowledge Graph Safety Verification for Acupuncture Clinical Decision Support0
Small Language Models for Efficient Agentic Tool Calling: Outperforming Large Models with Targeted Fine-tuning0
MAS-Orchestra: Understanding and Improving Multi-Agent Reasoning Through Holistic Orchestration and Controlled Benchmarks0
Visual Self-Fulfilling Alignment: Shaping Safety-Oriented Personas via Threat-Related Images0
Understand Then Memory: A Cognitive Gist-Driven RAG Framework with Global Semantic Diffusion0
Visualizing Coalition Formation: From Hedonic Games to Image Segmentation0
A Dataset for Probing Translationese Preferences in English-to-Swedish Translation0
Divide and Predict: An Architecture for Input Space Partitioning and Enhanced Accuracy0
Convergence Rate for the Last Iterate of Stochastic Gradient Descent Schemes0
Aero-Promptness: Drag-Aware Aerodynamic Manipulability for Propeller-driven Vehicles0
Examining the Role of YouTube Production and Consumption Dynamics on the Formation of Extreme Ideologies0
SRNeRV: A Scale-wise Recursive Framework for Neural Video Representation0
Disentangling Reasoning in Large Audio-Language Models for Ambiguous Emotion Prediction0
A Recipe for Stable Offline Multi-agent Reinforcement Learning0
Grow, Assess, Compress: Adaptive Backbone Scaling for Memory-Efficient Class Incremental Learning0
Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio0
Discovering Symbolic Differential Equations with Symmetry Invariants0
Show:102550
← PrevPage 184 of 13232Next →