SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 61016150 of 661570 papers

TitleStatusHype
Zoom to Essence: Trainless GUI Grounding by Inferring upon Interface Elements0
Right for the Wrong Reasons: Epistemic Regret Minimization for Causal Rung Collapse in LLMs0
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling1
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning0
Autonomous Agents Coordinating Distributed Discovery Through Emergent Artifact Exchange2
Rigorous Asymptotics for First-Order Algorithms Through the Dynamical Cavity Method0
Emotional Cost Functions for AI Safety: Teaching Agents to Feel the Weight of Irreversible Consequences0
Stop Before You Fail: Operational Capability Boundaries for Mitigating Unproductive Reasoning in Large Reasoning Models0
Delightful Policy Gradient0
Precedence-Constrained Decision Trees and Coverings0
SPARQ: Spiking Early-Exit Neural Networks for Energy-Efficient Edge AI0
The Active Discoverer Framework: Towards Autonomous Physics Reasoning through Neuro-Symbolic LaTeX Synthesis0
LLM-Augmented Release Intelligence: Automated Change Summarization and Impact Analysis in Cloud-Native CI/CD Pipelines0
Fine-tuning MLLMs Without Forgetting Is Easier Than You Think0
D-MEM: Dopamine-Gated Agentic Memory via Reward Prediction Error Routing0
Automatic Inter-document Multi-hop Scientific QA Generation0
Why Inference in Large Models Becomes Decomposable After Training0
Learning Unmasking Policies for Diffusion Language Models0
MistExit: Learning to Exit for Early Mistake Detection in Procedural Videos0
Personalized Cell Segmentation: Benchmark and Framework for Reference-Guided Cell Type Segmentation0
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models0
Central Dogma Transformer II: An AI Microscope for Understanding Cellular Regulatory Mechanisms0
ZOTTA: Test-Time Adaptation with Gradient-Free Zeroth-Order Optimization0
Bringing Model Editing to Generative Recommendation in Cold-Start Scenarios0
Multilingual TinyStories: A Synthetic Combinatorial Corpus of Indic Children's Stories for Training Small Language Models0
CausalEvolve: Towards Open-Ended Discovery with Causal Scratchpad0
GroundSet: A Cadastral-Grounded Dataset for Spatial Understanding with Vector Data0
Unveiling the Basin-Like Loss Landscape in Large Language Models0
Towards Operational Automated Greenhouse Gas Plume Detection and Delineation0
Efficient Neural Combinatorial Optimization Solver for the Min-max Heterogeneous Capacitated Vehicle Routing Problem0
Eva-VLA: Evaluating Vision-Language-Action Models' Robustness Under Real-World Physical Variations0
ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning0
Protecting Deep Neural Network Intellectual Property with Chaos-Based White-Box Watermarking0
HGAN-SDEs: Learning Neural Stochastic Differential Equations with Hermite-Guided Adversarial Training0
PolyFrame at MWE-2026 AdMIRe 2: When Words Are Not Enough: Multimodal Idiom Disambiguation0
Implementation of Quantum Implicit Neural Representation in Deterministic and Probabilistic Autoencoders for Image Reconstruction/Generation Tasks0
QAQ: Bidirectional Semantic Coherence for Selecting High-Quality Synthetic Code Instructions0
Separable neural architectures as a primitive for unified predictive and generative intelligence0
Walking Further: Semantic-aware Multimodal Gait Recognition Under Long-Range Conditions0
Safety-Potential Pruning for Enhancing Safety Prompts Against VLM Jailbreaking Without Retraining0
S2GS: Streaming Semantic Gaussian Splatting for Online Scene Understanding and Reconstruction0
Windowed Fourier Propagator: A Frequency-Local Neural Operator for Wave Equations in Inhomogeneous Media0
Toward Clinically Ready Foundation Models in Medical Image Analysis: Adaptation Mechanisms and Deployment Trade-offs0
Multi-Period Texture Contrast Enhancement for Low-Contrast Wafer Defect Detection and Segmentation0
MorphSNN: Adaptive Graph Diffusion and Structural Plasticity for Spiking Neural Networks0
Label Noise Cleaning for Supervised Classification via Bernoulli Random Sampling0
Generation of Human Comprehensible Access Control Policies from Audit Logs0
OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism0
OCRA: Object-Centric Learning with 3D and Tactile Priors for Human-to-Robot Action Transfer0
Graph-Based Deep Learning for Intelligent Detection of Energy Losses, Theft, and Operational Inefficiencies in Oil & Gas Production Networks0
Show:102550
← PrevPage 123 of 13232Next →