SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 54515500 of 661570 papers

TitleStatusHype
PMAx: An Agentic Framework for AI-Driven Process Mining0
Conditional Rectified Flow-based End-to-End Rapid Seismic Inversion Method0
Controlled Langevin Dynamics for Sampling of Feedforward Neural Networks Trained with Minibatches0
Trajectory-Diversity-Driven Robust Vision-and-Language Navigation0
SFCoT: Safer Chain-of-Thought via Active Safety Evaluation and Calibration0
Brain-Inspired Graph Multi-Agent Systems for LLM Reasoning0
SKILLS: Structured Knowledge Injection for LLM-Driven Telecommunications Operations0
Spectral Rectification for Parameter-Efficient Adaptation of Foundation Models in Colonoscopy Depth Estimation0
Why AI systems don't learn and what to do about it: Lessons on autonomous learning from cognitive science0
Efficient Morphology-Control Co-Design via Stackelberg Proximal Policy Optimization0
Beyond the Covariance Trap: Unlocking Generalization in Same-Subject Knowledge Editing for Large Language Models0
TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems0
SEA-Vision: A Multilingual Benchmark for Comprehensive Document and Scene Text Understanding in Southeast Asia0
Local Urysohn Width: A Topological Complexity Measure for Classification0
RESQ: A Unified Framework for REliability- and Security Enhancement of Quantized Deep Neural Networks0
AnyCrowd: Instance-Isolated Identity-Pose Binding for Arbitrary Multi-Character Animation0
Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities0
MA-VLCM: A Vision Language Critic Model for Value Estimation of Policies in Multi-Agent Team Settings0
CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents0
Physics-informed fine-tuning of foundation models for partial differential equations0
Real-Time Human Frontal View Synthesis from a Single Image0
Music Genre Classification: A Comparative Analysis of Classical Machine Learning and Deep Learning Approaches0
Evaluating Time Awareness and Cross-modal Active Perception of Large Models via 4D Escape Room Task0
Anchor then Polish for Low-light Enhancement0
TabKD: Tabular Knowledge Distillation through Interaction Diversity of Learned Feature Bins0
Talk, Evaluate, Diagnose: User-aware Agent Evaluation with Automated Error Analysis0
Grokking as a Variance-Limited Phase Transition: Spectral Gating and the Epsilon-Stability Threshold0
Seeking SOTA: Time-Series Forecasting Must Adopt Taxonomy-Specific Evaluation to Dispel Illusory Gains0
Not All Invariants Are Equal: Curating Training Data to Accelerate Program Verification with SLMs0
FreeTalk: Emotional Topology-Free 3D Talking Heads0
Building Trust in PINNs: Error Estimation through Finite Difference Methods0
Vib2ECG: A Paired Chest-Lead SCG-ECG Dataset and Benchmark for ECG Reconstruction0
DOT: Dynamic Knob Selection and Online Sampling for Automated Database Tuning0
Bridging Local and Global Knowledge: Cascaded Mixture-of-Experts Learning for Near-Shortest Path Routing0
Kimodo: Scaling Controllable Human Motion Generation0
Severe Domain Shift in Skeleton-Based Action Recognition:A Study of Uncertainty Failure in Real-World Gym Environments0
Computational Concept of the Psyche0
Robust and Computationally Efficient Linear Contextual Bandits under Adversarial Corruption and Heavy-Tailed Noise0
Position-Blind Ptychography: Viability of image reconstruction via data-driven variational inference0
Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion0
Mechanistic Origin of Moral Indifference in Language Models0
Can large language models assist choice modelling? Insights into prompting strategies and current models capabilities0
HorizonMath: Measuring AI Progress Toward Mathematical Discovery with Automatic Verification1
No More Blind Spots: Learning Vision-Based Omnidirectional Bipedal Locomotion for Challenging Terrain0
NanoFlux: Adversarial Dual-LLM Evaluation and Distillation For Multi-Domain Reasoning0
A Dynamic Time Warping-Transfer Learning Approach to Transferring Knowledge in Stress-strain Behaviors from Polymers to Metals: An Affordable and Generalizable Additive Manufacturing Part Qualification Framework0
Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space0
Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain0
Tail Distribution of Regret in Optimistic Reinforcement Learning0
MorphSeek: Fine-grained Latent Representation-Level Policy Optimization for Deformable Image Registration0
Show:102550
← PrevPage 110 of 13232Next →