SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 56015625 of 661570 papers

TitleStatusHype
Data-Local Autonomous LLM-Guided Neural Architecture Search for Multiclass Multimodal Time-Series Classification0
Are Dilemmas and Conflicts in LLM Alignment Solvable? A View from Priority Graph0
TurkicNLP: An NLP Toolkit for Turkic LanguagesCode0
Context-Aware Sensor Modeling for Asynchronous Multi-Sensor Tracking in Stone Soup0
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition1
Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGICode0
Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation0
ECHO: Ego-Centric modeling of Human-Object interactions0
Limitations of Public Chest Radiography Datasets for Artificial Intelligence: Label Quality, Domain Shift, Bias and Evaluation Challenges0
Track-On2: Enhancing Online Point Tracking with Memory0
GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning0
EvoX: Meta-Evolution for Automated Discovery0
Generative Visual Chain-of-Thought for Image Editing0
Emotion is Not Just a Label: Latent Emotional Factors in LLM Processing0
Planning as Goal Recognition: Deriving Heuristics from Intention Models - Extended Version0
Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks0
Ablate and Rescue: A Causal Analysis of Residual Stream Hyper-Connections0
FAR-Drive: Frame-AutoRegressive Video Generation in Closed-Loop Autonomous Driving0
RS-WorldModel: a Unified Model for Remote Sensing Understanding and Future Sense Forecasting0
FairMed-XGB: A Bayesian-Optimised Multi-Metric Framework with Explainability for Demographic Equity in Critical Healthcare Data0
Bridging Scene Generation and Planning: Driving with World Model via Unifying Vision and Motion Representation0
Interpretable Classification of Time Series Using Euler Characteristic Surfaces0
Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty0
Data Augmentation via Causal-Residual Bootstrapping0
From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation0
Show:102550
← PrevPage 225 of 26463Next →