SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 56015650 of 661570 papers

TitleStatusHype
Data-Local Autonomous LLM-Guided Neural Architecture Search for Multiclass Multimodal Time-Series Classification0
Are Dilemmas and Conflicts in LLM Alignment Solvable? A View from Priority Graph0
TurkicNLP: An NLP Toolkit for Turkic LanguagesCode0
Context-Aware Sensor Modeling for Asynchronous Multi-Sensor Tracking in Stone Soup0
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition1
Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGICode0
Breaking the SFT Plateau: Multimodal Structured Reinforcement Learning for Chart-to-Code Generation0
ECHO: Ego-Centric modeling of Human-Object interactions0
Limitations of Public Chest Radiography Datasets for Artificial Intelligence: Label Quality, Domain Shift, Bias and Evaluation Challenges0
Track-On2: Enhancing Online Point Tracking with Memory0
GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning0
EvoX: Meta-Evolution for Automated Discovery0
Generative Visual Chain-of-Thought for Image Editing0
Emotion is Not Just a Label: Latent Emotional Factors in LLM Processing0
Planning as Goal Recognition: Deriving Heuristics from Intention Models - Extended Version0
Dataset Distillation Efficiently Encodes Low-Dimensional Representations from Gradient-Based Learning of Non-Linear Tasks0
Ablate and Rescue: A Causal Analysis of Residual Stream Hyper-Connections0
FAR-Drive: Frame-AutoRegressive Video Generation in Closed-Loop Autonomous Driving0
RS-WorldModel: a Unified Model for Remote Sensing Understanding and Future Sense Forecasting0
FairMed-XGB: A Bayesian-Optimised Multi-Metric Framework with Explainability for Demographic Equity in Critical Healthcare Data0
Bridging Scene Generation and Planning: Driving with World Model via Unifying Vision and Motion Representation0
Interpretable Classification of Time Series Using Euler Characteristic Surfaces0
Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty0
Data Augmentation via Causal-Residual Bootstrapping0
From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation0
Fast SAM 3D Body: Accelerating SAM 3D Body for Real-Time Full-Body Human Mesh Recovery0
SEMAG: Self-Evolutionary Multi-Agent Code Generation0
FEEL (Force-Enhanced Egocentric Learning): A Dataset for Physical Action Understanding0
VIBEPASS: Can Vibe Coders Really Pass the Vibe Check?0
Deriving Hyperparameter Scaling Laws via Modern Optimization Theory0
E2EGS: Event-to-Edge Gaussian Splatting for Pose-Free 3D Reconstruction0
Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using AgentsCode0
ReasoningBank: Scaling Agent Self-Evolving with Reasoning MemoryCode0
Fractal Autoregressive Depth Estimation with Continuous Token Diffusion0
Video-CoE: Reinforcing Video Event Prediction via Chain of Events0
SimCert: Probabilistic Certification for Behavioral Similarity in Deep Neural Network Compression0
EventGPT: Capturing Player Impact from Team Action Sequences Using GPT-Based Framework0
Chain-of-Trajectories: Unlocking the Intrinsic Generative Optimality of Diffusion Models via Graph-Theoretic PlanningCode0
Efficient Construction of Model Family through Progressive Training Using Model Expansion0
MARVL: Multi-Stage Guidance for Robotic Manipulation via Vision-Language Models0
Experimental evidence of progressive ChatGPT models self-convergence0
3DTCR: A Physics-Based Generative Framework for Vortex-Following 3D Reconstruction to Improve Tropical Cyclone Intensity Forecasting0
Architecture-Agnostic Feature Synergy for Universal Defense Against Heterogeneous Generative Threats0
Riemannian Motion Generation: A Unified Framework for Human Motion Representation and Generation via Riemannian Flow Matching0
FuXiWeather2: Learning accurate atmospheric state estimation for operational global weather forecasting0
Benchmarking Machine Learning Approaches for Polarization Mapping in Ferroelectrics Using 4D-STEM0
Machine Translation in the Wild: User Reaction to Xiaohongshu's Built-In Translation Feature0
Lost in Aggregation: On a Fundamental Expressivity Limit of Message-Passing Graph Neural Networks0
FlatLands: Generative Floormap Completion From a Single Egocentric View0
Prose2Policy (P2P): A Practical LLM Pipeline for Translating Natural-Language Access Policies into Executable Rego0
Show:102550
← PrevPage 113 of 13232Next →