SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 601650 of 659983 papers

TitleStatusHype
PolarAPP: Beyond Polarization Demosaicking for Polarimetric Applications0
Generalization Bounds for Physics-Informed Neural Networks for the Incompressible Navier-Stokes Equations0
Can an LLM Detect Instances of Microservice Infrastructure Patterns?0
MsFormer: Enabling Robust Predictive Maintenance Services for Industrial Devices0
A Bayesian Learning Approach for Drone Coverage Network: A Case Study on Cardiac Arrest in Scotland0
HGNet: Scalable Foundation Model for Automated Knowledge Graph Generation from Scientific Literature0
Why AI-Generated Text Detection Fails: Evidence from Explainable AI Beyond Benchmark Accuracy0
Describe-Then-Act: Proactive Agent Steering via Distilled Language-Action World Models0
VoDaSuRe: A Large-Scale Dataset Revealing Domain Shift in Volumetric Super-Resolution0
SAiW: Source-Attributable Invisible Watermarking for Proactive Deepfake Defense0
GSwap: Realistic Head Swapping with Dynamic Neural Gaussian Field0
Robust Safety Monitoring of Language Models via Activation Watermarking0
From Synthetic to Native: Benchmarking Multilingual Intent Classification in Logistics Customer Service0
A Schrödinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control0
FDIF: Formula-Driven supervised Learning with Implicit Functions for 3D Medical Image Segmentation0
Gaze-Regularized Vision-Language-Action Models for Robotic Manipulation0
Between Resolution Collapse and Variance Inflation: Weighted Conformal Anomaly Detection in Low-Data Regimes0
A Latency Coding Framework for Deep Spiking Neural Networks with Ultra-Low Latency0
A One-Inclusion Graph Approach to Multi-Group Learning0
A Learning Method with Gap-Aware Generation for Heterogeneous DAG Scheduling0
Is AI Catching Up to Human Expression? Exploring Emotion, Personality, Authorship, and Linguistic Style in English and Arabic with Six Large Language Models0
AI Lifecycle-Aware Feasibility Framework for Split-RIC Orchestration in NTN O-RAN0
Permutation-Symmetrized Diffusion for Unconditional Molecular Generation0
Revisiting Real-Time Digging-In Effects: No Evidence from NP/Z Garden-Paths0
Harnessing Lightweight Transformer with Contextual Synergic Enhancement for Efficient 3D Medical Image Segmentation0
Kinetic Langevin Splitting Schemes for Constrained Sampling0
Graph Energy Matching: Transport-Aligned Energy-Based Modeling for Graph Generation0
Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning0
Evaluating a Multi-Agent Voice-Enabled Smart Speaker for Care Homes: A Safety-Focused Framework0
SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling0
Biased Error Attribution in Multi-Agent Human-AI Systems Under Delayed Feedback0
Bilevel Autoresearch: Meta-Autoresearching Itself0
Mecha-nudges for Machines0
Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning0
Targeted Adversarial Traffic Generation : Black-box Approach to Evade Intrusion Detection Systems in IoT Networks0
SIGMA: A Physics-Based Benchmark for Gas Chimney Understanding in Seismic Images0
Evaluating LLM-Based Test Generation Under Software Evolution0
VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions0
Estimating Flow Velocity and Vehicle Angle-of-Attack from Non-invasive Piezoelectric Structural Measurements Using Deep Learning0
WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG0
DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models0
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation0
MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage0
OccAny: Generalized Unconstrained Urban 3D Occupancy0
Agentic Automation of BT-RADS Scoring: End-to-End Multi-Agent System for Standardized Brain Tumor Follow-up Assessment0
The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations0
Residual Attention Physics-Informed Neural Networks for Robust Multiphysics Simulation of Steady-State Electrothermal Energy Systems0
MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis0
The Mass Agreement Score: A Point-centric Measure of Cluster Size Consistency0
Estimating Individual Tree Height and Species from UAV Imagery0
Show:102550
← PrevPage 13 of 13200Next →