SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 21512175 of 661570 papers

TitleStatusHype
OccAny: Generalized Unconstrained Urban 3D Occupancy0
Agentic Automation of BT-RADS Scoring: End-to-End Multi-Agent System for Standardized Brain Tumor Follow-up Assessment0
The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations0
Residual Attention Physics-Informed Neural Networks for Robust Multiphysics Simulation of Steady-State Electrothermal Energy Systems0
MetaKube: An Experience-Aware LLM Framework for Kubernetes Failure Diagnosis0
The Mass Agreement Score: A Point-centric Measure of Cluster Size Consistency0
Estimating Individual Tree Height and Species from UAV Imagery0
LineMVGNN: Anti-Money Laundering with Line-Graph-Assisted Multi-View Graph Neural Networks0
LongTail Driving Scenarios with Reasoning Traces: The KITScenes LongTail Dataset0
LLMORPH: Automated Metamorphic Testing of Large Language Models0
LLMLOOP: Improving LLM-Generated Code and Tests through Automated Iterative Feedback Loops0
M3T: Discrete Multi-Modal Motion Tokens for Sign Language Production0
Swiss-Bench SBP-002: A Frontier Model Comparison on Swiss Legal and Regulatory Tasks0
λSplit: Self-Supervised Content-Aware Spectral Unmixing for Fluorescence Microscopy0
Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge0
Ethio-ASR: Joint Multilingual Speech Recognition and Language Identification for Ethiopian Languages0
Boost Like a (Var)Pro: Trust-Region Gradient Boosting via Variable Projection0
Probing Ethical Framework Representations in Large Language Models: Structure, Entanglement, and Methodological Challenges0
GTO Wizard Benchmark0
Echoes: A semantically-aligned music deepfake detection dataset0
Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models0
Grounding Vision and Language to 3D Masks for Long-Horizon Box Rearrangement0
Prototype Fusion: A Training-Free Multi-Layer Approach to OOD Detection0
PLACID: Privacy-preserving Large language models for Acronym Clinical Inference and Disambiguation0
Learning What Can Be Picked: Active Reachability Estimation for Efficient Robotic Fruit Harvesting0
Show:102550
← PrevPage 87 of 26463Next →