SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 97019750 of 661570 papers

TitleStatusHype
SALVE: Sparse Autoencoder-Latent Vector Editing for Mechanistic Control of Neural Networks0
Concurrent training methods for Kolmogorov-Arnold networks: Disjoint datasets and FPGA implementation0
Cost Trade-offs of Reasoning and Non-Reasoning Large Language Models in Text-to-SQL0
Sparse Offline Reinforcement Learning with Corruption Robustness0
A Two-Stage Multitask Vision-Language Framework for Explainable Crop Disease Visual Question Answering0
The Algorithmic Gaze of Image Quality Assessment: An Audit and Trace Ethnography of the LAION-Aesthetics Predictor0
Replayable Financial Agents: A Determinism-Faithfulness Assurance Harness for Tool-Using LLM Agents0
Continuous-Flow Data-Rate-Aware CNN Inference on FPGA0
Semantic Search over 9 Million Mathematical Theorems0
Diffusion-Guided Pretraining for Brain Graph Foundation Models0
Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference0
Move What Matters: Parameter-Efficient Domain Adaptation via Optimal Transport Flow for Collaborative Perception0
Accelerating Robotic Reinforcement Learning with Agent Guidance0
A Geometric Taxonomy of Hallucinations in LLMs0
TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers0
A Hybrid LTR-based System via Social Context Embedding for Recommending Solutions of Software Bugs in Developer Communities0
Accelerated Predictive Coding Networks via Direct Kolen-Pollack Feedback Alignment0
On the Power of Source Screening for Learning Shared Feature Extractors0
MSP-ReID: Hairstyle-Robust Cloth-Changing Person Re-Identification0
Symmetry-Driven Generation of Crystal Structures from Composition0
Condition-Gated Reasoning for Context-Dependent Biomedical Question Answering0
Benchmarking GNN Models on Molecular Regression Tasks with CKA-Based Representation Analysis0
Position: Evaluation of Visual Processing Should Be Human-Centered, Not Metric-Centered0
Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace or Augment Social Scientists?0
ACES: Accent Subspaces for Coupling, Explanations, and Stress-Testing in Automatic Speech Recognition0
Personalized Multi-Agent Average Reward TD-Learning via Joint Linear Approximation0
DeAR: Fine-Grained VLM Adaptation by Decomposing Attention Head Roles0
LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning0
Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation0
LDP-Slicing: Local Differential Privacy for Images via Randomized Bit-Plane Slicing0
Learning Quadruped Walking from Seconds of Demonstration0
A SISA-based Machine Unlearning Framework for Power Transformer Inter-Turn Short-Circuit Fault Localization0
Not All Neighbors Matter: Understanding the Impact of Graph Sparsification on GNN Pipelines0
Virtual Intraoperative CT (viCT): Sequential Anatomic Updates for Modeling Tissue Resection Throughout Endoscopic Sinus Surgery0
Post-Training with Policy Gradients: Optimality and the Base Model Barrier0
Chart-RL: Generalized Chart Comprehension via Reinforcement Learning with Verifiable Rewards0
SurgCUT3R: Surgical Scene-Aware Continuous Understanding of Temporal 3D Representation0
T2SGrid: Temporal-to-Spatial Gridification for Video Temporal Grounding0
Elenchus: Generating Knowledge Bases from Prover-Skeptic Dialogues0
NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning0
Diffusion Controller: Framework, Algorithms and Parameterization0
Optimizing Multi-Modal Models for Image-Based Shape Retrieval: The Role of Pre-Alignment and Hard Contrastive Learning0
Perception-Aware Multimodal Spatial Reasoning from Monocular Images0
ADAS-TO: A Large-Scale Multimodal Naturalistic Dataset and Empirical Characterization of Human Takeovers during ADAS Engagement0
Large Language Model-Driven Full-Component Evolution of Adaptive Large Neighborhood Search0
Combinatorial Allocation Bandits with Nonlinear Arm Utility0
SuperSkillsStack: Agency, Domain Knowledge, Imagination, and Taste in Human-AI Design Education0
Can Safety Emerge from Weak Supervision? A Systematic Analysis of Small Language Models0
TEA-Time: Transporting Effects Across Time0
RESCHED: Rethinking Flexible Job Shop Scheduling from a Transformer-based Architecture with Simplified States0
Show:102550
← PrevPage 195 of 13232Next →