SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 13011325 of 659983 papers

TitleStatusHype
Characterizing the onset and offset of motor imagery during passive arm movements induced by an upper-body exoskeleton0
Scene Graph-guided SegCaptioning Transformer with Fine-grained Alignment for Controllable Video Segmentation and Captioning0
Auto-differentiable data assimilation: Co-learning of states, dynamics, and filtering algorithms0
LLM Router: Prefill is All You Need0
Beyond the Birkhoff Polytope: Spectral-Sphere-Constrained Hyper-Connections0
The data heat island effect: quantifying the impact of AI data centers in a warming world0
Natural Gradient Descent for Online Continual Learning0
Mitigating Shortcut Reasoning in Language Models: A Gradient-Aware Training Approach0
The Hidden Puppet Master: A Theoretical and Real-World Account of Emotional Manipulation in LLMs0
Bayesian Scattering: A Principled Baseline for Uncertainty on Image Data0
LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models0
Do LLM-Driven Agents Exhibit Engagement Mechanisms? Controlled Tests of Information Load, Descriptive Norms, and Popularity Cues0
Enhancing LIME using Neural Decision Trees0
Democratizing AI: A Comparative Study in Deep Learning Efficiency and Future Trends in Computational Processing0
Discriminative Representation Learning for Clinical Prediction0
Profit is the Red Team: Stress-Testing Agents in Strategic Economic Interactions0
MOELIGA: a multi-objective evolutionary approach for feature selection with local improvement0
User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction0
gUFO: A Gentle Foundational Ontology for Semantic Web Knowledge Graphs0
Understanding Contextual Recall in Transformers: How Finetuning Enables In-Context Reasoning over Pretraining Knowledge0
GraPHFormer: A Multimodal Graph Persistent Homology Transformer for the Analysis of Neuroscience Morphologies0
DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles0
Detection of adversarial intent in Human-AI teams using LLMs0
MERIT: Multi-domain Efficient RAW Image Translation0
Dodgersort: Uncertainty-Aware VLM-Guided Human-in-the-Loop Pairwise Ranking0
Show:102550
← PrevPage 53 of 26400Next →