SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 48764900 of 661570 papers

TitleStatusHype
Enhancing Moral Diagnosis and Correction in Large Language Models0
REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning0
PhysQuantAgent: An Inference Pipeline of Mass Estimation for Vision-Language Models0
MSRAMIE: Multimodal Structured Reasoning Agent for Multi-instruction Image Editing0
Hybrid Classical-Quantum Transfer Learning with Noisy Quantum Circuits0
Implementation of tangent linear and adjoint models for neural networks based on a compiler library tool0
Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization0
SYMDIREC: A Neuro-Symbolic Divide-Retrieve-Conquer Framework for Enhanced RTL Synthesis and Summarization0
Generative AI-assisted Participatory Modeling in Socio-Environmental Planning under Deep Uncertainty0
SMAL-pets: SMAL Based Avatars of Pets from Single Image0
Contextual Preference Distribution Learning0
How Clued up are LLMs? Evaluating Multi-Step Deductive Reasoning in a Text-Based Game Environment0
Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems0
Conditional Distributional Treatment Effects: Doubly Robust Estimation and Testing0
Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning0
DeepStage: Learning Autonomous Defense Policies Against Multi-Stage APT Campaigns0
Age Predictors Through the Lens of Generalization, Bias Mitigation, and Interpretability: Reflections on Causal Implications0
Scalable Sample-Level Causal Discovery in Event Sequences via Autoregressive Density Estimation0
VALD: Multi-Stage Vision Attack Detection for Efficient LVLM Defense0
From the Inside Out: Progressive Distribution Refinement for Confidence Calibration0
AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field RobotCode0
Diverging Transformer Predictions for Human Sentence Processing: A Comprehensive Analysis of Agreement Attraction Effects0
HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes0
Segmentation-Based Attention Entropy: Detecting and Mitigating Object Hallucinations in Large Vision-Language Models0
DanceHA: A Multi-Agent Framework for Document-Level Aspect-Based Sentiment Analysis0
Show:102550
← PrevPage 196 of 26463Next →