SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,984 papers248,105 code links4,818 tasks

Papers

Showing 33013350 of 659984 papers

TitleStatusHype
ExpressMind: A Multimodal Pretrained Large Language Model for Expressway Operation0
CompDiff: Hierarchical Compositional Diffusion for Fair and Zero-Shot Intersectional Medical Image Generation0
EmoLLM: Appraisal-Grounded Cognitive-Emotional Co-Reasoning in Large Language Models0
Characterizing Delusional Spirals through Human-LLM Chat Logs0
V-DyKnow: A Dynamic Benchmark for Time-Sensitive Knowledge in Vision Language Models0
What if Pinocchio Were a Reinforcement Learning Agent: A Normative End-to-End Pipeline0
x^2-Fusion: Cross-Modality and Cross-Dimension Flow Estimation in Event Edge Space0
CritiSense: Critical Digital Literacy and Resilience Against Misinformation0
Novelty-Driven Target-Space Discovery in Automated Electron and Scanning Probe Microscopy0
Federated Learning with Multi-Partner OneFlorida+ Consortium Data for Predicting Major Postoperative Complications0
V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising0
Adaptive Moments are Surprisingly Effective for Plug-and-Play Diffusion Sampling0
Is Conformal Factuality for RAG-based LLMs Robust? Novel Metrics and Systematic Insights0
Deep Reinforcement Learning-driven Edge Offloading for Latency-constrained XR pipelines0
An assessment of data-centric methods for label noise identification in remote sensing data sets0
Mediocrity is the key for LLM as a Judge Anchor Selection0
Unifying Optimization and Dynamics to Parallelize Sequential Computation: A Guide to Parallel Newton Methods for Breaking Sequential Bottlenecks0
SOMA: Unifying Parametric Human Body Models0
Chronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term Memory0
MessyKitchens: Contact-rich object-level 3D scene reconstruction0
Enhancing Moral Diagnosis and Correction in Large Language Models0
REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning0
PhysQuantAgent: An Inference Pipeline of Mass Estimation for Vision-Language Models0
MSRAMIE: Multimodal Structured Reasoning Agent for Multi-instruction Image Editing0
Hybrid Classical-Quantum Transfer Learning with Noisy Quantum Circuits0
Implementation of tangent linear and adjoint models for neural networks based on a compiler library tool0
Early Quantization Shrinks Codebook: A Simple Fix for Diversity-Preserving Tokenization0
SYMDIREC: A Neuro-Symbolic Divide-Retrieve-Conquer Framework for Enhanced RTL Synthesis and Summarization0
Generative AI-assisted Participatory Modeling in Socio-Environmental Planning under Deep Uncertainty0
SMAL-pets: SMAL Based Avatars of Pets from Single Image0
Contextual Preference Distribution Learning0
How Clued up are LLMs? Evaluating Multi-Step Deductive Reasoning in a Text-Based Game Environment0
Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems0
Conditional Distributional Treatment Effects: Doubly Robust Estimation and Testing0
Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning0
DeepStage: Learning Autonomous Defense Policies Against Multi-Stage APT Campaigns0
Age Predictors Through the Lens of Generalization, Bias Mitigation, and Interpretability: Reflections on Causal Implications0
Scalable Sample-Level Causal Discovery in Event Sequences via Autoregressive Density Estimation0
VALD: Multi-Stage Vision Attack Detection for Efficient LVLM Defense0
From the Inside Out: Progressive Distribution Refinement for Confidence Calibration0
AgriChrono: A Multi-modal Dataset Capturing Crop Growth and Lighting Variability with a Field RobotCode0
Diverging Transformer Predictions for Human Sentence Processing: A Comprehensive Analysis of Agreement Attraction Effects0
HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes0
Segmentation-Based Attention Entropy: Detecting and Mitigating Object Hallucinations in Large Vision-Language Models0
DanceHA: A Multi-Agent Framework for Document-Level Aspect-Based Sentiment Analysis0
Order Matters: 3D Shape Generation from Sequential VR Sketches0
PEPPER: Perception-Guided Perturbation for Robust Backdoor Defense in Text-to-Image Diffusion Models0
Multilingual Reference Need Assessment System for Wikipedia0
BEV-SLD: Self-Supervised Scene Landmark Detection for Global Localization with LiDAR Bird's-Eye View Images0
Who's important? -- SUnSET: Synergistic Understanding of Stakeholder, Events and Time for Timeline Generation0
Show:102550
← PrevPage 67 of 13200Next →