SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 67016750 of 661570 papers

TitleStatusHype
Causal Cellular Context Transfer Learning (C3TL): An Efficient Architecture for Prediction of Unseen Perturbation Effects0
Reference-Free Image Quality Assessment for Virtual Try-On via Human Feedback0
GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration0
Rooftop Wind Field Reconstruction Using Sparse Sensors: From Deterministic to Generative Learning Methods0
Human-in-the-Loop LLM Grading for Handwritten Mathematics Assessments0
Reasoning over Video: Evaluating How MLLMs Extract, Integrate, and Reconstruct Spatiotemporal Evidence0
ZO-SAM: Zero-Order Sharpness-Aware Minimization for Efficient Sparse Training0
XQC: Well-conditioned Optimization Accelerates Deep Reinforcement Learning0
Explainable Visual Anomaly Detection via Concept Bottleneck Models0
Entropy Collapse: A Universal Failure Mode of Intelligent Systems0
LLM Novice Uplift on Dual-Use, In Silico Biology Tasks0
Scaling Reward Modeling without Human Supervision0
Dynamic Sparse Attention: Access Patterns and Architecture0
Spatially Grounded Long-Horizon Task Planning in the Wild0
Vision-Language Based Expert Reporting for Painting Authentication and Defect Detection0
Filtered Spectral Projection for Quantum Principal Component Analysis0
MESD: Detecting and Mitigating Procedural Bias in Intersectional Groups0
Probabilistic Gaussian Homotopy: A Probability-Space Continuation Framework for Nonconvex Optimization0
Performance evaluation of deep learning models for image analysis: considerations for visual control and statistical metrics0
Analytical Logit Scaling for High-Resolution Sea Ice Topology Retrieval from Weakly Labeled SAR Imagery0
A Causal Framework for Mitigating Data Shifts in Healthcare0
StatePlane: A Cognitive State Plane for Long-Horizon AI Systems Under Bounded Context0
Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials0
Design and evaluation of an agentic workflow for crisis-related synthetic tweet datasets0
SldprtNet: A Large-Scale Multimodal Dataset for CAD Generation in Language-Driven 3D Design0
Improving Channel Estimation via Multimodal Diffusion Models with Flow Matching0
Active Sampling Sample-based Quantum Diagonalization from Finite-Shot Measurements0
The AI Fiction Paradox0
Ghosts of Softmax: Complex Singularities That Limit Safe Step Sizes in Cross-Entropy0
VoXtream2: Full-stream TTS with dynamic speaking rate control0
Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights0
LADR: Locality-Aware Dynamic Rescue for Efficient Text-to-Image Generation with Diffusion Large Language Models0
A Grid-Based Framework for E-Scooter Demand Representation and Temporal Input Design for Deep Learning: Evidence from Austin, Texas0
Topo-R1: Detecting Topological Anomalies via Vision-Language Models0
LLM Routing as Reasoning: A MaxSAT View0
PLUME: Building a Network-Native Foundation Model for Wireless Traces via Protocol-Aware Tokenization0
Learning to Repair Lean Proofs from Compiler Feedback0
InterEdit: Navigating Text-Guided Multi-Human 3D Motion EditingCode0
Panoramic Multimodal Semantic Occupancy Prediction for Quadruped RobotsCode0
Representation Learning for Spatiotemporal Physical SystemsCode0
LOSC: LiDAR Open-voc Segmentation ConsolidatorCode0
Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMsCode0
CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody DesignCode0
Reconciling In-Context and In-Weight Learning via Dual Representation Space EncodingCode0
DiveUp: Learning Feature Upsampling from Diverse Vision Foundation ModelsCode0
OpenACMv2: An Accuracy-Constrained Co-Optimization Framework for Approximate DCiMCode0
NOIR: Neural Operator mapping for Implicit RepresentationsCode0
EvoLMM: Self-Evolving Large Multimodal Models with Continuous RewardsCode0
Steer Away From Mode Collisions: Improving Composition In Diffusion ModelsCode0
Resolving Interference (RI): Disentangling Models for Improved Model MergingCode0
Show:102550
← PrevPage 135 of 13232Next →