SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 74017450 of 661570 papers

TitleStatusHype
Portfolio of Solving Strategies in CEGAR-based Object Packing and Scheduling for Sequential 3D Printing0
Incremental Neural Network Verification via Learned Conflicts0
STAMP: Selective Task-Aware Mechanism for Text Privacy0
BiGain: Unified Token Compression for Joint Generation and Classification0
Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training0
SciMDR: Benchmarking and Advancing Scientific Multimodal Document Reasoning0
Attend Before Attention: Efficient and Scalable Video Understanding via Autoregressive Gazing0
DVD: Deterministic Video Depth Estimation with Generative Priors3
DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning0
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation1
Do LLMs have a Gender (Entropy) Bias?0
From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Saptiotemporal Dynamics in Brain Signal Analysis0
SATGround: A Spatially-Aware Approach for Visual Grounding in Remote Sensing0
Aligning Large Language Model Agents with Rational and Moral Preferences: A Supervised Fine-Tuning Approach0
Intrinsic training dynamics of deep neural networks0
On the (In)Security of Loading Machine Learning Models0
LowDiff: Efficient Diffusion Sampling with Low-Resolution Condition0
ASTGI: Adaptive Spatio-Temporal Graph Interactions for Irregular Multivariate Time Series Forecasting0
Scaling Generalist Data-Analytic Agents0
HoneyBee: Data Recipes for Vision-Language Reasoners1
RLM: A Vision-Language Model Approach for Radar Scene Understanding0
Epistemic diversity across language models mitigates knowledge collapse0
Information-Consistent Language Model Recommendations through Group Relative Policy Optimization0
From XAI to Stories: A Factorial Study of LLM-Generated Explanation Quality0
AIMC-Spec: A Benchmark Dataset for Automatic Intrapulse Modulation Classification under Variable Noise Conditions0
A Longitudinal, Multinational, and Multilingual Corpus of News Coverage of the Russo-Ukrainian War0
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation0
Hierarchical Concept Embedding & Pursuit for Interpretable Image Classification0
Asynchronous Verified Semantic Caching for Tiered LLM Architectures0
OpenSage: Self-programming Agent Generation Engine0
TASTE-Streaming: Towards Streamable Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling0
Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration0
Spatial PDE-aware Selective State-space with Nested Memory for Mobile Traffic Grid Forecasting0
Sinkhorn-Drifting Generative Models0
Multi-Step Semantic Reasoning in Generative Retrieval0
The Privacy-Utility Trade-Off of Location Tracking in Ad Personalization0
NeuroLoRA: Context-Aware Neuromodulation for Parameter-Efficient Multi-Task Adaptation0
SPARROW: Learning Spatial Precision and Temporal Referential Consistency in Pixel-Grounded Video MLLMs0
Test-Time Strategies for More Efficient and Accurate Agentic RAG0
Not Just the Destination, But the Journey: Reasoning Traces Causally Shape Generalization Behaviors0
Generation of maximal snake polyominoes using a deep neural network0
Beyond Motion Imitation: Is Human Motion Data Alone Sufficient to Explain Gait Control and Biomechanics?0
A Neuro-Symbolic Framework Combining Inductive and Deductive Reasoning for Autonomous Driving Planning0
Interpreting Negation in GPT-2: Layer- and Head-Level Causal Analysis0
Surg-R1: A Hierarchical Reasoning Foundation Model for Scalable and Interpretable Surgical Decision Support with Multi-Center Clinical Validation0
KernelFoundry: Hardware-aware evolutionary GPU kernel optimization0
Unmasking Biases and Reliability Concerns in Convolutional Neural Networks Analysis of Cancer Pathology Images0
FloeNet: A mass-conserving global sea ice emulator that generalizes across climates0
CSE-UOI at SemEval-2026 Task 6: A Two-Stage Heterogeneous Ensemble with Deliberative Complexity Gating for Political Evasion Detection0
Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs0
Show:102550
← PrevPage 149 of 13232Next →