SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 10511100 of 659983 papers

TitleStatusHype
WIST: Web-Grounded Iterative Self-Play Tree for Domain-Targeted Reasoning Improvement0
Demystifying Low-Rank Knowledge Distillation in Large Language Models: Convergence, Generalization, and Information-Theoretic Guarantees0
Bridging neuroscience and AI: adaptive, culturally sensitive technologies transforming aphasia rehabilitation0
STEM Agent: A Self-Adapting, Tool-Enabled, Extensible Architecture for Multi-Protocol AI Agent Systems0
ECI: Effective Contrastive Information to Evaluate Hard-Negatives0
Structural Sensitivity in Compressed Transformers: Error Propagation, Lyapunov Stability, and Formally Verified Bounds0
Long-Term Outlier Prediction Through Outlier Score Modeling0
The Intelligent Disobedience Game: Formulating Disobedience in Stackelberg Games and Markov Decision Processes0
When Does Content-Based Routing Work? Representation Requirements for Selective Attention in Hybrid Sequence Models0
CLT-Forge: A Scalable Library for Cross-Layer Transcoders and Attribution Graphs0
Mitigating Selection Bias in Large Language Models via Permutation-Aware GRPO0
SpatialFly: Geometry-Guided Representation Alignment for UAV Vision-and-Language Navigation in Urban Environments0
When Minor Edits Matter: LLM-Driven Prompt Attack for Medical VLM Robustness in Ultrasound0
NoOVD: Novel Category Discovery and Embedding for Open-Vocabulary Object Detection0
CTFS : Collaborative Teacher Framework for Forward-Looking Sonar Image Semantic Segmentation with Extremely Limited Labels0
SqueezeComposer: Temporal Speed-up is A Simple Trick for Long-form Music Composing0
CoVFT: Context-aware Visual Fine-tuning for Multimodal Large Language Models0
Assessing the Ability of Neural TTS Systems to Model Consonant-Induced F0 Perturbation0
Hierarchical Text-Guided Brain Tumor Segmentation via Sub-Region-Aware Prompts0
ViCLSR: A Supervised Contrastive Learning Framework with Natural Language Inference for Natural Language Understanding Tasks0
Interpreting the Synchronization Gap: The Hidden Mechanism Inside Diffusion Transformers0
Can we automatize scientific discovery in the cognitive sciences?0
Behavioural feasible set: Value alignment constraints on AI decision support0
Text-Image Conditioned 3D Generation0
Direct Interval Propagation Methods using Neural-Network Surrogates for Uncertainty Quantification in Physical Systems Surrogate Model0
FluidWorld: Reaction-Diffusion Dynamics as a Predictive Substrate for World Models0
HELIX: Scaling Raw Audio Understanding with Hybrid Mamba-Attention Beyond the Quadratic Limit0
Stream separation improves Bregman conditioning in transformers0
KHMP: Frequency-Domain Kalman Refinement for High-Fidelity Human Motion Prediction0
COINBench: Moving Beyond Individual Perspectives to Collective Intent Understanding0
FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading0
Taming Sampling Perturbations with Variance Expansion Loss for Latent Diffusion Models0
CVT-Bench: Counterfactual Viewpoint Transformations Reveal Unstable Spatial Representations in Multimodal LLMs0
MS-CustomNet: Controllable Multi-Subject Customization with Hierarchical Relational Semantics0
Incentivizing Generative Zero-Shot Learning via Outcome-Reward Reinforcement Learning with Visual Cues0
Ontology-driven personalized information retrieval for XML documents0
ORACLE: Optimizing Reasoning Abilities of Large Language Models via Constraint-Led Synthetic Data Elicitation0
Time-adaptive functional Gaussian Process regression0
NeSy-Edge: Neuro-Symbolic Trustworthy Self-Healing in the Computing Continuum0
Learning from Label Proportions with Dual-proportion Constraints0
Training-Free Instance-Aware 3D Scene Reconstruction and Diffusion-Based View Synthesis from Sparse Images0
Model Evolution Under Zeroth-Order Optimization: A Neural Tangent Kernel Perspective0
Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models0
Entropy Alone is Insufficient for Safe Selective Prediction in LLMs0
Rethinking Plasticity in Deep Reinforcement Learning0
Explainable Semantic Textual Similarity via Dissimilar Span Detection0
Reward Sharpness-Aware Fine-Tuning for Diffusion Models0
On the Role of Batch Size in Stochastic Conditional Gradient Methods0
DSCSNet: A Dynamic Sparse Compression Sensing Network for Closely-Spaced Infrared Small Target Unmixing0
Positional Segmentor-Guided Counterfactual Fine-Tuning for Spatially Localized Image Synthesis0
Show:102550
← PrevPage 22 of 13200Next →