SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1540115450 of 474278 papers

TitleStatusHype
Projecting U.S. coastal storm surge risks and impacts with deep learning0
VL-GenRM: Enhancing Vision-Language Verification via Vision Experts and Iterative Training0
StaQ it! Growing neural networks for Policy Mirror Descent0
Are manual annotations necessary for statutory interpretations retrieval?0
Adaptive Guidance Accelerates Reinforcement Learning of Reasoning Models0
Prefix-Tuning+: Modernizing Prefix-Tuning by Decoupling the Prefix from Attention0
ProfiLLM: An LLM-Based Framework for Implicit Profiling of Chatbot Users0
Discovering Temporal Structure: An Overview of Hierarchical Reinforcement Learning0
Evolvable Conditional Diffusion0
Robustness of Reinforcement Learning-Based Traffic Signal Control under Incidents: A Comparative Study0
Toward Explainable Offline RL: Analyzing Representations in Intrinsically Motivated Decision Transformers0
Bures-Wasserstein Flow Matching for Graph Generation0
Scientifically-Interpretable Reasoning Network (ScIReN): Uncovering the Black-Box of Nature0
Meta Optimality for Demographic Parity Constrained Regression via Post-Processing0
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need!0
Graph-Convolutional-Beta-VAE for Synthetic Abdominal Aorta Aneurysm Generation0
Load Balancing Mixture of Experts with Similarity Preserving Routers0
Integrating Knowledge Graphs and Bayesian Networks: A Hybrid Approach for Explainable Disease Risk Prediction0
Self-Supervised Enhancement for Depth from a Lightweight ToF Sensor with Monocular ImagesCode1
Evolutionary chemical learning in dimerization networksCode0
Sustainable Machine Learning Retraining: Optimizing Energy Efficiency Without Compromising AccuracyCode0
Safe Domains of Attraction for Discrete-Time Nonlinear Systems: Characterization and Verifiable Neural Network EstimationCode0
SatHealth: A Multimodal Public Health Dataset with Satellite-based Environmental FactorsCode0
Density-aware Walks for Coordinated Campaign DetectionCode0
Evaluating Generalization and Representation Stability in Small LMs via Prompting, Fine-Tuning and Out-of-Distribution Prompts0
Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and CorrectionCode0
Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model OutputsCode1
ROSAQ: Rotation-based Saliency-Aware Weight Quantization for Efficiently Compressing Large Language Models0
Effective Stimulus Propagation in Neural Circuits: Driver Node Selection0
LocationReasoner: Evaluating LLMs on Real-World Site Selection ReasoningCode0
Quantum-Informed Contrastive Learning with Dynamic Mixup Augmentation for Class-Imbalanced Expert Systems0
Investigating the interaction of linguistic and mathematical reasoning in language models using multilingual number puzzles0
GITO: Graph-Informed Transformer Operator for Learning Complex Partial Differential Equations0
A Silent Speech Decoding System from EEG and EMG with Heterogenous Electrode Configurations0
Automatic Extraction of Clausal Embedding Based on Large-Scale English Text DataCode0
How Does LLM Reasoning Work for Code? A Survey and a Call to Action0
A Regret Perspective on Online Selective Generation0
Estimation of Treatment Effects in Extreme and Unobserved Data0
Experimental Design for Semiparametric Bandits0
Constant Stepsize Local GD for Logistic Regression: Acceleration by Instability0
Connecting phases of matter to the flatness of the loss landscape in analog variational quantum algorithms0
Enhancing interpretability of rule-based classifiers through feature graphsCode0
SeqPE: Transformer with Sequential Position EncodingCode1
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation0
EmoNews: A Spoken Dialogue System for Expressive News ConversationsCode0
Causal Mediation Analysis with Multiple Mediators: A Simulation ApproachCode0
A Production Scheduling Framework for Reinforcement Learning Under Real-World ConstraintsCode1
HierVL: Semi-Supervised Segmentation leveraging Hierarchical Vision-Language Synergy with Dynamic Text-Spatial Query Alignment0
DETRPose: Real-time end-to-end transformer model for multi-person pose estimationCode2
Sketched Sum-Product Networks for JoinsCode0
Show:102550
← PrevPage 309 of 9486Next →