SOTAVerified|Agents Browse Leaderboard About

valid

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 3589 papers

Title	Date	Tasks	Status	Hype
Multi-Objective Planning with Contextual Lexicographic Reward Preferences	Feb 13, 2025	valid	—Unverified	0
Trust Me, I Know the Way: Predictive Uncertainty in the Presence of Shortcut Learning	Feb 13, 2025	valid	—Unverified	0
Generalizability through Explainability: Countering Overfitting with Counterfactual Examples	Feb 13, 2025	counterfactualData Augmentation	—Unverified	0
CRANE: Reasoning with constrained LLM generation	Feb 13, 2025	Code GenerationMath	—Unverified	0
High-Throughput SAT Sampling	Feb 12, 2025	GPUvalid	CodeCode Available	0
Inference in dynamic models for panel data using the moving block bootstrap	Feb 12, 2025	valid	—Unverified	0
On Training-Conditional Conformal Prediction and Binomial Proportion Confidence Intervals	Feb 11, 2025	Conformal PredictionUncertainty Quantification	—Unverified	0
Beyond Confidence: Adaptive Abstention in Dual-Threshold Conformal Prediction for Autonomous System Perception	Feb 11, 2025	Conformal PredictionUncertainty Quantification	CodeCode Available	0
Experiments in the Linear Convex Order	Feb 10, 2025	valid	—Unverified	0
Krum Federated Chain (KFC): Using blockchain to defend against adversarial attacks in Federated Learning	Feb 10, 2025	Federated Learningimage-classification	CodeCode Available	0
Dual Conic Proxy for Semidefinite Relaxation of AC Optimal Power Flow	Feb 10, 2025	Self-Supervised Learningvalid	—Unverified	0
Tokenization Standards for Linguistic Integrity: Turkish as a Benchmark	Feb 10, 2025	MMLUMorphological Analysis	—Unverified	0
On the Impact of the Utility in Semivalue-based Data Valuation	Feb 10, 2025	Data Valuationvalid	—Unverified	0
Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association	Feb 9, 2025	EpidemiologyUncertainty Quantification	CodeCode Available	0
Forbidden Science: Dual-Use AI Challenge Benchmark and Scientific Refusal Tests	Feb 8, 2025	valid	—Unverified	0
Generative-enhanced optimization for knapsack problems: an industry-relevant study	Feb 7, 2025	Tensor Networksvalid	—Unverified	0
t-Testing the Waters: Empirically Validating Assumptions for Reliable A/B-Testing	Feb 7, 2025	Experimental Designvalid	—Unverified	0
Automating a Complete Software Test Process Using LLMs: An Automotive Case Study	Feb 6, 2025	valid	—Unverified	0
Combining Clusters for the Approximate Randomization Test	Feb 6, 2025	valid	—Unverified	0
First-ish Order Methods: Hessian-aware Scalings of Gradient Descent	Feb 6, 2025	valid	—Unverified	0
Efficient Randomized Experiments Using Foundation Models	Feb 6, 2025	valid	CodeCode Available	0
Change Point Detection in the Frequency Domain with Statistical Reliability	Feb 5, 2025	Change Point Detectionvalid	—Unverified	0
SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models	Feb 5, 2025	valid	CodeCode Available	1
FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference	Feb 4, 2025	Predictionvalid	—Unverified	0
Variance-Adjusted Cosine Distance as Similarity Metric	Feb 4, 2025	valid	—Unverified	0

Show:10 25 50

← PrevPage 10 of 144Next →

No leaderboard results yet.