SOTAVerified

Decision Making

Papers

Showing 661670 of 12311 papers

TitleStatusHype
Benchmarking LLMs for Political Science: A United Nations PerspectiveCode1
Benchmarking saliency methods for chest X-ray interpretationCode1
An Introduction to Deep Reinforcement LearningCode1
BetaZero: Belief-State Planning for Long-Horizon POMDPs using Learned ApproximationsCode1
Digital Transformation in the Water Distribution System based on the Digital Twins ConceptCode1
DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster ManagementCode1
DiffSTG: Probabilistic Spatio-Temporal Graph Forecasting with Denoising Diffusion ModelsCode1
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for SamplingCode1
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing InducementsCode1
Diffusion-Based Electrocardiography Noise Quantification via Anomaly DetectionCode1
Show:102550
← PrevPage 67 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified