SOTAVerified

Decision Making

Papers

Showing 531540 of 12311 papers

TitleStatusHype
ADaPT: As-Needed Decomposition and Planning with Language ModelsCode1
CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in CoqCode1
CFGPT: Chinese Financial Assistant with Large Language ModelCode1
AdaPlanner: Adaptive Planning from Feedback with Language ModelsCode1
Certified Reinforcement Learning with Logic GuidanceCode1
CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired TransformerCode1
Causal Discovery with Language Models as Imperfect ExpertsCode1
CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language ModelsCode1
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMsCode1
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and ActingCode1
Show:102550
← PrevPage 54 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified