SOTAVerified

Decision Making

Papers

Showing 171180 of 12311 papers

TitleStatusHype
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language ModelsCode2
MLAgentBench: Evaluating Language Agents on Machine Learning ExperimentationCode2
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous DrivingCode2
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language ModelsCode2
GPT-Driver: Learning to Drive with GPTCode2
Alphazero-like Tree-Search can Guide Large Language Model Decoding and TrainingCode2
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language ModelsCode2
Cross-Prediction-Powered InferenceCode2
ExpeL: LLM Agents Are Experiential LearnersCode2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsCode2
Show:102550
← PrevPage 18 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified