SOTAVerified

Decision Making

Papers

Showing 4150 of 12311 papers

TitleStatusHype
Will LLMs be Professional at Fund Investment? DeepFund: A Live Arena PerspectiveCode3
A Survey on the Optimization of Large Language Model-based AgentsCode3
Parallelized Planning-Acting for Efficient LLM-based Multi-Agent SystemsCode3
Automated Hypothesis Validation with Agentic Sequential FalsificationsCode3
Rethinking Early Stopping: Refine, Then CalibrateCode3
MineStudio: A Streamlined Package for Minecraft AI Agent DevelopmentCode3
Embodied CoT Distillation From LLM To Off-the-shelf AgentsCode3
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale GamesCode3
Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language ModelsCode3
Game-theoretic LLM: Agent Workflow for Negotiation GamesCode3
Show:102550
← PrevPage 5 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified