SOTAVerified

Decision Making

Papers

Showing 2130 of 12311 papers

TitleStatusHype
Fin-R1: A Large Language Model for Financial Reasoning through Reinforcement LearningCode4
Relationships are Complicated! An Analysis of Relationships Between Datasets on the WebCode4
Agent Q: Advanced Reasoning and Learning for Autonomous AI AgentsCode4
Is Sora a World Simulator? A Comprehensive Survey on General World Models and BeyondCode4
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual ReasoningCode4
AutoWebGLM: A Large Language Model-based Web Navigating AgentCode4
A Survey on Large Language Model-Based Game AgentsCode4
Eureka: Human-Level Reward Design via Coding Large Language ModelsCode4
Cognitive Architectures for Language AgentsCode4
AgentBench: Evaluating LLMs as AgentsCode4
Show:102550
← PrevPage 3 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified