SOTAVerified

Decision Making

Papers

Showing 151175 of 12311 papers

TitleStatusHype
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future DirectionsCode2
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action ChainCode2
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language ModelCode2
Jack of All Trades, Master of Some, a Multi-Purpose Transformer AgentCode2
Fairness Evaluation for Uplift Modeling in the Absence of Ground TruthCode2
AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based PoliciesCode2
Position: What Can Large Language Models Tell Us about Time Series AnalysisCode2
True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement LearningCode2
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making AgentsCode2
Graph-of-Thought: Utilizing Large Language Models to Solve Complex and Dynamic Business ProblemsCode2
ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction TuningCode2
LLMLight: Large Language Models as Traffic Signal Control AgentsCode2
LingoQA: Visual Question Answering for Autonomous DrivingCode2
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character DesignCode2
Tactics2D: A Highly Modular and Extensible Simulator for Driving Decision-makingCode2
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous DrivingCode2
ProAgent: From Robotic Process Automation to Agentic Process AutomationCode2
Vision Language Models in Autonomous Driving: A Survey and OutlookCode2
Octopus: Embodied Vision-Language Programmer from Environmental FeedbackCode2
Distributional Soft Actor-Critic with Three RefinementsCode2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language ModelsCode2
MLAgentBench: Evaluating Language Agents on Machine Learning ExperimentationCode2
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language ModelsCode2
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous DrivingCode2
GPT-Driver: Learning to Drive with GPTCode2
Show:102550
← PrevPage 7 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified