SOTAVerified

Decision Making

Papers

Showing 176200 of 12311 papers

TitleStatusHype
Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph MatchingCode2
iVideoGPT: Interactive VideoGPTs are Scalable World ModelsCode2
Jack of All Trades, Master of Some, a Multi-Purpose Transformer AgentCode2
Digital Player: Evaluating Large Language Models based Human-like Agent in GamesCode2
Do As I Can, Not As I Say: Grounding Language in Robotic AffordancesCode2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language ModelsCode2
LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language ModelsCode2
Cross-Prediction-Powered InferenceCode2
Cumulative Reasoning with Large Language ModelsCode2
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM AgentsCode2
LLM-PySC2: Starcraft II learning environment for Large Language ModelsCode2
Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous DrivingCode2
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPRCode2
Agentic Knowledgeable Self-awarenessCode2
LVBench: An Extreme Long Video Understanding BenchmarkCode2
DecisionNCE: Embodied Multimodal Representations via Implicit Preference LearningCode2
Concept Bottleneck Language Models For protein designCode2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsCode2
CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing GamesCode2
CoD, Towards an Interpretable Medical Agent using Chain of DiagnosisCode2
MACRec: a Multi-Agent Collaboration Framework for RecommendationCode2
Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A SurveyCode2
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous AgentsCode2
Natural Language Reinforcement LearningCode2
A Comprehensive Guide to Explainable AI: From Classical Models to LLMsCode2
Show:102550
← PrevPage 8 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified