SOTAVerified

Decision Making

Papers

Showing 101150 of 12311 papers

TitleStatusHype
HierarchicalForecast: A Reference Framework for Hierarchical Forecasting in PythonCode2
ForecastBench: A Dynamic Benchmark of AI Forecasting CapabilitiesCode2
A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language ModelCode2
GaussianAD: Gaussian-Centric End-to-End Autonomous DrivingCode2
Adversarial attacks and defenses in explainable artificial intelligence: A surveyCode2
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character DesignCode2
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAXCode2
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language ModelsCode2
A Survey of Financial AI: Architectures, Advances and Open ChallengesCode2
Astock: A New Dataset and Automated Stock Trading based on Stock-specific News Analyzing ModelCode2
Global birdsong embeddings enable superior transfer learning for bioacoustic classificationCode2
FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision MakingCode2
Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental LearningCode2
ExpeL: LLM Agents Are Experiential LearnersCode2
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation LearningCode2
LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention NetworksCode2
Machine Learning in Asset Management—Part 1: Portfolio Construction—Trading StrategiesCode2
A Review of Safe Reinforcement Learning: Methods, Theory and ApplicationsCode2
Agentic Knowledgeable Self-awarenessCode2
AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-MakingCode2
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM AgentsCode2
Mechanistic understanding and validation of large AI models with SemanticLensCode2
Fairness Evaluation for Uplift Modeling in the Absence of Ground TruthCode2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsCode2
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AICode2
AGIEval: A Human-Centric Benchmark for Evaluating Foundation ModelsCode2
AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based PoliciesCode2
ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian SplattingCode2
ADAPT: Action-aware Driving Caption TransformerCode2
Dungeons and Data: A Large-Scale NetHack DatasetCode2
Embodied LLM Agents Learn to Cooperate in Organized TeamsCode2
DrivingSphere: Building a High-fidelity 4D World for Closed-loop SimulationCode2
Doe-1: Closed-Loop Autonomous Driving with Large World ModelCode2
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous DrivingCode2
Disentangling Memory and Reasoning Ability in Large Language ModelsCode2
Diffusion Actor-Critic with Entropy RegulatorCode2
Distribution-Free, Risk-Controlling Prediction SetsCode2
Distributional Soft Actor-Critic with Three RefinementsCode2
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language ModelsCode2
DecisionNCE: Embodied Multimodal Representations via Implicit Preference LearningCode2
A Comprehensive Guide to Explainable AI: From Classical Models to LLMsCode2
Cross-Prediction-Powered InferenceCode2
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPRCode2
Cumulative Reasoning with Large Language ModelsCode2
Digital Player: Evaluating Large Language Models based Human-like Agent in GamesCode2
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language ModelsCode2
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement LearningCode2
Do As I Can, Not As I Say: Grounding Language in Robotic AffordancesCode2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language ModelsCode2
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future DirectionsCode2
Show:102550
← PrevPage 3 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified