SOTAVerified

Decision Making

Papers

Showing 326350 of 12311 papers

TitleStatusHype
A semantic embedding space based on large language models for modelling human beliefsCode1
Unleashing Artificial Cognition: Integrating Multiple AI SystemsCode1
Adaptive Two-Stage Cloud Resource Scaling via Hierarchical Multi-Indicator Forecasting and Bayesian Decision-MakingCode1
PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation LearningCode1
Reinforcement Learning Pair Trading: A Dynamic Scaling approachCode1
Hyp2Nav: Hyperbolic Planning and Curiosity for Crowd NavigationCode1
ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Streaming VideosCode1
InvAgent: A Large Language Model based Multi-Agent System for Inventory Management in Supply ChainsCode1
Can Learned Optimization Make Reinforcement Learning Less Difficult?Code1
Integrating Clinical Knowledge into Concept Bottleneck ModelsCode1
A Mamba-based Siamese Network for Remote Sensing Change DetectionCode1
Language Model Alignment in Multilingual Trolley ProblemsCode1
PUZZLES: A Benchmark for Neural Algorithmic ReasoningCode1
CELLO: Causal Evaluation of Large Vision-Language ModelsCode1
Evidential Concept Embedding Models: Towards Reliable Concept Explanations for Skin Disease DiagnosisCode1
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-MakingCode1
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and EvaluationCode1
ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical ImagesCode1
Ask-before-Plan: Proactive Language Agents for Real-World PlanningCode1
Statistical Uncertainty in Word Embeddings: GloVe-VCode1
LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal DataCode1
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for SamplingCode1
Open Grounded Planning: Challenges and Benchmark ConstructionCode1
RATT: A Thought Structure for Coherent and Correct LLM ReasoningCode1
Towards Rationality in Language and Multimodal Agents: A SurveyCode1
Show:102550
← PrevPage 14 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified