SOTAVerified

Decision Making

Papers

Showing 341350 of 12311 papers

TitleStatusHype
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-MakingCode1
What Do VLMs NOTICE? A Mechanistic Interpretability Pipeline for Gaussian-Noise-free Text-Image Corruption and EvaluationCode1
ImageFlowNet: Forecasting Multiscale Image-Level Trajectories of Disease Progression with Irregularly-Sampled Longitudinal Medical ImagesCode1
Ask-before-Plan: Proactive Language Agents for Real-World PlanningCode1
Statistical Uncertainty in Word Embeddings: GloVe-VCode1
LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal DataCode1
Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for SamplingCode1
Open Grounded Planning: Challenges and Benchmark ConstructionCode1
RATT: A Thought Structure for Coherent and Correct LLM ReasoningCode1
Towards Rationality in Language and Multimodal Agents: A SurveyCode1
Show:102550
← PrevPage 35 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified