SOTAVerified

Decision Making

Papers

Showing 451500 of 12311 papers

TitleStatusHype
Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine ReadingCode1
Extended Tree Search for Robot Task and Motion PlanningCode1
Extracting Reward Functions from Diffusion ModelsCode1
Failure Detection in Medical Image Classification: A Reality Check and Benchmarking TestbedCode1
Fairness Constraints: Mechanisms for Fair ClassificationCode1
CLASS: A Design Framework for building Intelligent Tutoring Systems based on Learning Science principlesCode1
Fairness in Ranking under UncertaintyCode1
Fairness Through Robustness: Investigating Robustness Disparity in Deep LearningCode1
Fast Interpretable Greedy-Tree SumsCode1
Fast model inference and training on-board of SatellitesCode1
Fault-Tolerant Federated Reinforcement Learning with Theoretical GuaranteeCode1
CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired TransformerCode1
ChessGPT: Bridging Policy Learning and Language ModelingCode1
Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal DataCode1
Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty QuantificationCode1
CFGPT: Chinese Financial Assistant with Large Language ModelCode1
Certified Reinforcement Learning with Logic GuidanceCode1
fMRI from EEG is only Deep Learning away: the use of interpretable DL to unravel EEG-fMRI relationshipsCode1
Forecasting Future World Events with Neural NetworksCode1
CELLO: Causal Evaluation of Large Vision-Language ModelsCode1
From Parity to Preference-based Notions of Fairness in ClassificationCode1
From point forecasts to multivariate probabilistic forecasts: The Schaake shuffle for day-ahead electricity price forecastingCode1
AvalonBench: Evaluating LLMs Playing the Game of AvalonCode1
CityLearn: Diverse Real-World Environments for Sample-Efficient Navigation Policy LearningCode1
CertRL: Formalizing Convergence Proofs for Value and Policy Iteration in CoqCode1
ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language ModelsCode1
Causal Discovery with Language Models as Imperfect ExpertsCode1
GATSBI: Generative Agent-centric Spatio-temporal Object InteractionCode1
Causal-aware Large Language Models: Enhancing Decision-Making Through Learning, Adapting and ActingCode1
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMsCode1
Generating Hierarchical Explanations on Text Classification via Feature Interaction DetectionCode1
Generating Synthetic Mixed-type Longitudinal Electronic Health Records for Artificial Intelligent ApplicationsCode1
Adaptive Conformal Predictions for Time SeriesCode1
GLocalX -- From Local to Global Explanations of Black Box AI ModelsCode1
GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical ReasoningCode1
GNFactor: Multi-Task Real Robot Learning with Generalizable Neural Feature FieldsCode1
CARL-GT: Evaluating Causal Reasoning Capabilities of Large Language ModelsCode1
Causal thinking for decision making on Electronic Health Records: why and howCode1
ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain FeedbackCode1
Accuracy and Fairness Trade-offs in Machine Learning: A Stochastic Multi-Objective ApproachCode1
Group-Aware Coordination Graph for Multi-Agent Reinforcement LearningCode1
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement LearningCode1
Adversarial Robustness of Representation Learning for Knowledge GraphsCode1
"Guinea Pig Trials" Utilizing GPT: A Novel Smart Agent-Based Modeling Approach for Studying Firm Competition and CollusionCode1
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?Code1
HAZARD Challenge: Embodied Decision Making in Dynamically Changing EnvironmentsCode1
AdViCE: Aggregated Visual Counterfactual Explanations for Machine Learning Model ValidationCode1
Hematoxylin and eosin stained oral squamous cell carcinoma histological images datasetCode1
Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical DiagnosisCode1
CAMul: Calibrated and Accurate Multi-view Time-Series ForecastingCode1
Show:102550
← PrevPage 10 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified