SOTAVerified

Decision Making

Papers

Showing 28512900 of 12311 papers

TitleStatusHype
β-calibration of Language Model Confidence Scores for Generative QA0
Optimizing Estimators of Squared Calibration Errors in Classification0
Crafting desirable climate trajectories with RL explored socio-environmental simulationsCode0
Detecting Bias and Enhancing Diagnostic Accuracy in Large Language Models for Healthcare0
Reproducing and Extending Experiments in Behavioral Strategy with Large Language Models0
DisasterQA: A Benchmark for Assessing the performance of LLMs in Disaster Response0
Modeling chaotic Lorenz ODE System using Scientific Machine Learning0
The Moral Turing Test: Evaluating Human-LLM Alignment in Moral Decision-Making0
Generating Origin-Destination Matrices in Neural Spatial Interaction ModelsCode0
Cooperative and Asynchronous Transformer-based Mission Planning for Heterogeneous Teams of Mobile RobotsCode0
HumVI: A Multilingual Dataset for Detecting Violent Incidents Impacting Humanitarian AidCode0
Navigating Inflation in Ghana: How Can Machine Learning Enhance Economic Stability and Growth Strategies0
Context-Aware Command Understanding for Tabletop Scenarios0
On the Modeling Capabilities of Large Language Models for Sequential Decision Making0
Tree-Based Leakage Inspection and Control in Concept Bottleneck ModelsCode0
Towards an Operational Responsible AI Framework for Learning Analytics in Higher Education0
Biased AI can Influence Political Decision-Making0
Intuitions of Compromise: Utilitarianism vs. ContractualismCode0
Better than Your Teacher: LLM Agents that learn from Privileged AI Feedback0
Functional Clustering of Discount Functions for Behavioral Investor Profiling0
Driving with Regulation: Interpretable Decision-Making for Autonomous Vehicles with Retrieval-Augmented Reasoning via LLM0
Assouad, Fano, and Le Cam with Interaction: A Unifying Lower Bound Framework and Characterization for Bandit Learnability0
ReasoningRank: Teaching Student Models to Rank through Reasoning-Based Knowledge Distillation0
Deep learning-based Visual Measurement Extraction within an Adaptive Digital Twin Framework from Limited Data Using Transfer Learning0
ResTNet: Defense against Adversarial Policies via Transformer in Computer Go0
Comparing Zealous and Restrained AI Recommendations in a Real-World Human-AI Collaboration Task0
Bisimulation metric for Model Predictive ControlCode0
ECon: On the Detection and Resolution of Evidence ConflictsCode0
Beyond Forecasting: Compositional Time Series Reasoning for End-to-End Task Execution0
Understanding the Effect of Algorithm Transparency of Model Explanations in Text-to-SQL Semantic Parsing0
Preference Optimization as Probabilistic Inference0
Riemann Sum Optimization for Accurate Integrated Gradients ComputationCode0
Contrastive Explanations That Anticipate Human Misconceptions Can Improve Human Decision-Making Skills0
Applying Hybrid Graph Neural Networks to Strengthen Credit Risk Analysis0
Harnessing Generative AI for Economic Insights0
ORAssistant: A Custom RAG-based Conversational Assistant for OpenROAD0
Open-World Reinforcement Learning over Long Short-Term ImaginationCode0
Mesh-Informed Reduced Order Models for Aneurysm Rupture Risk PredictionCode0
Minimax-optimal trust-aware multi-armed bandits0
Unsupervised Prior Learning: Discovering Categorical Pose Priors from Videos0
Spatial-aware decision-making with ring attractors in reinforcement learning systems0
Leveraging Fundamental Analysis for Stock Trend Prediction for Profit0
Strategic Insights from Simulation Gaming of AI Race Dynamics0
SELU: Self-Learning Embodied MLLMs in Unknown Environments0
Towards Cost Sensitive Decision Making0
Harm Ratio: A Novel and Versatile Fairness Criterion0
Joint Channel Selection using FedDRL in V2X0
Self-eXplainable AI for Medical Image Analysis: A Survey and New Outlooks0
Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural LanguageCode0
A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security0
Show:102550
← PrevPage 58 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified