SOTAVerified

Decision Making

Papers

Showing 43514400 of 12311 papers

TitleStatusHype
Can Twitter Predict Royal Baby's Name ?0
Can Turing machine be curious about its Turing test results? Three informal lectures on physics of intelligence0
Anticipating Gaming to Incentivize Improvement: Guiding Agents in (Fair) Strategic Classification0
A General Taylor Framework for Unifying and Revisiting Attribution Methods0
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification0
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM0
Can time series forecasting be automated? A benchmark and analysis0
Antibiotic Resistance Microbiology Dataset (ARMD): A De-identified Resource for Studying Antimicrobial Resistance Using Electronic Health Records0
Emotional Contagion-Aware Deep Reinforcement Learning for Antagonistic Crowd Simulation0
Can rationality be measured?0
A general recurrent state space framework for modeling neural dynamics during decision-making0
Active Measure Reinforcement Learning for Observation Cost Minimization0
Can Q-learning solve Multi Armed Bantids?0
Can Physician Judgment Enhance Model Trustworthiness? A Case Study on Predicting Pathological Lymph Nodes in Rectal Cancer0
Answer Set Programming for Non-Stationary Markov Decision Processes0
Canonical Cortical Circuits and the Duality of Bayesian Inference and Optimal Control0
Can Machines Think Like Humans? A Behavioral Evaluation of LLM-Agents in Dictator Games0
Answering the "why" in Answer Set Programming - A Survey of Explanation Approaches0
A Generalized Representer Theorem for Hilbert Space - Valued Functions0
Evaluating the Reproducibility of Research in Obstetrics and Gynecology0
Evaluating the Similarity Estimator component of the TWIN Personality-based Recommender System0
Reinforcement Learning for Freight Booking Control Problems0
Can Machine Learning Catch Economic Recessions Using Economic and Market Sentiments?0
A Generalized Probability Framework to Model Economic Agents' Decisions Under Uncertainty0
Can LVLMs Obtain a Driver's License? A Benchmark Towards Reliable AGI for Autonomous Driving0
Can LLMs Help Improve Analogical Reasoning For Strategic Decisions? Experimental Evidence from Humans and GPT-40
An Overview of Large Language Models for Statisticians0
Active Learning with Safety Constraints0
Can LLMs Grade Short-Answer Reading Comprehension Questions : An Empirical Study with a Novel Dataset0
An Overview of Healthcare Data Analytics With Applications to the COVID-19 Pandemic0
An Overview of Artificial Intelligence-based Soft Upper Limb Exoskeleton for Rehabilitation: A Descriptive Review0
Can LLMs be Good Financial Advisors?: An Initial Study in Personal Decision Making for Optimized Outcomes0
A generalized forecasting solution to enable future insights of COVID-19 at sub-national level resolutions0
Evaluating the Explainable AI Method Grad-CAM for Breath Classification on Newborn Time Series Data0
Evaluating the impact of quarantine measures on COVID-19 spread0
Can LLMs Assist Expert Elicitation for Probabilistic Causal Modeling?0
An Overview and Discussion of the Suitability of Existing Speech Datasets to Train Machine Learning Models for Collective Problem Solving0
Can Large Language Models Play Games? A Case Study of A Self-Play Approach0
Can large language models explore in-context?0
A Novel Unsupervised Post-Processing Calibration Method for DNNS with Robustness to Domain Shift0
Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation0
Can Large Language Models Beat Wall Street? Unveiling the Potential of AI in Stock Selection0
Can Language Representation Models Think in Bets?0
A Novel Twitter Sentiment Analysis Model with Baseline Correlation for Financial Market Prediction with Improved Efficiency0
Can Language Models Serve as Text-Based World Simulators?0
A Novel Task-Driven Method with Evolvable Interactive Agents Using Event Trees for Enhanced Emergency Decision Support0
Active Learning For Contextual Linear Optimization: A Margin-Based Approach0
Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study0
Evaluating the Performance of Large Language Models in Scientific Claim Detection and Classification0
Evaluating the Stability of Deep Learning Latent Feature Spaces0
Show:102550
← PrevPage 88 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified