SOTAVerified

Decision Making

Papers

Showing 80018025 of 12311 papers

TitleStatusHype
Evaluating AI-Driven Automated Map Digitization in QGIS0
Evaluating and Aligning Human Economic Risk Preferences in LLMs0
Evaluating and Boosting Uncertainty Quantification in Classification0
Evaluating and Improving Value Judgments in AI: A Scenario-Based Study on Large Language Models' Depiction of Social Conventions0
Evaluating Bayesian Model Visualisations0
Bias Evaluation and Mitigation in Retrieval-Augmented Medical Question-Answering Systems0
Evaluating Brain-Inspired Modular Training in Automated Circuit Discovery for Mechanistic Interpretability0
Evaluating Conversational Recommender Systems: A Landscape of Research0
Decictor: Towards Evaluating the Robustness of Decision-Making in Autonomous Driving Systems0
Evaluating Deep Human-in-the-Loop Optimization for Retinal Implants Using Sighted Participants0
Evaluating Dynamic Conditional Quantile Treatment Effects with Applications in Ridesharing0
Evaluating Explanation Methods for Vision-and-Language Navigation0
Evaluating Fair Feature Selection in Machine Learning for Healthcare0
Public Perceptions of Fairness Metrics Across Borders0
Evaluating Fairness Metrics in the Presence of Dataset Bias0
Evaluating Gender Bias of LLMs in Making Morality Judgements0
Partially Observable Markov Decision Process Modelling for Assessing Hierarchies0
Evaluating Human-AI Collaboration: A Review and Methodological Framework0
Evaluating Human Alignment and Model Faithfulness of LLM Rationale0
Evaluating Human-like Explanations for Robot Actions in Reinforcement Learning Scenarios0
Evaluating Interventional Reasoning Capabilities of Large Language Models0
Evaluating Large Language Models in Ophthalmology0
Evaluating Large Language Models through Gender and Racial Stereotypes0
Evaluating LeNet Algorithms in Classification Lung Cancer from Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases0
Evaluating LLMs for Text-to-SQL Generation With Complex SQL Workload0
Show:102550
← PrevPage 321 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified