SOTAVerified

Multiple-choice

Papers

Showing 276300 of 1107 papers

TitleStatusHype
Assessing AI-Generated Questions' Alignment with Cognitive Frameworks in Educational Assessment0
An AI-based Solution for Enhancing Delivery of Digital Learning for Future Teachers0
Evalita-LLM: Benchmarking Large Language Models on Italian0
Evaluating LLM -- Generated Multimodal Diagnosis from Medical Images and Symptom Analysis0
Collaboration among Multiple Large Language Models for Medical Question Answering0
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments0
An Add-On for Empowering Google Forms to be an Automatic Question Generator in Online Assessments0
COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain0
CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models0
A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs0
A Data-Driven Study of Commonsense Knowledge using the ConceptNet Knowledge Base0
CoddLLM: Empowering Large Language Models for Data Analytics0
A Semantic Parsing Algorithm to Solve Linear Ordering Problems0
A Semantic Feature-Wise Transformation Relation Network for Automatic Short Answer Grading0
From Human Days to Machine Seconds: Automatically Answering and Generating Machine Learning Final Exams0
EQUATOR: A Deterministic Framework for Evaluating LLM Reasoning with Open-Ended Questions. # v1.0.0-beta0
Establishing Task Scaling Laws via Compute-Efficient Model Ladders0
Aryl: An Elastic Cluster Scheduler for Deep Learning0
Clozer”:" Adaptable Data Augmentation for Cloze-style Reading Comprehension0
Clozer: Adaptable Data Augmentation for Cloze-style Reading Comprehension0
Amobee at SemEval-2019 Tasks 5 and 6: Multiple Choice CNN Over Contextual Embedding0
Enhancing Multiple-Choice Question Answering with Causal Knowledge0
CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering0
A Method for Building a Commonsense Inference Dataset based on Basic Events0
ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases0
Show:102550
← PrevPage 12 of 45Next →

No leaderboard results yet.