SOTAVerified

Multiple-choice

Papers

Showing 526550 of 1107 papers

TitleStatusHype
Assessing AI-Generated Questions' Alignment with Cognitive Frameworks in Educational Assessment0
An AI-based Solution for Enhancing Delivery of Digital Learning for Future Teachers0
Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models0
HANS, are you clever? Clever Hans Effect Analysis of Neural Systems0
Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation0
Collaboration among Multiple Large Language Models for Medical Question Answering0
Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing0
Cognitive Biases in Large Language Models: A Survey and Mitigation Experiments0
Graph-Structured Representations for Visual Question Answering0
GraphITE: Estimating Individual Effects of Graph-structured Treatments0
COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain0
GRAF: Graph Retrieval Augmented by Facts for Romanian Legal Multi-Choice Question Answering0
CodeReviewQA: The Code Review Comprehension Assessment for Large Language Models0
A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs0
GPT-4 to GPT-3.5: 'Hold My Scalpel' -- A Look at the Competency of OpenAI's GPT on the Plastic Surgery In-Service Training Exam0
GPT-4o System Card0
CoddLLM: Empowering Large Language Models for Data Analytics0
A Semantic Parsing Algorithm to Solve Linear Ordering Problems0
Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark0
Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning0
GeoCode-GPT: A Large Language Model for Geospatial Code Generation Tasks0
A Semantic Feature-Wise Transformation Relation Network for Automatic Short Answer Grading0
An Add-On for Empowering Google Forms to be an Automatic Question Generator in Online Assessments0
Genome-Bench: A Scientific Reasoning Benchmark from Real-World Expert Discussions0
GenNet : Reading Comprehension with Multiple Choice Questions using Generation and Selection model0
Show:102550
← PrevPage 22 of 45Next →

No leaderboard results yet.