Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 626–650 of 1107 papers

Title	Date	Tasks	Status	Hype
A Comparative Study of AI-Generated (GPT-4) and Human-crafted MCQs in Programming Education	Dec 5, 2023	Multiple-choice	—Unverified	0
Unleashing the Potential of Large Language Model: Zero-shot VQA for Flood Disaster Scenario	Dec 4, 2023	Language ModelingLanguage Modelling	—Unverified	0
Explanatory Argument Extraction of Correct Answers in Resident Medical Exams	Dec 1, 2023	Multiple-choice	CodeCode Available	0
Evaluating the Rationale Understanding of Critical Reasoning in Logical Reading Comprehension	Nov 30, 2023	Multiple-choiceReading Comprehension	—Unverified	0
Biomedical knowledge graph-optimized prompt generation for large language models	Nov 29, 2023	BenchmarkingKnowledge Graphs	CodeCode Available	2
CLOMO: Counterfactual Logical Modification with Large Language Models	Nov 29, 2023	counterfactualCounterfactual Reasoning	CodeCode Available	0
SEED-Bench-2: Benchmarking Multimodal Large Language Models	Nov 28, 2023	BenchmarkingImage Generation	CodeCode Available	2
MVBench: A Comprehensive Multi-modal Video Understanding Benchmark	Nov 28, 2023	3D Question Answering (3D-QA)Diagnostic	CodeCode Available	2
GPQA: A Graduate-Level Google-Proof Q&A Benchmark	Nov 20, 2023	Multiple-choice	CodeCode Available	2
Downstream Trade-offs of a Family of Text Watermarks	Nov 16, 2023	FormLanguage Modelling	CodeCode Available	0
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection	Nov 16, 2023	Language ModelingLanguage Modelling	CodeCode Available	4
ConceptPsy:A Benchmark Suite with Conceptual Comprehensiveness in Psychology	Nov 16, 2023	MMLUMultiple-choice	—Unverified	0
Investigating Data Contamination in Modern Benchmarks for Large Language Models	Nov 16, 2023	Common Sense ReasoningMMLU	—Unverified	0
Evaluating LLMs on Document-Based QA: Exact Answer Selection and Numerical Extraction using Cogtale dataset	Nov 14, 2023	Answer SelectionInformation Retrieval	—Unverified	0
It's Not Easy Being Wrong: Large Language Models Struggle with Process of Elimination Reasoning	Nov 13, 2023	Multiple-choice	CodeCode Available	0
Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models	Nov 10, 2023	GSM8KMemorization	CodeCode Available	1
Fake Alignment: Are LLMs Really Aligned Well?	Nov 10, 2023	Multiple-choice	CodeCode Available	1
Characterizing Large Language Models as Rationalizers of Knowledge-intensive Tasks	Nov 9, 2023	Multiple-choiceWorld Knowledge	—Unverified	0
Assessing Distractors in Multiple-Choice Tests	Nov 8, 2023	DiversityMultiple-choice	—Unverified	0
Evaluating multiple large language models in pediatric ophthalmology	Nov 7, 2023	Multiple-choice	—Unverified	0
Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions	Nov 5, 2023	Logical ReasoningMultiple-choice	—Unverified	0
More Robots are Coming: Large Multimodal Models (ChatGPT) can Solve Visually Diverse Images of Parsons Problems	Nov 3, 2023	Multiple-choice	—Unverified	0
CASE: Commonsense-Augmented Score with an Expanded Answer Space	Nov 3, 2023	Multiple-choice	CodeCode Available	0
Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysis	Nov 2, 2023	Density EstimationDiversity	CodeCode Available	1
An Open Source Data Contamination Report for Large Language Models	Oct 26, 2023	HellaSwagLanguage Modeling	CodeCode Available	1

Show:10 25 50

← PrevPage 26 of 45Next →

No leaderboard results yet.