Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–725 of 1107 papers

Title	Date	Tasks	Status	Hype
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models	Jul 20, 2023	BenchmarkingLanguage Modeling	CodeCode Available	1
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla	Jul 18, 2023	Multiple-choiceQuestion Answering	—Unverified	0
Assessing the Quality of Multiple-Choice Questions Using GPT-4 and Rule-Based Methods	Jul 16, 2023	Multiple-choice	CodeCode Available	0
MMBench: Is Your Multi-modal Model an All-around Player?	Jul 12, 2023	AllInstruction Following	CodeCode Available	5
Analyzing Multiple-Choice Reading and Listening Comprehension Tests	Jul 3, 2023	Multiple-choiceReading Comprehension	—Unverified	0
Structured Dialogue Discourse Parsing	Jun 26, 2023	Discourse ParsingMultiple-choice	CodeCode Available	0
Chance-Constrained Multiple-Choice Knapsack Problem: Model, Algorithms, and Applications	Jun 26, 2023	Combinatorial OptimizationMultiple-choice	CodeCode Available	0
Analysis of the Cambridge Multiple-Choice Questions Reading Dataset with a Focus on Candidate Response Distribution	Jun 22, 2023	Multiple-choice	—Unverified	0
Solving and Generating NPR Sunday Puzzles with Large Language Models	Jun 21, 2023	Multiple-choicePrompt Engineering	CodeCode Available	0
RECAP-KG: Mining Knowledge Graphs from Raw GP Notes for Remote COVID-19 Assessment in Primary Care	Jun 17, 2023	Decision Makinggraph construction	—Unverified	0
Thrilled by Your Progress! Large Language Models (GPT-4) No Longer Struggle to Pass Assessments in Higher Education Programming Courses	Jun 15, 2023	Multiple-choice	—Unverified	0
Can ChatGPT pass the Vietnamese National High School Graduation Examination?	Jun 15, 2023	Language ModelingLanguage Modelling	—Unverified	0
Questioning the Survey Responses of Large Language Models	Jun 13, 2023	Multiple-choiceSurvey	CodeCode Available	0
Investigating the Effectiveness of ChatGPT in Mathematical Reasoning and Problem Solving: Evidence from the Vietnamese National High School Graduation Examination	Jun 10, 2023	MathMathematical Reasoning	—Unverified	0
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation	Jun 9, 2023	JurisprudenceManagement	CodeCode Available	1
Network-based Representations and Dynamic Discrete Choice Models for Multiple Discrete Choice Analysis	Jun 7, 2023	Discrete Choice ModelsMultiple-choice	—Unverified	0
Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset	Jun 5, 2023	BenchmarkingMultiple-choice	CodeCode Available	1
Conformal Prediction with Large Language Models for Multi-Choice Question Answering	May 28, 2023	Conformal PredictionMultiple-choice	CodeCode Available	1
Fine-Tuning Language Models with Just Forward Passes	May 27, 2023	GPUIn-Context Learning	CodeCode Available	3
BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering	May 25, 2023	Binary ClassificationKnowledge Graphs	CodeCode Available	0
ToMChallenges: A Principle-Guided Dataset and Diverse Evaluation Tasks for Exploring Theory of Mind	May 24, 2023	Multiple-choiceQuestion Answering	CodeCode Available	0
Have Large Language Models Developed a Personality?: Applicability of Self-Assessment Tests in Measuring Personality in LLMs	May 24, 2023	Multiple-choice	—Unverified	0
This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models	May 24, 2023	Language ModellingLarge Language Model	CodeCode Available	0
Increasing Probability Mass on Answer Choices Does Not Always Improve Accuracy	May 24, 2023	In-Context LearningMultiple-choice	CodeCode Available	0

Show:10 25 50

← PrevPage 29 of 45Next →

No leaderboard results yet.