Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 476–500 of 1107 papers

Title	Date	Tasks	Status	Hype
Automated Generation and Tagging of Knowledge Components from Multiple-Choice Questions	May 30, 2024	Language ModellingLarge Language Model	CodeCode Available	0
DGRC: An Effective Fine-tuning Framework for Distractor Generation in Chinese Multi-choice Reading Comprehension	May 29, 2024	Distractor GenerationMultiple-choice	—Unverified	0
Edinburgh Clinical NLP at MEDIQA-CORR 2024: Guiding Large Language Models with Hints	May 28, 2024	Multiple-choiceSentence	—Unverified	0
Can We Trust LLMs? Mitigate Overconfidence Bias in LLMs through Knowledge Transfer	May 27, 2024	Multiple-choiceSentiment Analysis	—Unverified	0
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers	May 25, 2024	Common Sense ReasoningMultiple-choice	CodeCode Available	0
Eliciting Informative Text Evaluations with Large Language Models	May 23, 2024	Multiple-choicePrediction	CodeCode Available	0
Imagery as Inquiry: Exploring A Multimodal Dataset for Conversational Recommendation	May 23, 2024	Conversational RecommendationMultiple-choice	—Unverified	0
Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation	May 22, 2024	InformativenessLanguage Modeling	CodeCode Available	2
Embedding Trajectory for Out-of-Distribution Detection in Mathematical Reasoning	May 22, 2024	Mathematical ReasoningMultiple-choice	CodeCode Available	1
Robust portfolio optimization model for electronic coupon allocation	May 21, 2024	Multiple-choicePortfolio Optimization	—Unverified	0
Multiple-Choice Questions are Efficient and Robust LLM Evaluators	May 20, 2024	GSM8KHumanEval	CodeCode Available	1
Exploring the Capabilities of Prompted Large Language Models in Educational and Assessment Applications	May 19, 2024	Multiple-choice	—Unverified	0
From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT	May 17, 2024	BenchmarkingMultiple-choice	—Unverified	0
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset	May 17, 2024	16kBenchmarking	CodeCode Available	3
COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain	May 17, 2024	Language ModelingLanguage Modelling	—Unverified	0
AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning	May 16, 2024	Multiple-choiceQuestion Answering	—Unverified	0
CinePile: A Long Video Question Answering Dataset and Benchmark	May 14, 2024	FormHuman-Object Interaction Detection	—Unverified	0
SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation	May 14, 2024	BenchmarkingMultiple-choice	CodeCode Available	1
MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation	May 13, 2024	In-Context LearningMultiple-choice	—Unverified	0
Limited Ability of LLMs to Simulate Human Psychological Behaviours: a Psychometric Analysis	May 12, 2024	Multiple-choiceQuestion Answering	CodeCode Available	0
THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models	May 8, 2024	AttributeData Augmentation	CodeCode Available	1
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning	May 6, 2024	Multiple-choiceVideo Understanding	—Unverified	0
Anchored Answers: Unravelling Positional Bias in GPT-2's Multiple-Choice Questions	May 6, 2024	Decision MakingMultiple-choice	CodeCode Available	0
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance	May 5, 2024	Multiple-choice	CodeCode Available	2
Math Multiple Choice Question Generation via Human-Large Language Model Collaboration	May 1, 2024	Language ModelingLanguage Modelling	—Unverified	0

Show:10 25 50

← PrevPage 20 of 45Next →

No leaderboard results yet.