Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 1107 papers

Title	Date	Tasks	Status	Hype
MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic	May 5, 2023	Epistemic ReasoningLanguage Modeling	CodeCode Available	1
Fine-tuning Multimodal Large Language Models for Product Bundling	Jul 16, 2024	In-Context LearningMultiple-choice	CodeCode Available	1
Fake Alignment: Are LLMs Really Aligned Well?	Nov 10, 2023	Multiple-choice	CodeCode Available	1
FaceXBench: Evaluating Multimodal LLMs on Face Understanding	Jan 17, 2025	FairnessMultiple-choice	CodeCode Available	1
FarsTail: A Persian Natural Language Inference Dataset	Sep 18, 2020	Multiple-choiceNatural Language Inference	CodeCode Available	1
Explicit Planning Helps Language Models in Logical Reasoning	Mar 28, 2023	Logical ReasoningMultiple-choice	CodeCode Available	1
Ranked Voting based Self-Consistency of Large Language Models	May 16, 2025	Multiple-choiceOpen-Ended Question Answering	CodeCode Available	1
FETA: A Benchmark for Few-Sample Task Transfer in Open-Domain Dialogue	May 12, 2022	Dialogue UnderstandingDomain Adaptation	CodeCode Available	1
An Open Source Data Contamination Report for Large Language Models	Oct 26, 2023	HellaSwagLanguage Modeling	CodeCode Available	1
Annealed Winner-Takes-All for Motion Forecasting	Sep 17, 2024	AllAutonomous Driving	CodeCode Available	1
ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning	Apr 15, 2021	Graph GenerationMultiple-choice	CodeCode Available	1
Annealed Multiple Choice Learning: Overcoming limitations of Winner-takes-all with annealing	Jul 22, 2024	AllDiversity	CodeCode Available	1
Evaluating the Knowledge Dependency of Questions	Nov 21, 2022	Multiple-choice	CodeCode Available	1
Explaining NLP Models via Minimal Contrastive Editing (MiCE)	Dec 27, 2020	counterfactualMultiple-choice	CodeCode Available	1
Filter-then-Generate: Large Language Models with Structure-Text Adapter for Knowledge Graph Completion	Dec 12, 2024	HallucinationKnowledge Graph Completion	CodeCode Available	1
HCQA @ Ego4D EgoSchema Challenge 2024	Jun 22, 2024	Caption Generation	CodeCode Available	1
An MRC Framework for Semantic Role Labeling	Sep 14, 2021	Computational EfficiencyMachine Reading Comprehension	CodeCode Available	1
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification	Jun 20, 2024	BenchmarkingClassification	CodeCode Available	1
Estimating Contamination via Perplexity: Quantifying Memorisation in Language Model Evaluation	Sep 19, 2023	Language Model EvaluationLanguage Modeling	CodeCode Available	1
An In-depth Look at Gemini's Language Abilities	Dec 18, 2023	Instruction FollowingMath	CodeCode Available	1
Enhancing Human-like Multi-Modal Reasoning: A New Challenging Dataset and Comprehensive Framework	Jul 24, 2023	Contrastive LearningMultimodal Reasoning	CodeCode Available	1
Enhancing Knowledge Tracing with Concept Map and Response Disentanglement	Aug 23, 2024	DisentanglementKnowledge Tracing	CodeCode Available	1
Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams	Mar 29, 2023	Multiple-choice	CodeCode Available	1
EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain	Oct 12, 2022	Distractor GenerationMultiple-choice	CodeCode Available	1
Do Large Language Models Understand Conversational Implicature -- A case study with a chinese sitcom	Apr 30, 2024	ImplicaturesMultiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 6 of 45Next →

No leaderboard results yet.