Multiple-choice

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 601–625 of 1107 papers

Title	Date	Tasks	Status	Hype
Instruction Fine-Tuning: Does Prompt Loss Matter?	Jan 24, 2024	Multiple-choicetoken-classification	—Unverified	0
A Study on Large Language Models' Limitations in Multiple-Choice Question Answering	Jan 15, 2024	Multiple-choiceQuestion Answering	CodeCode Available	0
Towards Efficient Methods in Medical Question Answering using Knowledge Graph Embeddings	Jan 15, 2024	Knowledge Graph EmbeddingsKnowledge Graphs	CodeCode Available	0
Assessing Large Language Models in Mechanical Engineering Education: A Study on Mechanics-Focused Conceptual Understanding	Jan 13, 2024	Multiple-choicePrompt Engineering	—Unverified	0
Automated Answer Validation using Text Similarity	Jan 13, 2024	Information RetrievalMultiple-choice	—Unverified	0
PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities	Jan 13, 2024	Instruction FollowingMultiple-choice	—Unverified	0
A Novel Multi-Stage Prompting Approach for Language Agnostic MCQ Generation using GPT	Jan 13, 2024	Distractor GenerationMultiple-choice	CodeCode Available	0
The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models	Jan 11, 2024	MathMultiple-choice	CodeCode Available	1
A Joint-Reasoning based Disease Q&A System	Jan 6, 2024	Knowledge GraphsMisinformation	—Unverified	0
SEED-Bench: Benchmarking Multimodal Large Language Models	Jan 1, 2024	BenchmarkingImage Generation	CodeCode Available	3
The Earth is Flat? Unveiling Factual Errors in Large Language Models	Jan 1, 2024	In-Context LearningMultiple-choice	—Unverified	0
FusionMind -- Improving question and answering with external context fusion	Dec 31, 2023	Knowledge GraphsMultiple-choice	—Unverified	0
SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security	Dec 26, 2023	Computer SecurityMultiple-choice	CodeCode Available	0
RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models	Dec 26, 2023	MemorizationMultiple-choice	CodeCode Available	1
HyKGE: A Hypothesis Knowledge Graph Enhanced Framework for Accurate and Reliable Medical LLMs Responses	Dec 26, 2023	DiversityKnowledge Graphs	CodeCode Available	1
Towards a Unified Multimodal Reasoning Framework	Dec 22, 2023	Multimodal ReasoningMultiple-choice	CodeCode Available	0
Perception Test 2023: A Summary of the First Challenge And Outcome	Dec 20, 2023	BenchmarkingGrounded Video Question Answering	—Unverified	0
BloomVQA: Assessing Hierarchical Multi-modal Comprehension	Dec 20, 2023	Data AugmentationMemorization	—Unverified	0
Multiple Hypothesis Dropout: Estimating the Parameters of Multi-Modal Output Distributions	Dec 18, 2023	Multiple-choicePedestrian Trajectory Prediction	CodeCode Available	0
An In-depth Look at Gemini's Language Abilities	Dec 18, 2023	Instruction FollowingMath	CodeCode Available	1
Marathon: A Race Through the Realm of Long Context with Large Language Models	Dec 15, 2023	Long-Context UnderstandingMultiple-choice	CodeCode Available	1
Self-Evaluation Improves Selective Generation in Large Language Models	Dec 14, 2023	Multiple-choiceTruthfulQA	—Unverified	0
A Foundational Multimodal Vision Language AI Assistant for Human Pathology	Dec 13, 2023	Decision MakingDiagnostic	—Unverified	0
Steering Llama 2 via Contrastive Activation Addition	Dec 9, 2023	Multiple-choice	CodeCode Available	2
Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers	Dec 7, 2023	MathMultiple-choice	CodeCode Available	1

Show:10 25 50

← PrevPage 25 of 45Next →

No leaderboard results yet.