SOTAVerified

Multiple-choice

Papers

Showing 726750 of 1107 papers

TitleStatusHype
Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions0
Large Language Models Still Exhibit Bias in Long Text0
A Comparative Study of Open-Source Large Language Models, GPT-4 and Claude 2: Multiple-Choice Test Taking in Nephrology0
Understanding Prior Bias and Choice Paralysis in Transformer-based Language Representation Models through Four Experimental Probes0
Learning a Word-Level Language Model with Sentence-Level Noise Contrastive Estimation for Contextual Sentence Probability Estimation0
Learning Language-Visual Embedding for Movie Understanding with Natural-Language0
Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering0
Learning to Specialize with Knowledge Distillation for Visual Question Answering0
An AI-based Solution for Enhancing Delivery of Digital Learning for Future Teachers0
LegalBench.PT: A Benchmark for Portuguese Law0
Teaching Pretrained Models with Commonsense Reasoning: A Preliminary KB-Based Approach0
WIQA: A dataset for ``What if...'' reasoning over procedural text0
LEXam: Benchmarking Legal Reasoning on 340 Law Exams0
LHMKE: A Large-scale Holistic Multi-subject Knowledge Evaluation Benchmark for Chinese Large Language Models0
WirelessMathBench: A Mathematical Modeling Benchmark for LLMs in Wireless Communications0
Linguistic Legal Concept Extraction in Portuguese0
Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA0
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQ0
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do0
LLM-based Text Simplification and its Effect on User Comprehension and Cognitive Load0
LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering0
Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?0
LLM Evaluation Based on Aerospace Manufacturing Expertise: Automated Generation and Multi-Model Question Answering0
Unleashing the Potential of Large Language Model: Zero-shot VQA for Flood Disaster Scenario0
LLMs to Support a Domain Specific Knowledge Assistant0
Show:102550
← PrevPage 30 of 45Next →

No leaderboard results yet.