SOTAVerified

Multiple-choice

Papers

Showing 451475 of 1107 papers

TitleStatusHype
An Automatic Question Usability Evaluation ToolkitCode0
KGQuiz: Evaluating the Generalization of Encoded Knowledge in Large Language ModelsCode0
A Profit-Maximizing Strategy for Advertising on the e-Commerce PlatformsCode0
Automated Generation and Tagging of Knowledge Components from Multiple-Choice QuestionsCode0
IPEval: A Bilingual Intellectual Property Agency Consultation Evaluation Benchmark for Large Language ModelsCode0
Chance-Constrained Multiple-Choice Knapsack Problem: Model, Algorithms, and ApplicationsCode0
iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain TeasersCode0
TRACE: Transformer-based Risk Assessment for Clinical EvaluationCode0
Introducing Flexible Monotone Multiple Choice Item Response Theory Models and Bit ScalesCode0
DAHL: Domain-specific Automated Hallucination Evaluation of Long-Form Text through a Benchmark Dataset in BiomedicineCode0
Investigating the Shortcomings of LLMs in Step-by-Step Legal ReasoningCode0
Improving Question Answering with External KnowledgeCode0
CSEPrompts: A Benchmark of Introductory Computer Science PromptsCode0
INCEPTNET: Precise And Early Disease Detection Application For Medical Images AnalysesCode0
Improving Machine Reading Comprehension with General Reading StrategiesCode0
AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context RetrievalCode0
QMOS: Enhancing LLMs for Telecommunication with Question Masked loss and Option ShufflingCode0
A multimodal dataset for understanding the impact of mobile phones on remote online virtual educationCode0
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language ModelsCode0
How Can We Diagnose and Treat Bias in Large Language Models for Clinical Decision-Making?Code0
How much do LLMs learn from negative examples?Code0
A Benchmark for Long-Form Medical Question AnsweringCode0
Increasing Probability Mass on Answer Choices Does Not Always Improve AccuracyCode0
Harnessing Structured Knowledge: A Concept Map-Based Approach for High-Quality Multiple Choice Question Generation with Effective DistractorsCode0
Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document CorporaCode0
Show:102550
← PrevPage 19 of 45Next →

No leaderboard results yet.