SOTAVerified

Multiple-choice

Papers

Showing 901950 of 1107 papers

TitleStatusHype
Detect, Describe, Discriminate: Moving Beyond VQA for MLLM Evaluation0
Developing A Framework to Support Human Evaluation of Bias in Generated Free Response Text0
Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes0
DFIR-Metric: A Benchmark Dataset for Evaluating Large Language Models in Digital Forensics and Incident Response0
D-GEN: Automatic Distractor Generation and Evaluation for Reliable Assessment of Generative Model0
DGRC: An Effective Fine-tuning Framework for Distractor Generation in Chinese Multi-choice Reading Comprehension0
Instructions and Guide for Diagnostic Questions: The NeurIPS 2020 Education Challenge0
Dialogue-Based Simulation For Cultural Awareness Training0
Dienstplanerstellung in Krankenhaeusern mittels genetischer Algorithmen0
Differentiable Open-Ended Commonsense Reasoning0
Plug-in, Trainable Gate for Streamlining Arbitrary Neural Networks0
Different Questions, Different Models: Fine-Grained Evaluation of Uncertainty and Calibration in Clinical QA with LLMs0
Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities0
Disaggregating Hops: Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at each Hop?0
DISTO: Evaluating Textual Distractors for Multi-Choice Questions using Negative Sampling based Approach0
Distractor Analysis and Selection for Multiple-Choice Cloze Questions for Second-Language Learners0
Distractor Generation in Multiple-Choice Tasks: A Survey of Methods, Datasets, and Evaluation0
Distributional semantics beyond words: Supervised learning of analogy and paraphrase0
DiverseNet: When One Right Answer is not Enough0
DMind Benchmark: Toward a Holistic Assessment of LLM Capabilities across the Web3 Domain0
Document-level Event Factuality Identification via Machine Reading Comprehension Frameworks with Transfer Learning0
Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla0
Do Fine-tuned Commonsense Language Models Really Generalize?0
Do Large Language Models Know Folktales? A Case Study of Yokai in Japanese Folktales0
Do LLMs Act as Repositories of Causal Knowledge?0
Do LLMs Know When to NOT Answer? Investigating Abstention Abilities of Large Language Models0
Do LLMs Make Mistakes Like Students? Exploring Natural Alignment between Language Models and Human Error Patterns0
Do LLMs Recognize me, When I is not me: Assessment of LLMs Understanding of Turkish Indexical Pronouns in Indexical Shift Contexts0
DP-SSL: Towards Robust Semi-supervised Learning with A Few Labeled Samples0
DREAM: A Challenge Data Set and Models for Dialogue-Based Reading Comprehension0
DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests0
DsMCL: Dual-Level Stochastic Multiple Choice Learning for Multi-Modal Trajectory Prediction0
Dual Co-Matching Network for Multi-choice Reading Comprehension0
E-cheating Prevention Measures: Detection of Cheating at Online Examinations Using Deep Learning Approach -- A Case Study0
E-Commerce Promotions Personalization via Online Multiple-Choice Knapsack with Uplift Modeling0
Edinburgh Clinical NLP at MEDIQA-CORR 2024: Guiding Large Language Models with Hints0
Towards a Personal Health Large Language Model0
Efficient Knowledge Distillation: Empowering Small Language Models with Teacher Model Insights0
Eigen Values Features for the Classification of Brain Signals corresponding to 2D and 3D Educational Contents0
Eliciting Categorical Data for Optimal Aggregation0
ELiRF-UPV at SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge0
Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework0
End-to-end Concept Word Detection for Video Captioning, Retrieval, and Question Answering0
Enhancing Distractor Generation for Multiple-Choice Questions with Retrieval Augmented Pretraining and Knowledge Graph Integration0
Enhancing Event Causality Identification with Rationale and Structure-Aware Causal Question Answering0
Towards Collective Superintelligence: Amplifying Group IQ using Conversational Swarms0
Towards combinatorial clustering: preliminary research survey0
Enhancing LLM Evaluations: The Garbling Trick0
Enhancing LLMs' Reasoning-Intensive Multimedia Search Capabilities through Fine-Tuning and Reinforcement Learning0
Enhancing Multiple-choice Machine Reading Comprehension by Punishing Illogical Interpretations0
Show:102550
← PrevPage 19 of 23Next →

No leaderboard results yet.