CP-Router: An Uncertainty-Aware Router Between LLM and LRM May 26, 2025 Conformal Prediction Logical Reasoning
— Unverified 0Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Prefilling Attack May 21, 2025 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 0Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information May 9, 2025 Benchmarking Form
— Unverified 0Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models Mar 30, 2025 Knowledge Graphs Multiple-choice
Code Code Available 0Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework Mar 7, 2025 Conformal Prediction Medical Question Answering
— Unverified 0Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning Feb 27, 2025 Math Medical Question Answering
— Unverified 0Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores Feb 22, 2025 Distractor Generation Information Retrieval
Code Code Available 0Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above Feb 19, 2025 All Multiple-choice
— Unverified 0Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning Feb 8, 2025 Legal Reasoning Multiple-choice
Code Code Available 0First Token Probability Guided RAG for Telecom Question Answering Jan 11, 2025 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 0MedG-KRP: Medical Graph Knowledge Representation Probing Dec 14, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 0LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering Dec 13, 2024 Few-Shot Learning Knowledge Distillation
— Unverified 0KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting Dec 1, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 0SandboxAQ's submission to MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval Oct 28, 2024 Information Retrieval Multilingual Named Entity Recognition
— Unverified 0Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models Oct 18, 2024 Fairness Multiple-choice
— Unverified 0Differentiating Choices via Commonality for Multiple-Choice Question Answering Aug 21, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 0Answer, Assemble, Ace: Understanding How Transformers Answer Multiple Choice Questions Jul 21, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 0Long Story Short: Story-level Video Understanding from 20K Short Films Jun 14, 2024 Multiple Choice Question Answering (MCQA) Open-Ended Question Answering
— Unverified 0EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning May 13, 2024 Articles
Code Code Available 0AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts May 1, 2024 Multiple Choice Question Answering (MCQA)
Code Code Available 1From Multiple-Choice to Extractive QA: A Case Study for English and Arabic Apr 26, 2024 Belebele Extractive Question-Answering
Code Code Available 0Rethinking Generative Large Language Model Evaluation for Semantic Comprehension Mar 12, 2024 Language Model Evaluation Language Modeling
— Unverified 0KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations Mar 3, 2024 MedQA MMLU
— Unverified 0Unsupervised multiple choices question answering via universal corpus Feb 27, 2024 Form Knowledge Graphs
— Unverified 0Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? Feb 19, 2024 Decision Making Memorization
Code Code Available 0LLMs May Perform MCQA by Selecting the Least Incorrect Option Feb 2, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 0MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Nov 27, 2023 Articles Conditional Text Generation
Code Code Available 4Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education Oct 18, 2023 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 0Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations Oct 2, 2023 In-Context Learning Instruction Following
Code Code Available 1BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine Aug 18, 2023 Few-Shot Learning Language Modeling
Code Code Available 0Llama 2: Open Foundation and Fine-Tuned Chat Models Jul 18, 2023 Arithmetic Reasoning
Code Code Available 8M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models May 17, 2023 Instruction Following Multiple-choice
Code Code Available 1Towards Expert-Level Medical Question Answering with Large Language Models May 16, 2023 Medical Question Answering MedQA
Code Code Available 1FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain Apr 9, 2023 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 0BloombergGPT: A Large Language Model for Finance Mar 30, 2023 Causal Judgment Common Sense Reasoning
Code Code Available 0Generating multiple-choice questions for medical question answering with distractors and cue-masking Mar 13, 2023 Language Modeling Language Modelling
— Unverified 0Large Language Models Encode Clinical Knowledge Dec 26, 2022 Clinical Knowledge MedQA
Code Code Available 1Galactica: A Large Language Model for Science Nov 16, 2022 Anachronisms Bias Detection
Code Code Available 4Leveraging Large Language Models for Multiple Choice Question Answering Oct 22, 2022 Answer Selection Multiple-choice
Code Code Available 1Variational Open-Domain Question Answering Sep 23, 2022 Language Modelling MedQA
Code Code Available 1Can large language models reason about medical questions? Jul 17, 2022 MedQA Multiple-choice
Code Code Available 1HRCA+: Advanced Multiple-choice Machine Reading Comprehension Method Jun 1, 2022 Machine Reading Comprehension Multiple-choice
— Unverified 0Clues Before Answers: Generation-Enhanced Multiple-Choice QA Apr 30, 2022 Decoder Multiple-choice
Code Code Available 1PaLM: Scaling Language Modeling with Pathways Apr 5, 2022 Auto Debugging Code Generation
Code Code Available 2Training Compute-Optimal Large Language Models Mar 29, 2022 Anachronisms Analogical Similarity
Code Code Available 6MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering Mar 27, 2022 Diversity Multiple-choice
Code Code Available 2Does Transliteration Help Multilingual Language Modeling? Jan 29, 2022 Diversity Language Modeling
Code Code Available 0Disaggregating Hops: Can We Guide a Multi-Hop Reasoning Language Model to Incrementally Learn at each Hop? Jan 16, 2022 Language Modeling Language Modelling
— Unverified 0Context-guided Triple Matching for Multiple Choice Question Answering Jan 16, 2022 Benchmarking Multiple-choice
— Unverified 0QuALITY: Question Answering with Long Input Texts, Yes! Dec 16, 2021 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 1