Llama 2: Open Foundation and Fine-Tuned Chat Models Jul 18, 2023 Arithmetic Reasoning
Code Code Available 85 Training Compute-Optimal Large Language Models Mar 29, 2022 Anachronisms Analogical Similarity
Code Code Available 65 MEDITRON-70B: Scaling Medical Pretraining for Large Language Models Nov 27, 2023 Articles Conditional Text Generation
Code Code Available 45 Galactica: A Large Language Model for Science Nov 16, 2022 Anachronisms Bias Detection
Code Code Available 45 PaLM: Scaling Language Modeling with Pathways Apr 5, 2022 Auto Debugging Code Generation
Code Code Available 25 Scaling Language Models: Methods, Analysis & Insights from Training Gopher Dec 8, 2021 Abstract Algebra Anachronisms
Code Code Available 25 MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering Mar 27, 2022 Diversity Multiple-choice
Code Code Available 25 AdaMoLE: Fine-Tuning Large Language Models with Adaptive Mixture of Low-Rank Adaptation Experts May 1, 2024 Multiple Choice Question Answering (MCQA)
Code Code Available 15 Can large language models reason about medical questions? Jul 17, 2022 MedQA Multiple-choice
Code Code Available 15 Clues Before Answers: Generation-Enhanced Multiple-Choice QA Apr 30, 2022 Decoder Multiple-choice
Code Code Available 15 Counterfactual Variable Control for Robust and Interpretable Question Answering Oct 12, 2020 Causal Inference counterfactual
Code Code Available 15 Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations Oct 2, 2023 In-Context Learning Instruction Following
Code Code Available 15 IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages Nov 8, 2020 Genre classification Multiple-choice
Code Code Available 15 Large Language Models Encode Clinical Knowledge Dec 26, 2022 Clinical Knowledge MedQA
Code Code Available 15 Leveraging Large Language Models for Multiple Choice Question Answering Oct 22, 2022 Answer Selection Multiple-choice
Code Code Available 15 LexGLUE: A Benchmark Dataset for Legal Language Understanding in English Oct 3, 2021 Multi-class Classification Multi-Label Classification
Code Code Available 15 M3KE: A Massive Multi-Level Multi-Subject Knowledge Evaluation Benchmark for Chinese Large Language Models May 17, 2023 Instruction Following Multiple-choice
Code Code Available 15 QuALITY: Question Answering with Long Input Texts, Yes! Dec 16, 2021 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 15 Towards Expert-Level Medical Question Answering with Large Language Models May 16, 2023 Medical Question Answering MedQA
Code Code Available 15 Variational Open-Domain Question Answering Sep 23, 2022 Language Modelling MedQA
Code Code Available 15 KnowledgePrompts: Exploring the Abilities of Large Language Models to Solve Proportional Analogies via Knowledge-Enhanced Prompting Dec 1, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 05 FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain Apr 9, 2023 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 05 From Recognition to Cognition: Visual Commonsense Reasoning Nov 27, 2018 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 05 From Multiple-Choice to Extractive QA: A Case Study for English and Arabic Apr 26, 2024 Belebele Extractive Question-Answering
Code Code Available 05 Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning Feb 8, 2025 Legal Reasoning Multiple-choice
Code Code Available 05 Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering Aug 28, 2018 AI2 Reasoning Challenge ARC
Code Code Available 05 Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores Feb 22, 2025 Distractor Generation Information Retrieval
Code Code Available 05 BloombergGPT: A Large Language Model for Finance Mar 30, 2023 Causal Judgment Common Sense Reasoning
Code Code Available 05 MedG-KRP: Medical Graph Knowledge Representation Probing Dec 14, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 05 BioMedGPT: Open Multimodal Generative Pre-trained Transformer for BioMedicine Aug 18, 2023 Few-Shot Learning Language Modeling
Code Code Available 05 MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension Oct 1, 2019 Logical Reasoning Machine Reading Comprehension
Code Code Available 05 Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? Feb 19, 2024 Decision Making Memorization
Code Code Available 05 Question-Aware Knowledge Graph Prompting for Enhancing Large Language Models Mar 30, 2025 Knowledge Graphs Multiple-choice
Code Code Available 05 Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages Sep 22, 2021 Multiple Choice Question Answering (MCQA) Natural Language Inference
Code Code Available 05 Differentiating Choices via Commonality for Multiple-Choice Question Answering Aug 21, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
Code Code Available 05 Does Transliteration Help Multilingual Language Modeling? Jan 29, 2022 Diversity Language Modeling
Code Code Available 05 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning May 13, 2024 Articles
Code Code Available 05 Long Story Short: Story-level Video Understanding from 20K Short Films Jun 14, 2024 Multiple Choice Question Answering (MCQA) Open-Ended Question Answering
— Unverified 00 Context-guided Triple Matching for Multiple Choice Question Answering Sep 27, 2021 Benchmarking Multiple-choice
— Unverified 00 LLM Distillation for Efficient Few-Shot Multiple Choice Question Answering Dec 13, 2024 Few-Shot Learning Knowledge Distillation
— Unverified 00 Visual7W: Grounded Question Answering in Images Nov 11, 2015 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 00 Context-guided Triple Matching for Multiple Choice Question Answering Jan 16, 2022 Benchmarking Multiple-choice
— Unverified 00 Context Modeling with Evidence Filter for Multiple Choice Question Answering Oct 6, 2020 Machine Reading Comprehension Multiple-choice
— Unverified 00 LLMs May Perform MCQA by Selecting the Least Incorrect Option Feb 2, 2024 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 00 Med-RLVR: Emerging Medical Reasoning from a 3B base model via reinforcement Learning Feb 27, 2025 Math Medical Question Answering
— Unverified 00 Correctness Coverage Evaluation for Medical Multiple-Choice Question Answering Based on the Enhanced Conformal Prediction Framework Mar 7, 2025 Conformal Prediction Medical Question Answering
— Unverified 00 Multi-source Meta Transfer for Low Resource Multiple-Choice Question Answering Jul 1, 2020 Domain Adaptation Logical Reasoning
— Unverified 00 CP-Router: An Uncertainty-Aware Router Between LLM and LRM May 26, 2025 Conformal Prediction Logical Reasoning
— Unverified 00 What do we expect from Multiple-choice QA Systems? Nov 20, 2020 Multiple-choice Multiple Choice Question Answering (MCQA)
— Unverified 00 Fine-tuning BERT with Focus Words for Explanation Regeneration Dec 1, 2020 Explanation Generation Multiple-choice
— Unverified 00