MalAlgoQA: Pedagogical Evaluation of Counterfactual Reasoning in Large Language Models and Implications for AI in Education Jul 1, 2024 counterfactual Counterfactual Reasoning
Code Code Available 0Evaluation of Instruction-Following Ability for Large Language Models on Story-Ending Generation Jun 24, 2024 Instruction Following Machine Reading Comprehension
— Unverified 0It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension Jun 24, 2024 Reading Comprehension
— Unverified 0Comparison of Open-Source and Proprietary LLMs for Machine Reading Comprehension: A Practical Analysis for Industrial Applications Jun 19, 2024 Benchmarking Machine Reading Comprehension
— Unverified 0Exploring the Robustness of Language Models for Tabular Question Answering via Attention Analysis Jun 18, 2024 In-Context Learning Question Answering
— Unverified 0InternalInspector I^2: Robust Confidence Estimation in LLMs through Internal States Jun 17, 2024 Benchmarking Contrastive Learning
— Unverified 0VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models Jun 14, 2024 Reading Comprehension
— Unverified 0Unused information in token probability distribution of generative LLM: improving LLM reading comprehension through calculation of expected values Jun 11, 2024 Reading Comprehension
Code Code Available 02DP-2MRC: 2-Dimensional Pointer-based Machine Reading Comprehension Method for Multimodal Moment Retrieval Jun 10, 2024 Boundary Detection Machine Reading Comprehension
— Unverified 0Coherent Zero-Shot Visual Instruction Generation Jun 6, 2024 Image Generation Reading Comprehension
— Unverified 0M-QALM: A Benchmark to Assess Clinical Reading Comprehension and Knowledge Recall in Large Language Models via Question Answering Jun 6, 2024 abstractive question answering Clinical Knowledge
Code Code Available 0FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages Jun 6, 2024 Answer Generation Question Answering
Code Code Available 0Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding Jun 4, 2024 Articles Long-Context Understanding
Code Code Available 0Unsupervised Distractor Generation via Large Language Model Distilling and Counterfactual Contrastive Decoding Jun 3, 2024 counterfactual Distractor Generation
— Unverified 0Brainstorming Brings Power to Large Language Models of Knowledge Reasoning Jun 2, 2024 Logical Reasoning Reading Comprehension
— Unverified 0Learning to Clarify: Multi-turn Conversations with Action-Based Contrastive Self-Training May 31, 2024 Machine Reading Comprehension Question Answering
— Unverified 0Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation May 31, 2024 Question Answering Question Generation
— Unverified 0Automated Focused Feedback Generation for Scientific Writing Assistance May 30, 2024 Reading Comprehension Specificity
Code Code Available 0DGRC: An Effective Fine-tuning Framework for Distractor Generation in Chinese Multi-choice Reading Comprehension May 29, 2024 Distractor Generation Multiple-choice
— Unverified 0Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension May 29, 2024 Machine Reading Comprehension RAG
— Unverified 0Time Matters: Enhancing Pre-trained News Recommendation Models with Robust User Dwell Time Injection May 21, 2024 News Recommendation Reading Comprehension
— Unverified 0Auto FAQ Generation May 13, 2024 Philosophy Question Generation
— Unverified 0Compressing Long Context for Enhancing RAG with AMR-based Concept Distillation May 6, 2024 Abstract Meaning Representation Open-Domain Question Answering
— Unverified 0MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition May 6, 2024 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Can Large Language Models Make the Grade? An Empirical Study Evaluating LLMs Ability to Mark Short Answer Questions in K-12 Education May 5, 2024 Prompt Engineering Reading Comprehension
— Unverified 0UQA: Corpus for Urdu Question Answering May 2, 2024 Multilingual NLP Question Answering
Code Code Available 0QLSC: A Query Latent Semantic Calibrator for Robust Extractive Question Answering Apr 30, 2024 Extractive Question-Answering Machine Reading Comprehension
— Unverified 0Efficient LLM Inference with Kcache Apr 28, 2024 Reading Comprehension
— Unverified 0Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension Apr 27, 2024 Machine Reading Comprehension Reading Comprehension
— Unverified 0Transfer Learning Enhanced Single-choice Decision for Multi-choice Question Answering Apr 27, 2024 Binary Classification Language Modeling
— Unverified 0From Multiple-Choice to Extractive QA: A Case Study for English and Arabic Apr 26, 2024 Belebele Extractive Question-Answering
Code Code Available 0PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering Apr 19, 2024 Articles Information Retrieval
— Unverified 0emrQA-msquad: A Medical Dataset Structured with the SQuAD V2.0 Framework, Enriched with emrQA Medical Information Apr 18, 2024 Decision Making Machine Reading Comprehension
— Unverified 0Question Difficulty Ranking for Multiple-Choice Reading Comprehension Apr 16, 2024 Multiple-choice Reading Comprehension
— Unverified 0ViTextVQA: A Large-Scale Visual Question Answering Dataset for Evaluating Vietnamese Text Comprehension in Images Apr 16, 2024 Multimodal Deep Learning Optical Character Recognition (OCR)
Code Code Available 0Fewer Truncations Improve Language Modeling Apr 16, 2024 Combinatorial Optimization Hallucination
— Unverified 0Automatic Generation and Evaluation of Reading Comprehension Test Items with Large Language Models Apr 11, 2024 Multiple-choice Reading Comprehension
Code Code Available 0NoticIA: A Clickbait Article Summarization Dataset in Spanish Apr 11, 2024 Articles Reading Comprehension
Code Code Available 0LLMs' Reading Comprehension Is Affected by Parametric Knowledge and Struggles with Hypothetical Statements Apr 9, 2024 Natural Language Understanding Question Answering
— Unverified 0CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs Apr 9, 2024 counterfactual Counterfactual Reasoning
— Unverified 0The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models Apr 8, 2024 Question Answering Reading Comprehension
— Unverified 0Interpreting Themes from Educational Stories Apr 8, 2024 Machine Reading Comprehension Reading Comprehension
Code Code Available 0XL^2Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies Apr 8, 2024 Long-Context Understanding Reading Comprehension
— Unverified 0LLM-aided explanations of EDA synthesis errors Apr 7, 2024 Question Answering Reading Comprehension
— Unverified 0KazQAD: Kazakh Open-Domain Question Answering Dataset Apr 6, 2024 Information Retrieval Machine Translation
Code Code Available 0Exploring Autonomous Agents through the Lens of Large Language Models: A Review Apr 5, 2024 In-Context Learning Reading Comprehension
— Unverified 0The Death of Feature Engineering? BERT with Linguistic Features on SQuAD 2.0 Apr 4, 2024 Feature Engineering Machine Reading Comprehension
— Unverified 0Exploring the Nexus of Large Language Models and Legal Systems: A Short Survey Apr 1, 2024 Reading Comprehension Retrieval
— Unverified 0Text Understanding in GPT-4 vs Humans Mar 25, 2024 Reading Comprehension
— Unverified 0WangchanLion and WangchanX MRC Eval Mar 24, 2024 Instruction Following Machine Reading Comprehension
Code Code Available 0