SOTAVerified

Reading Comprehension

Most current question answering datasets frame the task as reading comprehension where the question is about a paragraph or document and the answer often is a span in the document.

Some specific tasks of reading comprehension include multi-modal machine reading comprehension and textual machine reading comprehension, among others. In the literature, machine reading comprehension can be divide into four categories: cloze style, multiple choice, span prediction, and free-form answer. Read more about each category here.

Benchmark datasets used for testing a model's reading comprehension abilities include MovieQA, ReCoRD, and RACE, among others.

The Machine Reading group at UCL also provides an overview of reading comprehension tasks.

Figure source: A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets

Papers

Showing 9511000 of 1760 papers

TitleStatusHype
ISAAQ - Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention0
Scene Restoring for Narrative Machine Reading Comprehension0
Event Extraction as Multi-turn Question Answering0
Event Extraction as Machine Reading Comprehension0
Towards Medical Machine Reading Comprehension with Structural Knowledge and Plain Text0
BiTeM at WNUT 2020 Shared Task-1: Named Entity Recognition over Wet Lab Protocols using an Ensemble of Contextual Language Models0
Q. Can Knowledge Graphs be used to Answer Boolean Questions? A. It’s complicated!0
Understanding Procedural Text using Interactive Entity Networks0
``You are grounded!'': Latent Name Artifacts in Pre-trained Language Models0
How You Ask Matters: The Effect of Paraphrastic Questions to BERT Performance on a Clinical SQuAD Dataset0
Structured Prediction for Joint Class Cardinality and Entity Property Inference in Model-Complete Text Comprehension0
Logic-guided Semantic Representation Learning for Zero-Shot Relation Classification0
Leveraging Extracted Model Adversaries for Improved Black Box Attacks0
Cross-lingual Machine Reading Comprehension with Language Branch Knowledge Distillation0
QBSUM: a Large-Scale Query-Based Document Summarization Dataset from Real-world Applications0
Commonsense knowledge adversarial dataset that challenges ELECTRA0
Improved Synthetic Training for Reading Comprehension0
Challenges in Information-Seeking QA: Unanswerable Questions and Paragraph Retrieval0
Towards Zero-Shot Multilingual Synthetic Question and Answer Generation for Cross-Lingual Reading Comprehension0
Knowledge Distillation for Improved Accuracy in Spoken Question Answering0
Probing and Fine-tuning Reading Comprehension Models for Few-shot Event Extraction0
Bi-directional Cognitive Thinking Network for Machine Reading Comprehension0
Deriving Commonsense Inference Tasks from Interactive Fictions0
Technical Question Answering across Tasks and DomainsCode0
Towards Interpreting BERT for Reading Comprehension Based QACode0
A Wrong Answer or a Wrong Question? An Intricate Relationship between Question Reformulation and Answer Selection in Conversational Question AnsweringCode0
Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension0
Multi-Stage Pre-training for Low-Resource Domain Adaptation0
Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented DataCode0
Context Modeling with Evidence Filter for Multiple Choice Question Answering0
Meta Sequence Learning for Generating Adequate Question-Answer Pairs0
Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous SpaceCode0
Reading Comprehension as Natural Language Inference: A Semantic Analysis0
基于阅读理解框架的中文事件论元抽取(Chinese Event Argument Extraction using Reading Comprehension Framework)0
ISAAQ -- Mastering Textbook Questions with Pre-trained Transformers and Bottom-Up and Top-Down Attention0
Evaluating NLP Models via Contrast Sets0
多模块联合的阅读理解候选句抽取(Evidence sentence extraction for reading comprehension based on multi-module)0
A Survey on Explainability in Machine Reading Comprehension0
面向垂直领域的阅读理解数据增强方法(Method for reading comprehension data enhancement in vertical field)0
ARES: A Reading Comprehension Ensembling Service0
基于多任务学习的生成式阅读理解(Generative Reading Comprehension via Multi-task Learning)0
A Vietnamese Dataset for Evaluating Machine Reading Comprehension0
Bridging Information-Seeking Human Gaze and Machine Reading Comprehension0
MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension0
No Answer is Better Than Wrong Answer: A Reflection Model for Document Level Machine Reading Comprehension0
Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering0
Question Directed Graph Attention Network for Numerical Reasoning over TextCode0
Multi-span Style Extraction for Generative Reading Comprehension0
Composing Answer from Multi-spans for Reading Comprehension0
Improving Machine Reading Comprehension with Contextualized Commonsense Knowledge0
Show:102550
← PrevPage 20 of 36Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Rational Reasoner / IDOLTest80.6Unverified
2AMR-LE-EnsembleTest80Unverified
3MERIt(MERIt-deberta-v2-xxlarge )Test79.3Unverified
4MERIt-deberta-v2-xxlarge deberta.v2.xxlarge.path.override_True.norm_1.1.0.w2.A100.cp200.s42Test79.3Unverified
5Knowledge modelTest79.2Unverified
6DeBERTa-v2-xxlarge-AMR-LE-ContrapositionTest77.2Unverified
7LReasoner ensembleTest76.1Unverified
8ELECTRA and ALBERTTest71Unverified
9WWZTest69.7Unverified
10xlnet-large-uncased [extended data]Test69.3Unverified
#ModelMetricClaimedVerifiedStatus
1ALBERT (Ensemble)Accuracy91.4Unverified
2Megatron-BERT (ensemble)Accuracy90.9Unverified
3ALBERTxxlarge+DUMA(ensemble)Accuracy89.8Unverified
4Megatron-BERTAccuracy89.5Unverified
5XLNetAccuracy (Middle)88.6Unverified
6DeBERTalargeAccuracy86.8Unverified
7B10-10-10Accuracy85.7Unverified
8RoBERTaAccuracy83.2Unverified
9Orca 2-13BAccuracy82.87Unverified
10Orca 2-7BAccuracy80.79Unverified
#ModelMetricClaimedVerifiedStatus
1Golden TransformerAverage F10.94Unverified
2MT5 LargeAverage F10.84Unverified
3ruRoberta-large finetuneAverage F10.83Unverified
4ruT5-large-finetuneAverage F10.82Unverified
5Human BenchmarkAverage F10.81Unverified
6ruT5-base-finetuneAverage F10.77Unverified
7ruBert-large finetuneAverage F10.76Unverified
8ruBert-base finetuneAverage F10.74Unverified
9RuGPT3XL few-shotAverage F10.74Unverified
10RuGPT3LargeAverage F10.73Unverified
#ModelMetricClaimedVerifiedStatus
1RoBERTa-LargeOverall: F164.4Unverified
2BERT-LargeOverall: F162.7Unverified
3BiDAFOverall: F128.5Unverified
#ModelMetricClaimedVerifiedStatus
1BERTMSE0.05Unverified
#ModelMetricClaimedVerifiedStatus
1BERT pretrained on MIMIC-IIIAnswer F163.55Unverified