SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 67516800 of 10817 papers

TitleStatusHype
Hidden Backdoors in Human-Centric Language ModelsCode1
Chop Chop BERT: Visual Question Answering by Chopping VisualBERT's Heads0
Entailment as Few-Shot LearnerCode1
Bridge to Answer: Structure-aware Graph Interaction Network for Video Question Answering0
Document Collection Visual Question Answering0
Question-Aware Memory Network for Multi-hop Question Answering in Human-Robot Interaction0
Document Structure aware Relational Graph Convolutional Networks for Ontology Population0
PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel ComputationCode1
MDETR -- Modulated Detection for End-to-End Multi-Modal UnderstandingCode1
DADgraph: A Discourse-aware Dialogue Graph Neural Network for Multiparty Dialogue Machine Reading Comprehension0
InfographicVQA0
Towards Knowledge Graphs Validation through Weighted Knowledge Sources0
GermanQuAD and GermanDPR: Improving Non-English Question Answering and Passage Retrieval0
Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration0
RelTransformer: A Transformer-Based Long-Tail Visual Relationship RecognitionCode1
Playing Lottery Tickets with Vision and Language0
BERT-CoQAC: BERT-based Conversational Question Answering in Context0
Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement Verification and Evidence Finding with Tables0
Efficient Retrieval Optimized Multi-task Learning0
X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural Language Understanding and Question AnsweringCode1
Towards Solving Multimodal Comprehension0
GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question AnsweringCode1
Natural Language Generation Using Link Grammar for General Conversational IntelligenceCode0
ELECTRAMed: a new pre-trained language representation model for biomedical NLPCode1
MT6: Multilingual Pretrained Text-to-Text Transformer with Translation PairsCode1
When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD DatasetCode1
Contextualized Query Embeddings for Conversational Search0
Generative Context Pair Selection for Multi-hop Question Answering0
FedNLP: Benchmarking Federated Learning Methods for Natural Language Processing TasksCode0
Case-based Reasoning for Natural Language Queries over Knowledge Bases0
Can NLI Models Verify QA Systems' Predictions?Code1
Cross-Task Generalization via Natural Language Crowdsourcing InstructionsCode2
Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation0
GooAQ: Open Question Answering with Diverse Answer TypesCode1
ASBERT: Siamese and Triplet network embedding for open question answering0
Multi-Perspective Abstractive Answer Summarization0
A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering0
Explaining Answers with Entailment TreesCode1
Mobile App Tasks with Iterative Feedback (MoTIF): Addressing Task Feasibility in Interactive Visual EnvironmentsCode1
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval ModelsCode2
Joint Passage Ranking for Diverse Multi-Answer Retrieval0
ESTER: A Machine Reading Comprehension Dataset for Event Semantic Relation ReasoningCode1
Q^2: Evaluating Factual Consistency in Knowledge-Grounded Dialogues via Question Generation and Question AnsweringCode1
Capturing Row and Column Semantics in Transformer Based Question Answering over TablesCode1
Multivalent Entailment Graphs for Question Answering0
What to Pre-Train on? Efficient Intermediate Task SelectionCode1
Cross-Modal Retrieval Augmentation for Multi-Modal Classification0
IndoNLG: Benchmark and Resources for Evaluating Indonesian Natural Language GenerationCode1
VGNMN: Video-grounded Neural Module Network to Video-Grounded Language Tasks0
Editing Factual Knowledge in Language ModelsCode1
Show:102550
← PrevPage 136 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified