SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 72767300 of 10817 papers

TitleStatusHype
Guess Me if You Can: Acronym Disambiguation for Enterprises0
Conditional Generation with a Question-Answering Blueprint0
ParaLaw Nets -- Cross-lingual Sentence-level Pretraining for Legal Text Processing0
Concise Thoughts: Impact of Output Length on LLM Reasoning and Cost0
GTR-LSTM: A Triple Encoder for Sentence Generation from RDF Data0
Parallelizing Word2Vec in Shared and Distributed Memory0
Parallel Key-Value Cache Fusion for Position Invariant RAG0
PARAMANU-AYN: Pretrain from scratch or Continual Pretraining of LLMs for Legal Domain Adaptation?0
Parameter-Efficient Abstractive Question Answering over Tables and over Text0
Exploiting User Search Sessions for the Semantic Categorization of Question-like Informational Search Queries0
CFO: A Framework for Building Production NLP Systems0
A Study of the Effect of Resolving Negation and Sentiment Analysis in Recognizing Text Entailment for Arabic0
GTR: Graph-Table-RAG for Cross-Table Question Answering0
Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations0
gTBLS: Generating Tables from Text by Conditional Question Answering0
Parameter-free Video Segmentation for Vision and Language Understanding0
Paraphrase-Driven Learning for Open Question Answering0
Paraphrase for Open Question Answering: New Dataset and Methods0
Paraphrase Generation from Latent-Variable PCFGs for Semantic Parsing0
GSQA: An End-to-End Model for Generative Spoken Question Answering0
Paraphrasing in Affirmative Terms Improves Negation Understanding0
G-SAP: Graph-based Structure-Aware Prompt Learning over Heterogeneous Knowledge for Commonsense Reasoning0
Paraphrasing with Large Language Models0
Paraphrastic Variance between European and Brazilian Portuguese0
AMR Beyond the Sentence: the Multi-sentence AMR corpus0
Show:102550
← PrevPage 292 of 433Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified