Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 5751–5775 of 10817 papers

Title	Date	Tasks	Status
Choice Fusion as Knowledge for Zero-Shot Dialogue State Tracking	Feb 25, 2023	DecoderDialogue State Tracking	CodeCode Available
Medical visual question answering using joint self-supervised learning	Feb 25, 2023	DecoderDiversity	—Unverified
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback	Feb 24, 2023	InformativenessOpen-Domain Question Answering	—Unverified
Time-aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering	Feb 24, 2023	Graph Question AnsweringKnowledge Graphs	—Unverified
Testing AI on language comprehension tasks reveals insensitivity to underlying meaning	Feb 23, 2023	Question Answering	—Unverified
Extracting Victim Counts from Text	Feb 23, 2023	Dependency ParsingHumanitarian	CodeCode Available
Dr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctness	Feb 23, 2023	Question Answering	—Unverified
EVJVQA Challenge: Multilingual Visual Question Answering	Feb 23, 2023	Language ModelingLanguage Modelling	—Unverified
MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense Retrieval	Feb 23, 2023	Question AnsweringRetrieval	CodeCode Available
VinVL+L: Enriching Visual Representation with Location Context in VQA	Feb 22, 2023	Question AnsweringTAG	CodeCode Available
Construction of Knowledge Graphs: State and Challenges	Feb 22, 2023	Knowledge GraphsManagement	—Unverified
Real-World Deployment and Evaluation of Kwame for Science, An AI Teaching Assistant for Science Education in West Africa	Feb 21, 2023	Question Answering	—Unverified
Reusable Slotwise Mechanisms	Feb 21, 2023	Future predictionObject	—Unverified
STOA-VLP: Spatial-Temporal Modeling of Object and Action for Video-Language Pre-training	Feb 20, 2023	Language ModellingObject	—Unverified
Few-shot Multimodal Multitask Multilingual Learning	Feb 19, 2023	Few-Shot LearningIn-Context Learning	—Unverified
Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning	Feb 19, 2023	Graph LearningMedical Visual Question Answering	—Unverified
Bridge Damage Cause Estimation Using Multiple Images Based on Visual Question Answering	Feb 18, 2023	Question AnsweringVisual Question Answering	—Unverified
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE	Feb 18, 2023	Contrastive LearningDenoising	—Unverified
Complex QA and language models hybrid architectures, Survey	Feb 17, 2023	Domain AdaptationFairness	—Unverified
Foundation Models for Natural Language Processing -- Pre-trained Language Models Integrating Media	Feb 16, 2023	Question AnsweringStory Generation	—Unverified
Product Question Answering in E-Commerce: A Survey	Feb 16, 2023	Question AnsweringSurvey	—Unverified
Bridge the Gap between Language models and Tabular Understanding	Feb 16, 2023	Contrastive LearningLanguage Modeling	—Unverified
Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?	Feb 16, 2023	Few-Shot LearningLanguage Modeling	—Unverified
Effects of Locality and Rule Language on Explanations for Knowledge Graph Embeddings	Feb 14, 2023	Knowledge Graph EmbeddingsKnowledge Graphs	—Unverified
Large-Scale Knowledge Synthesis and Complex Information Retrieval from Biomedical Documents	Feb 14, 2023	Information RetrievalKnowledge Graphs	—Unverified

Show:10 25 50

← PrevPage 231 of 433Next →

All datasets SQuAD2.0 SQuAD1.1 HotpotQA PIQA BoolQ COPA TriviaQA SQuAD1.1 dev Natural Questions OpenBookQA TruthfulQA MultiRC

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	IE-Net (ensemble)	EM	90.94	—	Unverified
2	FPNet (ensemble)	EM	90.87	—	Unverified
3	IE-NetV2 (ensemble)	EM	90.86	—	Unverified
4	SA-Net on Albert (ensemble)	EM	90.72	—	Unverified
5	SA-Net-V2 (ensemble)	EM	90.68	—	Unverified
6	FPNet (ensemble)	EM	90.6	—	Unverified
7	Retro-Reader (ensemble)	EM	90.58	—	Unverified
8	EntitySpanFocusV2 (ensemble)	EM	90.52	—	Unverified
9	TransNets + SFVerifier + SFEnsembler (ensemble)	EM	90.49	—	Unverified
10	EntitySpanFocus+AT (ensemble)	EM	90.45	—	Unverified