SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1065110700 of 10817 papers

TitleStatusHype
Towards Robust Numerical Question Answering: Diagnosing Numerical Capabilities of NLP Systems0
Towards Semantic Search for Community Question Answering for Mortgage Officers0
Towards Solving Multimodal Comprehension0
Towards Spoken Mathematical Reasoning: Benchmarking Speech-based Models over Multi-faceted Math Problems0
Towards Task-Agnostic Privacy- and Utility-Preserving Models0
Towards Teachable Reasoning Systems: Using a Dynamic Memory of User Feedback for Continual System Improvement0
Towards the Application of Calibrated Transformers to the Unsupervised Estimation of Question Difficulty from Text0
Towards the automatic classification of complex-type nominals0
Towards the Exploitation of LLM-based Chatbot for Providing Legal Support to Palestinian Cooperatives0
Towards the Unsupervised Acquisition of Implicit Semantic Roles0
Towards Time-Aware Knowledge Graph Completion0
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering0
Towards Topic-to-Question Generation0
Towards Transparent AI Systems: Interpreting Visual Question Answering Models0
Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction0
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers0
Towards Two-step Multi-document Summarisation for Evidence Based Medicine: A Quantitative Analysis0
Towards Understanding Camera Motions in Any Video0
Towards Understanding Multi-Round Large Language Model Reasoning: Approximability, Learnability and Generalizability0
Towards Universal Dense Retrieval for Open-domain Question Answering0
Towards Unsupervised Learning of Temporal Relations between Events0
Towards Unsupervised Question Answering System with Multi-level Summarization for Legal Text0
Towards Unsupervised Visual Reasoning: Do Off-The-Shelf Features Know How to Reason?0
Towards Verifiable Text Generation with Symbolic References0
Towards Visual Dialog for Radiology0
Towards Visual-Prompt Temporal Answering Grounding in Medical Instructional Video0
Towards Visual Text Grounding of Multimodal Large Language Model0
Towards Zero-Shot and Few-Shot Table Question Answering using GPT-30
Toward the automatic extraction of knowledge of usable goods0
Toward Unsupervised Realistic Visual Question Answering0
TPE: Towards Better Compositional Reasoning over Conceptual Tools with Multi-persona Collaboration0
Tracking the Copyright of Large Vision-Language Models through Parameter Learning Adversarial Images0
Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering0
Traffic-Domain Video Question Answering with Automatic Captioning0
T-RAG: Lessons from the LLM Trenches0
Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models0
Training a Korean SRL System with Rich Morphological Features0
Training a Ranking Function for Open-Domain Question Answering0
Training Data is More Valuable than You Think: A Simple and Effective Method by Retrieving from Training Data0
Training Generative Question-Answering on Synthetic Data Obtained from an Instruct-tuned Model0
Training IBM Watson using Automatically Generated Question-Answer Pairs0
Training Question Answering Models From Synthetic Data0
Training Recurrent Answering Units with Joint Loss Minimization for VQA0
Training Table Question Answering via SQL Query Decomposition0
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding0
Transcending Scaling Laws with 0.1% Extra Compute0
TranscRater: a Tool for Automatic Speech Recognition Quality Estimation0
Transducing Sentences to Syntactic Feature Vectors: an Alternative Way to ``Parse''?0
Transferable Adversarial Attacks on Black-Box Vision-Language Models0
Transferable speech-to-text large language model alignment module0
Show:102550
← PrevPage 214 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified