SOTAVerified

Question Answering

Question answering can be segmented into domain-specific tasks like community question answering and knowledge-base question answering. Popular benchmark datasets for evaluation question answering systems include SQuAD, HotPotQA, bAbI, TriviaQA, WikiQA, and many others. Models for question answering are typically evaluated on metrics like EM and F1. Some recent top performing models are T5 and XLNet.

( Image credit: SQuAD )

Papers

Showing 1070110750 of 10817 papers

TitleStatusHype
Transfer and Multi-Task Learning for Noun--Noun Compound Interpretation0
Transfer in Deep Reinforcement Learning using Knowledge Graphs0
Transfer Learning and Masked Generation for Answer Verbalization0
Transfer Learning Based Cross-lingual Knowledge Extraction for Wikipedia0
Transfer Learning Enhanced Single-choice Decision for Multi-choice Question Answering0
Transfer Learning in Visual and Relational Reasoning0
Transferring Domain-Agnostic Knowledge in Video Question Answering0
Transformer-Based Models for Question Answering on COVID190
Transformer based Natural Language Generation for Question-Answering0
Transformer-based Subject Entity Detection in Wikipedia Listings0
Transformers Can Compose Skills To Solve Novel Problems Without Finetuning0
Transformer Semantic Parsing0
Transformers in Vision: A Survey0
Transforming Wearable Data into Health Insights using Large Language Model Agents0
Transforming Wikipedia into a Large-Scale Fine-Grained Entity Type Corpus0
Transform-Retrieve-Generate: Natural Language-Centric Outside-Knowledge Visual Question Answering0
TransG : A Generative Model for Knowledge Graph Embedding0
Transition-based Dependency DAG Parsing Using Dynamic Oracles0
Translating Natural Language to SQL using Pointer-Generator Networks and How Decoding Order Matters0
Translating Questions into Answers using DBPedia n-triples0
Translating Web Search Queries into Natural Language Questions0
Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering0
Transliteration Better than Translation? Answering Code-mixed Questions over a Knowledge Base0
TransMamba: Fast Universal Architecture Adaption from Transformers to Mamba0
Leveraging Expert Input for Robust and Explainable AI-Assisted Lung Cancer Detection in Chest X-rays0
TransWiC at SemEval-2021 Task 2: Transformer-based Multilingual and Cross-lingual Word-in-Context Disambiguation0
TRAVELER: A Benchmark for Evaluating Temporal Reasoning across Vague, Implicit and Explicit References0
TraveLLaMA: Facilitating Multi-modal Large Language Models to Understand Urban Scenes and Provide Travel Assistance0
Treat us like the sequences we are: Prepositional Paraphrasing of Noun Compounds using LSTM0
Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese0
Tree Memory Networks for Modelling Long-term Temporal Dependencies0
Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering0
Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests0
Trick Me If You Can: Adversarial Writing of Trivia Challenge Questions0
Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks0
Triplet-Aware Scene Graph Embeddings0
Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis0
TrojVLM: Backdoor Attack Against Vision Language Models0
TRRNet: Tiered Relation Reasoning for Compositional Visual Question Answering0
Trust, Accountability, and Autonomy in Knowledge Graph-based AI for Self-determination0
Trusting Language Models in Education0
Trustworthy Graph Neural Networks: Aspects, Methods and Trends0
TruthLens:A Training-Free Paradigm for DeepFake Detection0
TruthTeller: Annotating Predicate Truth0
Trying Bilinear Pooling in Video-QA0
TTQA-RS- A break-down prompting approach for Multi-hop Table-Text Question Answering with Reasoning and Summarization0
TueFact at SemEval 2019 Task 8: Fact checking in community question answering forums: context matters0
TunBERT: Pretrained Contextualized Text Representation for Tunisian Dialect0
Tuning HeidelTime for identifying time expressions in clinical texts in English and French0
Turk Bootstrap Word Sense Inventory 2.0: A Large-Scale Resource for Lexical Substitution0
Show:102550
← PrevPage 215 of 217Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IE-Net (ensemble)EM90.94Unverified
2FPNet (ensemble)EM90.87Unverified
3IE-NetV2 (ensemble)EM90.86Unverified
4SA-Net on Albert (ensemble)EM90.72Unverified
5SA-Net-V2 (ensemble)EM90.68Unverified
6FPNet (ensemble)EM90.6Unverified
7Retro-Reader (ensemble)EM90.58Unverified
8EntitySpanFocusV2 (ensemble)EM90.52Unverified
9TransNets + SFVerifier + SFEnsembler (ensemble)EM90.49Unverified
10EntitySpanFocus+AT (ensemble)EM90.45Unverified