Open-Domain Question Answering
Open-domain question answering is the task of question answering on open-domain datasets such as Wikipedia.
Papers
Showing 1–10 of 494 papers
All datasetsKILT: ELI5KILT: Natural QuestionsKILT: TriviaQAKILT: HotpotQASearchQAELI5QUASARNatural QuestionsSQuAD1.1 devWebQuestionsSQuAD1.1DuReader
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | somebody | KILT-RL | 2.62 | — | Unverified |
| 2 | Wikipedia | KILT-RL | 2.46 | — | Unverified |
| 3 | arxiv.org/abs/2103.06332 | KILT-RL | 2.36 | — | Unverified |
| 4 | BART + DPR | KILT-RL | 1.9 | — | Unverified |
| 5 | RAG | KILT-RL | 1.69 | — | Unverified |
| 6 | Training Set Retrieval (top 1) | KILT-RL | 0 | — | Unverified |
| 7 | T5-base | KILT-RL | 0 | — | Unverified |
| 8 | Input Copying | KILT-RL | 0 | — | Unverified |
| 9 | Sphere | KILT-RL | 0 | — | Unverified |
| 10 | Random Training Set Answer | KILT-RL | 0 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Re2G | KILT-EM | 43.56 | — | Unverified |
| 2 | intersect | KILT-EM | 38.78 | — | Unverified |
| 3 | KGI_0 | KILT-EM | 36.36 | — | Unverified |
| 4 | Wikipedia | KILT-EM | 35.32 | — | Unverified |
| 5 | RAG | KILT-EM | 32.69 | — | Unverified |
| 6 | BERT + DPR | KILT-EM | 31.99 | — | Unverified |
| 7 | BART + DPR | KILT-EM | 30.06 | — | Unverified |
| 8 | Multitask DPR + BART | KILT-EM | 29.09 | — | Unverified |
| 9 | Multi-task DPR | KILT-EM | 0 | — | Unverified |
| 10 | Sphere | KILT-EM | 0 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Re2G | KILT-EM | 57.91 | — | Unverified |
| 2 | intersect | KILT-EM | 50.56 | — | Unverified |
| 3 | Wikipedia | KILT-EM | 45.55 | — | Unverified |
| 4 | KGI_0 | KILT-EM | 42.85 | — | Unverified |
| 5 | Multitask DPR + BART | KILT-EM | 42.36 | — | Unverified |
| 6 | RAG | KILT-EM | 38.13 | — | Unverified |
| 7 | BERT + DPR | KILT-EM | 34.48 | — | Unverified |
| 8 | BART + DPR | KILT-EM | 31.4 | — | Unverified |
| 9 | TABi | KILT-EM | 0 | — | Unverified |
| 10 | T5-base | KILT-EM | 0 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | intersect | KILT-EM | 18.06 | — | Unverified |
| 2 | Wikipedia | KILT-EM | 11.71 | — | Unverified |
| 3 | Multitask DPR + BART | KILT-EM | 9.53 | — | Unverified |
| 4 | RAG | KILT-EM | 3.21 | — | Unverified |
| 5 | BART + DPR | KILT-EM | 1.96 | — | Unverified |
| 6 | BERT + DPR | KILT-EM | 0.74 | — | Unverified |
| 7 | Sphere | KILT-EM | 0 | — | Unverified |
| 8 | Multi-task DPR | KILT-EM | 0 | — | Unverified |
| 9 | GENRE | KILT-EM | 0 | — | Unverified |
| 10 | chriskuei | KILT-EM | 0 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SpanBERT | F1 | 84.8 | — | Unverified |
| 2 | Cluster-Former (#C=512) | EM | 68 | — | Unverified |
| 3 | Locality-Sensitive Hashing | EM | 66 | — | Unverified |
| 4 | Multi-passage BERT | EM | 65.1 | — | Unverified |
| 5 | Sparse Attention | EM | 64.7 | — | Unverified |
| 6 | DECAPROP | EM | 62.2 | — | Unverified |
| 7 | Bi-Attention + DCU-LSTM | N-gram F1 | 59.5 | — | Unverified |
| 8 | Denoising QA | EM | 58.8 | — | Unverified |
| 9 | DecaProp | EM | 56.8 | — | Unverified |
| 10 | AMANDA | N-gram F1 | 56.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Fourier Transformer | Rouge-L | 26.9 | — | Unverified |
| 2 | QG | Rouge-L | 26.4 | — | Unverified |
| 3 | BART | Rouge-L | 24.3 | — | Unverified |
| 4 | E-MCA | Rouge-L | 24 | — | Unverified |
| 5 | Transformer Multitask + LayerDrop | Rouge-L | 23.4 | — | Unverified |
| 6 | Multi-Inrerleave | Rouge-L | 14.63 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Evidence Aggregation via R^3 Re-Ranking | EM (Quasar-T) | 42.3 | — | Unverified |
| 2 | Denoising QA | EM (Quasar-T) | 42.2 | — | Unverified |
| 3 | DecaProp | EM (Quasar-T) | 38.6 | — | Unverified |
| 4 | R^3 | EM (Quasar-T) | 35.3 | — | Unverified |
| 5 | GA | EM (Quasar-T) | 26.4 | — | Unverified |
| 6 | BiDAF | EM (Quasar-T) | 25.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | FiE | Exact Match | 58.4 | — | Unverified |
| 2 | R2-D2 HN-DPR | Exact Match | 55.9 | — | Unverified |
| 3 | UniK-QA | Exact Match | 54.9 | — | Unverified |
| 4 | UnitedQA (Hybrid) | Exact Match | 54.7 | — | Unverified |
| 5 | BPR (linear scan; l=1000) | Exact Match | 41.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SPARTA | EM | 59.3 | — | Unverified |
| 2 | Blended RAG | EM | 57.63 | — | Unverified |
| 3 | BERTserini | EM | 50.2 | — | Unverified |
| 4 | BERTserini | EM | 38.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ERNIE 2.0 Large | EM | 64.2 | — | Unverified |
| 2 | ERNIE 2.0 Base | EM | 61.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UniK-QA | Exact Match | 65.5 | — | Unverified |
| 2 | BPR (linear scan; l=1000) | Exact Match | 56.8 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | EMDR2 | Exact Match | 52.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UnitedQA (Hybrid) | Exact Match | 70.5 | — | Unverified |