Semantic Parsing
Semantic Parsing is the task of transducing natural language utterances into formal meaning representations. The target meaning representations can be defined according to a wide variety of formalisms. This include linguistically-motivated semantic representations that are designed to capture the meaning of any sentence such as λ-calculus or the abstract meaning representations. Alternatively, for more task-driven approaches to Semantic Parsing, it is common for meaning representations to represent executable programs such as SQL queries, robotic commands, smart phone instructions, and even general-purpose programming languages like Python and Java.
Source: Tranx: A Transition-based Neural Abstract Syntax Parser for Semantic Parsing and Code Generation
Papers
Showing 1–10 of 1202 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ARTEMIS-DA | Accuracy (Test) | 80.8 | — | Unverified |
| 2 | SynTQA (Oracle) | Test Accuracy | 77.5 | — | Unverified |
| 3 | TabLaP | Accuracy (Test) | 76.6 | — | Unverified |
| 4 | SynTQA (GPT) | Accuracy (Test) | 74.4 | — | Unverified |
| 5 | Mix SC | Accuracy (Test) | 73.6 | — | Unverified |
| 6 | SynTQA (RF) | Accuracy (Test) | 71.6 | — | Unverified |
| 7 | CABINET | Accuracy (Test) | 69.1 | — | Unverified |
| 8 | NormTab+TabSQLify | Accuracy (Test) | 68.63 | — | Unverified |
| 9 | Chain-of-Table | Accuracy (Test) | 67.31 | — | Unverified |
| 10 | Tab-PoT | Accuracy (Test) | 66.78 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | RESDSQL-3B + NatSQL | Accuracy | 84.1 | — | Unverified |
| 2 | code-davinci-002 175B (LEVER) | Accuracy | 81.9 | — | Unverified |
| 3 | RASAT+PICARD | Accuracy | 75.5 | — | Unverified |
| 4 | Graphix-3B + PICARD | Accuracy | 74 | — | Unverified |
| 5 | T5-3B + PICARD | Accuracy | 71.9 | — | Unverified |
| 6 | SADGA + GAP | Accuracy | 70.1 | — | Unverified |
| 7 | RATSQL + GAP | Accuracy | 69.7 | — | Unverified |
| 8 | RATSQL + Grammar-Augmented Pre-Training | Accuracy | 69.6 | — | Unverified |
| 9 | RATSQL + BERT | Accuracy | 65.6 | — | Unverified |
| 10 | Exact Set Matching | Accuracy | 19.7 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Dynamic Least-to-Most Prompting | Exact Match | 95 | — | Unverified |
| 2 | LeAR | Exact Match | 90.9 | — | Unverified |
| 3 | T5-3B w/ Intermediate Representations | Exact Match | 83.8 | — | Unverified |
| 4 | Hierarchical Poset Decoding | Exact Match | 69 | — | Unverified |
| 5 | Universal Transformer | Exact Match | 18.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ReaRev | Accuracy | 76.4 | — | Unverified |
| 2 | NSM+h | Accuracy | 74.3 | — | Unverified |
| 3 | CBR-KBQA | Accuracy | 70 | — | Unverified |
| 4 | STAGG (Yih et al., 2016) | Accuracy | 63.9 | — | Unverified |
| 5 | T5-11B (Raffel et al., 2020) | Accuracy | 56.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | CABINET | Denotation accuracy (test) | 89.5 | — | Unverified |
| 2 | TAPEX-Large (weak supervision) | Denotation accuracy (test) | 89.5 | — | Unverified |
| 3 | ReasTAP-Large (weak supervision) | Denotation accuracy (test) | 89.2 | — | Unverified |
| 4 | NL2SQL-BERT | Accuracy | 89 | — | Unverified |
| 5 | TAPAS-Large (weak supervision) | Denotation accuracy (test) | 83.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | PhraseTransformer | Accuracy | 90.4 | — | Unverified |
| 2 | Tranx | Accuracy | 86.2 | — | Unverified |
| 3 | ASN (Rabinovich et al., 2017) | Accuracy | 85.3 | — | Unverified |
| 4 | ZH15 (Zhao and Huang, 2015) | Accuracy | 84.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | coarse2fine | Accuracy | 88.2 | — | Unverified |
| 2 | PhraseTransformer | Accuracy | 87.9 | — | Unverified |
| 3 | Tranx | Accuracy | 87.7 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | PERIN + RobeCzech | F1 | 92.36 | — | Unverified |
| 2 | PERIN | F1 | 92.24 | — | Unverified |
| 3 | HUJI-KU | F1 | 58 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TAPEX-Large | Denotation Accuracy | 74.5 | — | Unverified |
| 2 | TAPAS-Large | Accuracy | 67.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | HSP | EM | 66.18 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ReasonBERTR | F1 Score | 41.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | MeMCE | Exact | 40.3 | — | Unverified |