Text-To-SQL

Text-to-SQL is a task in natural language processing (NLP) where the goal is to automatically generate SQL queries from natural language text. The task involves converting the text input into a structured representation and then using this representation to generate a semantically correct SQL query that can be executed on a database.

( Image credit: SyntaxSQLNet )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 424 papers

Title	Date	Tasks	Status	Hype
SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration	Jun 19, 2024	SQL ParsingText to SQL	CodeCode Available	1
MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL	Jun 18, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
QDA-SQL: Questions Enhanced Dialogue Augmentation for Multi-Turn Text-to-SQL	Jun 15, 2024	Data AugmentationText to SQL	CodeCode Available	1
BookSQL: A Large Scale Text-to-SQL Dataset for Accounting Domain	Jun 12, 2024	Natural Language QueriesText to SQL	CodeCode Available	1
EHR-SeqSQL : A Sequential Text-to-SQL Dataset For Interactively Exploring Electronic Health Records	May 23, 2024	SQL ParsingText to SQL	CodeCode Available	1
CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions	May 4, 2024	In-Context LearningText to SQL	CodeCode Available	1
Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL	Apr 19, 2024	RAGRetrieval	CodeCode Available	1
TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition	Apr 15, 2024	Natural Language UnderstandingQuestion Answering	CodeCode Available	1
Retrieval augmented text-to-SQL generation for epidemiological question answering using electronic health records	Mar 14, 2024	Question AnsweringRAG	CodeCode Available	1
Understanding the Effects of Noise in Text-to-SQL: An Examination of the BIRD-Bench Benchmark	Feb 19, 2024	Text to SQLText-To-SQL	CodeCode Available	1
Decomposition for Enhancing Attention: Improving LLM-based Text-to-SQL through Workflow Paradigm	Feb 16, 2024	Active LearningIn-Context Learning	CodeCode Available	1
Analyzing the Effectiveness of Large Language Models on Text-to-SQL Synthesis	Jan 22, 2024	16kProgram Synthesis	CodeCode Available	1
DBCopilot: Natural Language Querying over Massive Databases via Schema Routing	Dec 6, 2023	NavigateQuestion Generation	CodeCode Available	1
CRUSH4SQL: Collective Retrieval Using Schema Hallucination For Text2SQL	Nov 2, 2023	HallucinationRetrieval	CodeCode Available	1
ACT-SQL: In-Context Learning for Text-to-SQL with Automatically-Generated Chain-of-Thought	Oct 26, 2023	In-Context LearningText to SQL	CodeCode Available	1
Can LLMs Effectively Leverage Graph Structural Information through Prompts, and Why?	Sep 28, 2023	Graph LearningNode Classification	CodeCode Available	1
C3: Zero-shot Text-to-SQL with ChatGPT	Jul 14, 2023	Text to SQLText-To-SQL	CodeCode Available	1
Improving Generalization in Language Model-Based Text-to-SQL Semantic Parsing: Two Simple Semantic Boundary-Based Techniques	May 27, 2023	Domain GeneralizationLanguage Modeling	CodeCode Available	1
UNITE: A Unified Benchmark for Text-to-SQL Evaluation	May 25, 2023	Text to SQLText-To-SQL	CodeCode Available	1
Text-to-SQL Error Correction with Language Models of Code	May 22, 2023	SQL ParsingText to SQL	CodeCode Available	1
How to Prompt LLMs for Text-to-SQL: A Study in Zero-shot, Single-domain, and Cross-domain Settings	May 19, 2023	In-Context LearningRetrieval	CodeCode Available	1
Learning to Simulate Natural Language Feedback for Interactive Semantic Parsing	May 14, 2023	Semantic ParsingText to SQL	CodeCode Available	1
Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations	May 12, 2023	Text to SQLText-To-SQL	CodeCode Available	1
Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs	May 4, 2023	Semantic ParsingSQL Parsing	CodeCode Available	1
A comprehensive evaluation of ChatGPT's zero-shot Text-to-SQL capability	Mar 12, 2023	Code GenerationLanguage Modeling	CodeCode Available	1
LEVER: Learning to Verify Language-to-Code Generation with Execution	Feb 16, 2023	Arithmetic ReasoningCode Generation	CodeCode Available	1
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness	Jan 21, 2023	DiagnosticNatural Questions	CodeCode Available	1
EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records	Jan 16, 2023	RetrievalText to SQL	CodeCode Available	1
Know What I don't Know: Handling Ambiguous and Unanswerable Questions for Text-to-SQL	Dec 17, 2022	counterfactualText to SQL	CodeCode Available	1
Augmenting Multi-Turn Text-to-SQL Datasets with Self-Play	Oct 21, 2022	Domain GeneralizationSQL-to-Text	CodeCode Available	1
SpCQL: A Semantic Parsing Dataset for Converting Natural Language into Cypher	Oct 17, 2022	Natural Language QueriesSemantic Parsing	CodeCode Available	1
Recent Advances in Text-to-SQL: A Survey of What We Have and What We Expect	Aug 22, 2022	SurveyText to SQL	CodeCode Available	1
Semantic Enhanced Text-to-SQL Parsing via Iteratively Learning Schema Linking Graph	Aug 8, 2022	Graph LearningSQL Parsing	CodeCode Available	1
RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL	May 14, 2022	Dialogue State TrackingSemantic Parsing	CodeCode Available	1
Measuring and Improving Compositional Generalization in Text-to-SQL via Component Alignment	May 4, 2022	SentenceText to SQL	CodeCode Available	1
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries	May 3, 2022	Question AnsweringText to SQL	CodeCode Available	1
In-Context Learning for Few-Shot Dialogue State Tracking	Mar 16, 2022	Dialogue State TrackingFew-Shot Learning	CodeCode Available	1
Weakly Supervised Text-to-SQL Parsing through Question Decomposition	Dec 12, 2021	SQL ParsingText to SQL	CodeCode Available	1
SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL	Nov 1, 2021	Semantic ParsingText to SQL	CodeCode Available	1
mRAT-SQL+GAP:A Portuguese Text-to-SQL Transformer	Oct 7, 2021	Text to SQLText-To-SQL	CodeCode Available	1
SPARQLing Database Queries from Intermediate Question Decompositions	Sep 13, 2021	Knowledge GraphsText to SQL	CodeCode Available	1
Leveraging Table Content for Zero-shot Text-to-SQL with Meta-Learning	Sep 12, 2021	Meta-LearningText to SQL	CodeCode Available	1
Natural SQL: Making SQL Easier to Infer from Natural Language Specifications	Sep 11, 2021	Text to SQLText-To-SQL	CodeCode Available	1
PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models	Sep 10, 2021	Dialogue State TrackingSemantic Parsing	CodeCode Available	1
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data	Jun 9, 2021	Natural Language UnderstandingSemantic Parsing	CodeCode Available	1
Towards Robustness of Text-to-SQL Models against Synonym Substitution	Jun 2, 2021	Text to SQLText-To-SQL	CodeCode Available	1
LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations	Jun 2, 2021	Text to SQLText-To-SQL	CodeCode Available	1
Unlocking Compositional Generalization in Pre-trained Models Using Intermediate Representations	Apr 15, 2021	Semantic ParsingText to SQL	CodeCode Available	1
Learning to Synthesize Data for Semantic Parsing	Apr 12, 2021	Domain GeneralizationSemantic Parsing	CodeCode Available	1
An Investigation Between Schema Linking and Text-to-SQL Performance	Feb 3, 2021	Text to SQLText-To-SQL	CodeCode Available	1

Show:10 25 50

← PrevPage 2 of 9Next →

All datasets BIRD (BIg Bench for LaRge-scale Database Grounded Text-to-SQL Evaluation)spider Spider 2.0 SParC KaggleDBQA SEDE SQL-Eval Text-To-SQL

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Human Performance	Execution Accurarcy (Human)	92.96	—	Unverified
2	XiYan-SQL	Execution Accuracy % (Test)	75.63	—	Unverified
3	DSAIR + GPT-4o	Execution Accuracy % (Test)	74.12	—	Unverified
4	CHASE-SQL + Gemini	Execution Accuracy % (Test)	74.06	—	Unverified
5	ExSL + granite-34b-code	Execution Accuracy % (Test)	73.17	—	Unverified
6	OpenSearch-SQL+ v2 + GPT-4o	Execution Accuracy % (Test)	72.28	—	Unverified
7	Distillery + GPT-4o	Execution Accuracy % (Test)	71.83	—	Unverified
8	Insights AI	Execution Accuracy % (Test)	70.26	—	Unverified
9	PURPLE + RED + GPT-4o	Execution Accuracy % (Test)	70.21	—	Unverified
10	MCTS-SQL	Execution Accuracy % (Test)	69.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	XiYan-SQL	Execution Accuracy (Test)	89.65	—	Unverified
2	PET-SQL	Execution Accuracy (Test)	87.6	—	Unverified
3	datagpt-sql-7B + InvalidSQL-Feedback	Execution Accuracy (Dev)	87.2	—	Unverified
4	DAIL-SQL + GPT-4 + Self-Consistency	Execution Accuracy (Test)	86.6	—	Unverified
5	DIN-SQL + GPT-4	Execution Accuracy (Test)	85.3	—	Unverified
6	datagpt-sql-7B	Execution Accuracy (Dev)	84.8	—	Unverified
7	MSc-SQL	Execution Accuracy (Test)	84.7	—	Unverified
8	MARLO + Claude 2.1	Execution Accuracy (Test)	84	—	Unverified
9	C3 + ChatGPT + Zero-Shot	Execution Accuracy (Test)	82.3	—	Unverified
10	code-davinci-002 175B (LEVER)	Execution Accuracy (Dev)	81.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Spider-Agent + o1-preview	Success Rate	17.03	—	Unverified
2	Spider-Agent + GPT-4o	Success Rate	10.13	—	Unverified
3	Spider-Agent + Claude-3.5-Sonnect	Success Rate	9.02	—	Unverified
4	Spider-Agent + GPT-4	Success Rate	8.86	—	Unverified
5	Spider-Agent + Qwen2.5-72B	Success Rate	6.17	—	Unverified
6	Spider-Agent + DeepSeek-V2.5	Success Rate	5.22	—	Unverified
7	Spider-Agent + Gemini-Pro-1.5	Success Rate	2.53	—	Unverified
8	Spider-Agent + Llama-3.1-405B	Success Rate	2.21	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RASAT+PICARD	interaction match accuracy	45.2	—	Unverified
2	RAT-SQL-TC + GAP	interaction match accuracy	43.2	—	Unverified
3	HIE-SQL + GraPPa	interaction match accuracy	42.9	—	Unverified
4	RAT-SQL + SCoRe	interaction match accuracy	38.1	—	Unverified
5	EditSQL + BERT	interaction match accuracy	25.3	—	Unverified
6	GAZP + BERT	interaction match accuracy	23.5	—	Unverified
7	SyntaxSQL-con	interaction match accuracy	5.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RAT-SQL	Exact Match (EM)	26.77	—	Unverified
2	Edit-SQL	Exact Match (EM)	11.73	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	T5-Large	PCM-F1 (dev)	48.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	XiYan-SQL	Execution Accuracy	69.86	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Orange-mini	0-shot MRR	74.17	—	Unverified