SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models

2023-05-30NeurIPS 2023Code Available1· sign in to hype

Code Available — Be the first to reproduce this paper.

Code

github.com/bravegroup/sheetcopilot
none★ 163

Abstract

Computer end users have spent billions of hours completing daily tasks like tabular data processing and project timeline scheduling. Most of these tasks are repetitive and error-prone, yet most end users lack the skill to automate these burdensome works. With the advent of large language models (LLMs), directing software with natural language user requests become a reachable goal. In this work, we propose a SheetCopilot agent that takes natural language task and control spreadsheet to fulfill the requirements. We propose a set of atomic actions as an abstraction of spreadsheet software functionalities. We further design a state machine-based task planning framework for LLMs to robustly interact with spreadsheets. We curate a representative dataset containing 221 spreadsheet control tasks and establish a fully automated evaluation pipeline for rigorously benchmarking the ability of LLMs in software control tasks. Our SheetCopilot correctly completes 44.3\% of tasks for a single generation, outperforming the strong code generation baseline by a wide margin. Our project page:https://sheetcopilot.github.io/.

Tasks

Benchmarking Code Generation Robot Task Planning Scheduling Task Planning

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
SheetCopilot	SheetCopilot (NIPS2023)	Pass@1	44.3	—	Unverified

SheetCopilot: Bringing Software Productivity to the Next Level through Large Language Models

Code

Abstract

Tasks

Benchmark Results

Reproductions