FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models

2024-01-01Code Available1· sign in to hype

Shu Liu, Shangqing Zhao, Chenghao Jia, Xinlin Zhuang, Zhaoguang Long, Jie zhou, Aimin Zhou, Man Lan, Qingquan Wu, Chong Yang

arXiv PDF

Code Available — Be the first to reproduce this paper.

Reproduce

Code

github.com/cubenlp/BIBench
OfficialIn papernone★ 22

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of tasks. However, their proficiency and reliability in the specialized domain of financial data analysis, particularly focusing on data-driven thinking, remain uncertain. To bridge this gap, we introduce FinDABench, a comprehensive benchmark designed to evaluate the financial data analysis capabilities of LLMs within this context. FinDABench assesses LLMs across three dimensions: 1) Foundational Ability, evaluating the models' ability to perform financial numerical calculation and corporate sentiment risk assessment; 2) Reasoning Ability, determining the models' ability to quickly comprehend textual information and analyze abnormal financial reports; and 3) Technical Skill, examining the models' use of technical knowledge to address real-world data analysis challenges involving analysis generation and charts visualization from multiple perspectives. We will release FinDABench, and the evaluation scripts at https://github.com/cubenlp/BIBench. FinDABench aims to provide a measure for in-depth analysis of LLM abilities and foster the advancement of LLMs in the field of financial data analysis.

Tasks

Benchmarking

FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models

Code

Abstract

Tasks

Reproductions