SOTAVerified

TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation

2024-12-19Code Available1· sign in to hype

Jiatong Li, Junxian Li, Yunqing Liu, Dongzhan Zhou, Qing Li

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

In this paper, we propose Text-based Open Molecule Generation Benchmark (TOMG-Bench), the first benchmark to evaluate the open-domain molecule generation capability of LLMs. TOMG-Bench encompasses a dataset of three major tasks: molecule editing (MolEdit), molecule optimization (MolOpt), and customized molecule generation (MolCustom). Each major task further contains three subtasks, while each subtask comprises 5,000 test samples. Given the inherent complexity of open molecule generation evaluation, we also developed an automated evaluation system that helps measure both the quality and the accuracy of the generated molecules. Our comprehensive benchmarking of 25 LLMs reveals the current limitations as well as potential areas for improvement in text-guided molecule discovery. Furthermore, we propose OpenMolIns, a specialized instruction tuning dataset established for solving challenges raised by TOMG-Bench. Fine-tuned on OpenMolIns, Llama3.1-8B could outperform all the open-source general LLMs, even surpassing GPT-3.5-turbo by 46.5\% on TOMG-Bench. Our codes and datasets are available through https://github.com/phenixace/TOMG-Bench.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
TOMG-BenchClaude-3.5wAcc35.92Unverified
TOMG-BenchGemini-1.5-prowAcc34.8Unverified
TOMG-BenchGPT-4-turbowAcc34.23Unverified
TOMG-BenchGPT-4owAcc32.29Unverified
TOMG-BenchClaude-3wAcc30.47Unverified
TOMG-BenchLlama-3.1-8B (OpenMolIns-large)wAcc27.22Unverified
TOMG-BenchGalactica-125M (OpenMolIns-xlarge)wAcc25.73Unverified
TOMG-BenchLlama3-70B-Instruct (INT4)wAcc23.93Unverified
TOMG-BenchGalactica-125M (OpenMolIns-large)wAcc23.42Unverified
TOMG-BenchGalactica-125M (OpenMolIns-medium)wAcc19.89Unverified
TOMG-BenchGPT-3.5-turbowAcc18.58Unverified
TOMG-BenchGalactica-125M (OpenMolIns-small)wAcc15.18Unverified
TOMG-BenchLlama3.1-8B-InstructwAcc14.09Unverified
TOMG-BenchLlama3-8B-InstructwAcc13.75Unverified
TOMG-Benchchatglm-9BwAcc13.14Unverified
TOMG-BenchGalactica-125M (OpenMolIns-light)wAcc13.14Unverified
TOMG-BenchLlama3.2-1B (OpenMolIns-large)wAcc8.1Unverified
TOMG-Benchyi-1.5-9BwAcc7.32Unverified
TOMG-BenchMistral-7B-Instruct-v0.2wAcc4.81Unverified
TOMG-BenchBioT5-basewAcc4.21Unverified
TOMG-BenchMolT5-largewAcc2.89Unverified
TOMG-BenchLlama-3.1-1B-InstructwAcc1.99Unverified
TOMG-BenchMolT5-basewAcc1.3Unverified
TOMG-BenchMolT5-smallwAcc1.3Unverified
TOMG-BenchQwen2-7B-InstructwAcc0.15Unverified

Reproductions