Genome-Bench: A Scientific Reasoning Benchmark from Real-World Expert Discussions

2025-05-26Unverified0· sign in to hype

Ming Yin, Yuanhao Qu, Dyllan Liu, Ling Yang, Le Cong, Mengdi Wang

Unverified — Be the first to reproduce this paper.

Abstract

In this short report, we present an automated pipeline tailored for the genomics domain and introduce Genome-Bench, a new benchmark constructed from over a decade of scientific forum discussions on genome engineering. Our pipeline transforms raw interactions into a reinforcement learning friendly multiple-choice questions format, supported by 3000+ high quality question answer pairs spanning foundational biology, experimental troubleshooting, tool usage, and beyond. To our knowledge, this is the first end-to-end pipeline for teaching LLMs to reason from scientific discussions, with promising potential for generalization across scientific domains beyond biology.

Tasks

Multiple-choice

Genome-Bench: A Scientific Reasoning Benchmark from Real-World Expert Discussions

Abstract

Tasks

Reproductions