IGDA: Interactive Graph Discovery through Large Language Model Agents
Alex Havrilla, David Alvarez-Melis, Nicolo Fusi
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Large language models (LLMs) have emerged as a powerful method for discovery. Instead of utilizing numerical data, LLMs utilize associated variable semantic metadata to predict variable relationships. Simultaneously, LLMs demonstrate impressive abilities to act as black-box optimizers when given an objective f and sequence of trials. We study LLMs at the intersection of these two capabilities by applying LLMs to the task of interactive graph discovery: given a ground truth graph G^* capturing variable relationships and a budget of I edge experiments over R rounds, minimize the distance between the predicted graph G_R and G^* at the end of the R-th round. To solve this task we propose IGDA, a LLM-based pipeline incorporating two key components: 1) an LLM uncertainty-driven method for edge experiment selection 2) a local graph update strategy utilizing binary feedback from experiments to improve predictions for unselected neighboring edges. Experiments on eight different real-world graphs show our approach often outperforms all baselines including a state-of-the-art numerical method for interactive graph discovery. Further, we conduct a rigorous series of ablations dissecting the impact of each pipeline component. Finally, to assess the impact of memorization, we apply our interactive graph discovery strategy to a complex, new (as of July 2024) causal graph on protein transcription factors, finding strong performance in a setting where memorization is impossible. Overall, our results show IGDA to be a powerful method for graph discovery complementary to existing numerically driven approaches.