Boolean matrix logic programming for active learning of gene functions in genome-scale metabolic network models
Lun Ai, Stephen H. Muggleton, Shi-Shun Liang, Geoff S. Baldwin
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/lai1997/bmlp_active_publicOfficialIn papernone★ 0
Abstract
Reasoning about hypotheses and updating knowledge through empirical observations are central to scientific discovery. In this work, we applied logic-based machine learning methods to drive biological discovery by guiding experimentation. Genome-scale metabolic network models (GEMs) - comprehensive representations of metabolic genes and reactions - are widely used to evaluate genetic engineering of biological systems. However, GEMs often fail to accurately predict the behaviour of genetically engineered cells, primarily due to incomplete annotations of gene interactions. The task of learning the intricate genetic interactions within GEMs presents computational and empirical challenges. To efficiently predict using GEM, we describe a novel approach called Boolean Matrix Logic Programming (BMLP) by leveraging Boolean matrices to evaluate large logic programs. We developed a new system, BMLP_active, which guides cost-effective experimentation and uses interpretable logic programs to encode a state-of-the-art GEM of a model bacterial organism. Notably, BMLP_active successfully learned the interaction between a gene pair with fewer training examples than random experimentation, overcoming the increase in experimental design space. BMLP_active enables rapid optimisation of metabolic models to reliably engineer biological systems for producing useful compounds. It offers a realistic approach to creating a self-driving lab for biological discovery, which would then facilitate microbial engineering for practical applications.