BOAssembler: a Bayesian Optimization Framework to Improve RNA-Seq Assembly Performance
Shunfu Mao, Yihan Jiang, Edwin Basil Mathew, Sreeram Kannan
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/olivomao/boassemblerOfficialIn papernone★ 0
Abstract
High throughput sequencing of RNA (RNA-Seq) can provide us with millions of short fragments of RNA transcripts from a sample. How to better recover the original RNA transcripts from those fragments (RNA-Seq assembly) is still a difficult task. For example, RNA-Seq assembly tools typically require hyper-parameter tuning to achieve good performance for particular datasets. This kind of tuning is usually unintuitive and time-consuming. Consequently, users often resort to default parameters, which do not guarantee consistent good performance for various datasets. Here we propose BOAssembler (https://github.com/olivomao/boassembler), a framework that enables end-to-end automatic tuning of RNA-Seq assemblers, based on Bayesian Optimization principles. Experiments show this data-driven approach is effective to improve the overall assembly performance. The approach would be helpful for downstream (e.g. gene, protein, cell) analysis, and more broadly, for future bioinformatics benchmark studies.