SOTAVerified

Protein-to-genome alignment with miniprot

2022-10-14Code Available2· sign in to hype

Heng Li

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Motivation: Protein-to-genome alignment is critical to annotating genes in non-model organisms. While there are a few tools for this purpose, all of them were developed over ten years ago and did not incorporate the latest advances in alignment algorithms. They are inefficient and could not keep up with the rapid production of new genomes and quickly growing protein databases. Results: Here we describe miniprot, a new aligner for mapping protein sequences to a complete genome. Miniprot integrates recent techniques such as k-mer sketch and SIMD-based dynamic programming. It is tens of times faster than existing tools while achieving comparable accuracy on real data. Availability and implementation: https://github.com/lh3/miniprot

Reproductions