Accurate detection of single nucleotide polymorphisms using nanopore sequencing

Nanopore sequencing is a powerful single molecule DNA sequencing technology which provides a high throughput and long sequence reads. Nevertheless, its relatively high native error rate limits the direct detection of point mutations in individual reads of amplicon libraries, as these mutations are difficult to distinguish from the sequencing noise. We propose a computational method to reduce noise in nanopore detection of point variations. Our approach uses the fact that all reads are expected to be very similar to a wild type sequence, for which we experimentally characterize the position-specific systematic sequencing error pattern. We then use this information to reweight, in individual reads from the variant library, the confidence given to nucleotides read that do not match the wild type. We tested this method on two sets of known variants of Klen Taq, where the true mutation rate was 3.3 mutations per kb, well below the sequencing noise. We observed that the actual mutations became more distinguishable from sequencing noise after correction. This approach can be used, for example to help the clustering of variants, or to decrease the number of reads necessary to call a consensus.

The computational method is simple to implement and requires only a few thousands reads of the wild type sequence of interest, which can be easily obtained by multiplexing in a single MinION run. The approach does not require any modification in the experimental protocol for sequencing and can be simply implemented downstream standard base calling.

Authors: Rocio Espada, Nikola Zarevski, Adele Drame-Maigne, Yannick Rondelez