Structural variants (SVs) are of high importance in both normal and aberrant phenotypes; however, their detection using traditional technologies is limited by their size, complexity, and position in the genome. Long nanopore reads can span SVs end-to-end, with no need for PCR, enabling unprecedented resolution of even highly complex variants — in any genomic context.
Sequencing structural variants with short and long reads
Structural variants (SVs) are of high significance across a broad range of fields, from clinical research into their roles in diseases such as cancer, through to identifying SVs encoding desirable crop traits in agricultural science. However, as SVs reach up to the megabase scale, many cannot be spanned by short reads; instead, they must be sequenced in short sections and reassembled. This can result in incomplete or incorrect assemblies (Figure 1), whilst the requirement of PCR means that SVs in regions which cannot be amplified may not be represented at all.
With Oxford Nanopore, there is no limit to read length: single reads frequently reach hundreds of kilobases in length, with a current record of over 4 Mb. This means that even large SVs can often be sequenced end-to-end in single reads, making for simple, accurate characterisation (see below) and often removing any need for assembly. Amplification is not required, avoiding PCR bias and allowing SVs to be identified across the genome, including in repetitive or GC-rich regions, such as repeat expansions, which are inaccessible to other methods. This also enables the sequencing of intact modified bases, so that SVs and their epigenetic effects can be revealed in a single experiment.
Long nanopore reads enable calling of SVs across the genome with high precision and recall
To assess the performance of SV calling with nanopore sequencing, the human genome GM24385 was sequenced on the PromethION device (Table 1). Structural variants were called and evaluated against the Genome In A Bottle (GIAB) v0.6 high confidence truth set.
Nanopore technology is highly scalable. Large plant and animal genome SV surveys can be performed to high depth of coverage on the powerful PromethION device, or SVs in smaller genomes can be thoroughly assessed on the portable Flongle and MinION, whilst the GridION offers the flexibility to scale up or down to match your experimental goals. Targeted sequencing can also be used to enrich for SVs of interest, including PCR-free targeting of large SVs in any region of the genome using adaptive sampling — a unique on-device enrichment methodology — or Cas9 targeted sequencing. With simple library preparation, in as little as ten minutes, and real-time sequencing and analysis, including a dedicated SV analysis pipeline, nanopore sequencing is a powerful tool for the study of SVs.
Population-scale analysis of human structural variants
‘We show that SVs can be accurately characterized at population scale using long read sequence data in a genomewide non-targeted fashion and how these variants impact disease’.
Researchers at deCODE genetics, Iceland, performed whole-genome native DNA sequencing of 1,817 Icelanders using high-throughput GridION and PromethION devices. With a read length N50 of 14.7 kb, the sequencing data revealed a median of 23,111 autosomal SVs per individual, spanning a median total length of 9.9 Mb. The authors noted that, in previous large-scale studies using short-read technology, only ~2,000-8,000 SVs had been identified per genome. It was further demonstrated that rare SVs are larger in size than common SVs and are more likely to impact protein function. Association of the SVs with phenotypic data found that carriers of a 14,154 bp deletion overlapping the first exon of PCSK9, a target of cholesterol-lowering drugs, had lower levels of LDL cholesterol, highlighting the functional significance of such studies.
How can I best call structural variation with Oxford Nanopore?
A whole human genome SV survey can be achieved by sequencing on a single PromethION Flow Cell, producing ~30x depth of coverage. We recommend preparing your library using the Ligation Sequencing Kit, for high throughput and long reads without PCR. Our SV pipeline will then take you from raw data to SV calling and visualisation; view our step-by-step tutorial for instructions, or use our fully-automated EPI2ME SV calling workflow to avoid the command line.
Discover more about the advantages of nanopore long sequencing reads for analysing structural variation.