Genome assembly
High-quality genome assemblies are crucial for their use as reliable reference sequences. However, the short reads produced by traditional sequencing technologies lead to highly fragmented, incomplete assemblies. Short reads cannot span important genomic regions such as repeats and structural variants, resulting in them being assembled incorrectly. In contrast, nanopore technology can deliver long and ultra-long sequencing reads (current record >4 Mb), that can span complex genomic regions, enabling the generation of highly contiguous genome assemblies.
Introduction
Generate more contiguous genome assemblies using long sequencing reads
Large structural variants, repeat sequences, and GC-rich regions are challenging to accurately characterise with short-read sequencing technology, and the resulting genome assemblies tend to be fragmented due to the lack of read overlap. Nanopore technology routinely generates sequencing reads that are tens of kilobases in length, and is also capable of sequencing ultra-long libraries (i.e. read N50 of >100 kb; Figure 1). The greater overlap between ultra-long reads enables easier de novo genome assembly. The longest DNA fragment sequenced to date using nanopore technology is 4.2 Mb, which was achieved using the Ultra-Long DNA Sequencing Kit. The long-read capability of nanopore sequencing not only enables accurate delineation of complex genomic regions such as repeats and structural variants, but also the sequencing of smaller microbial genomes in single reads — negating the need for assembly entirely (see poster).

Figure 1: Nanopore sequencing delivers long and ultra-long read lengths that can span complex genomic regions, enabling the generation of highly complete and contiguous genome assemblies.
Table 1: Comparison of banana genome assemblies generated using short-read technologies and nanopore sequencing. Long nanopore sequencing reads enabled the assembly of a highly complete genome with over ~155-fold fewer contigs. Over 177x coverage of the Musa acuminata genome was delivered using a single PromethION Flow Cell, and of the 11 chromosomes, 5 were entirely reconstructed, telomere-to-telomere, in single contigs. Data from Belser et al. Commun Biol. 4(1):1047 (2021). Watch the video.
Comprehensive genomic analysis, including direct detection of modified bases
A common metric for assessing genome assembly quality is contig N50 — the length at which half of the nucleotides in the assembly belong in contigs of this length or longer. The use of long nanopore sequencing reads delivers significantly higher N50 values than provided by short-read sequencing technologies, enabling the generation of more complete and more contiguous genome assemblies (Table 1). In addition, using Pore-C, a complete, end-to-end workflow for nanopore sequencing-based chromosome conformation capture, large genome assemblies can be further scaffolded and corrected. Long sequencing reads also simplify haplotyping, enabling the resolution of compound heterozygosity and parental origin. Furthermore, nanopore sequencing does not require amplification, allowing the direct detection of base modifications (e.g. methylation) alongside the nucleotide sequence for even more comprehensive genomic analyses.
Case study
Delivering improved crop reference genomes
‘using a plant-trained basecalling model, nanopore-only reference crop genomes can be obtained with outstanding contiguity and accuracy, reducing the requirements for multiple technologies to generate reference-quality genomes’
Alexander Wittenberg, KeyGene, NetherlandsScientists at KeyGene in the Netherlands are at the forefront of technology innovation for crop improvement. A significant focus is crop improvement through breeding for traits such as pathogen resistance, extended shelf life, and improved taste and colour. The insights obtained using a high-quality reference genome enable better and faster selection of important breeding traits — allowing new plant varieties to be brought to market faster. Using the PromethION 24 device and a plant-trained basecalling model, the KeyGene team generated the most contiguous lettuce genome ever assembled. Using nanopore sequencing alone, the genome was captured in just 159 contigs. This contrasts with 153,952 contigs for the 2017 short-read-based reference genome, and 1,541 contigs for a genome assembled using an alternative long-read capable sequencing technology. Using their STL assembler, the nanopore-only genome was assembled within 30 hours, and consensus accuracies were shown to be on par with those obtained using alternative technologies.
'Nanopore data produces highly contiguous genome sequence assemblies.'
Jessica Allen, Eastern Washington University, USACase study
Using nanopore sequencing to investigate genome evolution in fungal symbioses
Lichenised fungi represent around 20% of described fungal species; they occur in every terrestrial ecosystem, performing essential functions such as nutrient cycling. Lichenised fungi also act as essential components of food webs, and can be utilised as biomarkers for air quality and pollution. They are an excellent model system, as they form stable obligate symbioses with green algae and/or cyanobacteria and function as ‘miniature ecosystems’, and are rich in bacterial communities. They also feature unique secondary metabolites, such as organic acids, with potential use in industrial and pharmaceutical applications.
Jessica Allen (Eastern Washington University, USA) and her team used the MinION to sequence lichenised fungi and generated 19 fungal genomes which included telomere-to-telomere assembly of most chromosomes. With average contig N50s of over 1 Mb, the fungal genome assemblies generated by Jessica's team were more contiguous than the publicly available assemblies generated via short-read technologies, and nearly doubled the total number of lichenized fungal genome assemblies available. Her team also sequenced the genome of the critically endangered Sulcaria isidiifera, in collaboration with the ORG.one programme, and characterised a 71.4 Mb genome containing 46.79% total interspersed repeats.
Sequencing workflow
How do I assemble genomes using nanopore sequencing?
Oxford Nanopore provides a range of sequencing devices suitable for any sized genome assembly project, from small individual microbial genomes to high-throughput, population-scale sequencing of large genomes.
For best practice advice on genome assembly, view our whole-genome sequencing Getting Started guides for small or large genomes. These guides provide a step-by-step overview of the entire sequencing workflow — from selecting the right nanopore sequencing device through to sample preparation, sequencing, and data analysis. Our best practice workflows for human and microbial genome assembly provide structured, recommended workflows for assembling genomes using nanopore sequencing technology.
Read our simple, end-to-end workflow for microbial genome assembly from an isolate.
Get started
High-throughput assembly of large genomes
For high-throughput sequencing and assembly of large and complex genomes, such as those of humans, animals, and plants, we recommend the following:
Subscribe
Get in touch
Talk to us
If you have any questions about our products or services, chat directly with a member of our sales team.
Talk to usBook a sales call
To book a call with one of our sales team, please click below.
Book a call