Characterising genomic and epigenomic variation between tumour-normal research samples using long nanopore sequencing reads


Overview

Genomic instability is characteristic of most cancers. Paired tumour-normal whole-genome sequencing enables a deeper understanding of genomic and epigenomic variability in cancer, which will prompt the discovery of new cancer biomarkers and new insights into the genetics of treatment-resistant tumours.

Using traditional sequencing technology, short DNA fragments must undergo PCR — erasing all epigenetic information, introducing bias, and limiting variant detection only to regions amenable to amplification. Furthermore, complex structural variants (SVs), which can reach megabase scale, cannot be spanned by short reads and may be missed.

Capturing a wide range of tumour-specific variation within a single sequencing assay has the potential to enhance the classification accuracy of tumour types and the identification of driver mutations involved in cancer progression. With Oxford Nanopore sequencing, native DNA reads of unrestricted length capture single nucleotide variants (SNVs), SVs, copy number variants (CNVs), short tandem repeats (STRs), and epigenetic modifications, including both 5mC and 5hmC in a single dataset. The unrestricted read lengths span complex and repetitive regions, thus simplifying phasing. Variants can be confidently assigned to the maternal or paternal chromosome, providing deeper resolution for the molecular characterisation of cancer research samples.

Here we present an end-to-end workflow to detect somatic variation between tumour-normal paired research samples using the PromethION sequencing device range.


Extraction: obtaining high-molecular-weight DNA

Selecting a suitable extraction method to obtain high-molecular-weight (HMW) DNA greatly depends on your sample type. There is a range of protocols available in the Community covering human clinical research samples, including brain tissue and blood. We highly recommend assessing DNA yield using a Qubit instrument, sample quality via a Nanodrop instrument, and DNA fragment length distribution using a fragment analyser before proceeding to library preparation.

View extraction protocol recommendations

Library preparation: selecting the right sequencing kit for your samples

When starting with HMW DNA, shearing and size selection can improve read length N50. We recommend the Oxford Nanopore Short Fragment Eliminator Kit to size select for longer fragments and the Diagenode Megaruptor 3 for light shearing. However, if it is not possible to isolate HMW DNA, or sample input amounts are limited, you can instead proceed straight to library preparation without size selection or fragmentation.

To prepare HMW DNA libraries for sequencing, we recommend the Ligation Sequencing Kit. Offering the greatest control over read length and output, this PCR-free library preparation method also preserves base modifications in the native DNA.

Find out more about library preparation

Schematic of the LSK library prep workflow

Sequencing: utilising high-output PromethION Flow Cells

For high-output sequencing, we recommend the powerful PromethION device range. This features the benchtop PromethION 24 — configured for sequencing on up to 24 independent PromethION Flow Cells — and the compact PromethION 2 devices, which enable sequencing on up to two flow cells, for lower sample throughput requirements.

To characterise SVs, SNVs, and methylation, we recommend sequencing normal tissue research samples to 30x depth of coverage and paired tumour research samples to 60x. We recommend basecalling using super accuracy (SUP) mode.

Output can be maximised by washing the flow cell using the Flow Cell Wash Kit and loading with fresh library every 24 hours.

Analysis: detecting somatic variants between tumour-normal samples

To identify somatic variants in the cancer genome via tumour-normal sequencing, we recommend using the analysis workflow wf-somatic-variation — an EPI2ME solution, which can be run with a simple point-and-click implementation or via the command line. This workflow takes the BAM file produced by onboard basecalling, aligns to a provided reference genome, then calls SNVs and small indels between paired samples using ClairS1, and SVs (>50 bp) using nanomonsv2. Methylation analysis of 5mC and 5hmC is performed using modkit3. The total time taken for somatic variant data analysis is between 6.5–10 hours.

View the tumour-normal open dataset

Find out more about tumour-normal sequencing
  1. Luo, R. et al. Nat. Mach. Intell. 2:220-227 (2022). DOI: https://doi.org/10.1038/s42256-020-0167-4
  2. Shiraishi, Y. et al. Nucleic Acids Res. gkad526 (2023). DOI: https://doi.org/10.1093/nar/gkad5262.
  3. GitHub. modkit. Available at: https://github.com/nanoporetech/modkit [Accessed: 14 July 2025]