What’s missing matters
The genome reference assembly produced by the Telomere-to-Telomere (T2T) consortium includes regions that have never been represented in prior references. This 8% of the genome encompasses genomic elements such as segmental duplications, repeat expansions, and pseudogenes that will be important not just for understanding biology, but also for clinical use in the future.
‘What we have learned from the T2T assembly—and from many efforts in the past 20 years to resolve genomic regions that proved impenetrable to conventional sequencing technologies—is that these missing elements matter’, Sissel wrote.
The new capabilities needed to access this 8% of the genome largely came from long-read sequencing technologies, including long nanopore reads. With extremely long sequence reads, it is possible to capture genomic regions that could not be accurately sequenced and assembled using short reads.
Rare disease examples
By applying long nanopore reads to rare disease research, scientists have uncovered previously inaccessible regions of the genome that appear to be relevant for understanding these diseases.
‘Scientists have made stunning progress in identifying the genetic causes of rare diseases with long-read data, increasing the diagnostic yield compared with earlier approaches such as whole exome sequencing or even whole genome sequencing based solely on short-read data’, Sissel noted.
The article includes several examples of these research studies, with a focus on two particular cases that were recently published. One came from a team at Stanford University, where nanopore sequencing was used to support an ultra-rapid pipeline. For one sample, this approach was able to identify the candidate variant responsible for the phenotype in less than eight hours. Another study was published by scientists at the University of Washington who used targeted nanopore sequencing to analyse samples for dozens of people with rare diseases. This data not only recapitulated the variants previously identified in those samples, but also found candidate variants for several samples for which there had been no previous answers.
A new mandate
In this article, Sissel points out that these studies raise the bar for all scientists performing rare disease research. ‘Now that we know just how important these tough-to-access genomic regions can be, it should no longer be considered acceptable to rely solely on short-read data for any study related to rare disease’, she noted. ‘Supplementing short-read data with rich long-read or long-range information, either from the same or different sequencing platforms, will be essential going forward’.
The rule of thumb for these projects is simple, Sissel added. Any approach to characterising genomes for rare disease analysis should make it possible to produce comprehensive data for every genomic region of interest, even the ones that have proven challenging in the past.