Kimberley Billingsley and Pilar Alvarez Jerez from the long-read sequencing team at the Center for Alzheimer’s and Related Dementias (CARD), National Institutes of Health, recently presented their work on using nanopore sequencing for Alzheimer’s disease research in webinar ‘Scalable nanopore sequencing for Alzheimer’s research’.
We caught up with them to discuss their current research interests, what led them to focus on neurodegenerative disorders, and how nanopore sequencing is helping our understanding of structural variation in Alzheimer’s and related dementias.
What are your current research interests?
We are interested in identifying associations between genetic variation and disease. It is clear that complex structural variation can affect neurodegenerative diseases, especially in the rare causal variant space. But little is known about common complex structural variation in the genome, in relation to neurodegenerative diseases. Our main research interests are generating large-scale genomic resources to identify rare variants causing disease and common variations contributing to disease, and of course, making all this data available to the research community in an easy-to-access manner.
What first ignited your interest in genomics and what led you to focus on neurodegenerative disorders such as Alzheimer’s and Parkinson’s disease?
Genomics is cool! Personally, I [Kimberley] became interested in the genetics of neurological disease during my master’s degree when I listened to a talk about transposable elements, which are these little bits of DNA that can copy and paste themselves into the genome. After that I was hooked and I’ve been working on characterising difficult parts of the genome ever since.
How is nanopore sequencing changing our understanding of structural variation in Alzheimer’s and dementia? How has it benefited your work?
Over the last few years, we’ve spent a lot of time and resources generating and analysing short-read sequencing data. But, unfortunately, when we’ve looked for structural variants associated with disease in these datasets, because of the nature of the technology, we have found a high number of false positives. Oxford Nanopore’s technology is changing the field because we can now sequence very long stretches of DNA and with these longer reads we can better detect disease-associated variants. Now that we have developed a scalable wet lab and computational pipeline that allows us to generate highly accurate variant calls, methylation calls, and de novo assemblies in thousands of samples, we are really looking forward to applying this framework to disease cohorts to start to make new genetic discoveries.
What impact is the ability to get epigenetic information from your nanopore data having on your research, and what insights could it offer?
Getting epigenetic information from our nanopore whole-genome data allows us to identify structural variants and investigate their role on methylation differences simultaneously, therefore speeding up the process by which we can characterise areas of the genome harboring genetic events. This is further aided by the high correlation of Oxford Nanopore data to methylation calls made by standard whole-genome bisulfite sequencing (WGBS). However, unlike WGBS, nanopore sequencing allows us to investigate epigenetic information at a haplotype-level resolution. This allows for a more detailed exploration of methylation in regions around heterozygous structural variants.
What impact does the ability to generate phased variant calls and assemblies from long-read sequences have for researchers?
The ability to generate phased variant calls and assemblies from our nanopore long reads facilitates the characterisation of complex regions that contain multiple small and structural variants on both haplotypes. The ability to know with high certainty how variants are positioned within the two alleles can lead to a better understanding of variant linkage and effect on expression.
What have been the main challenges in your work and how have you approached them, especially working on high-throughput projects with large cohorts?
Scaling up is always complex. Prepping and sequencing a hundred brain samples is doable with a few skilled lab hands, but if we scale this up to hundreds, and potentially thousands of samples, this process needs to be automated with robotics. Stability and harmonisation are key here, like developing any new technology (nanopore sequencing) and robotics systems, there were challenges and a lot of developments were needed. But after a long optimisation period, we think we have a good setup now and a clear path forward for scaling up.
What’s next for your research?
We’ve just finished sequencing our first brain cohort of around 250 samples, which is our ‘in house’ cohort that mainly consists of control samples of European ancestry. We started with this cohort mostly because we had a lot of tissue readily available in the lab (which sped up the process of developing the protocols and making it scalable) and another nice bonus was that we had already generated lots of omics datasets for this cohort. Now that we have established scalable nanopore sequencing at CARD, we are looking forward to sequencing samples from diverse populations — we are currently sequencing samples from African American ancestry and then we will begin sequencing brain samples from a Columbian brain bank.