Interview: Improving structural variant interpretation for hereditary cancer susceptibility

Accurate molecular diagnosis of cancer-causing germline variants enables increased screening, early detection, prevention, and optimal treatment, if cancer does arise in predisposed patients and their family members. However, structural variants (SV’s) can be difficult to characterise using short-read sequencing – in their recent Nature paper, Thibodeau, O’Neill, Dixon et al, used Oxford Nanopore sequencing to resolve multiple germline SV’s whose impact on gene expression or function could not be fully determined through short-read sequencing.

We caught up with the three co-first authors of the paper, Katherine Dixon, My Linh Thibodeau and Kieran O’Neill, to discuss their collaboration, how they became interested in genomics, and the impact long-read sequencing is having on their research.

Dr Katherine Dixon, Dr Kieran O’Neill and Dr My Linh Thibodeau will be discussing their work further in a webinar called ‘Improving structural variant interpretation for hereditary cancer susceptibility’ with Nature Research on Thursday 8th October (8am PDT, 11am EDT, 4pm BST, 5pm CEST).

How did you find it working together on this project?

Kieran: When we were doing this project, I was a Staff Scientist at the Genome Science Center doing bioinformatics stuff, Katie was co-supervised in the lab, and My Linh was doing her Master’s degree. It was a great collaboration.

My Linh: The three of us have such complementary expertise. I really loved working at the Genome Sciences Center and I will certainly remain part of the team despite changing institutions, as I am a co-investigator for a grant funder for characterisation of structural variants through sequencing which I expect Katie, and potentially Kieran, will be working on.

Can you tell us about your current research interests?

Kieran: I work as a Bioinformatics Process Development Coordinator - we try out all the new tools and technologies to figure out their strengths and weaknesses to decide which to run as a service. Because we do a lot of sequencing for research and the clinic, we work on the validation that you need to get accreditation.

My Linh: I recently finished my Medical Genetics and Genomics residency training and my Master’s degree in bioinformatics in Vancouver. I have now taken a position as a medical geneticist in the Division of Clinical and Metabolic Genetics at the Hospital for Sick Children (SickKids) in Toronto. I have numerous research interests, including hereditary cancer predisposition and germline structural variation. At SickKids, I will be working with clinicians, researchers, patients and families, and I am hoping to investigate the use of long-read sequencing in the pediatric setting.

Katherine: I’ve just completed my PhD and will now be transitioning into a post doc position. I’ll continue to be involved with long-read sequencing, such as assay development and how we can use long-read sequencing for molecular diagnosis within a hereditary cancer setting, in particular looking at targeted approaches.

What ignited your interest in genomics and cancer research?

My Linh: I find research fascinating - I love investigating mysteries and misunderstood processes. I also believe Medical Genetics is a medical specialty progressing incredibly fast. I wanted to be at the forefront of genomic technologies and computational tools helping to diagnose and assess patients with genetic conditions. Having a combined training allows me to link genotype and phenotype together and see the usefulness of these technologies in real time, to anticipate what additional help we can provide to our patients and their families.

Katherine: There are a few people in my life who unfortunately developed disorders we do not fully understand, and knowing that this research is about a person, we can look for ways to find answers for them. Even if it doesn’t lead to a treatment or cure for their disease, improving quality of life or answering some questions about why this is happening is important.

Kieran, you’re the bioinformatician in the team – how did you get into the analysis side of things?

Kieran: In my last year of high school I decided I wanted to do a degree in molecular biology and aimed to Major in biochemistry. I took the whole computer science first year modules and really enjoyed it – I then discovered that bioinformatics was a thing around halfway through the year, but this was in the early 2000’s and there weren’t any bioinformatic degrees at my university available, so I built my own. Learning new things and having interesting problems to solve is great, as well as being based in a cancer research center solving problems that actually might improve people’s treatment and outcomes.

And how did you first come to use Oxford Nanopore sequencing in your work?

Kieran: Well, to begin with it was because we got told to by our boss! But it is very exciting and has very obvious benefits, as we have shown with SV calling, whilst getting methylation and phasing on top of that. We started with a few MinION’s around two years ago but found that we weren’t getting enough coverage for the projects we were doing, and so last August we got PromethION 24. The cost is much cheaper due to the scale of sequencing we are doing, and has really taken off - our director of ‘everything’ is very excited about it. We are aiming to do 120 flow cells a year, and in practice this could be more. We are a medium sized sequencing center and have currently sequenced around two petabases.

How would you say long-read sequencing technology benefitted or influenced your work?

My Linh: The main advantage of long-read sequencing is facilitating structural variant interpretation. NGS short read sequencing results in a probabilistic interpretation and we make a deduction or inference of the most likely structural variant based on our analysis – but in many cases, uncertainty on genome structure remains. Long-read sequencing by virtue of the longer reads and alignment is much more likely to encompass two breakpoints or more and provide that certainty. From a research perspective, I am very interested in structural variation because I feel strongly that some patients we see in the clinic have mendelian disorders but not molecular cause identified despite comprehensive genetic testing. There is a blind spot in our current clinical tests, and I think a lot of research will be necessary to investigate these patients with new technologies and eventually, translate those technologies and discoveries into the clinic.

Katherine: My background is molecular and cellular biology, so I am a little biased in terms of the application of long-read sequencing, not only for genome sequencing but also for epigenetic variation and RNA modifications, as these are not typically looked at. When we look at the reference genome, one of the biggest limitations is that there are a lot of populations that are underrepresented, particularly people of colour. There is a huge potential application of nanopore sequencing to try to understand what ‘normal’ variation looks like in different people. This could have clinical implications, especially when looking for variants that can be potentially pathogenic or disease causing.

My Linh, as a clinician, what impact do you think being able to identify germline variants could have on your patients?

My Linh: In our study, we used long-read sequencing to explore selected SV’s detected via short read sequencing and for which had relatively high confidence of true positives. However, this approach misses SV’s missed by short-read sequencing, so I believe we are likely underestimating the genetic contribution of SV’s to Mendelian disorders. Molecular diagnosis is necessary for optimal management guidelines for that condition, and also counselling about occurrence risk in future pregnancy and in the rest of the family. Additional technologies will help in increasing diagnostic yield.

Sequencing technology has been changing and developing rapidly over the last few years - how do you see your field changing in the future?

Kieran: I think long-read sequencing will become a lot more routine, and we are going to see a lot more of it. It will be interesting to see how short-read balances out with long-read sequencing - Illumina really have had a monopoly on short-read sequencing, and now MGI is on the scene - we have an MGI sequencer, so it will be interesting to see in the future what we look like in 5 years’ time, but seems likely that nanopore will be a big part of that.

Katherine: I agree with Kieran, nanopore sequencing has so many potential applications. It will be fascinating to see how it is used in complement and more efficiently, in both research and the clinic. It would be great if we could adapt some of these approaches for understanding how diseases are caused, as well as diagnosing patients.

What have been the main challenges in your research, and how you have approached them?

Kieran: One challenge that is more for the lab to solve, has been getting better consistency from the instruments and across a range of libraries. If we have fresh DNA, we are fairly confident we can reliably get a good PromethION run. But if we have archival DNA, yield is very variable and n50 is all over the place. That is challenging going down the line, because if you don’t know whether you’re going to get 150 Gb or 30 Gb out of a run, then you need to figure out what the analysis is going to be and how to mitigate that. But there is a great community of bioinformatics tools available to do nanopore sequencing - there are also opportunities for new tool developments, and we have trainees and research staff working on that so by in large that is all readily available.

Katherine: When we are looking at the genome sequence and trying to interpret the impact of structural variants, especially in the non-coding region of the genome. From a clinical molecular perspective, that is one challenge that is ongoing, and one reason why investigating the functional impacts of genetic variants, such as effects on transcription or the epigenome, is so important.

My Linh: Indeed, reiterating what Katie mentioned about the difficulty of interpreting SV’s; another related caveat is the fact that most of the time, we do complementary testing, such as long-read sequencing, in a retrospective fashion. The challenge will be to offer this testing as a prospective assay and for families we believe have a high likelihood of Mendelian diagnosis.

And finally, what is your advice for someone just starting out in research?

Kieran: A really important skill to learn is when to say no. There are so many exciting things to work on but there are only so many hours in the day.

My Linh: First, know when to let go of a project and step away - accept it, grieve, and then move on. The second is around communication - I have seen people working two metres from each other, working on similar things, having similar problems, but not communicating. Don’t be afraid to ask questions and ask for help - collaboration and communication will improve your work and lots of peoples work.

Katherine: Stay open-minded and think outside of the box, especially when things are going wrong!  Be open-minded to learn from different people as learning and understanding from the experience of other people will make you a more well-rounded.

Thanks to My Linh, Kieran and Katherine for chatting to us, make sure you catch their webinar on Thursday!