Interview: Investigating genetic and epigenetic landscapes with long-read sequencing

Dr. Matthew Naish is a Postdoctoral Research Associate in Ian Henderson’s lab at Cambridge, where his research focuses on the assembly of Arabidopsis thaliana and the full resolution of all five centromeres. We caught up with Matthew to discuss his current research interests, how he came to work with Arabidopsis, and how long-read sequencing is allowing the genetic and epigenetic landscape of Arabidopsis thaliana to be investigated for the first time.

In addition to Matthew's interview, you can also find out more about his research by watching a webinar he recently presented on ‘Investigating genetic and epigenetic landscapes with long-read sequencing'.

Watch webinar

What are your current research interests?
My research interests are in understanding the structure and function of genomes and how this is impacted by the chromatin landscape found within a cell. Currently, I am working to assemble the missing centromere sequences and use these to investigate different aspects of centromeric function, its epigenetic determination and wider evolution – in relation to the process of meiotic recombination that shapes genetic variation in plant and animal genomes.

What first ignited your interest in plant genomics?
My Ph.D. was centred on cellular regeneration in plants as I found it fascinating that differentiated cells retained the ability to re-differentiate themselves to new cell types. This process involves the activation of suppressed gene networks and sweeping changes to the chromatin landscape. From my work in this system, I realised how genomics is foundational to understanding how these gene networks interact and fit together to address biological questions.

How is nanopore sequencing changing the field of plant biology? How has it benefitted your work?
In plants, many species have highly repetitive genomes or complex structures that are comprised of repetitive sequence blocks that cannot be resolved by short reads alone. Nanopore sequencing is allowing the more routine generation of accurate, chromosome-level assemblies from numerous species, and so the investigation of these lines can benefit from the advances in genomics that had previously been restricted to a few model species in the past. The read lengths possible with nanopore sequencing allow me to bridge these repetitive regions in the Arabidopsis centromeres – the ability to profile DNA modifications directly from sequencing has enabled me to profile the epigenetic landscape in these repetitive regions.

How will resolving all five centromeres of the Arabidopsis thaliana genome influence the study of this organism? What impact could it have on our understanding of centromere evolution?
Every eukaryotic chromosome has a centromere, allowing the faithful segregation of the genetic material from one generation to the next. However, the DNA and protein components of centromere formation are not well conserved and have been reported to evolve rapidly – known as the centromere paradox. In Arabidopsis, we have known the DNA repeat monomer associated with centromere formation for almost 40 years and have had a genome assembly since 2000, but the centromeric organization has remained a ‘black box’ and yet accounted for ~8-12% of the chromosome sequence. I hope that by resolving these regions and exposing the diversity within these repeat sequences, we can understand the relationship between the genetic and epigenetic factors leading to successful centromere formation. I hope that we can then use this base and wide variety of genomic tools available in the Arabidopsis model system to study how centromeric regions evolve and resolve the centromere paradox.

What have been the main challenges in your work and how have you approached them?
The main challenge with this technology is the requirement to develop new methodological approaches to purify high-molecular weight DNA suitable to generate the very long, nucleic acid molecules needed to span across these repetitive regions without compromising the throughput of the flow cell. I have mainly focused on developing these approaches to deliver larger molecules to the surface of the flow cell, so this information can be captured.

What’s next for your research?
In the future, my aim is to assemble the centromeric regions of other Arabidopsis accessions to help understand the wider diversity in centromere structure. I can then use this diversity to investigate signatures of meiotic recombination to model how these regions may evolve. In addition, I will use the advancing methods of direct DNA modification detection using nanopore sequencing, genomic, and technical approaches to profile the specialised chromatin states within these centromeric arrays.