NCM 2021: The genetic and epigenetic landscape of the Arabidopsis centromeres

It’s only been since ‘the advent of long-read sequencing’ that the centromeres can be studied in detail; previously, these regions have been too repetitive to sequence using short-read sequencing. Centromeres are at the heart of the central kinetochore complex of the chromosome, which binds to the microtubules, allowing the chromosome to pull apart during cell division. The centromere has a conserved function but is one the fastest evolving and most diverse parts of the genome. Ian and his group investigated Arabidopsis thaliana centromeres and found a ‘striking suppression of recombination around the centromere, as well as enrichment of epigenetic marks, like DNA methylation’. The Arabidopsis genome was sequenced in 2000, but the centromeres have remained unassembled due to high repetition. Long-read sequencing using nanopore technology revealed that the centromeres consist of approximately 2 Mb of repeats and genome assembly continues to reveal new information about the architecture of the centromeres. A dotplot of the five Arabidopsis centromeres show that they have private satellite libraries. Another interesting feature to be revealed is the presence of satellite higher order repeats that could help in the study of recombination pathways. The centromere assemblies have also revealed that they are invaded by a particular type of retrotransposon. Ian ends his talk by highlighting other features of nanopore sequencing that have been ‘really powerful into our studies of the centromeres’ and describes ‘one fun experiment’ that was performed, which was ‘to take ChIP data and put it straight through a nanopore … and this highlights the sort of novel approaches that are possible with nanopore technology’.

Authors: Ian Henderson