Revealing hidden biology with isoform-level single-cell transcriptomics
- Home
- Resource Centre
- Revealing hidden biology with isoform-level single-cell transcriptomics
Overcoming the challenges of single-cell sequencing
To enable specialised functions, cells within an organism utilise the genome in different ways. Sequencing the transcriptome of each individual cell is an important step in revealing how this is achieved and is ‘indispensable for understanding the underlying mechanisms of splicing and gene regulation’1.
Short-read sequencing has previously been used to characterise specific transcriptomic differences in single cells; however, the technology is limited in its ability to quantify RNA transcript isoforms because the transcripts must be fragmented, with just the 5’ or 3’ end being sequenced. In contrast, long nanopore sequencing reads can span complete transcripts, revealing full isoform diversity and enabling comprehensive isoform-level expression analysis in single cells.
Wang et al. from Baylor College of Medicine, USA, used nanopore sequencing to compile the first comprehensive characterisation of full-length transcript isoforms in individual mouse retinal cells1. The mouse retina is composed of over 130 unique cell types with their own distinctive transcriptomic profile, produced through alternative splicing of pre-mRNA. A comprehensive understanding of the RNA isoforms, splicing events, and the differential expression pattern at the single-cell level could be crucial in predicting the effect of genetic variants in retinal disorders.
‘One of the key advantages of long read [sequencing] is the improved ability of detecting transcript isoforms’1
Generating novel insights through full-length reads
Approximately 30,000 mouse retinal cells — two samples from wild-type retinas and two samples enriched in amacrine (AC) and bipolar cells (BC) — were profiled. The team reported high concordance (<98%) between short-read and nanopore datasets when comparing cell class assignments, although nanopore data ‘identified an additional [bipolar cell] that was missed in the short-read data’.
Using a PromethION device, 1.4 billion long nanopore reads were analysed, alongside 1.54 billion short reads. When sequenced at similar depths, both short-read and the nanopore datasets exhibited ‘comparable sensitivity and high concordance in cell identification, clustering, and annotation’. The team further observed that nanopore sequencing ‘excelled in the precise identification of transcript isoforms’, with the median read length of approximately 1,000 nucleotides corresponding to the average size of full-length transcripts. The team identified 44,325 transcript isoforms, with approximately 40% being novel isoforms, which tended to be expressed at lower levels. It was suggested that this may be why these isoforms were undetected in previous studies and it was emphasised that single-cell nanopore sequencing ‘greatly increased the number of isoforms detected’.
Many genes were reported to display ‘varying patterns of isoform usage among different cell classes and subclasses’. Of the 44,325 transcript isoforms, 7,383 were pinpointed as cell-class specific, many of which were novel. Using long nanopore reads, the team discovered a ‘common pattern’ where the major retinal cell classes expressed a ‘combination of diverse isoforms rather than a single canonical isoform’, while ‘intricate splicing variations’ between the two most abundant isoforms of a gene were frequently observed. However, similar to Aguzzoli Heberle et al. who also used long nanopore sequencing reads to identify diverse isoforms2, Wang and colleagues found that it is not possible to identify cell-class-specific genes based solely on gene expression levels since the retinal cells expressed different isoforms even when their overall gene expression levels were not significantly different.
'While transcript isoforms are often shared across various cell types, their relative abundance shows considerable cell-type-specific variation’1
Uncovering hidden variation missed by legacy technologies
Highlighting how the limitations of short-read sequencing has caused important biology to be missed, the team explained that using short-read technology it is ‘necessary to rely on alterations in specific exons or splice junctions’, leading to ‘suboptimal isoform reconstruction’1. Single-cell nanopore sequencing comprehensively resolved the full-length transcriptome of the mouse retina, and in doing so uncovered some surprisingly complex gene fusion events. Within the mouse retinal cells, 1,055 intrachromosomal gene fusion transcripts were detected. Interestingly, although all fusions partners were on the same chromosome, they were not necessarily immediately adjacent to each other. Furthermore, some fusions exhibited alternative splicing. The researchers suggested that these findings highlight the complexity and flexibility of gene fusion events in the context of single-cell RNA sequencing data, and may present novel, clinically relevant insights.
‘the long-read sequencing approach demonstrated its reliability in detecting single-cell transcriptomes … making it plausible to use ... exclusively for single cell RNA-seq in the future’1
The authors concluded that ‘The integration of long-read sequencing with single-cell sequencing techniques holds the promise of filling the existing gaps in isoform information’. They anticipate that their ‘comprehensive atlas of full-length transcript isoforms’ mapped to individual mouse retinal cells will provide an ‘invaluable resource for the community’1 and could help towards elucidating mechanisms of retinal disease.
This case study was taken from the RNA sequencing white paper.
Wang, M. et al. Integrating short-read and long-read single-cell RNA sequencing for comprehensive transcriptome profiling in mouse retina. bioRxiv 581234 (2024). DOI: https://doi.org/10.1101/2024.02.20.581234
Aguzzoli Heberle, B. et al. Mapping medically relevant RNA isoform diversity in the aged human frontal cortex with deep long-read RNA-seq. Nat. Biotechnol. 10.1038/s41587-024-02245-9 (2024). DOI: https://doi.org/10.1038/s41587-024-02245-9