Full-length transcript characterisation of SF3B1 mutation in chronic lymphocytic leukemia - Alison Tang

Chronic Lymphocytic Leukaemia (CLL) is the most common form of blood cancer in adults, with an estimated incidence of up to 5.5 per 100,000 people worldwide. Closing the Transcriptome breakout session, Alison Tang, from the University of California, Santa Cruz, described how the splicing factor gene SF3B1 is frequently mutated in CLL, and these mutations are associated with poor patient prognosis.

Traditional short-read sequencing technologies have allowed the identification of aberrant splicing across the transcriptome associated with SF3B1 mutations; however, the inherent challenges of short sequencing reads limit their application to the study of splice junctions rather than isoform-level analysis — limiting our understanding of the functional consequences of these aberrant splicing changes.

To overcome these challenges, Alison applied long-read nanopore sequencing to enable unambiguous identification and analysis of full-length transcripts. Using the high-yield, high-throughput PromethION platform, the team at University of California, Santa Cruz, sequenced the full transcriptome of CLL samples with and without the SF3B1 mutation (3 normal B cells; 3 CLL SF3B1WT; 3 CLL SF3B1K700E). In total, 149 million cDNA reads were obtained that passed QC metrics. Alison described how the team utilised FLAIR (Full-Length Alternative Isoform analysis of RNA) — a novel computational workflow — to identify high-confidence full-length transcripts.

Briefly, the raw sequencing reads are aligned to a human reference genome prior to correction using annotated splice junctions. The corrected reads are then grouped by their splice junction chain before each group is collapsed to form a consensus sequence for each individual transcript. Finally, the raw sequencing reads are reassigned to the collapsed isoforms. Isoforms that pass a given coverage threshold are retained and used as a high-confidence transcript reference.

The team developed a new splicing caller, FLAIR-diffSplice, that calls the four main types of alternative splicing events of alternative 3’ and 5’ splicing, intron retention, and exon skipping. This methodology identified the same alternative SF3B1 3’ splice site choice as detected using an alternative short-read sequencing-based analysis, confirming the validity of the approach. Focusing on intron retention, Alison commented that, even though in their study the nanopore sequencing reads were relatively short (approximately 1 kb), nanopore sequencing reads make intron retention ‘much more obvious’.

Due to the facility of nanopore technology to sequence full-length transcripts, Alison commented that ‘isoform productivity can be more confidently assessed’. The team defined an unproductive isoform as those that have a premature termination codon that is 55 nucleotides or more upstream of the 3’ most splice junction. Demonstrating the validity of nanopore sequencing-based productivity assessment, Alison shared data showing complete concordance with previous studies of the highly characterised isoforms of SRSF1.

Upon examining the productivity of transcripts from the mutant SF3B1 cell line, it was found that the expression of unproductive isoforms was decreased in comparison to the other cell types tested. Taking this a step further, it was shown that the down-regulated unproductive intron retention genes are associated with kinase signalling pathways, which, Alison suggested, may support tumour proliferation.

Summarising her work, Alison commented that nanopore sequencing, combined with the FLAIR analysis workflow, enabled the study of ‘differential isoform usage, coordinated splicing events, and isoform productivity prediction’.