Single Cell Analysis | Industry Spotlights & Insight Articles

AI Tool Could Predict Single Cell RNA Sequencing and Expression

A new AI model - scGPT - improves single cell RNA sequencing by predicting gene expression in individual cells, although it may be some time before it enters general use.

Single cell RNA sequencing (scRNA-seq) could be augmented by the integration of AI to improve scRNA-seq prediction of RNA transcripts.

While scRNA-seq as a sequencing approach enables the visualisation of differential gene expressions at a higher resolution, the approach can generate data with a degree of both variability and background noise.

Now, a new approach that pioneers the integration of machine learning tools to overcome these technical challenges is being introduced.

Bo Wang, a Computational Biologist at the University of Toronto, constructed a new artificial intelligence (AI) model with a team of computer scientists and cell biologists. 

The AI model - a single-cell generative pretrained transformer (scGPT) - can be fine-tuned to run a diverse range of tasks and prompts using scRNA-seq data. 

Wang explained in The Scientist that the AI's core model could be built upon and tweaked into distinct versions that carry out a range of downstream tasks, predicting the expression levels of genes in a cell.

The model approach involved the construction of a single-cell foundation model through generative pre-training carried out on over 10 million cells. 

By employing a single base model to perform many downstream tasks, scGPT avoids encountering issues associated with misalignment that can arise when multiple computational models are used to carry out different tasks. 

Augmenting Single Cell RNA Sequencing with Machine Learning

In recent years, scRNA-seq technology has become the state-of-the-art approach for interpreting the heterogeneity and complexity of RNA transcripts within individualised cells.

The approach reveals the composition of different cell types, placing more emphasis on measuring the genomes of individual cells from a given population. 

Through measuring the RNA molecules within each cell of a given sample, scRNA-seq provides a snapshot of the transcriptome when the cells were harvested. 

RELATED:

Although the findings from the University of Toronto have huge potential, there are considerations that should be taken into account before the approach is unilaterally introduced. 

For one, the models contain millions of parameters and require a huge volume of data to train: this means they consume a lot of energy, leaving a significant carbon footprint. 

Researchers involved in using the technology will also require a degree of familiarity with machine learning, leaving some ambiguity as to how widely scGPT could be adopted among cell biologists. 

Since its development, scRNA-seq has found a large number of applications. 

Although the final product of a gene's expression is its protein, detecting its messenger RNA indicates the gene is turned on and, therefore, has the potential to be subsequently translated and expressed. 

Get your regular dose of industry news and announcements here, or head over to our Omics portal to catch up with the latest advances in tumour analysis. To learn more about our upcoming NextGen Omics UK conference in London, click here to download an agenda or register your interest.