Small RNA sequencing


Small RNA sequencing is a type of RNA sequencing based on the use of NGS technologies that allows to isolate and get information about noncoding RNA molecules in order to evaluate and discover new forms of small RNA and to predict their possible functions. By using this technique, it is possible to discriminate small RNAs from the larger RNA family to better understand their functions in the cell and in gene expression. Small RNA-Seq can analyze thousands of small RNA molecules with a high throughput and specificity. The greatest advantage of using RNA-seq is represented by the possibility of generating libraries of RNA fragments starting from the whole RNA content of a cell.

Introduction

Small RNAs are noncoding RNA molecules between 20 and 200 nucleotide in length, which are contained in the bigger class of ncRNAs. In particular, the term “small” refers to bacteria, while for eukaryotic cells the more general term “ncRNA” is used.. These molecules are characterized by the presence of a high density of STOP codons and very short ORF sequences: for this reason, they are not involved in coding, but they act as regulators of gene expression in the cell.
Small RNAs include several different classes of noncoding RNAs, depending on their sizes and functions: snRNA, snoRNA, scRNA, piRNA, miRNA and siRNA. Their functions go from RNAi, RNA processing and modification, gene silencing, epigenetics modifications modifications, protein stability and transport.
Small RNA "is unable to induce RNAi alone, and to accomplish the task it must form the core of the RNA–protein complex termed the RNA-induced silencing complex, specifically with Argonaute protein”.

Small RNA sequencing

Purification

This step is very critical and important for any molecular-based technique since it ensures that the Small RNA fragments found in the samples to be analyzed are characterized by a good level of purity and quality. There are different purification methods that can be used, based on the purposes of the experiment:
Once small RNAs have been isolated, it is important to quantify them and to evaluate the quality of the purification. There are two different methods to do this:
Many of the NGS sequencing protocols rely on the production of a genomic library that contains thousands of fragments of the target nucleic acids that will then be sequenced by proper technologies. According to the sequencing methods to be used, libraries can be created differently : generally, universal adapters A and B are ligated to the 5' and 3' ends of the RNA fragments thanks to the activity of the T4 RNA ligase 2 truncated. After the adapters are ligated to both ends of the small RNAs, retrotranscription occurs producing complementary DNA molecules which will be, eventually, amplified by different amplification techniques depending on the sequencing protocol that is being followed in order to obtain up to billions of amplicons to be sequenced. Besides the regular PCR mix, masking oligonucleotides targeting 5.8s rRNA are added to increase sensitivity to small RNA targets and to improve the amplification results. Caution has to be used, as RNA samples are prone to degradation, and further improvement of this technique should be oriented towards the elimination of adapter dimers.

Sequencing

Depending on the purpose of the analysis, RNA-seq can be performed using different approaches:
The final step regards analysis of data and storage: after obtaining the sequencing reads, UMI and index sequences are automatically removed from the reads and their quality is analyzed by PHRED ; reads can then be mapped or aligned to a reference genome in order to extract information about their similarity: reads having the same length, sequence and UMI are considered as equal and are removed from the hit list. Indeed, the number of different UMIs for a given small RNA sequence reflects its copy number.
The small RNAs are finally quantified by assigning molecules to transcript annotations from different databases .

Applications

Small RNA sequencing can be useful for: