Paired-end tag

Paired-end tags are the short sequences at the 5’ and 3’ ends of a DNA fragment which are unique enough that they exist together only once in a genome, therefore making the sequence of the DNA in between them available upon search or upon further sequencing. Paired-end tags exist in PET libraries with the intervening DNA absent, that is, a PET "represents" a larger fragment of genomic or cDNA by consisting of a short 5' linker sequence, a short 5' sequence tag, a short 3' sequence tag, and a short 3' linker sequence. It was shown conceptually that 13 base pairs are sufficient to map tags uniquely. However, longer sequences are more practical for mapping reads uniquely. The endonucleases used to produce PETs give longer tags but sequences of 50–100 base pairs would be optimal for both mapping and cost efficiency. After extracting the PETs from many DNA fragments, they are linked together for efficient sequencing. On average, 20–30 tags could be sequenced with the Sanger method, which has a longer read length. Since the tag sequences are short, individual PETs are well suited for next-generation sequencing that has short read lengths and higher throughput. The main advantages of PET sequencing are its reduced cost by sequencing only short fragments, detection of structural variants in the genome, and increased specificity when aligning back to the genome compared to single tags, which involves only one end of the DNA fragment.

Constructing the PET library

PET libraries are typically prepared in two general methods: cloning based and cloning-free based.

Cloning based

Fragmented genomic DNA or complementary DNA of interest is cloned into plasmid vectors. The cloning sites are flanked with adaptor sequences that contain restriction sites for endonucleases. Inserts are ligated to the plasmid vectors and individual vectors are then transformed into E. coli making the PET library. PET sequences are obtained by purifying plasmid and digesting with specific endonuclease leaving two short sequences on the ends of the vectors. Under intramolecular conditions, vectors are re-circularized and ligated, leaving only the ditags in the vector. The sequences unique to the clone are now paired together. Depending on the next-generation sequencing technique, PET sequences can be left singular, dimerized, or concatenated into long chains.

Cloning-free based

Instead of cloning, adaptors containing the endonuclease sequence are ligated to the ends of fragmented genomic DNA or cDNA. The molecules are then self-circularized and digested with endonuclease, releasing the PET. Before sequencing, these PETs are ligated to adaptors to which PCR primers anneal for amplification.
The advantage of cloning based construction of the library is that it maintains the fragments or cDNA intact for future use. However, the construction process is much longer than the cloning-free method. Variations on library construction have been produced by next-generation sequencing companies to suit their respective technologies.

Endonucleases

Unlike other endonucleases, the MmeI and EcoP15I restriction endonucleases cut downstream of their target binding sites. MmeI cuts 18/20 base pairs downstream and EcoP15I cuts 25/27 base pairs downstream. As these restriction enzymes bind at their target sequences located in the adaptors, they cut and release vectors that contain short sequences of the fragment or cDNA ligated to them, producing PETs.

PET applications

DNA-PET: Because PET represent connectivity between the tags, the use of PET in genome re-sequencing has advantages over the use of single reads. This application is called pairwise end sequencing, known colloquially as double-barrel shotgun sequencing. Anchoring one half of the pair uniquely to a single location in the genome allows mapping of the other half that is ambiguous. Ambiguous reads are those that map to more than a single location. This increased efficiency reduces the cost of sequencing as these ambiguous sequences, or reads, would normally be discarded. The connectivity of PET sequences also allows detection of structural variations: insertions, deletions, duplications, inversions, translocations. During the construction of the PET library, the fragments can be selected to all be of a certain size. After mapping, the PET sequences are thus expected to be consistently a particular distance away from each other. A discrepancy from this distance indicates a structural variation between the PET sequences. For example : a deletion in the sequenced genome will have reads that map further away than expected in the reference genome as the reference genome will have a segment of DNA that is not present in the sequenced genome.
ChIP-PET: The combined use of chromatin immunoprecipitation and PET is used to detect regions of DNA bound by a protein of interest. ChIP-PET has the advantage over single read sequencing by reducing ambiguity of the reads generated. The advantage over chip hybridization is that hybridization tiling arrays do not have the statistical sensitivity that sequence reads have. However, ChIP-PET, ChIP-Seq and ChIP-chip have all been highly successful.
ChIA-PET: The application of PET sequencing on chromatin interaction analysis. It is a genome-wide strategy for finding de novo long-range interactions between DNA elements bound by protein factors. The first ChIA-PET was developed by Fullwood et al.. to generate a map of the interactions between chromatin bound by oestrogen receptor α in oestrogen-treated human breast adenocarcinoma cells. ChIA-PET is an unbiased way to analyze interactions and higher-order chromatin structures because it can detect interactions between unknown DNA elements. In contrast, 3C and 4C methods are used to detect interactions involving a specific target region in the genome. ChIA-PET is similar to finding fusion genes through RNA-PET in that the paired tags map to different regions in the genome. However, ChIA-PET involves artificial ligations between different DNA fragments located at different genomic regions, rather than naturally occurring fusion between two genomic regions as in RNA-PET.
RNA-PET: This application is used for studying the transcriptome: transcripts, gene structures, and gene expressions. The PET library is generated using full length cDNAs, so the ditags represent the 5’ capped and the 3’ polyA tail signatures of individual transcripts. Therefore, RNA-PET is especially useful for demarcating the boundaries of transcription units. This will help identify alternative transcription start sites and polyadenylation sites of genes. RNA-PET could also be used to detect fusion genes and trans-splicing, but further experiment is needed to distinguish between them. Other methods of finding the boundaries of transcripts include the single-tag strategies CAGE, SAGE, and the most recent SuperSAGE, with the CAGE and 5’ SAGE defining the transcription start sites and the 3’ SAGE defining the polyadenylation sites. The advantages of PET sequencing over these methods are that PET identify both ends of the transcripts and, at the same time, provide more specificity when mapping back to the genome. Sequencing the cDNAs can reveal the structures of transcripts in great details, but this approach is much more expensive than RNA-PET sequencing, especially for characterizing the whole transcriptome. The major limitation of RNA-PET is the lack of information regarding the organization of the internal exons of transcripts. Therefore, RNA-PET is not suitable for detecting alternative splicing. In addition, if the cloning procedure is used construct the cDNA library before generating the PETs, cDNAs that are difficult to clone would have lower coverage. Similarly, transcripts with low expression levels would likely be under-represented as well.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...