ChIA-PET
Chromatin Interaction Analysis by Paired-End Tag Sequencing is a technique that incorporates chromatin immunoprecipitation -based enrichment, chromatin proximity ligation, Paired-End Tags, and High-throughput sequencing to determine de novo long-range chromatin interactions genome-wide.
Genes can be regulated by regions far from the promoter such as regulatory elements, insulators and boundary elements, and transcription-factor binding sites. Uncovering the interplay between regulatory regions and gene coding regions is essential for understanding the mechanisms governing gene regulation in health and disease. ChIA-PET can be used to identify unique, functional chromatin interactions between distal and proximal regulatory transcription-factor binding sites and the promoters of the genes they interact with.
ChIA-PET can also be used to unravel the mechanisms of genome control during processes such as cell differentiation, proliferation, and development. By creating ChIA-PET interactome maps for DNA-binding regulatory proteins and promoter regions, we can better identify unique targets for therapeutic intervention.
Methodology
The ChIA-PET method combines ChIP-based methods, and Chromosome conformation capture, to extend the capabilities of both approaches. ChIP-Sequencing is a popular method used to identify TFBS while 3C has been used to identify long-range chromatin interactions. However, both suffer from limitations when used independently to identify de-novo long-range interactions genome wide. While ChIP-Seq is typically used for genome-wide identification of TFBS, it provides only linear information of protein binding sites along the chromosomes, and can suffer from high genomic background noise.While 3C is capable of analyzing long-range chromatin interactions, it cannot be used genome
wide and, like ChIP-Seq, also suffers from high levels of background noise. Since the noise increases in relation to the distance between interacting regions, laborious and tedious controls are required for accurate characterization of chromatin interactions.
The ChIA-PET method successfully resolves the issues of non-specific interaction noise found in ChIP-Seq by sonicating the ChIP fragments in order to separate random attachments from specific interaction complexes. The next step, which is referred to as enrichment, reduces complexity for genome-wide analysis and adds specificity to chromatin interactions bound by pre-determined TFs.
The ability of 3C approaches to identify long-range interactions is based on the theory of proximity ligation. In regards to DNA inter-ligation, fragments that are tethered by common protein complexes have greater kinetic advantages under dilute conditions, than those freely diffusing in solution or anchored in different complexes. ChIA-PET takes advantage of this concept by incorporating linker sequences onto the free ends of the DNA fragments tethered to the protein complexes. In order to build connectivity of the fragments tethered by regulatory complexes, the linker sequences are ligated during nuclear proximity ligation.
Therefore, the products of linker-connected ligation can be analyzed by ultra-high-throughput PET sequencing and mapped to the reference genome. Since ChIA-PET is not dependent on specific sites for detection as 3C and 4C are, it allows unbiased, genome-wide de-novo detection of chromatin interactions.
Workflow
Wet-lab portion of the workflow:- Formaldehyde is used to cross-link the DNA-protein complexes. Sonication is used to break-up the chromatin and also to reduce non-specific interactions.
- A specific antibody of choice is used to enrich protein-of-interest–bound chromatin fragments. ChIP material bound by the antibody are used to construct the ChIA-PET.
- Figure 1. Biotinylated oligonucleotide half-linkers containing flanking MmeI sites are used to connect proximity ligated DNA fragments. Two different linkers are designed with specific nucleotide barcodes for each of the two linker sequences.
- Figure 2. The linkers are ligated to the tethered DNA fragments.
- Figure 3. The linker fragments are ligated on the ChIP beads under dilute conditions. The purified DNA is then digested by MmeI, which cuts at a distance from its recognition site to release the tag-linker-tag structure.
- Figure 4. The biotinylated PETs are then immobilized on streptavidin-conjugated magnetic beads.
- Figure 5. PET sequences with AA and BB linker barcode composition are considered to be possible intra-complex ligation products, while the PET sequences with AB linker composition are considered to be derived from chimeric ligation products between DNA fragments bounded in different chromatin complexes.
PET extraction, mapping, and statistical analyses
The PET tags are extracted and mapped to the reference human genome in silico.
Identification of ChIP enriched peaks
Self-ligated PET are used for identifying ChIP enriched sites because they provide the most reliable mapping to the reference genome.
ChIP enrichment peak-finding algorithm
A called peak is considered a binding site if there are multiple overlapping self-ligated PETs.
The false discovery rate is determined using statistical simulations to estimate the random background of PET-derived virtual DNA overlaps, and the estimated background noise.
Filtering of repetitive DNA
Satellite regions and binding sites present in regions with severe structural variations are removed.
ChIP enrichment count
The numbers of self-ligation and inter-ligation PETs are reported at each site. The total number of self-ligated and inter-ligated PETs at a specific site is called the ChIP enrichment count.
Figure 6. PET Classification: Uniquely aligned PET sequences can be classified by whether they are derived from one DNA fragment or two DNA fragments.
- Self-ligation PETs
- Inter-ligation PETs
- Intrachromosomal inter-ligation PETs
- Interchromosomal inter-ligation PETs
Figure 7. Proposed mechanism showing how distal regulatory elements can initiate long-range chromatin interactions involving promoter regions of target genes.
The interactions form DNA loop structures with multiple TFBS at the anchoring center. Small loops might package genes near the anchoring center in a tight sub-compartment, which could increase the local concentration of regulatory proteins for enhanced transcriptional activation. This mechanism might also enhance transcription efficiency, allowing RNA pol II to cycle the tight circular gene templates. The large interaction loops are more likely to link together distant genes at either end of the loop residing near anchor sites for coordinated regulation, or could separate genes in long loops to prevent their activation. Adapted from Fullwood et al..
Strengths and weaknesses
Advantages of the ChIA-PET method- ChIA-PET has a potential to be an unbiased, whole-genome and de-novo approach for long-range chromatin interaction analysis.
- A ChIA-PET experiment is capable of providing two global datasets: The protein factor binding sites ; and The interactions between the binding sites.
- ChIA-PET involves ChIP to reduce the complexity for genome-wide analysis and adds specificity to chromatin interactions bound by specific factors of interest.
- ChIA-PET is compatible with tag-based next-generation sequencing approaches such as Roche 454 pyrosequencing, Illumina GA, ABI SOLiD, and Helicos.
- ChIA-PET is applicable to many different protein factors involved in transcriptional regulation or chromatin structural conformation.
- ChIA-PET analysis can be applied to chromatin interactions involved in a particular nuclear process. By using general TFs such as RNA Polymerase II, it may be possible to identify all chromatin interactions involved in transcription regulation. Further, the use of protein factors involved in DNA replication or chromatin structure would allow identification of all interactions due to DNA replication and chromatin structural modification.
- It is well established that cis and trans-regulatory complexes contain unique combinations of proteins based on cell and tissue specific conditions. While identification of single, functional TFBS is a significant advancement, the use of ChIA-PET to identify individual proteins in a complex would require guess work and multiple experiments to identify each interacting protein. This would be a costly and time-consuming endeavour.
- ChIA-PET is limited by the quality, purity, and specificity of the antibodies used.
- ChIA-PET is dependent on identification of sequences that can be mapped to the reference sequence.
- ChIA-PET requires the use of peak-calling computer algorithms to organize and map PET reads to the reference genome. Because of variations between software platforms, results can vary depending on which program is used.
- Although repetitive DNA regions can be associated with gene regulation, they need to be removed as they can affect the data.
History
Analysis and software
Software typically used in a ChIA-PET experimentELAND
Maps ChIP enriched DNA fragments to the reference human genome.
RepeatMasker
In-silico masking of repetitive elements.
Monte Carlo simulation
Used to estimate the false discovery rates.
PET-Tool
A software suite for processing and managing of Paired-End di-Tag sequence data.
ChIA-PET Tool
A software suite for processing ChIA-PET data.
Alternatives
Chromatin immunoprecipitation:ChIP-Seq, ChIP-PET, ChIP-SAGE, ChIP-CHIP.
Chromosome conformation capture:
2-C, 3-C, 4-C, 5-C.
Paired-end tags:
PET.