Family-based QTL mapping


mapping or QTL mapping is the process of identifying genomic regions that potentially contain genes responsible for important economic, health or environmental characters. Mapping QTLs is an important activity that plant breeders and geneticists routinely use to associate potential causal genes with phenotypes of interest. Family-based QTL mapping is a variant of QTL mapping where multiple-families are used.

Pedigree in humans and wheat

information include information about ancestry. Keeping pedigree records is a centuries-old tradition. Pedigrees can also be verified using gene-marker data.

In plants

The method has been discussed in the context of plant breeding populations. Pedigree records are kept by plants breeders and pedigree-based selection is popular in several plant species. Plant pedigrees are different from that of humans, particularly as plant are hermaphroditic – an individual can be male or female and mating can be performed in random combinations, with inbreeding loops. Also plant pedigrees may contain of "selfs", i.e. offspring resulting from self-pollination of a plant.

Pedigree denotation

SIMPLE CROSS SYMBOL Example
/ first order cross SON 64/KLRE
//, second order cross IR 64/KLRE // CIAN0
/3/, third order cross TOBS /3/ SON 64/KLRE // CIAN0
/4/, fourth order cross TOBS /3/ SON 64/KLRE // CIAN0 /4/ SEE
/n/, nth order cross
BACK CROSS SYMBOL
*n n number of times the back cross parent used
left side simple cross symbol,
back cross parent is the female,
right side – male,
Example: SEE/3*ANE, TOBS*6/CIAN0
The idea of family-based QTL mapping comes from inheritance of marker alleles and its association with trait of interest has demonstrated how to use family-based association in plant breeding families.

Limitation of conventional methods

Traditional mapping populations include single family consisting of crossing between two parents or three parents often distantly related. There are some important limitations associated with traditional mapping methods. Some of which include limited polymorphism rates, and no indication of marker effectiveness in multiple genetic backgrounds. Often, by the time a QTL mapping population is developed and mapped, breeders have introgressed the new QTL using traditional breeding and selection methods. This can reduce the usefulness of MAS within breeding programs at the time when MAS could be most useful. Family-based QTL mapping removes this limitation by using existing plant breeding families.

Common study population mapping

Broadly, there are 3 classes of study designs: study designs in which large sets of relatives from extended or nuclear families are sampled, study designs in which pairs of relatives are sampled or study designs in which unrelated individuals are sampled.

Unrelated individuals

Natural collection of individuals with unknown pedigree constitutes mapping populations. The population based association mapping technique are based on this type of populations. In plant context such population are hard to find as most of individuals are someway related. Other disadvantage of such method is that even if we can find such a population, it is difficult to find high allele frequency for allele of interest in such situation. For purpose of create balance in allele frequency, usually case-control studies.

Sibpairs

Such design include a pair of sibs from multiple independent families. The members in each sibpairs are not randomly chosen – often both siblings are chosen from one tail of the distribution of the QT or one sibling is chosen from the upper tail and the other sibling is chosen from the lower tail. Another sampling design could include a pair of siblings, one chosen from the upper or lower tail of the distribution and the other chosen randomly from among the remaining siblings.

Trios

Trios include parents and one offspring. Trios are more commonly used in association studies. The concept of association mapping that each trio are unrelated, however trios are related in themselves.

Nuclear family

Nuclear family consists of two generation simple family pedigree.

Extended pedigrees

In extended pedigree include multiple generation pedigree. It can be as deep or wide as the pedigree information is available. Extended pedigree are attractive for linkage-based analysis.

Linkage vs association analysis

Linkage and association analysis are primary tools for gene discovery, localization and functional analysis. While conceptual underpinning of these approaches have been long known, advances in recent decades in molecular genetics, development in efficient algorithms, and computing power have enabled the large scale application of these methods. While linkage studies seek to identify loci cosegregate with the trait within families, association studies seek to identify particular variants that are associated with the phenotype at the population level. These are complementary methods that, together, provide means to probe the genome and describe etiology of complex traits. In linkage studies, we seek to identify the loci that cosegregate with a specific genomic region, tagged by polymorphic markers, within families. In contrast, in association studies, we seek a correlation between a specific genetic variation and trait variation in sample of individuals, implicating a causal role of the variant.

Family-based linkage analysis

Genetic linkage is the phenomenon where by alleles at different loci cosegregate in families. The strength of cosegregation is measured by the recombination fraction θ, the probability of an odd number of recombination. More complex pedigree provide higher power. Identity by descent matrix estimation is a central component in mapping of Quantitative Trait Loci using variance component models. Alleles have identity by type when they have the same phenotypic effect. Alleles that are identical by type fall into two groups; those that are identical by descent because they arose from the same allele in an earlier generation; and those that are non-identical by descent or identical by state because they arose from separate mutations. Parent-offspring pairs share 50% of their genes IBD, and monozygotic twins share 100% IBD. What is relevant in linkage analysis is the inheritance of alleles at adjacent loci; therefore; it is critical importance to determine whether the alleles are identical by descent or only identical by state. Therefore there three categories of family-based linkage analysis – strongly modeled, weakly model based, or model free. Variance component methods may be viewed as hybrids.

Family-based association analysis

and association mapping is receiving considerable attention in the plant genetics community for its potential to use existing genetic resources collections to fine map quantitative trait loci, validate candidate genes, and identify alleles of interest. The three elements of particular importance for conducting association mapping or interpreting the results include:
  1. the analysis of population structure into subgroups,
  2. its use to control for spurious associations and consequences in the specific case of differential selection among subgroups, and
  3. the analysis of the local structure of LD into haplotypes and its consequences on the resolution and the application of LD mapping.
In contrast to population-based association, family-based association tests are becoming more popular.
The family-based, Tran-disequilibirum test has gained wide popularity in recent years, this method also focuses on alleles transmitted to affect offispring, but it is formulated to take account of both the linkage and the disequilibrium that underlie the association. The test requires genotype information on trio individuals, namely affected child and both biological parents; and at least one parent must be heterozygous for the test to be informative. The proposed test statistic is actually McNemar’s chi-square statistic and tests the null hypothesis that the putative disease associated allele is transmitted 50% of the time from the heterogygous parents against the alternative hypothesis that the trait positive allele -associated allele will be transmitted more often. The TDT is not affected by population stratification and admixture.
The concept of family-based test of association has been extended to quantitative traits.

Quantitative Transmission Disequilibrium Test (QTDT)

The TDT has been extended in context of quantitative traits and nuclear or extended pedigree families. The generalized test allows to use any family type of families in testing. QTDT has also be been extended to haplotype-based association mapping. Haplotypes refer to combinations of marker alleles which are located closely together on the same chromosome and which tend to be inherited together. With availability of high density SNP makers, haplotypes play an important role in association studies. First – haplotypes are critical to understanding the LD pattern across the genome, which is essential for association studies. Actually there is no better way to understand LD pattern than to know the haplotypes themselves. Haplotypes tell us how alleles are organized along the chromosome and reflect the pattern of inheritance over evaluations. Second, methods based on haplotypes can be more powerful than those based on single markers in association studies of mapping complex trait genes.

Drawing family pedigrees

There are several pedigree drawing software available for human genetics context such as COPE, CYRILLIC, FTM, FTREE, KINDRED, PED,PEDHUNTER, PEDIGRAPH, PEDIGREE/DRAW, PEDIGREE-VISUALIZER, PEDPLOT,PEDRAW/WPEDRAW, PROGENY etc. However the pedigree drawing in plants requires some additional features such as inbreeding, selfing, mutation, polyploidy etc. which is supported in . The pedimap can be used for pedigree visualization along with phenotypic, genotypic and ibd probabilities data in every type of plant pedigrees in both diploids and tetraploids.