Haplogroup E-M2
Haplogroup E-M2 is a human Y-chromosome DNA haplogroup. It is primarily distributed in Sub-Saharan Africa. E-M2 is the predominant subclade in Western Africa, Central Africa, Southern Africa and the African Great Lakes, and occurs at moderate frequencies in North Africa and Middle East. E-M2 has several subclades, but many of these subhaplogroups are included in either E-L485 or E-U175. E-M2 is especially common in native Africans speaking Niger-Congo languages and was spread to Southern and Eastern Africa through the Bantu expansion.
Origins
The discovery of two SNPs by Trombetta et al. significantly redefined the E-V38 phylogenetic tree. This led the authors to suggest that E-V38 may have originated in East Africa. E-V38 joins the West African-affiliated E-M2 and the northern East African-affiliated E-M329 with an earlier common ancestor who, like E-P2, may have also originated in East Africa.The downstreams SNP E-M180 possibly originated on the moist south-central Saharan savannah/grassland of northern Africa between 14,000–10,000 years BP. According to Wood et al. and Rosa et al., such population movements changed the pre-existing population Y chromosomal diversity in Central, Southern and southern East Africa, replacing the previous haplogroups frequencies in these areas with the now dominant E1b1a1 lineages. Traces of earlier inhabitants, however, can be observed today in these regions via the presence of the Y DNA haplogroups A1a, A1b, A2, A3, and B-M60 that are common in certain populations, such as the Mbuti and Khoisan.
Distribution
This haplogroup's frequency and diversity are highest in the West Africa region. Within Africa, E-M2 displays a west-to-east as well as a south-to-north clinal distribution. In other words, the frequency of the haplogroup decreases as one moves from western and southern Africa toward the eastern and northern parts of the continent.Population group | frequency | References |
Bamileke | 96%-100% | |
Ewe | 97% | |
Ga | 97% | |
Yoruba | 93.1% | |
Tutsi | 85% | |
Fante | 84% | |
Mandinka | 79%–87% | |
Ovambo | 82% | |
Senegalese | 81% | |
Ganda | 77% | |
Bijagós | 76% | |
Balanta | 73% | |
Fula | 73% | |
Herero | 71% | |
Nalú | 71% |
Populations on the North West Africa, central Eastern Africa and Madagascar have tested at more moderate frequencies.
Population group | frequency | References |
Tuareg from Tânout, Niger | 44.4% | |
Comorian Shirazi | 41% | |
Tuareg from Gorom-Gorom, Burkina Faso | 16.6% | |
Tuareg from Gossi, Mali | 9.1% | |
Cape Verdeans | 15.9% | |
Maasai | 15.4% | |
Luo | 66% | |
Iraqw | 11.11% | |
Comoros | 23.46% | |
Merina people | 44% | |
Antandroy | 69.6% | |
Antanosy | 48.9% | |
Antaisaka | 37.5% |
E-M2 is found at low to moderate frequencies in North Africa, and northern East Africa. Some of the lineages found in these areas are possibly due to the Bantu expansion or other migrations. The E-M2 marker that appeared in North African samples stem from indigenous Moors. However, the discovery in 2011 of the E-M2 marker that predates E-M2 has led Trombetta et al. to suggest that E-M2 may have originated in East Africa. In Eritrea and most of Ethiopia E-V38 is usually only found in the form of E-M329, which is autochthonous, while E-M2 generally indicates Bantu migratory origins.
Population group | frequency | References |
Tuareg from Al Awaynat and Tahala, Libya | 46.5% | |
Oran, Algeria | 8.6% | |
Berbers, southern and north-central Morocco | 9.5% 5.8% | |
Moroccan Arabs | 6.8% 1.9% | |
Saharawis | 3.5% | |
Egyptians | 1.4%, 0%, 8.33% | |
Tunisians | 1.4% | |
Sudanese | 0.9% | |
Somalia nationals | 1.5% | |
Somalis | 0% | |
Djiboutians | 0% | |
Eritreans | 0% | |
Ethiopians | 0% |
Outside of Africa, E-M2 has been found at low frequencies. The clade has been found at low frequencies in West Asia. A few isolated occurrences of E-M2 have also been observed among populations in Southern Europe, such as Croatia, Malta, Spain and Portugal.
Population group | frequency | References |
Saudi Arabians | 6.6% | |
Omanis | 6.6% | |
Emiratis | 5.5% | |
Yemenis | 4.8% | |
Cypriots | 3.2% | |
Southern Iranians | 1.7% | |
Jordanians | 1.4% | |
Sri Lanka | 1.4% | |
Aeolian Islands, Italy | 1.2% |
The Trans-Atlantic slave trade brought people to North America, Central America and South America including the Caribbean. Consequently, the haplogroup is often observed in the United States populations in men who self-identify as African Americans. It has also been observed in a number of populations in Mexico, the Caribbean, Central America, and South America among people of African descent.
Population group | frequency | References |
Americans | 7.7–7.9% | |
Cubans | 9.8% | |
Dominicans | 5.69% | |
Puerto Ricans | 19.23% | |
Nicaraguans | 5.5% | |
Several populations of Colombians | 6.18% | |
Alagoas, Brazil | 4.45% | |
Bahia, Brazil | 19% | |
Bahamians | 58.63% |
Subclades
E1b1a1
E1b1a1 is defined by markers DYS271/M2/SY81, M291, P1/PN1, P189, P293, V43, and V95. Whilst E1b1a reaches its highest frequency of 81% in Senegal, only 1 of the 139 Senegalese that were tested showed M191/P86. In other words, as one moves to West Africa from western Central Africa, the less subclade E1b1a1f is found. "A possible explanation might be that haplotype 24 chromosomes were already present across the Sudanese belt when the M191 mutation, which defines haplotype 22, arose in central western Africa. Only then would a later demic expansion have brought haplotype 22 chromosomes from central western to western Africa, giving rise to the opposite clinal distributions of haplotypes 22 and 24."E1b1a1a1
E1b1a1a1 is commonly defined by M180/P88. The basal subclade is quite regularly observed in M2+ samples.E1b1a1a1a
E1b1a1a1a is defined by marker M58. 5% of the town Singa-Rimaïbé, Burkina Faso tested positive for E-M58. 15% of Hutus in Rwanda tested positive for M58. Three South Africans tested positive for this marker. One Carioca from Rio de Janeiro, Brazil tested positive for the M58 SNP. The place of origin and age is unreported.E1b1a1a1b
E1b1a1a1b is defined by M116.2, a private marker. A single carrier was found in Mali.E1b1a1a1c
E1b1a1a1c is defined by private marker M149. This marker was found in a single South African.E1b1a1a1d
E1b1a1a1d is defined by a private marker M155. It is known from a single carrier in Mali.E1b1a1a1e
E1b1a1a1e is defined by markers M10, M66, M156 and M195. Wairak people in Tanzania tested 4.6% positive for E-M10. E-M10 was found in a single person of the Lissongo group in the Central African Republic and two members in a "Mixed" population from the Adamawa region.E1b1a1a1f
E1b1a1a1f is defined by L485. The basal node E-L485* appears to be somewhat uncommon but has not been sufficiently tested in large populations. The ancestral L485 SNP was very recently discovered. Some of these SNPs have little or no published population data and/or have yet to receive nomenclature recognition by the YCC.- E1b1a1a1f1 is defined by marker L514. This SNP is currently without population study data outside of the 1000 Genomes Project.
- E1b1a1a1f1a is defined by marker M191/P86. Filippo et al. studied a number of African populations that were E-M2 positive and found the basal E-M191/P86 in a population of Gur speakers in Burkina Faso. Montano et al. found similar sparse distribution of E-M191* in Nigeria, Gabon, Cameroon and Congo. M191/P86 positive samples occurred in tested populations of Annang, Ibibio, Efik, and Igbo living in Nigeria, West Africa. E-M191/P86 appears in varying frequencies in Central and Southern Africa but almost all are also positive for P252/U174. Bantu-speaking South Africans tested 25.9% positive and Khoe-San speaking South Africans tested 7.7% positive for this SNP. It also appears commonly in Africans living in the Americas. A population in Rio de Janeiro, Brazil tested 9.2% positive. 34.9% of American Haplogroup E men tested positive for M191.
- E1b1a1a1f1a1 is defined by P252/U174. It appears to be the most common subclade of E-L485. It is believed to have originated near western Central Africa. It is rarely found in the most western portions of West Africa. Montano et al. found this subclade very prevalent in Nigeria and Gabon. Filippo et al. estimated a tMRCA of ~4.2 kya from sample of Yoruba population positive for the SNP.
- E1b1a1a1f1a1b is defined by P115. This subclade has only been observed amongst Fang people of Central Africa.
- E1b1a1a1f1a1c is defined by P116. Montano et al. observed this SNP only in Gabon and a Bassa population from Cameroon.
- E1b1a1a1f1a1d is defined by Z1704. This subclade has been observed across Africa. The 1000 Genomes Project Consortium found this SNP in Yoruba Nigerian, three Kenyan Luhyas and one African descent Puerto Rican.
- E1b1a1a1f1b is defined by markers L515, L516, L517, and M263.2. This subclade was found by the researchers of Y-Chromosome Genome Comparison Project using data from the commercial bioinformatics company 23andMe.
E1b1a1a1g
The supposed "Bantu haplotype" found in E-U175 carriers is "present at appreciable frequencies in other Niger–Congo languages speaking peoples as far west as Guinea-Bissau". This is the modal haplotype of STR markers that is common in carriers of E-U175.
E1b1a1a1g has several subclades.
- E1b1a1a1g1 is defined by U209. It is the most prominent subclade of U175. This subclade has very high frequencies of over fifty percentages in Cameroonian populations of Bassa and Bakaka, possibly indicating place of origin. However, E-U209 is widely found at lower frequencies in West and Central African countries surrounding Cameroon and Gabon. Brucato et al. found the SNP in a populations of Ahizi 38.8%, Yacouba 27.5%, and Beninese 6.5% respectively.
- E1b1a1a1g1a is defined by U290. The Montano et al. study of U290 showed a lower frequency in Nigeria and western Central Africa than basal node U209. The highest population frequency rate in that study was 57.7% in Ewondo in Cameroon. 32.5% of American Haplogroup E men tested by Sims et al. were positive for this SNP.
- E1b1a1a1g1a2 is defined by Z1725. This marker has been observed by The 1000 Genomes Project Consortium in Yoruba Nigerians and Luhya Kenyans.
- E1b1a1a1g1c is defined by M154. A Bamilike population tested 31.3% for the marker. Bakaka speakers from Cameroon tested 8%. An Ovimbundu test population found this SNP at 14%. Members of this subclade have also been found in South Africa.
- E1b1a1a1g1d is defined by V39. Trombetta et al. first published this SNP in 2011 but gave little population data about it. It is only known to have been found in an African population.
E1b1a1a1h
Phylogenetics
Phylogenetic history
Prior to 2002, there were in academic literature at least seven naming systems for the Y-Chromosome Phylogenetic tree. This led to considerable confusion. In 2002, the major research groups came together and formed the Y-Chromosome Consortium. They published a joint paper that created a single new tree that all agreed to use. Later, a group of citizen scientists with an interest in population genetics and genetic genealogy formed a working group to create an amateur tree aiming at being above all timely. The table below brings together all of these works at the point of the landmark 2002 YCC Tree. This allows a researcher reviewing older published literature to quickly move between nomenclatures.YCC 2002/2008 | ' | ' | ' | ' | ' | ' | YCC 2002 | YCC 2005 | YCC 2008 | YCC 2010r | ISOGG 2006 | ISOGG 2007 | ISOGG 2008 | ISOGG 2009 | ISOGG 2010 | ISOGG 2011 | ISOGG 2012 | |
E-P29 | 21 | III | 3A | 13 | Eu3 | H2 | B | E* | E | E | E | E | E | E | E | E | E | E |
E-M33 | 21 | III | 3A | 13 | Eu3 | H2 | B | E1* | E1 | E1a | E1a | E1 | E1 | E1a | E1a | E1a | E1a | E1a |
E-M44 | 21 | III | 3A | 13 | Eu3 | H2 | B | E1a | E1a | E1a1 | E1a1 | E1a | E1a | E1a1 | E1a1 | E1a1 | E1a1 | E1a1 |
E-M75 | 21 | III | 3A | 13 | Eu3 | H2 | B | E2a | E2 | E2 | E2 | E2 | E2 | E2 | E2 | E2 | E2 | E2 |
E-M54 | 21 | III | 3A | 13 | Eu3 | H2 | B | E2b | E2b | E2b | E2b1 | - | - | - | - | - | - | - |
E-P2 | 25 | III | 4 | 14 | Eu3 | H2 | B | E3* | E3 | E1b | E1b1 | E3 | E3 | E1b1 | E1b1 | E1b1 | E1b1 | E1b1 |
E-M2 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a* | E3a | E1b1 | E1b1a | E3a | E3a | E1b1a | E1b1a | E1b1a | E1b1a1 | E1b1a1 |
E-M58 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a1 | E3a1 | E1b1a1 | E1b1a1 | E3a1 | E3a1 | E1b1a1 | E1b1a1 | E1b1a1 | E1b1a1a1a | E1b1a1a1a |
E-M116.2 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a2 | E3a2 | E1b1a2 | E1b1a2 | E3a2 | E3a2 | E1b1a2 | E1b1a2 | E1ba12 | removed | removed |
E-M149 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a3 | E3a3 | E1b1a3 | E1b1a3 | E3a3 | E3a3 | E1b1a3 | E1b1a3 | E1b1a3 | E1b1a1a1c | E1b1a1a1c |
E-M154 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a4 | E3a4 | E1b1a4 | E1b1a4 | E3a4 | E3a4 | E1b1a4 | E1b1a4 | E1b1a4 | E1b1a1a1g1c | E1b1a1a1g1c |
E-M155 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a5 | E3a5 | E1b1a5 | E1b1a5 | E3a5 | E3a5 | E1b1a5 | E1b1a5 | E1b1a5 | E1b1a1a1d | E1b1a1a1d |
E-M10 | 8 | III | 5 | 15 | Eu2 | H2 | B | E3a6 | E3a6 | E1b1a6 | E1b1a6 | E3a6 | E3a6 | E1b1a6 | E1b1a6 | E1b1a6 | E1b1a1a1e | E1b1a1a1e |
E-M35 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b* | E3b | E1b1b1 | E1b1b1 | E3b1 | E3b1 | E1b1b1 | E1b1b1 | E1b1b1 | removed | removed |
E-M78 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b1* | E3b1 | E1b1b1a | E1b1b1a1 | E3b1a | E3b1a | E1b1b1a | E1b1b1a | E1b1b1a | E1b1b1a1 | E1b1b1a1 |
E-M148 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b1a | E3b1a | E1b1b1a3a | E1b1b1a1c1 | E3b1a3a | E3b1a3a | E1b1b1a3a | E1b1b1a3a | E1b1b1a3a | E1b1b1a1c1 | E1b1b1a1c1 |
E-M81 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b2* | E3b2 | E1b1b1b | E1b1b1b1 | E3b1b | E3b1b | E1b1b1b | E1b1b1b | E1b1b1b | E1b1b1b1 | E1b1b1b1a |
E-M107 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b2a | E3b2a | E1b1b1b1 | E1b1b1b1a | E3b1b1 | E3b1b1 | E1b1b1b1 | E1b1b1b1 | E1b1b1b1 | E1b1b1b1a | E1b1b1b1a1 |
E-M165 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b2b | E3b2b | E1b1b1b2 | E1b1b1b1b1 | E3b1b2 | E3b1b2 | E1b1b1b2a | E1b1b1b2a | E1b1b1b2a | E1b1b1b2a | E1b1b1b1a2a |
E-M123 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b3* | E3b3 | E1b1b1c | E1b1b1c | E3b1c | E3b1c | E1b1b1c | E1b1b1c | E1b1b1c | E1b1b1c | E1b1b1b2a |
E-M34 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3b3a* | E3b3a | E1b1b1c1 | E1b1b1c1 | E3b1c1 | E3b1c1 | E1b1b1c1 | E1b1b1c1 | E1b1b1c1 | E1b1b1c1 | E1b1b1b2a1 |
E-M136 | 25 | III | 4 | 14 | Eu4 | H2 | B | E3ba1 | E3b3a1 | E1b1b1c1a | E1b1b1c1a1 | E3b1c1a | E3b1c1a | E1b1b1c1a1 | E1b1b1c1a1 | E1b1b1c1a1 | E1b1b1c1a1 | E1b1b1b2a1a1 |
Research publications
The following research teams per their publications were represented in the creation of the YCC tree.Tree
This phylogenetic tree of haplogroup subclades is based on the Y-Chromosome Consortium 2008 Tree, the ISOGG Y-DNA Haplogroup E Tree, and subsequent published research.- *E1b1a1
- ** E1b1a1a
- ***E1b1a1a1
- ****E1b1a1a1a
- ****E1b1a1a1b
- ****E1b1a1a1c
- ****E1b1a1a1d
- ****E1b1a1a1e
- ****E1b1a1a1f
- *****E1b1a1a1f1
- ******E1b1a1a1f1a
- *******E1b1a1a1f1a1
- ********E1b1a1a1f1a1a
- ********E1b1a1a1f1a1b
- ********E1b1a1a1f1a1c
- *********E1b1a1a1f1a1c1
- ********E1b1a1a1f1a1d
- ********
- ******E1b1a1a1f1b
- *******E1b1a1a1f1b1
- ********
- ****E1b1a1a1g
- *****E1b1a1a1g1
- ******E1b1a1a1g1a
- *******E1b1a1a1g1a1
- ********E1b1a1a1g1a1a
- *******E1b1a1a1g1a2
- ******E1b1a1a1g1b
- ******E1b1a1a1g1c
- ******E1b1a1a1g1d
- ****E1b1a1a1h
Genetics