Haplogroup E-M2


Haplogroup E-M2 is a human Y-chromosome DNA haplogroup. It is primarily distributed in Sub-Saharan Africa. E-M2 is the predominant subclade in Western Africa, Central Africa, Southern Africa and the African Great Lakes, and occurs at moderate frequencies in North Africa and Middle East. E-M2 has several subclades, but many of these subhaplogroups are included in either E-L485 or E-U175. E-M2 is especially common in native Africans speaking Niger-Congo languages and was spread to Southern and Eastern Africa through the Bantu expansion.

Origins

The discovery of two SNPs by Trombetta et al. significantly redefined the E-V38 phylogenetic tree. This led the authors to suggest that E-V38 may have originated in East Africa. E-V38 joins the West African-affiliated E-M2 and the northern East African-affiliated E-M329 with an earlier common ancestor who, like E-P2, may have also originated in East Africa.
The downstreams SNP E-M180 possibly originated on the moist south-central Saharan savannah/grassland of northern Africa between 14,000–10,000 years BP. According to Wood et al. and Rosa et al., such population movements changed the pre-existing population Y chromosomal diversity in Central, Southern and southern East Africa, replacing the previous haplogroups frequencies in these areas with the now dominant E1b1a1 lineages. Traces of earlier inhabitants, however, can be observed today in these regions via the presence of the Y DNA haplogroups A1a, A1b, A2, A3, and B-M60 that are common in certain populations, such as the Mbuti and Khoisan.

Distribution

This haplogroup's frequency and diversity are highest in the West Africa region. Within Africa, E-M2 displays a west-to-east as well as a south-to-north clinal distribution. In other words, the frequency of the haplogroup decreases as one moves from western and southern Africa toward the eastern and northern parts of the continent.
Population groupfrequencyReferences
Bamileke96%-100%
Ewe97%
Ga97%
Yoruba93.1%
Tutsi85%
Fante84%
Mandinka79%–87%
Ovambo82%
Senegalese81%
Ganda77%
Bijagós76%
Balanta73%
Fula73%
Herero71%
Nalú71%

Populations on the North West Africa, central Eastern Africa and Madagascar have tested at more moderate frequencies.
Population groupfrequencyReferences
Tuareg from Tânout, Niger44.4%
Comorian Shirazi41%
Tuareg from Gorom-Gorom, Burkina Faso16.6%
Tuareg from Gossi, Mali9.1%
Cape Verdeans15.9%
Maasai15.4%
Luo66%
Iraqw11.11%
Comoros23.46%
Merina people 44%
Antandroy69.6%
Antanosy48.9%
Antaisaka37.5%

E-M2 is found at low to moderate frequencies in North Africa, and northern East Africa. Some of the lineages found in these areas are possibly due to the Bantu expansion or other migrations. The E-M2 marker that appeared in North African samples stem from indigenous Moors. However, the discovery in 2011 of the E-M2 marker that predates E-M2 has led Trombetta et al. to suggest that E-M2 may have originated in East Africa. In Eritrea and most of Ethiopia E-V38 is usually only found in the form of E-M329, which is autochthonous, while E-M2 generally indicates Bantu migratory origins.
Population groupfrequencyReferences
Tuareg from Al Awaynat and Tahala, Libya46.5%
Oran, Algeria8.6%
Berbers, southern and north-central Morocco9.5% 5.8%
Moroccan Arabs6.8% 1.9%
Saharawis3.5%
Egyptians1.4%, 0%, 8.33%
Tunisians1.4%
Sudanese 0.9%
Somalia nationals 1.5%
Somalis 0%
Djiboutians 0%
Eritreans 0%
Ethiopians 0%

Outside of Africa, E-M2 has been found at low frequencies. The clade has been found at low frequencies in West Asia. A few isolated occurrences of E-M2 have also been observed among populations in Southern Europe, such as Croatia, Malta, Spain and Portugal.
Population groupfrequencyReferences
Saudi Arabians6.6%
Omanis6.6%
Emiratis5.5%
Yemenis4.8%
Cypriots3.2%
Southern Iranians1.7%
Jordanians1.4%
Sri Lanka1.4%
Aeolian Islands, Italy1.2%

The Trans-Atlantic slave trade brought people to North America, Central America and South America including the Caribbean. Consequently, the haplogroup is often observed in the United States populations in men who self-identify as African Americans. It has also been observed in a number of populations in Mexico, the Caribbean, Central America, and South America among people of African descent.
Population groupfrequencyReferences
Americans7.7–7.9%
Cubans9.8%
Dominicans5.69%
Puerto Ricans19.23%
Nicaraguans5.5%
Several populations of Colombians6.18%
Alagoas, Brazil4.45%
Bahia, Brazil19%
Bahamians58.63%

Subclades

E1b1a1

E1b1a1 is defined by markers DYS271/M2/SY81, M291, P1/PN1, P189, P293, V43, and V95. Whilst E1b1a reaches its highest frequency of 81% in Senegal, only 1 of the 139 Senegalese that were tested showed M191/P86. In other words, as one moves to West Africa from western Central Africa, the less subclade E1b1a1f is found. "A possible explanation might be that haplotype 24 chromosomes were already present across the Sudanese belt when the M191 mutation, which defines haplotype 22, arose in central western Africa. Only then would a later demic expansion have brought haplotype 22 chromosomes from central western to western Africa, giving rise to the opposite clinal distributions of haplotypes 22 and 24."

E1b1a1a1

E1b1a1a1 is commonly defined by M180/P88. The basal subclade is quite regularly observed in M2+ samples.

E1b1a1a1a

E1b1a1a1a is defined by marker M58. 5% of the town Singa-Rimaïbé, Burkina Faso tested positive for E-M58. 15% of Hutus in Rwanda tested positive for M58. Three South Africans tested positive for this marker. One Carioca from Rio de Janeiro, Brazil tested positive for the M58 SNP. The place of origin and age is unreported.

E1b1a1a1b

E1b1a1a1b is defined by M116.2, a private marker. A single carrier was found in Mali.

E1b1a1a1c

E1b1a1a1c is defined by private marker M149. This marker was found in a single South African.

E1b1a1a1d

E1b1a1a1d is defined by a private marker M155. It is known from a single carrier in Mali.

E1b1a1a1e

E1b1a1a1e is defined by markers M10, M66, M156 and M195. Wairak people in Tanzania tested 4.6% positive for E-M10. E-M10 was found in a single person of the Lissongo group in the Central African Republic and two members in a "Mixed" population from the Adamawa region.

E1b1a1a1f

E1b1a1a1f is defined by L485. The basal node E-L485* appears to be somewhat uncommon but has not been sufficiently tested in large populations. The ancestral L485 SNP was very recently discovered. Some of these SNPs have little or no published population data and/or have yet to receive nomenclature recognition by the YCC.
E1b1a1a1g is defined by marker U175. The basal E-U175* is extremely rare. Montano et al. only found one out of 505 tested African subjects who was U175 positive but negative for U209. Brucato et al. found similarly low frequencies of basal E-U175* in subjects in the Ivory Coast and Benin. Veeramah et al. found U175 in tested Annang, Ibibio, Efik, and Igbo but did not test for U209.
The supposed "Bantu haplotype" found in E-U175 carriers is "present at appreciable frequencies in other Niger–Congo languages speaking peoples as far west as Guinea-Bissau". This is the modal haplotype of STR markers that is common in carriers of E-U175.


E1b1a1a1g has several subclades.
E1b1a1a1h is defined by markers P268 and P269. It was first reported in a person from the Gambia.

Phylogenetics

Phylogenetic history

Prior to 2002, there were in academic literature at least seven naming systems for the Y-Chromosome Phylogenetic tree. This led to considerable confusion. In 2002, the major research groups came together and formed the Y-Chromosome Consortium. They published a joint paper that created a single new tree that all agreed to use. Later, a group of citizen scientists with an interest in population genetics and genetic genealogy formed a working group to create an amateur tree aiming at being above all timely. The table below brings together all of these works at the point of the landmark 2002 YCC Tree. This allows a researcher reviewing older published literature to quickly move between nomenclatures.
YCC 2002/2008 ''''''YCC 2002 YCC 2005 YCC 2008 YCC 2010r ISOGG 2006ISOGG 2007ISOGG 2008ISOGG 2009ISOGG 2010ISOGG 2011ISOGG 2012
E-P2921III3A13Eu3H2BE*EEEEEEEEEE
E-M3321III3A13Eu3H2BE1*E1E1aE1aE1E1E1aE1aE1aE1aE1a
E-M4421III3A13Eu3H2BE1aE1aE1a1E1a1E1aE1aE1a1E1a1E1a1E1a1E1a1
E-M7521III3A13Eu3H2BE2aE2E2E2E2E2E2E2E2E2E2
E-M5421III3A13Eu3H2BE2bE2bE2bE2b1-------
E-P225III414Eu3H2BE3*E3E1bE1b1E3E3E1b1E1b1E1b1E1b1E1b1
E-M28III515Eu2H2BE3a*E3aE1b1E1b1aE3aE3aE1b1aE1b1aE1b1aE1b1a1E1b1a1
E-M588III515Eu2H2BE3a1E3a1E1b1a1E1b1a1E3a1E3a1E1b1a1E1b1a1E1b1a1E1b1a1a1aE1b1a1a1a
E-M116.28III515Eu2H2BE3a2E3a2E1b1a2E1b1a2E3a2E3a2E1b1a2E1b1a2E1ba12removedremoved
E-M1498III515Eu2H2BE3a3E3a3E1b1a3E1b1a3E3a3E3a3E1b1a3E1b1a3E1b1a3E1b1a1a1cE1b1a1a1c
E-M1548III515Eu2H2BE3a4E3a4E1b1a4E1b1a4E3a4E3a4E1b1a4E1b1a4E1b1a4E1b1a1a1g1cE1b1a1a1g1c
E-M1558III515Eu2H2BE3a5E3a5E1b1a5E1b1a5E3a5E3a5E1b1a5E1b1a5E1b1a5E1b1a1a1dE1b1a1a1d
E-M108III515Eu2H2BE3a6E3a6E1b1a6E1b1a6E3a6E3a6E1b1a6E1b1a6E1b1a6E1b1a1a1eE1b1a1a1e
E-M3525III414Eu4H2BE3b*E3bE1b1b1E1b1b1E3b1E3b1E1b1b1E1b1b1E1b1b1removedremoved
E-M7825III414Eu4H2BE3b1*E3b1E1b1b1aE1b1b1a1E3b1aE3b1aE1b1b1aE1b1b1aE1b1b1aE1b1b1a1E1b1b1a1
E-M14825III414Eu4H2BE3b1aE3b1aE1b1b1a3aE1b1b1a1c1E3b1a3aE3b1a3aE1b1b1a3aE1b1b1a3aE1b1b1a3aE1b1b1a1c1E1b1b1a1c1
E-M8125III414Eu4H2BE3b2*E3b2E1b1b1bE1b1b1b1E3b1bE3b1bE1b1b1bE1b1b1bE1b1b1bE1b1b1b1E1b1b1b1a
E-M10725III414Eu4H2BE3b2aE3b2aE1b1b1b1E1b1b1b1aE3b1b1E3b1b1E1b1b1b1E1b1b1b1E1b1b1b1E1b1b1b1aE1b1b1b1a1
E-M16525III414Eu4H2BE3b2bE3b2bE1b1b1b2E1b1b1b1b1E3b1b2E3b1b2E1b1b1b2aE1b1b1b2aE1b1b1b2aE1b1b1b2aE1b1b1b1a2a
E-M12325III414Eu4H2BE3b3*E3b3E1b1b1cE1b1b1cE3b1cE3b1cE1b1b1cE1b1b1cE1b1b1cE1b1b1cE1b1b1b2a
E-M3425III414Eu4H2BE3b3a*E3b3aE1b1b1c1E1b1b1c1E3b1c1E3b1c1E1b1b1c1E1b1b1c1E1b1b1c1E1b1b1c1E1b1b1b2a1
E-M13625III414Eu4H2BE3ba1E3b3a1E1b1b1c1aE1b1b1c1a1E3b1c1aE3b1c1aE1b1b1c1a1E1b1b1c1a1E1b1b1c1a1E1b1b1c1a1E1b1b1b2a1a1

Research publications

The following research teams per their publications were represented in the creation of the YCC tree.

Tree

This phylogenetic tree of haplogroup subclades is based on the Y-Chromosome Consortium 2008 Tree, the ISOGG Y-DNA Haplogroup E Tree, and subsequent published research.

Y-DNA E subclades