Haplogroup I (mtDNA)
Haplogroup I is a human mitochondrial DNA haplogroup. It is believed to have originated about 21,000 years ago, during the Last Glacial Maximum period in West Asia. The haplogroup is unusual in that it is now widely distributed geographically, but is common in only a few small areas of East Africa, West Asia and Europe. It is especially common among the El Molo and Rendille peoples of Kenya, various regions of Iran, the Lemko people of Slovakia, Poland and Ukraine, the island of Krk in Croatia, the department of Finistère in France and some parts of Scotland.
Origin
Haplogroup I is a descendant of haplogroup N1a1b and sibling of haplogroup N1a1b1. It is believed to have arisen somewhere in West Asia between 17,263 and 24,451 years before present , with coalescence age of 20.1 thousand years ago. It has been suggested that its origin may be in Iran or more generally the Near East. It has diverged to at least seven distinct clades i.e. branches I1-I7, dated between 16-6.8 thousand years. The hypothesis about its Near Eastern origin is based on the fact that all haplogroup I clades, especially those from Late Glacial period, include mitogenomes from the Near East. The age estimates and dispersal of some subclades are similar to those of major subclades of the mtDNA haplogroups J and T, indicating possible dispersal of the I haplogroup into Europe during the Late Glacial period and postglacial period, several millennia before the European Neolithic period. Some subclades show signs of the Neolithic diffusion of agriculture and pastoralism within Europe.A similar view puts more emphasis on the Persian Gulf region of the Near East.
Distribution
Haplogroup I is found at moderate to low frequencies in East Africa, Europe, West Asia and South Asia. In addition to the confirmed seven clades, the rare basal/paraphyletic clade I* has been observed in three individuals; two from Somalia and one from Iran.Africa
The highest frequencies of mitochondrial haplogroup I observed so far appear in the Cushitic-speaking El Molo and Rendille in northern Kenya. The clade is also found at comparable frequencies among the Soqotri.Population | Location | Language Family | N | Frequency | Source |
Amhara | Ethiopia | Afro-Asiatic > Semitic | 1/120 | 0.83% | |
Egyptians | Egypt | Afro-Asiatic > Semitic | 2/34 | 5.9% | |
Beta Israel | Ethiopia | Afro-Asiatic > Cushitic | 0/29 | 0.00% | |
Dawro Konta | Ethiopia | Afro-Asiatic > Omotic | 0/137 | 0.00% | and |
Ethiopia | Ethiopia | Undetermined | 0/77 | 0.00% | |
Ethiopian Jews | Ethiopia | Afro-Asiatic > Cushitic | 0/41 | 0.00% | |
Gurage | Ethiopia | Afro-Asiatic > Semitic | 1/21 | 4.76% | |
Hamer | Ethiopia | Afro-Asiatic > Omotic | 0/11 | 0.00% | and |
Ongota | Ethiopia | Afro-Asiatic > Cushitic | 0/19 | 0.00% | and |
Oromo | Ethiopia | Afro-Asiatic > Cushitic | 0/33 | 0.00% | |
Tigrai | Ethiopia | Afro-Asiatic > Semitic | 0/44 | 0.00% | |
Daasanach | Kenya | Afro-Asiatic > Cushitic | 0/49 | 0.00% | |
Elmolo | Kenya | Afro-Asiatic > Cushitic | 12/52 | 23.08% | and |
Luo | Kenya | Nilo-Saharan | 0/49 | 0.00% | and |
Maasai | Kenya | Nilo-Saharan | 0/81 | 0.00% | and |
Nairobi | Kenya | Niger-Congo | 0/100 | 0.00% | |
Nyangatom | Kenya | Nilo-Saharan | 1/112 | 0.89% | |
Rendille | Kenya | Afro-Asiatic > Cushitic | 3/17 | 17.65% | and |
Samburu | Kenya | Nilo-Saharan | 3/35 | 8.57% | and |
Turkana | Kenya | Nilo-Saharan | 0/51 | 0.00% | and |
Hutu | Rwanda | Niger-Congo | 0/42 | 0.00% | |
Dinka | Sudan | Nilo-Saharan | 0/46 | 0.00% | |
Sudan | Sudan | Undetermined | 0/102 | 0.00% | |
Burunge | Tanzania | Afro-Asiatic > Cushitic | 1/38 | 2.63% | |
Datoga | Tanzania | Nilo-Saharan | 0/57 | 0.00% | and |
Iraqw | Tanzania | Afro-Asiatic > Cushitic | 0/12 | 0.00% | |
Sukuma | Tanzania | Niger-Congo | 0/32 | 0.00% | and |
Turu | Tanzania | Niger-Congo | 0/29 | 0.00% | |
Yemeni | Yemen | Afro-Asiatic > Semitic | 0/114 | 0.00% |
Asia
Haplogroup I is present across West Asia and Central Asia, and is also found at trace frequencies in South Asia. Its highest frequency area is perhaps in northern Iran. Terreros 2011 notes that it also has high diversity there and reiterates past studies that have suggested that this may be its place of origin. Found in Svan population from Georgia I* 4.2%."Sequence polymorphisms of the mtDNA control region in a human isolate: the Georgians from Swanetia."Alfonso-Sánchez MA1, Martínez-Bouzas C, Castro A, Peña JA, Fernández-Fernández I, Herrera RJ, de Pancorbo MM. The table below shows some of the populations where it has been detected.Population | Language Family | N | Frequency | Source |
Baluch | Indo-European | 0/39 | 0.00% | |
Brahui | Dravidian | 0/38 | 0.00% | |
Caucasus * | Kartvelian | 1/58 | 1.80% | |
Druze | - | 11/311 | 3.54% | |
Gilaki | Indo-European | 0/37 | 0.00% | |
Gujarati | Indo-European | 0/34 | 0.00% | |
Hazara | Indo-European | 0/23 | 0.00% | |
Hunza Burusho | Isolate | 2/44 | 4.50% | |
India | - | 8/2544 | 0.30% | |
Iran | - | 3/31 | 9.70% | |
Iran | - | 2/117 | 1.70% | |
Kalash | Indo-European | 0/44 | 0.00% | |
Kurdish | Indo-European | 1/20 | 5.00% | |
Kurdish | Indo-European | 1/32 | 3.10% | |
Kurdish | Indo-European | 66/200 | 33.0% | |
Lur | Indo-European | 0/17 | 0.00% | |
Makrani | Indo-European | 0/33 | 0.00% | |
Mazandarian | Indo-European | 1/21 | 4.80% | |
Pakistani | Indo-European | 0/100 | 0.00% | |
Pakistan | - | 1/145 | 0.69% | |
Parsi | Indo-European | 0/44 | 0.00% | |
Pathan | Indo-European | 1/44 | 2.30% | |
Persian | Indo-European | 1/42 | 2.40% | |
Shugnan | Indo-European | 1/44 | 2.30% | |
Sindhi | Indo-European | 1/23 | 8.70% | |
Turkish | Turkic | 2/40 | 5.00% | |
Turkish * | Turkic | 1/50 | 2.00% | |
Turkmen | Turkic | 0/41 | 0.00% | |
Uzbek | Turkic | 0/42 | 0.00% |
Europe
Western Europe
In Western Europe, haplogroup I is most common in Northwestern Europe. The frequency in these areas is between 2 and 5 percent. Its highest frequency in Brittany, France where it is over 9 percent of the population in Finistère. It is uncommon and sometimes absent in other parts of Western Europe.Population | Language | N | Frequency | Source |
Austria/Switzerland | - | 4/187 | 2.14% | |
Basque | Basque/Labourdin côtier-haut navarrais | 0/56 | 0.00% | |
Basque | Basque/Occidental | 0/55 | 0.00% | |
Basque | Basque/Biscayen | 1/59 | 1.69% | |
Basque | Basque/Haut-navarrais méridional | 2/63 | 3.17% | |
Basque | Basque/Gipuzkoan | 0/57 | 0.00% | |
Basque | Basque/Bas-navarrais | 0/68 | 0.00% | |
Basque | Basque/Haut-navarrais septentrional | 0/51 | 0.00% | |
Basque | Basque/Roncalais-salazarais | 0/55 | 0.00% | |
Basque | Basque/Souletin | 0/62 | 0.00% | |
Basque | Basque/Biscayen | 0/64 | 0.00% | |
Béarn | French | 0/51 | 0.00% | |
Bigorre | French | 0/44 | 0.00% | |
Burgos | Spanish | 0/25 | 0.00% | |
Cantabria | Spanish | 0/18 | 0.00% | |
Chalosse | French | 0/58 | 0.00% | |
Denmark | - | 6/105 | 5.71% | |
England/Wales | - | 12/429 | 3.03% | |
Finland | - | 1/49 | 2.04% | |
Finland/Estonia | - | 5/202 | 2.48% | |
France | - | 2/22 | 9.10% | |
France | - | 0/40 | 0.00% | |
France | - | 0/39 | 0.00% | |
France | - | 2/72 | 2.80% | |
France | - | 2/37 | 5.40% | |
France/Italy | - | 2/248 | 0.81% | |
Germany | - | 12/527 | 2.28% | |
Iceland | - | 21/467 | 4.71% | |
Ireland | - | 3/128 | 2.34% | |
Italy | - | 2/48 | 4.20% | |
La Rioja | Spanish | 1/51 | 1.96% | |
North Aragon | Spanish | 0/26 | 0.00% | |
Orkney | - | 5/152 | 3.29% | |
Saami | - | 0/176 | 0.00% | |
Scandinavia | - | 12/645 | 1.86% | |
Scotland | - | 39/891 | 4.38% | |
Spain/Portugal | - | 2/352 | 0.57% | |
Sweden | - | 0/37 | 0.00% | |
Western Bizkaia | Spanish | 0/18 | 0.00% | |
Western Isles/Isle of Skye | - | 15/246 | 6.50% |
Eastern Europe
In Eastern Europe, the frequency of haplogroup I is generally lower than in Western Europe, but its frequency is more consistent between populations with fewer places of extreme highs or lows. There are two notable exceptions. Nikitin 2009 found that Lemkos in the Carpathian mountains have the "highest frequency of haplogroup I in Europe, identical to that of the population of Krk Island in the Adriatic Sea".Population | N | Frequency | Source |
Boyko | 0/20 | 0.00% | |
Hutsul | 0/38 | 0.00% | |
Lemko | 6/53 | 11.32% | |
Belorussians | 2/92 | 2.17% | |
Russia | 3/215 | 1.40% | |
Romanians | 59 | 0.00% | |
Romanians | 46 | 2.17% | |
Russia | 1/50 | 2.0% | |
Ukraine | 0/18 | 0.00% | |
Croatia | 4/277 | 1.44% | |
Croatia | 15/133 | 11.28% | |
Croatia | 1/105 | 0.95% | |
Croatia | 2/108 | 1.9% | |
Croatia | 1/98 | 1% | |
Herzegovinians | 1/130 | 0.8% | |
Bosnians | 6/247 | 2.4% | |
Serbians | 4/117 | 3.4% | |
Macedonians | 2/146 | 1.4% | |
Macedonian Romani | 7/153 | 4.6% | |
Slovenians | 2/104 | 1.92% | |
Bosnians | 4/144 | 2.78% | |
Poles | 8/436 | 1.83% | |
Caucasus * | 1/58 | 1.80% | |
Russians | 5/201 | 2.49% | |
Bulgaria/Turkey | 2/102 | 1.96% |
Historic and Pre-Historic Samples
Haplogroup I has until recently been absent from ancient European samples found in Paleolithic and Mesolithic grave sites. In 2017, in a site on Italian island of Sardinia was found a sample with the subclade I3 dated to 9124-7851 BC, while in the Near East, in Levant was found a sample with yet-not-defined subclade dated 8,850-8,750 BC, while in Iran was found a younger sample with subclade I1c dated to 3972-3800 BC. In Neolithic Spain was found a sample with yet-not-defined subclade. Haplogroup I displays a strong connection with the Indo-European migrations; especially its I1, I1a1 and I3a subclades, which have been found in Poltavka and Srubnaya cultures in Russia, among ancient Scythians, and in Corded Ware and Unetice Culture burials in Saxony. Haplogroup I has also been noted at significant frequencies in more recent historic grave sites.In 2013, Nature announced the publication of the first genetic study utilizing next-generation sequencing to ascertain the ancestral lineage of an Ancient Egyptian individual. The research was led by Carsten Pusch of the University of Tübingen in Germany and Rabab Khairat, who released their findings in the Journal of Applied Genetics. DNA was extracted from the heads of five Egyptian mummies that were housed at the institution. All the specimens were dated to between 806 BC and 124 AD, a time frame corresponding with the Late Dynastic and Ptolemaic periods. The researchers observed that one of the mummified individuals likely belonged to the I2 subclade. Haplogroup I has also been found among ancient Egyptian mummies excavated at the Abusir el-Meleq archaeological site in Middle Egypt, which date from the Pre-Ptolemaic/late New Kingdom, Ptolemaic, and Roman periods.
Haplogroup I5 has also been observed among specimens at the mainland cemetery in Kulubnarti, Sudan, which date from the Early Christian period.
Samples with determined subclades
Samples with unknown subclades
The frequency of haplogroup I may have undergone a reduction in Europe following the Middle Ages. An overall frequency of 13% was found in ancient Danish samples from the Iron Age to the Medieval Age from Denmark and Scandinavia compared to only 2.5% in modern samples. As haplogroup I is not observed in any ancient Italian, Spanish , British, central European populations, early central European farmers and Neolithic samples, according to the authors "Haplogroup I could, therefore, have been an ancient Southern Scandinavian type "diluted" by later immigration events".Subclades
Tree
This phylogenetic tree of haplogroup I subclades with time estimates is based on the paper and published research.Hg | Age estimate | 95% confidence interval |
N1a1b | 28.6 | 23.5 - 33.9 |
I | 20.1 | 18.4 - 21.9 |
I1 | 16.3 | 14.6 - 18.0 |
I1a | 11.6 | 9.9 - 13.3 |
I1a1 | 4.9 | 4.2 - 5.6 |
I1a1a | 3.8 | 3.3 - 4.4 |
I1a1b | 1.4 | 0.5 - 2.2 |
I1a1c | 2.5 | 1.3 - 3.7 |
I1a1d | 1.8 | 1.0 - 2.6 |
I1b | 13.4 | 11.3 - 15.5 |
I1c | 10.3 | 8.4 - 12.2 |
I1c1 | 7.2 | 5.4 - 9.0 |
I1c1a | 4.0 | 2.5 - 5.4 |
I2'3 | 12.6 | 10.4 - 14.7 |
I2 | 6.8 | 6.0 - 7.6 |
I2a | 4.7 | 3.8 - 5.7 |
I2a1 | 3.2 | 2.1 - 4.4 |
I2b | 1.7 | 0.5 - 2.9 |
I2c | 4.7 | 3.6 - 5.8 |
I2d | 3.0 | 1.1 - 4.8 |
I2e | 3.1 | 1.4 - 4.8 |
I3 | 10.6 | 8.8 - 12.4 |
I3a | 7.4 | 6.1 - 8.7 |
I3a1 | 6.1 | 4.7 - 7.5 |
I3b | 2.6 | 1.1 - 4.2 |
I3c | 9.4 | 7.6 - 11.2 |
I4 | 15.1 | 12.3 - 18.0 |
I4a | 6.4 | 5.4 - 7.4 |
I4a1 | 5.7 | 4.5 - 6.7 |
I4b | 8.4 | 5.8 - 10.9 |
I5 | 18.4 | 16.4 - 20.3 |
I5a | 16.0 | 14.0 - 17.9 |
I5a1 | 9.2 | 7.1 - 11.3 |
I5a2 | 12.3 | 10.2 - 14.4 |
I5a2a | 1.6 | 1.0 - 2.1 |
I5a3 | 4.8 | 2.8 - 6.8 |
I5a4 | 5.6 | 3.5 - 7.8 |
I5b | 8.8 | 6.3 - 11.2 |
I6 | 18.4 | 16.2 - 20.6 |
I6a | 5.3 | 3.5 - 7.0 |
I6b | 13.1 | 10.4 - 15.8 |
I7 | 9.1 | 6.3 - 11.9 |
Distribution
I1
It formed during the Last Glacial pre-warming period. It is found mainly in Europe, Near East, occasionally in North Africa and the Caucasus.It is the most frequent clade of the haplogroup.
Genbank ID | Population | Source |
JQ702472 | ||
JQ702567 | Germany | |
JQ704077 | Germany | |
JQ705190 | ||
JQ705840 |
I1a
The subclade frequency peaks are mostly located in North-Eastern Europe.Genbank ID | Population | Source |
- | FamilyTreeDNA | |
Turkey | FamilyTreeDNA | |
Chuvash |
I1a1
Genbank ID | Population | Source |
Portugal | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
Tunisia | ||
- | ||
Czech | ||
Czech | ||
Turkey | ||
Morocco |
I1a1a
Genbank ID | Population | Source |
Finland | ||
Finland | ||
Finland | ||
Finland | ||
Finland | ||
Finland | ||
Finland | ||
Finland | ||
- | ||
- | ||
- | ||
- | ||
- |
I1a1b
Genbank ID | Population | Source |
- | ||
- | ||
- |
I1a1c
Genbank ID | Population | Source |
- | ||
- | ||
Mishar Tatars |
I1a1d
Genbank ID | Population | Source |
- | ||
- |
I1b
Genbank ID | Population | Source |
Caucasian | ||
India | ||
Jewish Diaspora | ||
Armenian | FamilyTreeDNA | |
- | FamilyTreeDNA | |
- | ||
- | ||
Swedish | FamilyTreeDNA |
I1c
GenBank ID | Population | Source |
- | FamilyTreeDNA | |
- | ||
- | ||
- |
I2'3
It is the common root clade for subclades I2 and I3. There's a sample from Tanzania with which I2'3 shares a variant at position 152 from the root node of haplogroup I, and this "node 152" could be upstream I2'3s clade. Both I2 and I3 might have formed during the Holocene period, and most of their subclades are from Europe, only few from the Near East. Examples of this ancestral branch have not been documented.I2
GenBank ID | Population | Source |
- | FamilyTreeDNA | |
Volga Tatars | ||
- | FamilyTreeDNA | |
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | FamilyTreeDNA | |
Chechnya | ||
Czech | ||
Turkey |
I2a
GenBank ID | Population | Source |
- | FamilyTreeDNA | |
Scotland | FamilyTreeDNA | |
- | ||
- | ||
- | ||
- | FamilyTreeDNA |
I2a1
GenBank ID | Population | Source |
Finland | ||
Ireland | FamilyTreeDNA | |
Ireland | FamilyTreeDNA |
I2b
GenBank ID | Population | Source |
Finland | ||
Finland | ||
Finland | ||
Finland |
I2c
GenBank ID | Population | Source |
- | ||
- | ||
- | ||
- | ||
- |
I2d
GenBank ID | Population | Source |
- | ||
- |
I2e
GenBank ID | Population | Source |
- | ||
- |
I3
GenBank ID | Population | Source |
- | ||
- | ||
- | ||
- | ||
Greece |
I3a
GenBank ID | Population | Source |
France | FamilyTreeDNA | |
- | FamilyTreeDNA | |
- | ||
- | ||
- | ||
- |
I3a1
GenBank ID | Population | Source |
Italy | Bandelt | |
France | FamilyTreeDNA | |
- |
I3b
GenBank ID | Population | Source |
Ireland | FamilyTreeDNA | |
- |
I4
The clade splits into subclades I4a and newly defined I4b, with samples found in Europe, the Near East and the Caucasus.GenBank ID | Population | Source |
- | ||
Italy |
I4a
GenBank ID | Population | Source |
Siberia | ||
- | FamilyTreeDNA | |
- | FamilyTreeDNA | |
Armenian | FamilyTreeDNA | |
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- | ||
- |
I5
Is the second most frequent clade of the haplogroup. Its subclades are found in Europe, e.g. I5a1, and the Near East, e.g. I5a2a and I5b.GenBank ID | Population | Source |
German | FamilyTreeDNA | |
North Ossetia |
I5a
GenBank ID | Population | Source |
Hutterite | ||
- | ||
- | ||
Dubai | ||
Turkey | ||
Yemen | ||
Yemen | ||
Yemen | ||
Yemen | ||
Yemen | ||
Yemen | ||
Yemen |
I5a1
GenBank ID | Population | Source |
Leon | ||
Bedouin | ||
- | ||
- | ||
Italy | ||
Bulgaria |
I6
The subclade is very rare, found until July 2013 only in four samples from the Near East.GenBank ID | Population | Source |
Turkey |
I6a
GenBank ID | Population | Source |
- | ||
- |
I7
It is the rarest defined subclade, until July 2013 found only in two samples from the Near East and the Caucasus.GenBank ID | Population | Source |
Armenian | FamilyTreeDNA | |
Kuwait |
Genetics
Backbone mtDNA Tree
Footnotes
Works Cited
Journals
Websites