Genetic history of North Africa


The genetic history of North Africa has been heavily influenced by geography. The Sahara desert to the south and the Mediterranean Sea to the North were important barriers to gene flow in prehistoric times. However, Africa is connected to Western Asia via the Isthmus of Suez, while at the Straits of Gibraltar North Africa and Europe are separated by only.
Although North Africa has experienced gene flows from the surrounding regions, it has also experienced long periods of genetic isolation, allowing a distinctive genetic "Berber marker" to evolve in the native Berber people. Today, this genetic "Berber marker" is consistently found in the regions and populations that still predominantly speak the Berber languages, as well as in the Canary Islands which was inhabited by native Berbers and by their descendants to this day. A recent genetic study showed that modern North Africans are genetically similar to Paleolithic North Africans.
Current scientific debate is concerned with determining the relative contributions of different periods of gene flow to the current gene pool of North Africans. Anatomically modern humans are known to have been present in North Africa during the Middle Paleolithic , as attested by the by Jebel Irhoud 1. Without morphological discontinuity, the Aterian was succeeded by the Iberomaurusian industry, whose lithic assemblages bore relations with the Cro-Magnon cultures. The Iberomaurusian industry was succeeded by the Capsian industry in the eastern part of North Africa.
In the 7th century A.D., part of the Berber countries was invaded by Muslim Umayyad Arabs. Under the relatively brief Arab-Umayyad occupation and the later arrival of some bedouin Arabs and Syriacs from the Near East in Asia and the arrival of some Jews and Muslims fleeing the Spanish Catholic Reconquista, a partial population mix or fusion might have taken place and might have resulted in some genetic diversity among some North Africans. However, this partial fusion of Berbers and foreigners is mostly limited in terms of geographical distribution to some of the main Berber urban areas and some coastal plains of North Africa because migrants and refugees tend to gravitate towards major cities since ancient times and they tend to avoid the heartland. The Berber ethnic and genetic nature of North Africa is still dominant, either prominently or subtly.

Y-chromosome

is the most common paternal haplogroup among Berbers. It represents up to 100 percent of Y-chromosomes among some Berber populations. Haplogroup E is thought to have emerged in prehistoric North Africa or East Africa, and would have later dispersed into West Asia. The major subclades of haplogroup E found amongst Berbers belong to E-Z827, which is believed to have emerged in North Africa. Common subclades include E1b1b1a, E1b1b1b and E1b1b1*. E1b1b1b is distributed along a west-to-east cline with frequencies that can reach as high as 100 percent in Northwest Africa. E1b1b1a has been observed at low to moderate frequencies among Berber populations with significantly higher frequencies observed in Northeast Africa relative to Northwest Africa.
West Eurasian haplogroups, such as Haplogroup J and Haplogroup R1, have also been observed at moderate frequencies. A thorough study by Arredi et al., which analyzed populations from Algeria, concludes that the North African pattern of Y-chromosomal variation is largely of Neolithic origin, which suggests that the Neolithic transition in this part of the world was accompanied by demic diffusion of Berber–speaking pastoralists from the Middle East However, Loosdrecht et al. 2018 demonstrated that E1b1b is most likely indigenous to North Africa and migrated from North Africa to the Near East during the Paleolithic.

E1b1b1b (E-M81); formerly E3b1b, E3b2

is the most common Y chromosome haplogroup in North Africa, dominated by its sub-clade E-M183. It is thought to have originated in North Africa 5,600 years ago. The parent clade, E1b1b, originated in East Africa. Colloquially referred to as the Berber marker or Maghrebi marker for its prevalence among Mozabite, Middle Atlas, and other Berber-speaking groups, E-M81 is also quite common among North African groups. It reaches frequencies of up to 90 percent in some parts of the Maghreb. This includes the Saharawish for whose men reports that approximately 76 percent are M81+.
This haplogroup is also found at high levels in Canary islands and in Sicily, parts of the Andalusia and Southern Portugal as well as much lower levels in South Italy and Provence. In Andalusia, it is generally more common than E1b1b1a, unlike the rest of Europe, and as a result E-M81 is found in parts of Latin America, among Hispanic and Italian-origin men in USA. As an exceptional case in Europe, this sub-clade of E1b1b1 has also been observed at 40 percent the Pasiegos from Cantabria.
In smaller numbers, E-M81 men can be found in Sudan, Cyprus and among Sephardic Jews.
There are two recognized sub-clades, although one is much more common than the other.

Mitochondrial DNA

Individuals receive mtDNA only from their mothers. According to Macaulay et al. 1999, "one-third of Mozabite Berber mtDNAs have a Near Eastern ancestry, probably having arrived in North Africa less than 50,000 years ago, and one-eighth have an origin in sub-Saharan Africa. Europe appears to be the source of many of the remaining sequences, with the rest having arisen either in Europe or in the Near East". Maca-Meyer et al. 2003 analyze the "autochthonous North African lineage U6" in mtDNA, and conclude that:
A genetic study by Fadhlaoui-Zid et al. 2004 argues concerning certain exclusively North African haplotypes that "expansion of this group of lineages took place around 10,500 years ago in North Africa, and spread to neighbouring population", and apparently that a specific Northwestern African haplotype, U6, probably originated in the Near East 30,000 years ago accounts for 28 percent in Mozabites, 18 percent in Kabyles, but only accounts for 6-8 percent in the southern Moroccan Berbers. Rando et al. 1998 "detected female-mediated gene flow from sub-Saharan Africa to NW Africa" amounting to as much as 21.5 percent of the mtDNA sequences in a sample of NW African populations; the amount varied from 82 percent in Tuaregs to less than 3 percent in Riffians in north of Morocco. This north-south gradient in the sub-Saharan contribution to the gene pool is supported by Esteban et al.
Nevertheless, individual Berber communities display a considerably high mtDNA heterogeneity among them. The Berbers of Jerba Island, located in South Eastern Tunisia, display an 87 percent West Eurasian contribution with no U6 haplotypes, while the Kesra of Tunisia, for example, display a much higher proportion of typical sub-Saharan mtDNA haplotypes, as compared to the Zriba. According to the article, "The North African patchy mtDNA landscape has no parallel in other regions of the world and increasing the number of sampled populations has not been accompanied by any substantial increase in our understanding of its phylogeography. Available data up to now rely on sampling small, scattered populations, although they are carefully characterized in terms of their ethnic, linguistic, and historical backgrounds. It is therefore doubtful that this picture truly represents the complex historical demography of the region rather than being just the result of the type of samplings performed so far."
A 2005 study discovered a close mitochondrial link between Berbers and the Uralic speaking Saami of northern Scandinavia, and argues that Southwestern Europe and North Africa was the source of late-glacial expansions of hunter-gatherers that repopulated Northern Europe after a retreat south during the Last Glacial Maximum, and reveals a direct maternal link between those European hunter-gatherer populations and the Berbers. With regard to Mozabite Berbers, one-third of Mozabite Berber mtDNAs have a Near Eastern ancestry, probably having arrived in North Africa ∼50,000 years ago, and one-eighth have an origin in sub-Saharan Africa. Europe appears to be the source of many of the remaining sequences, with the rest having arisen either in Europe or in the Near East."
According to the most recent and thorough study on Berber mtDNA from Coudray et al. 2008, which analysed 614 individuals from 10 different regions, Algeria, Tunisia and Egypt ), the results may be summarized as follows:
The Berber mitochondrial pool is characterized by an overall high frequency of Western Eurasian haplogroups, a markedly lower frequency of sub-Saharan L lineages, and a significant presence of North African haplogroups U6 and M1.
There is a degree of dispute about when and how the minority sub-Saharan L haplogroups entered the North African gene pool. Some papers suggest that the distribution of the main L haplogroups in North Africa was mainly due to trans-Saharan slave trade, as espoused by Harich et.al in a study conducted in 2010. However, also in September 2010, a study of Berber mtDNA by Frigi et al. concluded that many of L haplogroups were much older and introduced by an ancient African gene flow around 20,000 years ago.

Autosomal DNA

On 13 January 2012, an exhaustive genetic study of North Africa's human populations was published in PLoS Genetics and was undertaken jointly by researchers in the Evolutionary Biology Institute and Stanford University, among other institutions.
The study reveals that the genetic composition of North Africa's human populations is extremely complex, and the result of a local component dating back thirteen thousand years to approximately 11,000 BCE and the varied genetic influence of neighbouring populations on North African groups during successive migrations. According to David Comas, coordinator of the study and researcher at the Institute for Evolutionary Biology, "some of the questions we wanted to answer were whether today's inhabitants are direct descendants of the populations with the oldest archaeological remains in the region, dating back fifty thousand years, or whether they are descendants of the Neolithic populations in the Middle East, which introduced agriculture to the region around eight thousand years ago. We also wondered if there had been any genetic exchange between the North African populations and the neighbouring regions and if so, when these took place".
To answer these questions, the researchers analyzed around 800,000 genetic markers, distributed throughout the entire genome in 125 North African individuals belonging to seven representative populations in the whole region, and the information obtained was compared with the information from the neighbouring populations.
The results of this study show that there is a native genetic component that defines North Africans. In-depth study of these markers shows that the people inhabiting North Africa today are not descendants of the earliest occupants of this region fifty thousand years ago, but shows that the ancestors of today's North Africans were a group of populations that settled in the region around thirteen thousand years ago. Furthermore, this local North African genetic component is very different from the one found in the populations of Sub-Saharan Africa, which shows that the ancestors of today's North Africans were members of a subgroup of humanity who left North Africa to conquer the rest of the world and who subsequently returned to the north of the continent to settle in the region.
As well as this local component, North African populations were also observed to share genetic markers with all the neighbouring regions, as a result of more recent migrations, although these appear in different proportions.
There is an influence from the Middle East, which becomes less marked as the distance from the Levant and Arabian Peninsula increases, similar proportions of European influence in all North African populations, and, in some populations, there are even individuals who present a large proportion of influence from the South of the Sahara in their genome.
A 2015 study by Dobon et al. identified an ancestral autosomal component of West Eurasian origin that is common to many modern Afro-Asiatic-speaking populations in Northeast Africa. Known as the Coptic component, it peaks among Egyptian Copts, including those who settled in Sudan over the past two centuries. The Coptic component evolved out of a main North African and Middle Eastern ancestral component that is shared by other Egyptians and also found at high frequencies among other Afro-Asiatic populations in Northeast Africa. The scientists suggest that this points to a common origin for the general population of Egypt. They also associate the Coptic component with Ancient Egyptian ancestry, without the later Medieval Era Arabian influence that is present among other Egyptians.
According to a paper published in 2017 most of the genetic studies in North African populations agree with a limited correlation between genetics and geography, and show a high population heterogeneity in the region. North African populations have been described as a mosaic of North African, Middle Eastern, European and sub-Saharan ancestries. This explains the current genetic structure in North Africa, characterized by diverse and heterogeneous populations, and why nearby populations inhabiting the same location might be genetically more distant than groups of people in geographically distant populations.
A recent genetic study published in the "European Journal of Human Genetics" in Nature showed that Northern Africans are closely related to Europeans and West Asians as well as to Southwest Asians. Northern Africans can clearly be distinguished from West Africans and other African populations dwelling south of the Sahara.
According to a recent genetic study in 2019, North African populations have been found to be the result of admixture of extensive gene flow coming from four different geographical and temporal sources.
PopIndigenous North African
EuropeanMiddle EasternSub-Saharan
Saharawi37%34%18%11%
Moroccan30%38%19%14%
Berber-Moroccan28%47%17%8%
Berber-Mozabite26%43%18%13%
Algerian22%46%17%15%
Berber-Zenata22%27%12%39%
Lybian22%34%35%9%
Berber-Tunisian21%43%26%10%
Tunisian18%44%25%13%
Egyptian11%41%38%10%

Genetic influence

Y-chromosome DNA

The general parent Y-chromosome Haplogroup E1b1b, which might have originated in the Horn of Africa or the Near East is by far the most common clade in North and Northeast Africa and found in select populations in Europe, particularly in the Mediterranean and South Eastern Europe. E1b1b reaches in Europe Greece and the Balkan region but, is not as high there as it is among African populations..
A study from Semino showed that Y-chromosome haplotype E1b1b1b, is specific to North African populations and almost absent in Europe except the Iberia and Sicily. Another 2004 study showed that E1b1b1b is found present, albeit at low levels throughout Southern Europe.
The findings of this latter study contradict a more thorough analysis Y-chromosome analysis of the Iberian peninsula according to which haplogroup E1b1b1b surpasses frequencies of 10 percent in Southern Spain. The study points only to a very limited influence from Northern Africa and the Middle East in Iberia, both in historic and prehistoric times. The absence of microsatellite variation suggests a very recent arrival from North Africa consistent with historical exchanges across the Mediterranean during the period of Islamic expansion, namely of Berber populations. However, a study restricted to Portugal, concerning Y-chromosome lineages, revealed that "The mtDNA and Y-DNA data indicate that the presence of Berbers in that region dates clearly prior to the Moorish expansion in 711 AD, so it´s not recent there at all.... Our data indicate that male Berbers, unlike sub-Saharan immigrants, constituted long-lasting and continuous community in the country".
A wide-ranging study using 6,501 unrelated Y-chromosome samples from 81 populations found that: "Considering both these E-M78 sub-haplogroups and the E-M81 haplogroup, the contribution of northern African lineages to the entire male gene pool of Iberia, continental Italy and Sicily can be estimated as 5.6 percent, 3.6 percent and 6.6 percent, respectively." It has also been argued that the European distribution of E-M78 and its sub-clades is compatible with the Neolithic demic diffusion of agriculture, but also possibly partly from at least, the Mesolithic. For example, estimated that E-M78 has been in Europe longer than 10,000 years. In support of this theory, human remains excavated in a Spanish funeral cave dating from approximately 7,000 years ago were shown to be in this haplogroup. More recently, two E-M78 have been found in the Neolitich Sopot and Lengyel cultures from the same period.
High-resolution analysis of human Y-chromosome variation shows a sharp discontinuity and limited gene flow between northwestern Africa and the Iberian Peninsula which seems supported by the most recent studies.
A very recent study about Sicily by Gaetano et al. 2008 found that "The Hg E3b1b-M81, widely diffused in northwestern African populations, is estimated to contribute to the Sicilian gene pool at a rate of 6 percent.".
According to the most recent and thorough study about Iberia by Adams et al. 2008 that analysed 1,140 unrelated Y-chromosome samples in Iberia, a limited contribution of northern African lineages to the entire male gene pool of Iberia was found : "mean North African admixture is just 10.6 percent, with wide geographical variation, ranging from zero in Gascony to 21.7 percent in Northwest Castile".
More recent extensive/complete studies, like «The Geography of Recent Genetic Ancestry across Europe», determined that Italians and Iberians, in fact, share very few common ancestors with other populations over at least, the last 2.500 years, unlike all the other European populations, present on the study, so North African contribution in both peninsulas is very likely limited, many times constituted by ancient haplogroups, and in many cases, geographically not compatible with Moor invasion.

Mitochondrial DNA

Genetic studies on Iberian populations also show that North African mitochondrial DNA sequences and sub-Saharan sequences, although present at only low levels, are still at higher levels than those generally observed elsewhere in Europe, though very likely, most of the L mtDNA that has been found in minor amounts in Iberia, is actually pre-neolithic in origin, as it was demonstrated by María Cerezo et al., and U6 too, which also have a very old presence in Iberia, since Iberia has a great diversity in lineages from this haplogroup, it was already found in some local hunter-gatherer remains and its local geographic distribution is not compatible, in many cases, with Moor occupation area. Haplogroup U6 have also been detected in Sicily and Southern Italy at much lower frequencies. It happens also to be a characteristic genetic marker of the Saami populations of Northern Scandinavia.
It is difficult to ascertain that U6's presence is the consequence of Islam's expansion into Europe during the Middle Ages, particularly because it is more frequent in the west of the Iberian Peninsula rather than in the east. In smaller numbers it is also attested in the British Isles, again in its northern and western borders. It may be a trace of a prehistoric Neolithic/Megalithic/Mesolithic or even Upper Paleolithic expansion along the Atlantic coasts from North Africa or Iberian Peninsula, perhaps in conjunction with seaborne trade, although an alternative, but less likely explanation, would attribute this distribution in Northern Britain to the Roman period. One subclade of U6 is particularly common among Canarian Spaniards as a result of native Guanche ancestry.

Ancient DNA

In 2013, Nature announced the publication of the first genetic study utilizing next-generation sequencing to ascertain the ancestral lineage of an Ancient Egyptian individual. The research was led by Carsten Pusch of the University of Tübingen in Germany and Rabab Khairat, who released their findings in the Journal of Applied Genetics. DNA was extracted from the heads of five Egyptian mummies that were housed at the institution. All the specimens were dated to between 806 BC and 124 AD, a timeframe corresponding with the Late Dynastic and Ptolemaic periods. The researchers observed that one of the mummified individuals likely belonged to the mtDNA haplogroup I2, a maternal clade that is believed to have originated in Western Asia.
In 2013, Iberomaurusian skeletons from the prehistoric sites of Taforalt and Afalou in the Maghreb were analyzed for ancient DNA. All of the specimens belonged to maternal clades associated with either North Africa or the northern and southern Mediterranean littoral, indicating gene flow between these areas since the Epipaleolithic. The ancient Taforalt individuals carried the mtDNA haplogroups U6, H, JT and V, which points to population continuity in the region dating from the Iberomaurusian period.
The E1b1b-M81, R-M269, and E-M132/E1a paternal haplogroups have been found in ancient Guanche fossils excavated in Punta Azul, El Hierro, Canary Islands, which are dated to the 10th century. Maternally, the specimens all belong to the H1 clade. These locally born individuals carried the H1-16260 haplotype, which is exclusive to the Canary Islands and Algeria. Analysis of their autosomal STRs indicates that the Guanches of the Canary Islands were most closely related to Moroccan Berbers.
In 2018, DNA analysis of Later Stone Age individuals from the site of Taforalt and Early Neolithic Moroccans from the site of Ifri N' Ammar revealed that they were related to the modern North Africans and carried Y-DNA E-M35, EM-215*, E-L19*, and E-M78*. These studies confirmed a long-term genetic continuity in the region showing that Mesolithic Moroccans are similar to Later Stone Age individuals from the same region and possess an endemic component retained in present-day Maghrebi populations.