We report genome-wide SNP data for nine individuals from the ST site Rostovka, new data for two BOO individuals, and shotgun genome data for five already published BOO individuals (Table1). We performed 1240k SNP22,23 and mitochondrial genome captures on the nine individuals from ROT, and the two new BOO individuals, as well as Y-chromosomal capture24 on just the males. Lastly, we generated shotgun sequence data for five published BOO individuals, including one 40 covered individual (Fig.1a, Table1, Supplementary Data1). Of the newly analyzed individuals, eight ROT individuals were genetically male and one was female, while both new BOO individuals were female. Biological relatedness among the newly reported individuals was estimated using READ25, Pairwise Mismatch Rate (PMR), KIN26, and lcMLkin27 (Supplementary Data25). Based on consistent results across these analyses, we identified a pair of second-degree relatives (ROT011 and ROT015), both of whom are males carrying Y-haplogroup C2a, and could either represent a grandson/grandparent, a nephew/uncle pair or paternal half-siblings, consistent with overlapping radiocarbon dates for both individuals (Table1). A second-degree related pair was also found among the BOO individuals (BOO004-BOO005).
We generated a radiocarbon date for individual BOO004, whose genome was shotgun sequenced to 40 coverage (Table1). The radiocarbon date (MAMS-57646) was determined to be 335125 BP, or 1735-1538 calBC (2) after calibration with OxCal 4.428, and 1504-1220 calBC (2) when correcting for a potential freshwater reservoir effect using the Marine 20 curve28. The corrected date is an approximation because we do not know the extent of fish consumption in the BOO individuals.
We used smartPCA29 to perform a principal component analysis (PCA) of modern-day reference populations from Eurasia and the Americas, onto which the ROT and the BOO individuals were projected (Fig.2a, b). When assessing the genetic structure of Eurasian populations, plotting PC1 vs. PC2 (Fig.2b) allows us to separate west and east Eurasian populations from the Native American groups, while plotting PC1 vs PC3 (Fig.2a) distinguishes the major Eurasian ecological zones30,31. When plotting PC1 vs PC2, the ANE ancestry cline becomes apparent including individuals from Afontova Gora, Malta1, Botai, West Siberian hunter-gatherers (WSHG), and others. ROT individuals vary along the ANE ancestry cline, while the BOO form a tight cluster within the variation seen at ROT. ROT and BOO individuals fall on the Eurasian PCA (PC1 vs PC3), mainly along a genetic cline of present-day populations that occupy the ecological forest-tundra zone (after Jeong et al. 31; Fig.2a), which coincides with the distribution of modern-day Uralic speaking groups and represents the Siberian ancestry variation. BOO individuals form a tighter and more homogeneous cluster, in the middle of the cline between Eastern_Siberia_LNBA and the EEHG, that can be seen with both the PCA and the ADMIXTURE analyses, in line with what has been previously reported17. By contrast, the ROT individuals are genetically more heterogenous and spread on a triangle (Fig.2b) between the Western Steppe Middle to Late Bronze Age cluster (e.g. Sintashta_MLBA32), Eastern_Siberia_LNBA and WSHG individuals, which is also visible in the results from unsupervised ADMIXTURE (k=10) (Fig.2c, Supplementary Fig.4).
a Principal component analysis plot with newly typed (colored symbols with black outline) and published (no outline) ancient individuals projected onto modern variation calculated using modern Eurasian and North American populations from AADR v44.371. Modern populations are shown as gray circles and modern Uralic speaking groups as open circles. Ancient reference individuals are listed under Published ancient data, and the new individuals are listed under This study. PC1 vs PC3 are plotted, which reveals three genetic clines (labeled in Italics) between Western and Eastern Eurasian populations; b PCA results for PC1 vs PC2; c Unsupervised ADMIXTURE results (k=10) of a representative subset of the relevant populations and sample names shown in the PCA plot. WSHG West-Siberian Hunter-Gatherers, EEHG Eastern European Hunter-Gatherers, WHG Western Hunter-Gatherers, LNBA Late Neolithic/Bronze Age, MLBA Middle/Late Bronze Age.
We performed Y-haplogroup (Y-hg) typing of the ROT males using the YMCA method24 (Table1) and identified two individuals who carry Y-hg R1a (ROT003: R1a-M417 and ROT016: R1a-Z645), one of the most widely distributed Y-hgs in Eurasia33. However, both individuals could be R1a-Z645, since ROT003 does not have coverage on either ancestral or derived ISOGG list SNPs downstream of R1a-M417. Generally, due to their geographic distribution, these R1a sub-lineages are thought to represent the eastward movement of Corded Ware-, and Fatyanovo-associated groups. ROT002, the individual with the highest proportion of north Siberian ancestry, was assigned to Y-hg N1a (N-L392). This Y-hg has also been found in two BOO individuals17. Lineage N-L392 is one of the most common in present-day Uralic populations which highlights the potential importance of Y-hg N-L392 in the dissemination of proto-Uralic. One of the male individuals (ROT004) was assigned to haplogroup Q1b (Q-M346), which is found throughout Asia, including in several Turkic speaking populations, e.g.,Tuvinians, Todjins, Altaians, Sojots, and the Mongolian-speaking Kalmyk population34. ROT017 carries Y-hg Q1b (Q-L53), which is also common among present-day Turkic speakers across Eurasia. The branch Q-YP4004 includes Central Asian Q-L53(xL54) lineages and one ancient Native American individual from Lovelock Cave in Nevada35, while the oldest Q-L53 individual is irk040 (Baikal Neolithic, 4846 BP)36. The lineage C2a-L1373, carried by ROT011, is found at high frequency in Central Asian populations, North Asia and the Americas. Lastly, ROT006 carries Y-hg R1b (R1b-M73), a sister-clade of R1b-M269, which is common in the Caucasus, Siberia, Mongolia, and Central Asia today34. Overall, the Y-hg lineage diversity of male ROT individuals is consistent with the heterogeneous nature of the ST37.
We also identified a large diversity in the mitochondrial haplogroups (mt-hg) among ROT (Table1), including mt-hgs that are found commonly in east Eurasia (A10, C1, C4, G2a1)38,39,40,41 and in west Eurasia (H1, H101, U5a, R1b, R1a)42,43. Consistently, the individual ROT002 with the highest affinity to Siberia_LNBA and carrying the Y-hg N-L392 also carries a mt-hg G2a1 commonly found in Eastern Eurasia. Analogously, individual ROT003 who carries Sintashta_MLBA-like ancestry and the Corded Ware-derived Y-hg R1a1a1, is also a carrier of the R1a1a mt-hg commonly found in west Eurasia.
We used F-statistics44 to formally assess the relationship of the ROT and BOO individuals with each other, and with different modern and ancient reference individuals and populations. First, we performed outgroup f3-statistics of the form f3(Mbuti; test, modern) to test for the affinity of each ROT and BOO individual with modern world-wide populations (Supplementary Fig.5, Supplementary Data6). The f3-statistics results mirror the distribution of the samples in the PCA and ADMIXTURE analyses, wherein the individuals with higher proportions of Eastern_Siberia_LNBA ancestry (e.g. ROT002) show a greater affinity to modern-day Siberian and Uralic-speaking populations, such as Nganasan, Evenk, Negidal, Nanai, and Ulchi (Supplementary Fig.5A), whereas the individuals with more Sintashta-like Western_Steppe_MLBA ancestry (e.g., ROT003) are closer to modern-day (North) Europeans, including Norwegian, Belarusian, Lithuanian, Scottish and Icelandic individuals (Supplementary Fig.5B). Comparisons with ancient groups/individuals using f3(Mbuti; test, ancient) showed a similar trend (Supplementary Fig.5). ROT002 on the eastern end of the Eurasian cline shares more genetic drift with Eastern_Siberia_LNBA, Russia Ust Belaya Neolithic, and Mongolia Early Iron Age individuals (Supplementary Fig.5A). By contrast, ROT003, the westernmost individual in the Eurasian PCA space, has the highest affinity to Lithuania early Middle Neolithic Narva, Russia Sintashta, Kazakhstan Georgievsky Middle Bronze Age, Russia Poltavka, and Serbia Mesolithic individuals (Supplementary Fig.5B). Similar trends can be observed for BOO, wherein the modern Uralic-speaking populations, such as Nganasan and Selkup, are among the tests with the highest f3- statistics. The ancient individuals most closely related to BOO areEEHG,WSHG, Botai and Tarim Early/Middle Bronze Age (EMBA) individuals carrying high levels of ANE ancestry (Supplementary Fig.5JR).
Based on the geographic location of the sites, we tested whether ROT and BOO individuals retained more local ANE ancestry compared to contemporaneous groups from similar general geographic area, time period, and archeological affiliation, using f4-statistics of the form f4(X, test; WSHG, Mbuti) where X stands for ROT and BOO individuals, and test populations include Okunevo, Tarim_EMBA_1, Sintashta_MLBA, and Eastern_Siberia_LNBA (Fig.3). This test allowed us to identify groups that form a clade with ROT and BOO, and cases where ROT and BOO may have additional affinity to ANE ancestry represented here by WSHG from Russia as the best spatial and temporal proxy. We find that ROT and BOO individuals carry excess affinity to ANE when compared to Eastern_Siberia_LNBA (Fig.3a) and Russia MLBA Sintashta (Fig.3c), except for ROT002 and ROT003. All BOO individuals are symmetrically related to the Okunevo Bronze Age group indicating no additional affinity to ANE (Fig.3b). However, we see more heterogeneity in ROT, with some individuals having significantly more, and others significantly less genetic affinity to WSHG compared to Okunevo (Fig.3b). All but one individual (ROT013) have significantly less ANE ancestry compared to Tarim EMBA (Fig.3d). The general observations from f4-statistics formally confirm the PCA results (Fig.2), where ROT individuals vary in their location with regards to WSHG, i.e., ANE ancestry affinity, while the BOO individuals are more homogeneous.
f4-statistics testing for excess WSHG ancestry in ROT and BOO individuals with respect to a Yakutia Lena 4780-2490 (Siberia_LNBA), b Okunevo, c Russia MLBA Sintashta, and d Tarim EMBA1. Significantly non-zero f4-statistics (|Z|>3) are shown in color, and non-significant f4-statistics are shown in gray. All error bars indicate 3 standard errors. X denotes the individuals given on the y-axis.
The genetic profile of BOO individuals is intriguing, when compared to present-day individuals of the same geographic area of Scandinavia and western Russia (Fig.2). However, the cultural affiliation of the BOO individuals remains poorly understood. Based on pairwise outgroup-f3-statistics with different ancient populations from Scandinavia, Anatolia_N, and Sintashta_MLBA, the BOO and ROT individuals separate from the rest of the ancient populations (Supplementary Fig.6). The f3- and f4-statistics together show a non-local genetic origin for the BOO individuals, with no substantial levels of early European farmer ancestry, which thus excludes contact with contemporary and genetic contribution towards subsequent Scandinavian groups.
Lastly, we performed qpAdm analysis to formally test for and quantify the admixture proportions of potential source populations for ROT and BOO individuals (Fig.4, Supplementary Data7). Here, we successfully modeled the ROT individuals as a mix of three sources (Eastern_Siberia_LNBA, Sintashta_MLBA, and WSHG), except for ROT002, which we modeled instead as a two-source mixture of mainly Eastern_Siberia_LNBA ancestry and a smaller proportion of EEHG-like ancestry that could be represented by either Sintashta_MLBA, WSHG, or EEHG, and ROT003 which we modeled with Sintashta_MLBA as single source (Fig.4b). We also tested whether ROT individuals could be modeled as a two-way mixture of the Eastern_Siberia_LNBA ancestry and either Sintashta_MLBA or WSHG as sources, however, this combination of ancestries did not result in consistently plausible model fits, compared to the combination of all three ancestries (Fig.4ac). By contrast, BOO individuals could not be modeled using either the combination of all three ancestry sources (Eastern_Siberia_LNBA, Sintashta_MLBA, and WSHG), or just a two-way mixture (Fig.4a, c, Supplementary Data7). However, replacing WSHG with EEHG as the putative local hunter-gatherer ancestry substrate and using Eastern_Siberia_LNBA as a second source provided good model fits (Fig.4d, Supplementary Data8). Importantly, all BOO individuals, except for BOO001, could also be modeled as a mixture of ROT002 and EEHG (Fig.4e, f, Supplementary Data8) suggesting, together with the results from the outgroup f3-statistics (Supplementary Fig.6), that BOO individuals may represent a subset of the diversity present in ROT.
a qpAdm models using Eastern Siberia LNBA, Russia MLBA Sintashta, and WSHG as sources; b Models with Eastern Siberia LNBA and Sintashta as sources; c Models with Eastern Siberia LNBA and WSHG as sources; d Models with Eastern Siberia LNBA and EEHG as sources; e Models with Eastern Siberia LNBA and EEHG; f Models with ROT002 and EEHG. Corresponding p-values for each analysis are shown to the right of each row. Models with p-values<0.05 are grayed out, and the models with negative ancestry proportions are indicated as Not feasible.
To investigate distant biological relatedness among the BOO individuals, we first imputed the genomes using GLIMPSE45 with the 1000G dataset46 as a reference panel (ROT individuals are below the required coverage threshold for imputation). Based on the identification of haplotype blocks of certain lengths that are shared between individuals, i.e. identical by descent47, we confirmed the above identified 2nd-degree related pair (BOO004-BOO005), and also found two third-degree related pairs (BOO003-BOO004 and BOO003-BOO005), as well as multiple pairs potentially related in the fourth-fifth-degree (Supplementary Data9). The observation that the BOO individuals are distantly related to each other explains the relative homogeneity seen in the sample compared to ROT. According to the archeological context, two pairs of biologically related individuals were buried in the same grave: third-degree related pair BOO003 (burial 16, sepulture 1, female) and BOO004 (burial 16, sepulture 3, male); and one 4th/5th-degree related pair BOO005 (burial 17, sepulture 3, female) and BOO009 (burial 17, sepulture 4, female)18.
We also tested for IBD sharing between BOO and published individuals who are broadly contemporaneous and geographically close, including Tarim_EMBA48, Okunevo42, Sintashta_MLBA32, EEHG49, Botai42, Yamnaya42, Easter_Siberia_LNBA36, and others (Fig.5a, Supplementary Data9). We found three shared IBD fragments (1422cM) between BOO individuals and Sintashta_MLBA individuals (Supplementary Data9), potentially suggesting shared ancestors as recent as approximately 500750 years, and most likely reflecting the shared EEHG ancestry that is present in both groups.
a IBD sharing between BOO and published data. Shared IBD chunks between 12 and 30cM in length are shown. The total IBD length shared is indicated by the color of the square, and population designation is shown on the y-axis. b HapROH output for BOO, ROT and relevant contemporaneous populations. Runs of homozygosity (ROH) are plotted by population for individuals with more than 400k SNPs on the 1240k panel. ROH segments are colored according to their binned lengths.
To investigate the underlying population structure, general parental background relatedness, and effective population sizes, we used HapROH50 to analyze runs-of-homozygosity (ROH) in the genomes of the BOO individuals together with a set of published individuals with more than 400k SNPs on the 1240k panel. We compared BOO to geographically and genetically close individuals from the Eurasian forest-tundra-steppe area, associated with Okunevo, Sintashta_MLBA, EEHG (UOO), Eastern_Siberia_LNBA, Tarim EMBA, and Fatyanovo cultures (Fig.5b). We also included two ROT individuals with more than 200k SNPs, but these results should be interpreted with caution. The ROH results of BOO individuals suggests that this early Metal Age group had a relatively small effective population size of ~2N=800, and one of the individuals (BOO006) appears to be an offspring of second cousins. Tarim EBMA, Okunevo, and Eastern_Siberia_LNBA groups also seemed to have relatively small effective population sizes, while Fatyanovo and Sintashta-associated groups potentially had larger effective population sizes (Fig.5b). In comparison, ROT individuals show similar ROH profiles to the populations they are closely related to, based on the PCA and F-statistics, i.e., ROT002 resembles the Eastern Siberian LNBA, and ROT017 the BOO individuals (Fig.5b).
High-coverage shotgun data from BOO004 allowed us to perform demographic modeling to investigate North Eurasian genetic ancestry and the nature of the admixture of the Eastern and Western Eurasian sources found in BOO individuals using a site-frequency spectrum (SFS) modeling-based method called momi251. We included published data from representative North Eurasian populations, both preceding and contemporaneous to BOO. We also used DATES v.75352 to estimate the date of the admixture event in BOO individuals between the EEHG and Eastern_Siberia_LNBA sources to be 17.981.06 generations ago, or around 500 calendar years prior to the mean radiocarbon date of BOO, assuming a generation time of 29 years53 (Supplementary Fig.7). This results in an approximate date of admixture ~4086 or ~3800 years ago when the marine reservoir correction is taken into account.
After an incremental build-up of our momi2 model (Supplementary Note4, Supplementary Data1012, Supplementary Tables16, Supplementary Figs.812) and including three admixture events, our final model estimated the split times between Africans (Yoruba, YRI) and Eurasians (Loschbour) 87,790 years ago (95% CI 85,25091,040), and between Western Eurasians (Loschbour) and Eastern Eurasians (CHB) at 53,010 years ago (95% CI 49,20055,540). The divergence between the lineage leading to the Eastern Siberia LNBA and CHB was found to be 21,580 years ago (95% CI 18,60024,810). We then modeled gene flow from the lineage leading to CHB to the EEHG at 9.4% (95% CI 4.4%14.7%). The effective population size Ne for Eastern Siberia LNBA was found to be 1690 (95% CI 13802020), and the population size for EEHG - 2470 (95% 19303790). The gene flow event from EEHG to East Siberian LNBA was modeled at 12.5% (95% CI 7.77%15.7%). These gene flow events are in line with the shared ANE ancestry history in both lineages. We estimated a recent admixture for BOO individuals (95% confidence interval (CI) 37784357 years ago), with substantial gene flow (39.8%; 95% CI 34.944.4%) from Eastern Eurasians (represented here by Eastern Siberia LNBA). Importantly, the mixture proportions are consistent with the results from qpAdm, and the date estimates overlap with those from DATES. The population size estimated for BOO (Ne=235, 95% CI 118441) from momi2 (Fig.6, Supplementary Data10) is at the smaller end of the estimate obtained from hapROH (2N between 400 and 800 individuals, Fig.6), which is likely an effect of momi2 not taking into account inbreeding via the analysis of the runs of homozygosity.
Momi2 demographic model for BOO004 using shotgun sequencing data from published ancient and modern individuals. Point estimates of the final model are shown in blue; results for 100 nonparametric bootstraps are shown in gray. The sampling times of populations are indicated by circles and population size estimates by the thickness of branches. The y-axis is linear below 10,000 years ago, and logarithmic above it. See Supplementary Data10 for specific parameter values. YRI Yoruban, CHB Han Chinese.
More here:
Bronze age Northern Eurasian genetics in the context of development of metallurgy and Siberian ancestry ... - Nature.com