Categories
- Global News Feed
- Uncategorized
- Alabama Stem Cells
- Alaska Stem Cells
- Arkansas Stem Cells
- Arizona Stem Cells
- California Stem Cells
- Colorado Stem Cells
- Connecticut Stem Cells
- Delaware Stem Cells
- Florida Stem Cells
- Georgia Stem Cells
- Hawaii Stem Cells
- Idaho Stem Cells
- Illinois Stem Cells
- Indiana Stem Cells
- Iowa Stem Cells
- Kansas Stem Cells
- Kentucky Stem Cells
- Louisiana Stem Cells
- Maine Stem Cells
- Maryland Stem Cells
- Massachusetts Stem Cells
- Michigan Stem Cells
- Minnesota Stem Cells
- Mississippi Stem Cells
- Missouri Stem Cells
- Montana Stem Cells
- Nebraska Stem Cells
- New Hampshire Stem Cells
- New Jersey Stem Cells
- New Mexico Stem Cells
- New York Stem Cells
- Nevada Stem Cells
- North Carolina Stem Cells
- North Dakota Stem Cells
- Oklahoma Stem Cells
- Ohio Stem Cells
- Oregon Stem Cells
- Pennsylvania Stem Cells
- Rhode Island Stem Cells
- South Carolina Stem Cells
- South Dakota Stem Cells
- Tennessee Stem Cells
- Texas Stem Cells
- Utah Stem Cells
- Vermont Stem Cells
- Virginia Stem Cells
- Washington Stem Cells
- West Virginia Stem Cells
- Wisconsin Stem Cells
- Wyoming Stem Cells
- Biotechnology
- Cell Medicine
- Cell Therapy
- Diabetes
- Epigenetics
- Gene therapy
- Genetics
- Genetic Engineering
- Genetic medicine
- HCG Diet
- Hormone Replacement Therapy
- Human Genetics
- Integrative Medicine
- Molecular Genetics
- Molecular Medicine
- Nano medicine
- Preventative Medicine
- Regenerative Medicine
- Stem Cells
- Stell Cell Genetics
- Stem Cell Research
- Stem Cell Treatments
- Stem Cell Therapy
- Stem Cell Videos
- Testosterone Replacement Therapy
- Testosterone Shots
- Transhumanism
- Transhumanist
Archives
Recommended Sites
Category Archives: Genetics
Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated | Scientific Reports -…
Posted: August 30, 2022 at 3:01 am
The near-perfect case of dimensionality reduction
Applying principal component analysis (PCA) to a dataset of four populations sampled evenly: the three primary colors (Red, Green, and Blue) and Black illustrate a near-ideal dimension reduction example. PCA condensed the dataset of these four samples from a 3D Euclidean space (Fig.1B) into three principal components (PCs), the first two of which explained 88% of the variation and can be visualized in a 2D scatterplot(Fig.1C). Here, and in all other color-based analyses, the colors represent the true 3D structure, whereas their positions on the 2D plots are the outcome of PCA. Although PCA correctly positioned the primary colors at even distances from each other and Black, it distorted the distances between the primary colors and Black (from 1 in 3D space to 0.82 in 2D space). Thereby, even in this limited and near-perfect demonstration of data reduction, the observed distances do not reflect the actual distances between the samples (which are impossible to recreate in a 2D dataset). In other words, distances between samples in a reduced dimensionality plot do not and cannot be expected to represent actual genetic distances. Evenly increasing all the sample sizes yields identical results irrespective of the sample size (Fig.1D,E).
When analyzing human populations, which harbor most of the genomic variation between continental populations (12%) with only 1% of the genetic variation distributed within continental populations39, PCA tends to position Africans, Europeans, and East Asians at the corners of an imaginary triangle, which closely resembles our color-population model and illustration. Analyzing continental populations, we obtained similar results for two even-sized sample datasets (Fig.2A,C) and their quadrupled counterparts (Fig.2B,D). As before, the distances between the populations remain similar (Fig.2AD), demonstrating that for same-sized populations, sample size does not contribute to the distortion of the results if the increase in size is proportional.
Testing the effect of even-sample sizes using two population sets. The top plots show nine populations with n=50 (A) and n=188 (B). The bottom plots show a different set of nine populations with n=50 (C) and n=192 (D). In both cases, increasing the sample size did not alter the PCs (the y-axis flip between (C) and (D) is a known phenomenon).
The extent to which different-sized populations produce results with conflicting interpretations is illustrated through a typical study case in Box 1.
Note that unlike in Figs.1C and 3A, where Black is in the middle, in other figures, the overrepresentation of certain alleles (e.g., Fig. 4B) shifts Black away from (0,0). Intuitively, this can be thought of as the most common allele (Green in Fig. 4B) repelling Black, which has three null or alternative alleles.
PCA is commonly reported as yielding a stable differentiation of continental populations (e.g., Africans vs. non-Africans, Europeans vs. Asians, and Asians vs. Native Americans or Oceanians, on the primary PCs40,41,42,43). This prompted prehistorical inferences of migrations and admixture, viewing the PCA results that position Africans, East Asians, and Europeans in three corners of an imaginary triangle as representing the post Out Of Africa event followed by multiple migrations, differentiation, and admixture events. Inferences for Amerindians or Aboriginals typically follow this reconstruction. For instance, Silva-Zolezzi et al.42 argued that the Zapotecosdid not experience a recent admixture due to their location on the AmerindianPCA cluster at the Asian end of the European-Asian cline.
Here we show that the appearance of continental populations at the corners of a triangle is an artifact of the sampling scheme since variable sample sizes can easily create alternative results as well as alternative clines. We first replicated the triangular depiction of continental populations (Fig. 3A,B) before altering it (Fig. 3CF). Now, East Asians appear as a three-way admixed group of Africans, Europeans, and Melanesians (Fig. 3C), whereas Europeans appear on an African-East Asian cline (Fig. 3D). Europeans can also be made to appear in the middle of the plot as an admixed group of Africans-Asians-Oceanians origins (Fig. 3E), and Oceanians can cluster with (Fig. 3F) or without East Asians (Fig. 3E). The latter depiction maximizes the proportion of explained variance, which common wisdom would consider the correct explanation. According to some of these results, only Europeans and Oceanians (Fig. 3C) or East Asians and Oceanians (Fig. 3D) experienced the Out of Africa event. By contrast, East Asians (Fig. 3C) and Europeans (Fig. 3D) may have remained in Africa. Contrary to Silva-Zolezzi et al.s42 claim, the same MexicanAmerican cohort can appear closer to Europeans (Fig. 3A) or as a European-Asian admixed group (Fig. 3B). It is easy to see that none of those scenarios stand out as more or less correct than the other ones.
PCA of uneven-sized African (Af), European (Eu), Asian (As), and Mexican-Americans (Ma) or Oceanian (Oc) populations. Fixing the sample size of Mexican-Americans and altering the sample sizes of other populations: (A) nAf=198; nEu=20; nAs=483; nMa=64 and (B) nAf=20; nEu=343; nMa=20; nAm=64 changes the results. An even more dramatic change can be seen when repeating this analysis on Oceanians: (C) nAf=5; nEu=25; nAs=10; nOce=20 and (D) nAfr=5; nEu=10; nAs=15; nOc=20 and when altering their sample sizes: (E) nAf=98; nEu=25; nAs=150; nOc=24 and (F) nAf=98; nEu=83; nAs=30; nOc=15.
Reich et al.44 presented further PCA-based evidence to the out of Africa scenario. Applying PCA to Africans and non-Africans, they reported that non-Africans cluster together at the center of African populations when PC1 was plotted against PC4 and that this rough cluster[ing] of non-Africans is about what would be expected if all non-African populations were founded by a single dispersal out of Africa. However, observing PC1 and PC4 for Supplementary Fig. S3, we found no rough cluster of non-Africans at the center of Africans, contrary to Reich et al.s44 claim. Remarkably, we found a rough cluster of Africans at the center of non-Africans (Supplementary Fig. S3C), suggesting that Africans were founded by a single dispersal into Africa by non-Africans. We could also infer, based on PCA, either that Europeans never left Africa (Supplementary Fig. S3D), that Europeansleft Africa through Oceania (Supplementary Fig. S3B), that Asians and Oceanians never left Europe (or the other way around) (Supplementary Fig. S3F), or,since all are valid PCA results,all of the above. Unlike Reich et al.44, we do not believe that their example highlights how PCA methods can provide evidence of important migration events. Instead, our examples (Fig. 3, Supplementary Fig. S3) show how PCA can be used to generate conflicting and absurd scenarios, all mathematically correct but, obviously, biologically incorrect and cherry-pick the most favorable solution. This is an example of how vital a priori knowledge is to PCA. It is thereby misleading to present one or a handful of PC plots without acknowledging the existence of many other solutions, let alone while not disclosing the proportion of explained variance.
Three research groups sought to study the origin of Black. A previous study that employed even sample-sized color populations alluded that Black is a mixture of all colors (Fig.1BD). A follow-up study with a larger sample size (nRed=nGreen=nBlue=10) and enriched in Black samples (nBlack=200) (Fig. 4A) reached the same conclusion. However, the Black-is-Blue group suspected that the Blue population was mixed. After QC procedures, the Blue sample size was reduced, which decreased the distance between Black and Blue and supported their speculation that Black has a Blue origin (Fig. 4B). The Black-is-Red group hypothesized that the underrepresentation of Green, compared to its actual population size, masks the Red origin of Black. They comprehensively sampled the Green population and showed that Black is very close to Red (Fig. 4C). Another Black-is-Red group contributed to the debate by genotyping more Red samples. To reduce the bias from other color populations, they kept the Blue and Green sample sizes even. Their results replicated the previous finding that Black is closer to Red and thereby shares a common origin with it (Fig. 4D). A new Black-is-Green group challenged those results, arguing that the small sample size and omission of Green samples biased the results. They increased the sample sizes of the populations of the previous study and demonstrated that Black is closer to Green (Fig. 4E). The Black-is-Blue group challenged these findings on the grounds of the relatively small sample sizes that may have skewed the results and dramatically increased all the sample sizes. However, believing that they are of Purple descent, Blue refused to participate in further studies. Their relatively small cohort was explained by their isolation and small effective population size. The results of the new sampling scheme confirmed that Black is closer to Blue (Fig. 4F), and the group was praised for the large sample sizes that, no doubt, captured the actual variation in nature better than the former studies.
PCA of uneven-sized samples of four color populations. (A) nRed=nGreen=nBlue=10; nBlack=200, (B) nRed=nGreen=10; nBlue=5; nBlack=200, (C) nRed=10; nGreen=200; nBlue=50; nBlack=200 (D) nRed=25; nGreen=nBlue=50; nBlack=200, (E) nRed=300; nGreen=200; nBlue=nBlack=300, and (F) nRed=1000; nGreen=2000; nBlue=300; nBlack=2000. Scatter plots show the top two PCs. The numbers on the grey bars reflect the Euclidean distances between the color populations over all PCs. Colors include Red [1,0,0], Green [0,1,0], Blue [0,0,1], and Black [0,0,0].
The question of who the ancestors of admixed populations are and the extent of their contribution to other groups is at the heart of population genetics. It may not be surprising that authors hold conflicting views on interpreting these admixtures from PCA. Here, we explore how an admixed group appears in PCA, whether its ancestral groups are identifiable, and how its presence affects the findings for unmixed groups through a typical study case (Box 2).
To understand the impact of parameter choices on the interpretation of PCA, we revisited the first large-scale study of Indian population history carried out by Reich et al.45. The authors applied PCA to a cohort of Indians, Europeans, Asians, and Africans using various sample sizes that ranged from 2 (Srivastava) (out of 132 Indians) to 203 (Yoruban) samples. After applying PCA to Indians and the three continental populations to exclude outliers that supposedly had more African or Asian ancestries than other samples, PCA was applied again in various settings.
At this point, the authors engaged in circular logic as, on the one hand, they removed samples that appeared via PCA to have experienced gene flow from Africa (their Note 2, iii) and, on the other hand, employed a priori claim (unsupported by historical documents) that African history has little to do with Indian history (which must stand in sharp contrast to the rich history of gene flow from Utah (US) residentsto Indians, which was equally unsupported). Reich et al. provided no justification for the exact protocol used or any discussion about the impact of using different parameter values on resulting clusters.They then generated a plethora of conflicting PCA figures, never disclosing the proportion of explained variance along with the first four PCs examined. They then inferred based on PCA that Gujarati Americans exhibit no unusual relatedness to West Africans (YRI) or East Asians (CHB or JPT) (Supplementary Fig. S4)45. Their concluding analysis of Indians, Asians, and Europeans (Fig. 4)45 showed Indians at the apex of a triangle with Europeans and Asians at the opposite corners. This plot was interpreted as evidence of an ancestry that is unique to India and an Indian cline. Indian groups were explained to have inherited different proportions of ancestry from Ancestral North Indians (ANI), related to western Eurasians, and Ancestral South Indians (ASI), who split from Onge. The authors then followed up with additional analyses using Africans as an outgroup, supposedly confirming the results of their selected PCA plot. Indians have since been described using the terms ANI and ASI.
In evaluating the claims of Reich et al.45 that rest on PCA, we first replicated the finding of the alleged Indian cline (Fig. 5A). We next garnered support for an alternative cline using Indians, Africans, and Europeans (Fig. 5B). We then demonstrated that PCA results support Indians to be European (Fig. 5C), East Asians (Fig. 5D), and Africans (Fig. 5E), as well as a genuinely European-Asian, admixed population (Fig. 5F). Whereas the first two PCs of Reich et al.s primary figure explain less than 8% of the variation (according to our Fig. 5A, Reich et al.s Fig. 4 does not report this information), four out of five of our alternative depictions explain 814% of the variation. Our results also expose the arbitrariness of the scheme used by Reich et al. and show how radically different clustering can be obtained merely by manipulating the non-Indian populations used in the analyses. Our results also question the authors choice in using an analysis that explained such a small proportion of the variation (let alone not reporting it), yielded no support for a unique ancestry to India, and cast doubt on the reliability and usefulness of the ANI-ASI model to describe Indians provided their exclusive reliability on a priori knowledge in interpreting the PCA patters. Although supported by downstream analyses, the plurality of PCA results could not be used to support the authors findings because using PCA, it is impossible to answer a priori whether Africa is in India or the other way around (Fig. 5E). We speculate tat the motivation for Reich et al.'s strategy was to declare Africans an outgroup, an essential component of D-statistics.Clearly, PCA-based a posteriori inferences can lead to errors of Colombian magnitude.
Studying the origin of Indians using PCA. (A) Replicating Reich et al.s 45 results using nEu=99; nAs=146; nInd=321. Generating alternative PCA scenarios using: (B) nAf=178; nEu=99; nInd=321, (C) nAf=400; nEu=40; nAs=100; nInd=321, (D) nAf=477; nEu=253; nAs=23; nInd=321, (E) nAf=25; nEu=220; nAs=490; nInd=320, and (F) nAf=30; nEu=200; nAs=50; nInd=320.
To evaluate the extent of deviation of PCA results from genetic distances, we adopted a simple genetic distance scheme where we measured the Euclidean distance between allelic counts (0,1,2) in the same data used for PCA calculations. We are aware of the diversity of existing genetic distance measures. However, to the best of our knowledge, no study has ever shown that PCA outcomes numerically correlate with any genetic distance measure, except in very simple scenarios and tools like ADMIXTURE-like tools, which, like PCA, exhibit high design flexibility. Plotting the genetic distances against those obtained from the top two PCs shows the deviation between these two measures for each dataset. We found that all the PC projections (Fig. 6) distorted the genetic distances in unexpected ways that differ between the datasets. PCA correctly represented the genetic distances for a minority of the populations, and just like the most poorly represented populationsnone were distinguishable from other populations. Moreover, populations that clustered under PCA exhibited mixed results, questioning the accuracy of PCA clusters. Although it remains unclear which sampling scheme to adopt, neither scheme is genetically accurate. These results further question the genetic validity of the ANI-ASI model.
Comparing the genetic distances with PCA-based distances for the corresponding datasets of Fig. 5. Genetic and PCA (PC1+PC2) distances between populations pairs (symbol pairs) and 2000 random individual pairs (grey dots) were calculated using Euclidean distances and normalized to range from 0 to 1. Population and individual pairs whose PC distances reflect their genetic distances are shown along the x=y dotted line. Note that the position of heterogeneous populations on the plot may deviate from that of their samples and that some populations are very small.
We are aware that PCA disciplesmay reject our reductio ad absurdum argument and attempt to read into these results, as ridiculous as they may be, a valid description of Indian ancestry. For those readers, demonstrating the ability of the experimenter to generate near-endless contradictory historical scenarios using PCA may be more convincing or at least exhausting. For brevity, we present six more such scenarios that show PCA support for Indians as a heterogeneous group with European admixture and Mexican-Americans as an Indian-European mixed population (Supplementary Fig. S4A), MexicanAmerican as an admixed African-European group with Indians as a heterogeneous group with European admixture (Supplementary Fig. S4B), Indians and Mexican-Americans as European-Japanese admixed groups with common origins and high genetic relatedness (Supplementary Fig. S4C), Indians and Mexican-Americans as European-Japanese admixed groups with no common origins and genetic relatedness (Supplementary Fig. S4D), Europans as Indian and Mexican-Americans admixed group with Japanese fully cluster with the latter (Supplementary Fig. S4E), and Japanese and Europeans cluster as an admixed Indian and Mexican-Americans groups (Supplementary Fig. S4F). Readers are encouraged to use our code to produce novel alternative histories.We suspect that almost any topology could be obtained by finding the right set of input parameters.In this sense, any PCA output can reasonably be considered meaningless.
Contrary to Reich et al.'s claims,a more common interpretation of PCA is that the populations at the corners of the triangle are ancestral or are related to the mixed groups within the triangle, which are the outcome of admixture events, typically referred to as gradient or clines45. However, some authors held different opinions. Studying the African component of Ethiopian genomes, Pagani et al.46 produced a PC plot showing Europeans (CEU), Yoruba (western African), and Ethiopians (Eastern Africans) at the corners of a triangle (Supplementary Fig. S4)46. Rather than suggesting that the populations within the triangle (e.g., Egyptians, Spaniards, Saudi) are mixtures of these supposedly ancestral populations, the authors argued that Ethiopians have western and eastern Africanorigins, unlike the central populations with different patterns of admixture. Obviously, neither interpretation is correct. Reich et al.s interpretation does not explain why CEUs are not an Indian-African admix nor why Africans are not a European-Indian admix and is analogous to arguing that Red has Green and Blue origins (Fig.1). Pagani et al.s interpretation is a tautology, ignores the contribution of non-Africans, and is analogous to arguing that Red has Red and Green origins. We carried out forward simulations of populations with various numbers of ancestral populations and found that admixture cannot be inferred from the positions of samples in a PCA plot (Supplementary Text 1).
In a separate effort to study the origins of AJs, Need et al.47 applied PCA to 55 Ashkenazic Jews (AJs) and 507 non-Jewish Caucasians. Their PCA plot showed that AJs (marked as Jews) formed a distinct cluster from Europeans (marked as non-Jews). Based on these results, the authors suggested that PCA can be used to detect linkage to Jewishness. A follow-up PCA where Middle Eastern (Bedouin, Palestinians, and Druze) and Caucasus (Adygei) populations were included showed that AJs formed a distinct cluster that nested between the Adygei (and the European cluster) and Druze (and the Middle Eastern cluster). The authors then concluded that AJs might have mixed Middle Eastern and European ancestries. The proximity to the Adygei cluster was noted as interesting but dismissed based on the small sample size of the Adygei (n=17). The authors concluded that AJ genomes carry an unambiguous signature of their Jewish heritage, and this seems more likely to be due to their specific Middle Eastern ancestry than to inbreeding. A similar strategy was employed by Bray et al.48 to claim that PCA confirmed that the AJ individuals cluster distinctly from Europeans, aligning closest to Southern European populations along with the first principal component, suggesting a more southern origin, and aligning with Central Europeans along the second, consistent with migration to this region. Other authors49,50 made similar claims.
It is easy to show why PCA cannot be used to reach such conclusions. We first replicated Need et al.s47 primary results (Fig. 7A), showing that AJs cluster separately from Europeans. However, such an outcome is typical when comparing Europeans and non-European populations like Turks (Fig. 7B). It is not unique to AJs, nor does it prove that they are genetically detectable. A slightly modified design shows that most AJs overlap with Turks in support of the Turkic (or Near Eastern) origin of AJs (Fig. 7C). We can easily refute our conclusion by including continental populations and showing that most AJs cluster with Iberians rather than Turks (Fig. 7D). This last design explains more of the variance than all the previous analyses together, although, as should be evident by now, it is not indicative of accuracy. This analysis questions PCA's use as a discriminatory genetic utility and to infer genetic ancestry.
Studying the origin of 55 AJs using PCA. (A) Replicating Need et al.s results using nEu=507; Generating alternative PCA scenarios using: (B) nEu=223; nTurks=56; (C) nEu=400; nTurks+Caucasus=56, and (D) nAf=100, nAs=100 (Africans and Asians are not shown), nEu=100; and nTurks=50.Need et al.'s faulty terminology was adopted in A and B.
There are several more oddities with the report of Need et al.47. First, they did not report the variance explained by their sampling scheme (it is, likely, ~1%, as in Fig. 7A). Second, they misrepresented the actual populations analyzed. AJs are not the only Jews, and Europeans are not the only non-Jews (Figs.1, 7A)47. Finally, their dual interpretations of AJs as a mixed population of Middle Eastern origin are based solely on a priori belief: first, because most of the populations in their PCA are nested between and within other populations, yet the authors did not suggest that they are all admixed and second because AJs nested between Adygii and Druze51,52, both formed in the Near Eastern. The conclusions of Need et al.47 were thereby obtained based on particular PCA schemes and what may be preconceived ideas of AJs origins that are no more real than the Iberian origin of AJs (Fig. 7D). This is yet another demonstration (discussed in Elhaik36) of how PCA can be misused to promote ethnocentric claims due to its design flexibility.
Following criticism on the sampling scheme used to study the origin of Black (Box 1), the redoubtableBlack-is-Red group genotyped Cyan. Using even sample sizes, they demonstrated that Black is closer to Red (DBlack-Red=0.46) (Fig. 8A), where D is the Euclidean distance between the samples over all three PCs (short distances indicate high similarity). The Black-is-Green school criticized their findings on the grounds that their Cyan samples were biased and their results do not apply to the broad Black cohort. They also reckoned that the even sampling scheme favored Red because Blue is related to Cyan through shared language and customs. The Black-is-Red group responded by enriching their cohort in Cyan and Black (nCyan, nBlack=1000) and provided even more robust evidence that Black is Red (DBlack-Red=0.12) (Fig. 8B). However, the Black-is-Green camp dismissed these findings. Conscious of the effects of admixture, they retained only the most homogeneous Green and Cyan (nGreen, nCyan=33), genotyped new Blue and Black (nBlue, nBlack=400), and analyzed them with the published Red cohort (nRed=100). The Black-is-Green results supported their hypothesis that Black is Green (DBlack-Green=0.27) and that Cyan shared a common origin with Blue (DBlue-Green=0.27) (Fig. 8C) and should thereby be considered an admixed Blue population. Unsurprisingly, the Black-is-Red group claimed that these results were due to the under-representation of Black since when they oversampled Black, PCA supported their findings (Fig. 8A). In response, the Black-is-Green school maintained even sample sizes for Cyan, Blue, and Green (nBlue, nGreen, nCyan=33) and enriched Black and Red (nRed, nBlack=100). Not only did their results (DBlack-Green=0.63 PCA with the primary and mixed color populations. (A) nall=100; nBlack=200, (B) nRed=nGreen=nBlue=100; nBlack=nCyan=500, (C) nRed=100; nGreen=nCyan=33; nBlue=nBlack=400; and (D) nRed=nBlack=100; nGreen=nBlue=nCyan=33; Scatter plots show the top two PCs. The numbers on the grey bars reflect the Euclidean distances between the color populations over all PCs. Colors include Red [1,0,0], Green [0,1,0], Blue [0,0,1], Cyan [0,1,1], and Black [0,0,0]. The question of how analyzing admixed groups with multiple ancestral populations affects the findings for unmixed groups is illustrated through a typical study case in Box 3. To understand how PCA can be misused to study multiple mixed populations, we will investigate other PCA applications to study AJs. Such analyseshave a thematic intepretation, wherethe clustering of AJsamples is evidence of a sharedLevantine origin, e.g., Refs.12,13, that short distances between AJs and Levantines indicate close genetic relationships in support of a shared Levantine past, e.g., Ref.12, whereas the short distances between AJs and Europeans areevidence of admixture13. Finally,as a rule, the much shorter distances between AJs and the Caucasus or Turkish populations, observed by all recent studies, were ignored12,13,47,48. Bray et al.48 concluded that not only doAJs have a more southern origin but that their alignment with Central Europeans is consistent with migration to this region. In these studies, "short" andbetween received a multitude of interpretations. For example, Gladstein and Hammer's53 PCA plot that showed AJs in the extreme edge of the plot with Bedouins and French in the other edges was interpreted as AJs clustering tightly between European and Middle Eastern populations. The authors interpreted the lack of outliers among AJs (which were never defined) as evidence of common AJ ancestry. Following the rationale of these studies, it is easy to show how PCA can be orchestrated to yield a multitude origins for AJs. We replicated the observation that AJs are population isolate, i.e., AJs form a distinct group, separated from all other populations (Fig. 9A), and are thereby genetically distinguishable47. We also replicated the most common yet often-ignored observation, that AJs cluster tightly with Caucasus populations (Fig. 9B). We next produced novel results where AJs cluster tightly with Amerindians due to the north Eurasian or Amerindian origins of both groups (Fig. 9C). We can also show that AJs cluster much closer to South Europeans than Levantines (Fig. 9D), and overlap Finns entirely, in solid evidence of AJs ancient Finnish origin (Fig. 9E). Last, we wish to refute our previous finding and show that only half of the AJs are of Finnish origin. The remaining analysis supports the lucrative Levantine origin (Fig. 9F)a discovery touted by all the previous reports though never actually shown. Excitingly enough, the primary PCs of this last Eurasian Finnish-Levantine mixed origin depiction explained the highest amount of variance. An intuitive interpretation of those results is a recent migration of the Finnish AJs to the Levant, where they experienced high admixture with the local Levantine populations that altered their genetic background. These examples demonstrate that PCA plots generate nonsensical results for the same populations and no a posteriori knowledge. An in-depth study of the origin of AJs using PCA in relation to Africans (Af), Europeans (Eu), East Asians (Ea), Amerindians (Am), Levantines (Le), and South Asians (Sa). (A) nEu=159; nAJ=60; nLe=82, (B) nAf=30; nEu=159; nEa=50; nAJ=60; nLe=60, (C) nAf=30; nEa=583; nAJ=60; nAm=255; (D) nAf=200; nEu=115; nEa=200; nAJ=60; nLe=235; nSa=88, (E) nAf=200; nEu=30; nAJ=400, nLe=80 (F) nAf=200; nEu=30; nAJ=50; nLe=160. Large square indicate insets. The value of using mixed color populations to study origins prompted new analyses using even (Fig. 10A) and variable sample sizes (Fig. 10BD). Using this novel sampling scheme, the Black-is-Green school reaffirmed that Black is the closest to Green (Fig. 10A, 10C, and 10D)in a series of analyses, but using a different cohort yielded a novel finding that Black is closest to Pink (Fig. 10B). PCA with the primary and multiple mixed color populations. (A) nall=50, (B) nall=50 or 10, (C,D) nAll=[50, 5, 100, or 25]. Scatter plots show the top two PCs. Colors codes are shown. (E) The difference between the true distances calculated over a 3D plane between every color population pair (shown side by side) from (D) and their Euclidean distances calculated from the top two PCs. Pairs whose PC distances from each other reflect their true 3D distances are shown along the x=y dotted line. One of the largest PCA distortions is the distances between the Red and Green populations (inset). The true Red-Green distance is 1.41 (x-axis), but the PCA distance is 0.5 (y-axis). The extent to which PCA distances obtained by the top two PCs reflect the true distances among color population pairs is shown in Fig. 10E. PCA distorted the distances between most color populations, but the distortion was uneven among the pairs, and while a minority of the pairs are correctly projected via PCA, most are not. Identifying which pairs are correctly projected is impossible without a priori information. For example, some shades of blue and purple were less biased than similar shades. We thereby show that PCA inferred distances are biased in an unpredicted manner and thereby uninformative for clustering. Unlike stochasticmodels that possess inherent randomness, PCA is a deterministic process, a property that contributes to its perceived robustness. To explore the behavior of PCA, we tested whether the same computer code can produce similar or different results when the only variable that changes is the standard randomization technique used throughout the paper to generate the individual samples of the color populations (to avoid clutter). We evaluated two color sets. In the first set, Black was the closest to Yellow (Fig.11A), Purple (Fig.11C), and Cyan (Fig.11D,E). When adding White, in the second set, Black behaved as an outgroup as the distances between the secondary colors largely deviated from the expectation and produced false results (Fig.11DF). These results illustrate the sensitivity of PCA to tiny changes in the dataset, unrelated to the populations or the sample sizes. Studying the effects of minor sample variation on PCA results using color populations (nall=50). (AC) Analyzing secondary colors and Black. (DE) Analyzing secondary colors, White, and Black. Scatter plots show the top two PCs. Colors include Cyan [0,1,1], Purple [1,0,1], Yellow [1,1,0], White [1,1,0], and Black [0,0,0]. To explore this effect on human populations, we curated a cohort of 16 populations. We carried out PCA on ten random individuals from 15 random populations. We show that these analyses result in spurious and conflicting results (Fig.12). Puerto Ricans, for instance, clustered close to Europeans (A), between Africans and Europeans (B), close to Adygei (C), and close to Europe and Adygei (D). Indians clustered with Mexicans (A, B, and D) or apart from them (C). Mexicans themselves cluster with (A and D) or without (B and C)Africans. Papuans and Russians cluster close (B) or afar (C) from East Asian populations. More robust clustering was observed for East Asians, Caucasians, and Europeans, as well as Africans. However, these were not only indistinguishable from the less robust clustering but also failed to replicate over multiple runs (results not shown). These examples show that PCA results are unpredictable and irreproducible even when 94% of the populations are the same. Note that the proportion of explained variance was similar in all the analyses, demonstrating that it is not an indication of accuracy or robustness. Studying the effect of sampling on PCA results. A cohort of 16 worldwide populations (see legend) was selected. In each analysis, a random population was excluded. Populations were represented by random samples (n=10). The clusters highlight the most notable differences. We found that although a deterministic process, PCA behaves unexpectedly, and minor variations can lead toan ensemble of different outputs that appear stochastic. This effect is more substantial when continentalpopulations are excluded from the analysis. Samples of unknown ancestry or self-reported ancestry are typically identified by applying PCA to a cohort of test samplescombined with reference populations of known ancestry (e.g., 1000 Genomes), e.g., Refs.22,54,55,56. To test whether using PCA to identify the ancestry of an unknown cohort with known samples is feasible, we simulated a large and heterogeneous Cyan population (Fig.13A, circles) of self-reported Blue ancestry. Following a typical GWAS scheme, we carried out PCA for these individuals and seven known and distinct color populations. PCA grouped the Cyanindividuals with Blue and Black individuals (Fig.13B), although none of the Cyanindividuals were Blue or Black (Fig.13A), as a different PCA scheme confirmed (Fig.13C). A casecontrol assignment of this cohort to Blue or Black based on the PCA result (Fig.13B) produced poor matches that reduced the power of the analysis. When repeating the analysis with different reference populations (Fig.13D), the simulated individuals exhibited minimal overlap with Blue, no overlap with Black, and overlapped mostly with the Cyan reference population present this time. We thereby showed that the clustering with Blue and Black is an artifact due to the choice of reference populations. In other words, the introduction of reference populations with mismatched ancestries respective to the unknown samples biases the ancestry inference of the latter. Evaluating the accuracy of PCA clustering for a heterogeneous test population in a simulation of a GWAS setting. (A) The true distribution of the test Cyan population (n=1000). (B) PCA of the test population with eight even-sized (n=250) samples from reference populations. (C) PCA of the test population with Blue from the previous analysis shows a minimal overlap between the cohorts. (D) PCA of the test population with five even-sized (n=250) samples from reference populations, including Cyan (marked by an arrow). Colors (B) from top to bottom and left to right include: Yellow [1,1,0], light Red [1,0,0.5], Purple [1,0,1], Dark Purple [0.5,0,0.5], Black [0,0,0], dark Green [0,0.5,0], Green [0,1,0], and Blue [1,0,0]. We next asked whether PCA results can group Europeans into homogeneous clusters. Analyzing four European populations yielded 43% homogeneous clusters (Fig.14A). Adding Africans and Asians and then South Asian populations decreased the European cluster homogeneity to 14% and 10%, respectively (Fig.14B,C). Including the 1000 Genome populations, as customarily done, yielded 14% homogeneous clusters (Fig.14D). Although the Europeans remained the same, the addition of other continental populations resulted in a three to four times decrease in the homogeneity of their clusters. Evaluating the cluster homogeneity of European samples. PCA was applied to the four European populations (Tuscan Italians [TSI], Northern and Western Europeans from Utah [CEU], British [GBR], and Spanish [IBS]) alone (A), together with an African and Asian population (B), as well as South Asian population (C), and finally with all the 1000 Genomes Populations (D). (E) Evaluating the usefulness of PCA-based clustering. The bottom two plots show the sizes of non-homogeneous and homogeneous clusters, and the top three plots show the proportion of individuals in homogeneous clusters. Each plot shows the results for 10 or 20 random African, European, or Asian populations for the same PCs (x-axis). The number of PCs analyzed in the literature ranges from 2 to, at least, 28035, which raises the question of whether using more PCs increases cluster homogeneity or is another cherry-picking strategy. We calculated the cluster homogeneity for different PCs for either 10 or 20 African (n10=337, n20=912), Asian (n10=331, n20=785), and European (n10=440, n20=935) populations of similar sample sizes (Fig.14E). Even in this favorable setting that included only continental populations, on average, the homogeneous clusters identified using PCA were significantly smaller than the non-homogeneous clusters (Homogeneous=12.5 samples; Non-homogeneous=42.6 samples; Homogeneous=12.5 samples; Non-homogeneous=42.6 samples; KruskalWallis test [nHomogeneous=nNon-homogeneous=238 samples, p=1.951075, Chi-square=338]) and included a minority of the individuals when 20 populations were analyzed. Analyzing higher PCs decreased the size of the homogeneous clusters and increased the size of the non-homogeneous ones. The maximum number of individuals in the homogeneous clusters fluctuated for different populations and sample sizes. Mixing other continental populations with each cohort decreased the homogeneity of the clusters and their sizes (results now shown). Overall, these examples show that PCA is a poor clustering tool, particularly as sample size increases, in agreement with Elhaik and Ryan57, who reported that PCA clusters are neither genetically nor geographical homogeneous and that PCA does not handle admixed individuals well. Note that the cluster homogeneity in this limited setting should not be confused with the amount of variance explained by additional PCs. To further assess whether PCA clustering represents shared ancestry or biogeography, two of the most common applications of PCA, e.g., Ref.22, we applied PCA to 20 Puerto Ricans (Fig.15) and 300 Europeans. The Puerto Ricans clustered indistinguishably with Europeans (by contrast to Fig.12) using the first two and higher PCs (Fig.15). The Puerto Ricans represented over 6% of the cohort, sufficient to generate a stratification bias in an association study. We tested that by randomly assigning casecontrol labels to the European samples with all the Puerto Ricans as controls. We then generated causal alleles to the evenly-sized cohorts and computed the association before and after PCA adjustment. We repeated the analysis with randomly assigned labels to all the samples. In all our 12 casecontrol analyses, the outcome of the PCA adjustment for 2 and 10 PCs were worse than the unadjusted results, i.e., PCA adjusted results had more false positives, fewer true positives, and weaker p-values than the unadjusted results (Supplementary Text 3). PCA of20 Puerto Ricans and 300 random Europeans from the 1000 Genomes. The results are shown for various PCs. We next assessed whether the distance between individuals and populations is a meaningful biological or demographic quantity by studying the relationships between Chinese and Japanese, a question of major interest in the literature58,59. We already applied PCA to Chinese and Japanese, using Europeans as an outgroup (Supplementary Fig. S2.4). The only element that varied in the following analyses was the number of Mexicans as the second outgroup (5, 25, and 50). We found that the proportion of homogeneous Japanese and Chinese clusters dropped from 100% (Fig.16A) to 93.33% (Fig.16B) and 40% (Fig.16C), demonstrating that the genetic distances between Chinese and Japanese depend entirely on the number of Mexicans in the cohort rather than the actual genetic relationships between these populations as one may expect. The effect of varying the number of MexicanAmerican on the inference of genetic distances between Chinese and Japaneseusing various PCs. We analyzed a fixed number of 135 Han Chinese (CHB), 133 Japanese (JPT), 115 Italians (TSI), and a variable number of Mexicans (MXL), including 5 (left column), 25 (middle column), and 50 (right column) individuals over the top four PCs. We found that the overlap between Chinese and Japanese in PC scatterplots, typically used to infer genomic distances, was unexpectedly conditional on the number of Mexican in the cohort. We noted the meaning of the axes of variation whenever apparent (red). The right column had the same axes of variations as the middle one. Some authors consider higher PCs informative and advise considering these PCs alongside the first two. In our case, however, these PCs were not only susceptible to bias due to the addition of Mexicans but also exhibited the exact opposite pattern observed by the primary PCs (e.g., Fig.16GI). It has also been suggested that in datasets with ancestry differences between samples, axes of variation often have a geographic interpretation10. Accordingly, the addition of Mexicans altered the order of axes of variation between the cases, making the analysis of additional PCs valuable. We demonstrate that this is not always the case. Excepting PC1, over 60% of the axes had no geographical interpretation or an incorrect one. An a priori knowledge of the current distribution of the population was essential to differentiate these cases. The addition of the first 20 Mexicans replaced the second axis of variation (initially undefined) with a third axis (Eurasia-America) in the middle and right columns and resulted in a minor decline of~5% of the homogeneous clusters. Adding 25 Mexicans to the second cohort did not affect the axes, but the proportion of homogeneous clusters declined by 66%. The axes changes were unexpected and altered the interpretation of PCA results. Such changes were not detectable without an a priori knolwedge. These results demonstrate that (1) the observable distances (and thereby clusters) between populations inferred from PCA plots (Figs.14, 15, 16) are artifacts of the cohort and do not provide meaningful biological or historical information, (2) that distances betewen samples can be easily manipulated by the experimenter in a way that produces unpredictable results, (3) that considering higher PCs produces conflicting patterns, which are difficult to reconcile and interpret, and (4) that our extensive exploration of PCA solutions to Chinese and Japanese relationships using 18 scatterplots and four PCs produced no insight. It is easy to see that the multitude of conflicting results, allows the experimenter to select the favorable solution that reflects their a priori knowledge. Incorporating precalculated PCA is done by projecting the PCA results calculated for the first dataset onto the second one, e.g., Ref.17. Here, we tested the accuracy of this approach by projecting one or more color populations onto precalculated color populations that may or may not match the projected ones. The accuracy of the results was dependent on the identity of the populations of the two cohorts. When the same populations were analyzed, they overlapped (Fig.17A), but when unique populations were found in the two datasets, PCA created misleading matches (Figs.17BD). In the latter case, and when the sample sizes were uneven (Fig.17C), the projected samples formed clusters with the wrong populations, and their positioning in the plot was incorrect. Overall, we found that PCA projections are unreliable and misleading, with correct outcomes indistinguishable from incorrect ones. Examining the accuracy of PCA projections. The PCA results of one dataset (circles) were projected onto another (squares). In (A), testing the case of varying sample sizes between the first (nRed=200, nGreen=10, nBlue=200, nPurple=10) and second (nRed=200, nGreen=200, nBlue=10, nPurple=10) datasets, where in the second dataset, colors varied a little (e.g., [1,0,0][1,0.1,0.1]). In (BD), the sample size varied (10n300) for both datasets. Colors include Red [1,0,0], Green [0,1,0], light Green [1,0.2,1], Cyan [0,1,1], Blue [0,0,1], Purple [1,0,1], Yellow [1,1,0], Grey [0.5,0.5,0.5], White [1,1,1], and Black [0,0,0]. To evaluate the reliability of projections for human populations, we tested whether the projected populations cluster with their closest groups and to what extent these results can be manipulated. We found that populations can be shown to correctly align with continental populations when the base (or test) populations and the projected populations are very similar (Fig.18A), which gives us confidence in the accuracy of PCA projections. However, even in the simplest scenario of using three continental populations, it is unclear how to interpret the overlap between the base and projected populations since the Spanish would not be considered genetically closer to Finns than Italians, as suggested by PCA. In another simple scenario, where Europeans are projected onto other Europeans, distinct populations like AJs, Iberians, French, CEU, and British overlap entirely (Fig.18B), whereas Finns and Italians were separate. Not only do the results share no apparent resemblance to the geographical distribution, but they also produce conflicting information as to the genetic distances between these populationstwo properties that PCAenthusiastics claimit represents. Adding more populations, even if only to the projected populations, contributes to further distortions with previously distinct populations (Fig.18B) now clustering (Fig.18C). In a different dataset, projecting Japanese onto a base dataset of Africans and Europeans places them as an admixed African-European population. The projected Finns cluster with other Europeans (Fig.18D), at odds with the previous results (Fig.18B) that singled them out. PCA projections of populations (italic and black star inside the shape) onto base populations with even-sized sample (n=50, unless noted otherwise) (regular font). In (A) nprojected=100, (B) nprojected=50, (C) nprojected=20, (D) nprojected=100, (E) nprojected=80 and nprojected=100, and (F) 80nprojected100 and 12nprojected478. To test the behavior of PCA when projecting populations different from the base populations, we projected Chinese, Finns, Indians, and AJs onto Levantine and two European populations (Fig.18E). The results imply that the Chinese and AJs are of an Indian origin originating from a European-Levantine mix. Replacing Levantines with Africans does not stabilize the projected results (Fig.18F). Now the projected Chinese and Japanese overlap, and AJs cluster with Iranians. Overall, our results show that it is unfeasible to rely on PCA projections, particularly in studies involving different populations, as is commonly done. Even when the projected populations are identical to the base ones, the base and projected populations may or may not overlap. PCA is the primary tool in paleogenomics, where ancient samples are initially identified based on their clustering with modern or other ancient samples. Here, a wide variety of strategies is employed. In some studies, ancient and modern samples are combined60. In other studies, PCA is performed separately for each ancient individual and particular reference samples, and the PC loadings are combined61. Some authors projected present-day human populations onto the top two principal components defined by ancient hominins (and non-humans)62. The most common strategy is to project ancient DNA onto the top two principal components defined by modern-day populations14. Here, we will investigate the accuracy of this strategy. Since ancient populations show more genetic diversity than modern ones14, we defined ancient colors (a) as brighter colors whose allele frequency is 0.95 with an SD of 0.05 and modern colors (m) as darker colors whose allele frequency is 0.6 with an SD of 0.02. Two approaches were used in analyzing the two datasets: calculating PCA separately for the two datasets and presenting the results jointly (Fig.19A,B), and projecting the PCA results of the ancient populations onto the modern ones (Fig.19C,D). In both cases, meaningful results would show the ancient colors clustering close to their modern counterparts in distances corresponding to their true distances. Merging PCA of ancient (circles) and modern (squares) color populations using two approaches. First, PCA is calculated separately on the two datasets, and the results are plotted together (A,B). Second, PCA results of ancient populations are projected onto the PCs of the modern ones (C,D). In (A), even-sized samples from ancient (n=25) and modern (n=75) color populations are used. In (B), different-sized samples from ancient (10n25) and modern (10n75) populations are used. In (C) and (D), different-sized samples from ancient (10n75) are used alongside even-sized samples from modern populations: (C) (n=15) and (D) n=25. Colors include Red [1,0,0], dark Red [0.6,0,0], Green [0,1,0], dark Green [0,0.6,0], Blue [0,0,1], dark Blue [0,0,0.6], light Cyan [0,0.6,0.6], light Yellow [0.6,0.6,0], light Purple [0.6,0,0.6], and Black [0,0,0]. These are indeed the results of PCA when even-sized modern and ancient samples from color populations are analyzed and the color pallett isbalanced (Fig.19A). In the more realistic scenario where the color pallet is imbalanced and sample sizes differ, PCA produced incorrect results where ancient Green (aGreen) clustered with modern Yellow (mYellow) away from its closest mGreen that clustered close to aRed. mPurple appeared as 4-ways mixed of aRed, aBlue, mCyan, and mDark Blue. Instead of being at the center (Fig.19A), Black became an outgroup and its distances to the other colors were distorted (Fig.19B). Projecting ancient colors onto modern onesalso highly misrepresented the relationships among the ancient samples as aRed overlapped with aBlue or aGreen, mYellow appeared closer to mCyan or aRed, and the outgroups continuously changed (Fig.19C,D). Note that the first two PCs of the last results explained most of the variance (89%) of all anlyses. Recently, Lazaridis et al.14 projected ancient Eurasians onto modern-day Eurasians and reported that ancient samples from Israel clustered at one end of the Near Eastern cline and ancient Iranians at the other, close to modern-day Jews. Insights from the positions of the ancient populations were then used in their admixture modeling that supposedly confirmed the PCA results. To test whether the authors inferences were correct and to what extent those PCA results are unique, we used similar modern and ancient populations to replicate the results of Lazaridis et al.14 (Fig.20A). By adding the modern-day populations that Lazaridis et al.14 omitted, we found that the ancient Levantines cluster with Turks (Fig.20B), Caucasians (Fig.20C), Iranians (Fig.20D), Russians (Fig.20E), and Pakistani (Fig.20F) populations. The overlap between the ancient Levantines and other populations also varied widely, whereas they cluster with ancient Iranians and Anatolians, Caucasians, or alone, as a population isolate. Moreover, the remaining ancient populations exhibited conflicting results inconsistent with our understanding of their origins. Mesolithic and Neolithic Swedes, for instance, clustered with modern Eastern Europeans (Fig.20AC) or remotely from them (Fig.20DF). These examples show the wide variety of results and interpretations possible to generate with ancient populations projected onto modern ones. Lazaridis et al.s14 results are neither the only possible onesnor do they explain the most variation. It is difficult to justify Lazaridis et al.s14 preference for the first outcome where the first two components explained only 1.35% of the variation (in our replication analysis. Lazaridis et al. omitted the proportion of explained variation) (Fig.20A), compared to all the alternative outcomes that explained a much larger portion of the variation (1.926.06%). PCA of 65 ancient Palaeolithic, Mesolithic, Chalcolithic, and Neolithic from Iran (12), Israel (16), the Caucasus (7), Romania (10), Scandinavia (15), and Central Europe (5) (colorful shapes) projected onto modern-day populations of various sample sizes (grey dots, black labels). The full population labels are shown in Supplementary Fig. S8. In addition to the modern-day populations used in (A), the following subfigures also include (B) Han Chinese, (C) Pakistani (Punjabi), (D) additional Russians, (E) Pakistani (Punjabi) and additional Russians, and (F) Pakistani (Punjabi), additional Russians, Han Chinese, and Mexicans. The ancient samples remained the same in all the analyses. In each plot (AF), the ancient Levantines cluster with different modern-day populations. We note that for high dimensionality data where markers are in high LD, projected samples tend to shrink, i.e., move towards the center of the plot. Corrections to this phenomenon have been proposed in the literature, e.g., Ref.63. This phenomenon does not affect our datasets, which are very small (Fig.19) or LD pruned (Fig.20). The effect of marker choice on PCA results received little attention in the literature. Although PCA is routinely applied to different SNP sets, the PCs are typically deemed comparable. In forensic applications, that typically employ 100300 markers, this is a major problem. To evaluate the effect of various markers on PCA outcomes, it is unfeasible to use our color model, although it can be used to study the effects of missing data and noise, which are common in genomic datasets and reflect the biological properties of different marker types in capturing the population structure. Remarkably, the addition of 50% (Fig.21A) and even 90% missingness (Fig.21B) allowed recovering the original population structure. The structure decayed when random noise was added to the latter dataset (Fig.21C). To further explore the effect of noise, we added random markers to the dataset. An addition of 10% of noisy markers increased the dataset's disparity, but it still retained the original structure (Fig.21D). Interestingly, even adding 100% noisy markers allowed identifying the original structure's key features (Fig.21E). Only when adding 1000%, noisy markers did the original structure disappear (Fig.21F). Note that the introduction of noise has also sliced the percent of variation explained by the PCs. These results highlight the importance of using ancestry informative markers (AIMs) to uncover the true structure of the dataset and accounting for disruptive markers. Testing the effects of missingness and noise in a PCA of six fixed-size (n=50) samples from color populations. The top plots show the effect of missingness alone or combined with noise: (A) 50% missingness, (B) 90% missingness, and (C) 90% missingness and low-level random noise in all the markers. The bottom plots test the effect of noise when added to the original markers in the above plots using: (D) 30 random markers, (E) 300 random markers, and (F) 3000 random markers. Colors include Red [1,0,0], Green [0,1,0], Blue [0,0,1], Cyan [0,1,1], Yellow [1,1,0], and Black [0,0,0]. To evaluate the extent to which marker types represent the population structure, we studied the relationships between UK British and other Europeans (Italians and Iberians) using different types of 30,000 SNPs, a number of similar magnitude to the number of SNPs analyzed by some groups64,65. According to the full SNP set, the British do not overlap with Europeans (Fig.22A). However, coding SNPs show considerable overlap (Fig.22B) compared with intronic SNPs (Fig.22C). Protein coding SNPs, RNA molecules, and upstream or downstream SNPs (Fig.22DF, respectively) also show small overlap. The identification of outliers, already a subjective measure, may also differ based on the proportions of each marker type. These results not only illustrate how the choice of markers and populations profoundly affect PCA results but also the difficulties in recovering the population structure in exome datasets. Overall, different marker types represent the population structure differently. PCA of Tuscany Italians (n=115), British (n=105), and Iberians (n=150) across all markers (p~129,000) (A) and different marker types (p~30,000): (B) coding SNPs, (C) intronic SNPs, (D) protein-coding SNPs, (E) RNA molecules, and (F) upstream and downstream SNPs. Convex hull was used to generate the European cluster. PCA is used to infer the ancestry of individuals for various purposes, however a minimal sample size of one, may be even more subjected to biases than in population studies. We found that such biases can occur when individuals with Green (Fig.23A) and Yellow (Fig.23B) ancestries clustered near admixed Cyan individuals and Orange, rather than with Greens or by themselves, respectively. One Grey individual clustered with Cyan (Fig.23C) when it is the only available population, much like a Blue sample clustered with Green samples (Figs. 23D). Inferring single individual ancestries using reference individuals. In (A) Using even-sized samples from reference populations (n=37): Red [1,0,0], Green [0,1,0], bright Cyan [0, 0.9, 0.8], dark Cyan [0, 0.9, 0.6], heterogeneous darker Cyan [0, 0.9, 0.4] with high standard deviation (0.25) with a light Green test individual [0, 0.5, 0]. In (B) Using the same reference populations as in (A) with uneven-sizes: Red (n=15), Green (n=15), bright Cyan (n=100), dark Cyan (n=15), heterogeneous darker Cyan (n=100), with a Yellow test indiviaul (1,1,0). In (C) A heterogeneous Cyan population [0, 1, 1] (n=300) with high standard deviation (0.25) and a Grey test individual (0.5, 0.5, 0.5). In (D) Red [1,0,0] (n=10), Green [0,1,0] (n=10), a heterogeneous population [1, 1, 0.5] (n=200) and a Blue test individual (0,0,1). Arguably, one of the most famous cases of personalancestral inference occurred during the 2020 US presidential primaries when a candidate published the outcome of their genetic test undertaken by Carlos Bustamante that tested their Native American ancestry (https://elizabethwarren.com/wp-content/uploads/2018/10/Bustamante_Report_2018.pdf). Analyzing 764,958 SNPs, Bustamante sought to test the existence of Native American ancestry using populations from the 1000 Genomes Project and Amerindians. RFMix66 was used to identify Native American ancestry segments and PCA, elevated to be a machine learning technique, to verify that ancestry independently of RFMix. The longest of five genetic segments, judged to be of Native American origin, was analyzed using PCA and reported to be clearly distinct from segments of European ancestry and strongly associated with Native American ancestry as it clustered with Native Americans distinctly from Europeans and Africans (Fig.1 in their report) and between Native American samples (Fig.2 in their report). Bustamante concluded that While the vast majority of the individuals ancestry is European, the results strongly support the existence of an unadmixed Native American ancestor in the individuals pedigree, likely in the range of 610 generations ago. We have already shown that AJs (Fig.9C) and Pakistanis (Fig.14D) can cluster with Native Americans. With the candidates DNA unavailable (and their specific European ancestry undisclosed), we tested whether the two PCA patterns observed by Bustamante can be reproduced for modern-day Eurasians without any reported Native American ancestry (Pakistani, Iranian, Even Russian, and Moscow Russian) (Figs.24AD, respectively). Evaluation of Native American ancestry for four Eurasians. (A) Using even-sample size (n=37) for Africans, Mexican-Americans, British, Puerto Ricans, Colombians, and a Pakistani. (B) Using uneven-sample sizes, for Africans (n=100), Mexican-Americans (n=20), British (n=50), Puerto Ricans (n=89), Colombians (n=89), and an Iranian. (C) Analyzing awhole-Amerindian cohort of Colombian (n=93), Mexican-Americans (n=117), Peruvian (n=75), Puerto Ricans (n=102), and an Even Russian. (D) Using uneven-sample sizes, for Africans (n=100), Mexican-Americans (n=53), British (n=20), Puerto Ricans (n=30), Colombians (n=89), and a Moscow Russian. All the samples were randomly selected. These analyses show that the experimenter can easily generate desired patterns to support personalancestral claims, making PCA an unreliable and misleading tool to infer personalancestry. We further question the accuracy of Bustamantes report, provided the biased reference population panel used by RFMixto infer the DNA segments with the alleged Amerindian origin, which excluded East European and North Eurasian populations. We draw no conclusions about the candidates ancestry. Continue reading here:
Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated | Scientific Reports -...
Posted in Genetics
Comments Off on Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated | Scientific Reports -…
You’re in control: Exercise outweighs genetics when it comes to longer life – Study Finds
Posted: August 30, 2022 at 3:01 am
SAN DIEGO If living into your 90s seems to run in the family, dont just assume that means you will too.Our genetics make us who we are, but new research from the University of California, San Diego finds exercise trumps genes when it comes to promoting a longer life.
You dont need a medical degree to know that forgoing physical activity in favor of stagnation isnt the wisest choice for your health and longevity. But, certain people are genetically predisposed to live longer than others. The research team at UCSD set out to determine if such individuals dont have to move quite as much as the rest of us to live just as long.
The goal of this research was to understand whether associations between physical activity and sedentary time with death varied based on different levels of genetic predisposition for longevity, says lead study author Alexander Posis, M.P.H., a fourth-year doctoral student in the San Diego State University/UC San Diego Joint Doctoral Program in Public Health, in a university release.
This research project began a decade ago. In 2012, as part of the Womens Health Initiative Objective Physical Activity and Cardiovascular Health study (OPACH), study authors began keeping track of the physical activity habits among 5,446 older U.S. women (ages 63 or older). Subjects were tracked up until 2020, and wore a research-grade accelerometer for up to seven days. That device measured how much time they spent moving, the intensity of that physical activity, and their usual amount of sedentary time.
Sure enough, higher levels of light physical activity and moderate-to-vigorous physical activity were associated with a lower risk of dying during the tracking period. Additionally, more time spent sedentary was associated with a higher risk of mortality. Importantly, this observed connection between exercise and a longer life remained consistent even among women determined to have different levels of genetic predisposition for longevity.
Our study showed that, even if you arent likely to live long based on your genes, you can still extend your lifespan by engaging in positive lifestyle behaviors such as regular exercise and sitting less, explains senior study author Aladdin H. Shadyab, Ph.D., assistant professor at the Herbert Wertheim School of Public Health and Human Longevity Science at UC San Diego. Conversely, even if your genes predispose you to a long life, remaining physically active is still important to achieve longevity.
In conclusion, study authors recommend that older women engage in physical activity of any intensity as regularly as possible. Doing so will lower the risk of both various diseases and premature death.
The study is published in the Journal of Aging and Physical Activity.
Read more:
You're in control: Exercise outweighs genetics when it comes to longer life - Study Finds
Posted in Genetics
Comments Off on You’re in control: Exercise outweighs genetics when it comes to longer life – Study Finds
An international team sets out to cure genetic heart diseases with one shot – Freethink
Posted: August 30, 2022 at 3:01 am
Armed with a 30 million grant from the British Heart Foundation, an international team of researchers from the UK, US, and Singapore is setting their sights on curing forms of genetic heart disease using gene therapy.
Called the CureHeart Project, the team which includes researchers from Oxford, Harvard, Singapores National Heart Research Institute, and pharma multinational Bristol Myers Squibb will develop therapies for inherited heart muscle conditions, which impact millions and can cause sudden death, including in young people.
They plan to tackle the problem using two types of targeted techniques, called base editing and prime editing.
An international team of researchers wants to develop a one-shot cure for inherited heart muscle conditions.
Many of the mutations seen in these patients come down to one fateful letter in their DNA code, Christine Seidman, professor of medicine and genetics at Harvard Medical School and co-lead of CureHeart, told The Guardian.
That has raised the possibility that we could alter that one single letter and restore the code so that it is now making a normal gene, with normal function, Seidman said.
The teams work is building on successful demonstrations in animals.
Our goals are to fix the hearts, to stabilise them where they are and perhaps to revert them back to more normal function, Seidman said.
Fixing genetic heart disease: Inherited heart muscle diseases cause abnormalities in the heart, which are passed on through families.
Many different mutations can cause them, but in total, they affect one out of every 250 people around the world, Hugh Watkins, CureHearts lead investigator and the director of Oxfords British Heart Foundation Centre of Research Excellence, told The Guardian.
People of any age can fall victim to sudden heart failure and death, and there is generally a 50/50 chance of passing the problem along to their children.
But decades of genetic research and recent innovations in gene therapy have researchers believing that gene editing may be the answer and even, eventually, the cure.
After 30 years of research, we have discovered many of the genes and specific genetic faults responsible for different cardiomyopathies, and how they work, Watkins said.
Inherited heart muscle conditions impact millions of people, and can cause sudden death.
By using prime and base editing very precise tools for editing DNA the team hopes to develop an injectable cure to repair faulty heart genes, the British Heart Foundation said in a release.
We believe that we will have a gene therapy ready to start testing in clinical trials in the next five years, Watkins told The Guardian.
According to CureHeart, their genetic goals are twofold.
When the cause is a fault in one copy of a gene, which stops the healthy copy from working, they want to switch off the faulty copy; their second approach will be to edit the broken gene sequence itself, to correct it. Theyve demonstrated both methods in mouse models.
Delivering cures: To achieve those goals, the team is turning to two different precision gene editing techniques: prime editing and base editing.
Both enable researchers to edit DNA strands without completely slicing through them (unlike the earlier CRISPR techniques). Prime editing allows researchers to insert or remove certain parts of the genome more precisely, with less collateral damage and fewer errors.
Prime editors offer more targeting flexibility and greater editing precision, Broad Institute chemist David Liu told Science.
They plan to tackle the problem using two types of targeted genetic techniques, called base editing and prime editing.
Base editing which, Science reported, Lius lab invented involves even smaller edits, engineering single letters in the code.
We may be able to deliver these therapies in advance of disease, in individuals we know from genetic testing are at extraordinary risk of having disease development and progressing to heart failure, Seidman told The Guardian.
Never before have we been able to deliver cures, and that is what our project is about. We know we can do it and we aim to get started.
Wed love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [emailprotected]
Visit link:
An international team sets out to cure genetic heart diseases with one shot - Freethink
Posted in Genetics
Comments Off on An international team sets out to cure genetic heart diseases with one shot – Freethink
Yale Study Suggests That Evolution Can Be Predicted – SciTechDaily
Posted: August 30, 2022 at 3:01 am
Evolution has long been thought to be random, however, a recent study suggests differently.
Evolution has long been thought of as a relatively random process, with species features being formed by random mutations and environmental factors and thus largely unpredictable.
But an international team of scientists headed by researchers from Yale University and Columbia University discovered that a specific plant lineage independently developed three similar leaf types repeatedly in mountainous places scattered across the Neotropics.
The research revealed the first examples in plants of replicated radiation, which is the repeated development of similar forms in different regions. This discovery raises the possibility that evolution is not necessarily such a random process and can be anticipated.
The study was recently published in the journal Nature Ecology & Evolution.
Similar leaf types evolved independently in three species of plants found in cloud forests of Oaxaca, Mexico, and three species of plants in a similar environment in Chiapas, Mexico. This example of parallel evolution is one of several found by Yale-led scientists and suggests that evolution may be predictable. Credit: Yale University
The findings demonstrate how predictable evolution can actually be, with organismal development and natural selection combining to produce the same forms again and again under certain circumstances, said Yales Michael Donoghue, Sterling Professor Emeritus of Ecology & Evolutionary Biology and co-corresponding author. Maybe evolutionary biology can become much more of a predictive science than we ever imagined in the past.
The research team examinedthe genetics and morphology of the Viburnum plant lineage, a genus of flowering plants that started to spread into Central and South America from Mexico around 10 million years ago. Donoghue conducted research on this plant group for his Ph.D. dissertation at Harvard 40 years ago. At the time, he advocated an alternate theory according to which large, hair-covered leaves and small, smooth leaves both evolved early in the history of the group and later migrated separately, being scattered by birds, through the different mountain ranges.
However, the new genetic analyses presented in the study demonstrate that the 2 different leaf types evolved separately and simultaneously in each of many mountain regions.
I came to the wrong conclusion because I lacked the relevant genomic data back in the 1970s, Donoghue said.
The team found that a very similar set of leaf types evolved in nine of the 11 regions studied. However, the full array of leaf types may have yet to evolve in places where Viburnum has only more recently migrated. For instance, the mountains of Bolivia lack the large hairy leaf types found in other wetter areas with little sunshine in the cloud forest in Mexico, Central America, and northern South America.
These plants arrived in Bolivia less than a million years ago, so we predict that the large, hairy leaf form will eventually evolve in Bolivia as well, Donoghue said.
Several examples of replicated radiation have been found in animals, such as Anolis lizards in the Caribbean. In that case, the same set of body forms, or ectomorphs, evolved independently on several different islands. With a plant example now in hand, evolutionary biologists will try to discover the general circumstances under which solid predictions can be made about evolutionary trajectories.
This collaborative work, spanning decades, has revealed a wonderful new system to study evolutionary adaptation, said Ericka Edwards, professor of ecology and evolutionary biology at Yale and co-corresponding author of the paper. Now that we have established the pattern, our next challenges are to better understand the functional significance of these leaf types and the underlying genetic architecture that enables their repeated emergence.
Reference: Replicated radiation of a plant clade along a cloud forest archipelago by Michael J. Donoghue, Deren A. R. Eaton, Carlos A. Maya-Lastra, Michael J. Landis, Patrick W. Sweeney, Mark E. Olson, N. Ival Cacho, Morgan K. Moeglein, Jordan R. Gardner, Nora M. Heaphy, Matiss Castorena, Al Segovia Rivas, Wendy L. Clement, and Erika J. Edwards, 18 July 2022, Nature Ecology & Evolution.DOI: 10.1038/s41559-022-01823-x
View post:
Yale Study Suggests That Evolution Can Be Predicted - SciTechDaily
Posted in Genetics
Comments Off on Yale Study Suggests That Evolution Can Be Predicted – SciTechDaily
Olufunmilayo I. Olopade, MD: Cutting Into Breast Cancer Disparities With Genetic Testing – Everyday Health
Posted: August 30, 2022 at 3:01 am
Where some people see race, gender, and ZIP code as drivers of cancer risk, Olufunmilayo (Funmi) I. Olopade, MD, also sees DNA, RNA, and alleles the stuff of genes and genetic ancestry.
Dr. Olopades work revolves around the intersection of genetics, breast cancer, and racial health disparities. Her research on aggressive breast cancers in young Black African and Black American women has revealed variants of genetic mutations that raise risk for breast cancer and link these two communities.
The Walter L. Palmer Distinguished Service Professor of Medicine and Human Genetics and director of the Center for Clinical Cancer Genetics and Global Health at the University of Chicago Medicine, Olopade earner her MD from the University of Ibadan in Nigeria in 1980. She joined the University of Chicago faculty in 1986.
Her work on cancer risk assessment, prevention, and treatment garnered the Distinguished Clinical Scientist Award of the Doris Duke Charitable Foundation in 2000, a MacArthur Genius Fellowship in 2005, and the 2017 Mendel Medal Lecture at Villanova University.
Today, Olopade continues to build on her work by showing other researchers and clinicians how to explore ancestry-specific genetic variants that lead to breast cancer, and how to use that knowledge to tailor prevention efforts and treatment to each persons needs.
As part of Closing the Cancer Gap, a continuing series on cancer disparities, the world-renowned expert explains her quest to harness heredity and realize the promise of precision medicine for everyone, whether in Nigeria or on the South Side of Chicago.
This interview has been edited for length and clarity.
Everyday Health:You were the first person to link a mutation in the BRCA gene, known to increase risk for breast cancer, in Nigerian women to BRCA mutations in Black American women of African descent. What led to that revelation?
Olufunmilayo I. Olopade:Originally, I wasnt focused on breast cancer. I was in Chicago studying genetics and looking at lymphoma (cancer of the lymphatic system). But then I saw so many young African American women seeking bone marrow transplants because they were facing advanced breast cancers. Some were only in their twenties and came from families with a history of the disease.
I began to seek out the stories of these young women and of others, including the namesake of the Susan G. Komen Foundation (a leading organization funding breast cancer research and awareness globally). Her personal journey and those of the many women who helped the disease gain visibility were intriguing.
Then I went back to Africa and saw young women crowded into a hospital waiting room, desperately sick with advanced breast cancers. I wondered whether we could link our knowledge about the genetic basis of the aggressive breast cancers in American women to these women from Africa and of African ancestry. I felt there was an imperative here, because these fast-growing cancers, called triple-negative, contribute to a 41 percent greater risk of African American women dying from breast cancer compared with their white counterparts.
Ten to 20 percent of all breast cancers in the United States are triple-negative. These cancers are harder to control because they lack the three most common hormone receptors (proteins inside and on the surface of cells that receive messages telling cells what to do). Since many breast cancer therapies target those three receptors, we must look at other options when treating cases of triple-negative.
EH: So, the triple-negative work propelled you into studying breast cancer and genetic ancestry?
OIO: After I launched the University of Chicago Cancer Risk Clinic in 1992, my team and I spent years studying genetics and learning how to identify women at the highest risk of breast cancer. We gathered findings from a large geographical area of Africa and compared them with results found in African Americans in Chicago.
Thats how we confirmed that this specific kind of breast cancer, which shared recurrent BRCA1 (breast cancer 1) mutations, existed in extended African American families with histories of breast cancer and in Africans.
As we continued to grow our knowledge about genetics, we marked the 30th anniversary of the clinics founding in July 2022 with a name change, from the Cancer Risk Clinic to the Cancer Prevention Clinic.
The switch reflects our move beyond fundamental biology understanding how the disease works to using genetics to allow early preventive measures, while, hopefully, maintaining a focus on equity in the medical system. By that I mean ensuring that underserved and underrepresented communities are part of our studies and clinical trials, and that they have access to the genetic screening and counseling too often denied them.
EH: The American Association for Cancer Research 2022 Disparities Report notes that breast cancer is the most prevalent form of cancer among Black American women and predicts that 36,260 new cases will be diagnosed in 2022. Since BRCA genes are relatively rare, what other factors may be contributing to the large case numbers?
OIO:One of the main causes was apparent almost immediately when I first came to Chicago; the presence of two cities.
When I arrived from Nigeria in 1986, I couldnt believe the level of segregation. There were food deserts and medical deserts and insufficiencies in the health system in some South Side and West Side neighborhoods with predominantly African American residents.
There were so few pharmacists in some areas that people couldnt get the medications they needed and would run out. And the health infrastructure was so deficient that it was extremely difficult for the most vulnerable Chicagoans to stay healthy. To my surprise, this was happening in the most well-resourced and blessed country in the world.
Olufunmilayo I. Olopade, MD
Our team was able to bring in many of these people for screening. But if they needed follow-up and treatment, it became tough for them. Many had to drop out because of other issues a lack of transportation, for example, no time off for medical appointments, being a family members caregiver, as well as their inability to pay or a lack of insurance.
Unfortunately, the healthcare system is now reckoning with the general inattention to diseases that affect certain populations. Society has fragmented us into healthcare haves and have-nots.
EH: How can we improve the outcomes for underserved communities while benefiting everyone?
OIO: We have to find a way to get genetic testing done for everyone so that we can fully understand individuals risks and respond accordingly. And when someone has a greater chance of developing the disease, we need to find a way to secure a breast MRI (magnetic resonance imaging), which can pick up cancer long before a mammogram can.
Assessing risk can also help us determine whether we can and should use one of the three approved drugs shown effective in clinical trials at reducing cancer risk. Some of the answers may come from a study now underway, called the WISDOM project (Women Informed to Screen Depending on Measures of Risk), which compares the effectiveness of two approved screening approaches: annual mammograms starting at age 40 for all women versus creating a personalized risk profile and screening program.
EH: What else may be keeping us from making better progress?
OIO:If we could evaluate everyones genetic profile, we could catch the disease as early as possible instead of waiting for people to become ill. Any cancer is potentially curable if discovered early enough.
But right now, too many people dont know that genetic tests are available, too few doctors ask for them, and insurance often denies coverage. Without solving those problems, we cant take full advantage of the power of precision medicine.
EH: What do you mean by precision medicine?
OIO:Im referring to the ability to select the right drug for the right condition at the right time. Using the appropriate treatment when its most effective can help prevent and treat cancers with fewer side effects.
Genetic information enables us to determine who needs chemotherapy, which type is most effective, and when immunotherapy, for example, is more effective. Not all very early cancers are deadly. Some can be closely watched. Some need additional intervention or require a different kind of prevention.
In the next decade, I predict well see this kind of optimized treatment become available for everyone, whether in Nigeria or on the South Side of Chicago. We will make it all happen.
EH: What drew you to the field of medicine?
OIO: My father was a pastor, so when people were sick, they would come to our house for prayers. Some, of course, remained ill, and my father whose unofficial motto was health is wealth would always remark about how wonderful it would be to have a doctor in the family; someone who could provide more help to these people. He strongly encouraged me to learn about medicine.
Olufunmilayo I. Olopade, MD
Although Im the only one of six siblings who became a doctor, I have a sister whos a nurse and a daughter, one of my three children, runs a healthcare company, Cancer IQ, that has created an application to help medical providers track critical genetic information.
EH: What are your current projects and goals?
OIO: Were trying to develop better ways to assess breast cancer risk, particularly through the use of image-based biomarkers in breast MRIs.
We did a study, published in the March 2019 issue of Clinical Cancer Research, showing that scheduling two MRIs a year is preferable to a single yearly mammogram for younger women at high risk for some forms of breast cancer. But MRIs are more expensive than mammograms. And, as I said before, insurance doesnt always cover them.
When I refer to biomarkers, Im suggesting that we have the ability to do extremely accurate assessments using artificial intelligence, which can read millions of MRI images and pick out subtle changes that mammograms cant. So, from the very first screening, we can monitor these women and plan for any potential interventions, if and when they become necessary.
We also want to better understand why certain populations have much lower levels of breast cancer. Individuals of Asian or Hispanic descent, who are less prone to develop certain breast cancers, may help us pinpoint and isolate whatever particular protective factor is involved. That could potentially lead to future preventions and treatments.
And, of course, I also plan to maintain our focus on equity in access, as we continue to study the effects of new drugs on women of African descent and on the entire population. We need to ensure that we fully understand the side effects on the entire range of people taking these drugs.
Premature breast cancer death is unacceptable; too many women die too young. So, our current goal is the same as always: identify the patient, predict the risk, and prevent the cancer.
EH: Whats the most challenging part of your work?
OIO: The toughest thing you face as a doctor is losing a patient. I try to remember that there is great value in creating an end-of-life plan that diminishes pain and suffering and preserves true dignity. As doctors, we have moments of victory and moments when we are humbled by what we do, but to be present with a patient at the end is so important.
Originally posted here:
Olufunmilayo I. Olopade, MD: Cutting Into Breast Cancer Disparities With Genetic Testing - Everyday Health
Posted in Genetics
Comments Off on Olufunmilayo I. Olopade, MD: Cutting Into Breast Cancer Disparities With Genetic Testing – Everyday Health
Could Genetics Be the Key to Never Getting the Coronavirus? – The Atlantic
Posted: July 27, 2022 at 2:47 am
Last Christmas, as the Omicron variant was ricocheting around the United States, Mary Carrington unknowingly found herself at a superspreader eventan indoor party, packed with more than 20 people, at least one of whom ended up transmitting the virus to most of the gatherings guests.
After two years of avoiding the coronavirus, Carrington felt sure that her time had come: Shed been holding her great-niece, who tested positive soon after, and she was giving me kisses, Carrington told me. But she never caught the bug. And I just thought, Wow, I might really be resistant here. She wasnt thinking about immunity, which she had thanks to multiple doses of a COVID vaccine. Rather, perhaps via some inborn genetic quirk, her cells had found a way to naturally repel the pathogens assaults instead.
Carrington, of all people, understood what that would mean. An expert in immunogenetics at the National Cancer Institute, she was one of several scientists who, beginning in the 1990s, helped uncover a mutation that makes it impossible for most strains of HIV to enter human cells, rendering certain people essentially impervious to the pathogens effects. Maybe something analogous could be safeguarding some rare individuals from SARS-CoV-2 as well.
Read: America is running out of COVID virgins
The idea of coronaviral resistance is beguiling enough that scientists around the world are now scouring peoples genomes for any hint that it exists. If it does, they could use that knowledge to understand whom the virus most affects, or leverage it to develop better COVID-taming drugs. For individuals who have yet to catch the contagiona fast-dwindling proportion of the populationresistance dangles like a superpower that people cant help but think they must have, says Paula Cannon, a geneticist and virologist at the University of Southern California.
As with any superpower, though, bona fide resistance to SARS-CoV-2 infection would likely be very rare, says Helen Su, an immunologist at the National Institutes of Allergy and Infectious Disease. Carringtons original hunch, for one, eventually proved wrong: She recently returned from a trip to Switzerland and found herself entwined with the virus at last. Like most people who remained unscathed until recently, Carrington had done so for two and a half years through a probable combination of vaccination, cautious behavior, socioeconomic privilege, and luck. Its entirely possible that inborn coronavirus resistance may not even existor that it may come with such enormous costs that its not worth the protection it theoretically affords.
Of the 1,400 or so viruses, bacteria, parasites, and fungi known to cause disease in humans, Jean-Laurent Casanova, a geneticist and an immunologist at Rockefeller University, is certain of only three that can be shut out by bodies with one-off genetic tweaks: HIV, norovirus, and a malaria parasite.
The HIV-blocking mutation is maybe the most famous. About three decades ago, researchers, Carrington among them, began looking into a small number of people who we felt almost certainly had been exposed to the virus multiple times, and almost certainly should have been infected, and yet had not, she told me. Their superpower was simple: They lacked functional copies of a gene called CCR5, which builds a cell-surface protein that HIV needs in order to hack its way into T cells, the viruss preferred human prey. Just 1 percent of people of European descent harbor this mutation, called CCR5-32, in two copies; in other populations, the trait is rarer still. Even so, researchers have leveraged its discovery to cook up a powerful class of antiretroviral drugs, and purged the virus from two people with the help of 32-based bone-marrow transplantsthe closest that medicine has come to developing a functional HIV cure.
The stories with those two other pathogens are similar. Genetic errors in a gene called FUT2, which pastes sugars onto the outsides of gut cells, can render people resistant to norovirus; a genomic tweak erases a protein called Duffy from the walls of red blood cells, stopping Plasmodium vivax, one of several parasites that causes malaria, from wresting its way inside. The Duffy mutation, which affects a gene called DARC/ACKR1, is so common in parts of sub-Saharan Africa that those regions have driven rates of P. vivax infection way down.
In recent years, as genetic technologies have advanced, researchers have begun to investigate a handful of other infection-resistance mutations against other pathogens, among them hepatitis B virus and rotavirus. But the links are tough to definitively nail down, thanks to the number of people these sorts of studies must enroll, and to the thorniness of defining and detecting infection at all; the case with SARS-CoV-2 will likely be the same. For months, Casanova and a global team of collaborators have been in contact with thousands of people from around the world who believe they harbor resistance to the coronavirus in their genes. The best candidates have had intense exposures to the virussay, via a symptomatic person in their homeand continuously tested negative for both the pathogen and immune responses to it. But respiratory transmission is often muddied by pure chance; the coronavirus can infiltrate people silently, and doesnt always leave antibodies behind. (The team will be testing for less fickle T-cell responses as well.) People without clear-cut symptoms may not test at all, or may not test properly. And all on its own, the immune system can guard people against infection, especially in the period shortly after vaccination or illness. With HIV, a virus that causes chronic infections, lacks a vaccine, and spreads through clear-cut routes in concentrated social networks, it was easier to identify those individuals whom the virus had visited but not put down permanent roots within, says Ravindra Gupta, a virologist at the University of Cambridge. SARS-CoV-2 wont afford science the same ease of study.
Read: Is BA.5 the reinfection wave?
A full analogue to the HIV, malaria, and norovirus stories may not be possible. Genuine resistance can manifest in only so many ways, and tends to be born out of mutations that block a pathogens ability to force its way into a cell, or xerox itself once its inside. CCR5, Duffy, and the sugars dropped by FUT2, for instance, all act as microbial landing pads; mutations rob the bugs of those perches. If an equivalent mutation exists to counteract SARS-CoV-2, it might logically be found in, say, ACE2, the receptor that the coronavirus needs in order to break into cells, or TMPRSS2, a scissors-like protein that, for at least some variants, speeds the invasive process along. Already, researchers have found that certain genetic variations can dial down ACE2s presence on cells, or pump out junkier versions of TMPRSS2hints that there could be tweaks that further strip away the molecules. But ACE2 is very important to blood-pressure regulation and the maintenance of lung-tissue health, said Su, of NIAID, whos one of many scientists collaborating with Casanova to find SARS-CoV-2 resistance genes. A mutation that keeps the coronavirus out might very well muck around with other aspects of a persons physiology. That could make the genetic tweak vanishingly rare, debilitating, or even, as Gupta put it, not compatible with life. People with the CCR5-32 mutation, which halts HIV, are basically completely normal, Cannon told me, which means HIV kind of messed up in choosing CCR5. The coronavirus, by contrast, has figured out how to exploit something vital to its hostan ingenious invasive move.
The superpowers of genetic resistance can have other forms of kryptonite. A few strains of HIV have figured out a way to skirt around CCR5, and glom on to another molecule, called CXCR4; against this version of the virus, even people with the 32 mutation are not safe. A similar situation has arisen with Plasmodium vivax, which we do see in some Duffy-negative individuals, suggesting that the parasite has found a back door, says Dyann Wirth, a malaria researcher at Harvards School of Public Health. Evolution is a powerful strategyand with SARS-CoV-2 spewing out variants at such a blistering clip, I wouldnt necessarily expect resistance to be a checkmate move, Cannon told me. BA.1, for instance, conjured mutations that made it less dependent on TMPRSS2 than Delta was.
Read: The BA.5 wave is what COVID normal looks like
Still, protection doesnt have to be all or nothing to be a perk. Partial genetic resistance, too, can reshape someones course of disease. With HIV, researchers have pinpointed changes in groups of so-called HLA genes that, through their impact on assassin-like T cells, can ratchet down peoples risk of progressing to AIDS. And a whole menagerie of mutations that affect red-blood-cell function can mostly keep malaria-causing parasites at baythough many of these changes come with a huge human cost, Wirth told me, saddling people with serious clotting disorders that can sometimes turn lethal themselves.
With COVID-19, too, researchers have started to home in on some trends. Casanova, at Rockefeller, is one of several scientists who has led efforts unveiling the importance of an alarm-like immune molecule called interferon in early control of infection. People who rapidly pump out gobs of the protein in the hours after infection often fare just fine against the virus. But those whose interferon responses are weak or laggy are more prone to getting seriously sick; the same goes for people whose bodies manufacture maladaptive antibodies that attack interferon as it passes messages between cells. Other factors could toggle the risk of severe disease up or down as well: cells ability to sense the virus early on; the amount of coordination between different branches of defense; the brakes the immune system puts on itself, so it does not put the hosts own tissues at risk. Casanova and his colleagues are also on the hunt for mutations that might alter peoples risk of developing long COVID and other coronaviral consequences. None of these searches will be easy. But they should be at least simpler than the one for resistance to infection, Casanova told me, because the outcomes theyre measuringserious and chronic forms of diseaseare that much more straightforward to detect.
If resistance doesnt pan out, that doesnt have to be a letdown. People dont need total blockades to triumph over microbesjust a defense thats good enough. And the protection were born with isnt all the leverage weve got. Unlike genetics, immunity can be easily built, modified, and strengthened over time, particularly with the aid of vaccines. Those DIY defenses are probably what kept Carringtons case of COVID down to a mild course, she told me. Immune protection is also a far surer bet than putting a wager on what we may or may not inherit at birth. Better to count on the protections we know we can cook up ourselves, now that the coronavirus is clearly with us for good.
More here:
Could Genetics Be the Key to Never Getting the Coronavirus? - The Atlantic
Posted in Genetics
Comments Off on Could Genetics Be the Key to Never Getting the Coronavirus? – The Atlantic
Noonan appointed Kent Professor of Genetics and Professor of Neuroscience – Yale News
Posted: July 27, 2022 at 2:47 am
James Noonan
James Noonan, who has made critical and novel contributions to the fields of human evolutionary genetics and neurodevelopment, was recently appointed the Albert E. Kent Professor of Genetics and Professor of Neuroscience, effective immediately.
Noonan received his B.S. in biology and English literature from the State University of New York at Binghamton in 1997, and his Ph.D. in genetics from Stanford University School of Medicine in 2004. He completed a postdoctoral fellowship in the Genomics Division at Lawrence Berkeley National Laboratory from 2004 to 2007. In 2007, he was recruited to Yale as assistant professor and was promoted to associate professor in 2013, and professor in 2021. He has a secondary appointment in Yales Department of Neuroscience.
Noonans research program is focused on deciphering the role of gene regulatory changes in the evolution of uniquely human traits. This work addresses a central hypothesis in human evolution, proposed more than 40 years ago: that changes in the level, timing, and location of gene expression account for biological differences between humans and other primates. Noonan has discovered thousands of human-specific genetic changes that alter gene expression and regulation, and by pioneering novel genetic models, his lab has begun to reveal how human-specific regulatory changes alter developmental traits. His work has provided key insights into the genetic origins of human biological uniqueness and has driven the rise of a new field: human evolutionary developmental biology.
Noonans seminal research discovered two classes of gene regulatory elements implicated in human evolution. The first are Human Accelerated Regions (HARs), which encode transcriptional enhances which are highly conserved across species and show many human-specific sequence changes (Science 2006, Science 2008). Using humanized mouse models, he has shown that HARs alter developmental gene expression and drive the evolution of novel phenotypes. As an example, he recently showed that one HAR altered expression of a transcription factor that has a role in limb development, possibly contributing to changes in skeletal patterning in human limb evolution (Nature Communications, 2022). These findings provide mechanistic insight into how HARs modified gene expression in human evolution. Using massively parallel assays, he has also comprehensively characterized the effect of thousands of human-specific sequence changes in HARs on their activity during neurodevelopment (Proceedings of the National Academy of Sciences, 2021)
He also identified thousands of human-specific changes in enhancer activity by direct analysis of developing human and nonhuman tissues. These loci, termed Human Gain Enhancers (HGEs), have gained activity in the developing human limb (Cell, 2013) and cerebral cortex (Science, 2015). These studies identified the biological pathways in limb and cortical development likely altered by human-specific regulatory changes, providing the basis for understanding their effects using genetic and experimental models.
Noonan has also contributed substantially to the educational programs of Yale School of Medicine, revolutionizing its graduate training landscape and empowering experimental genetics research across many labs at Yale. He designed the first course in genomics in the medical school more than 12 years ago, serving hundreds of students and faculty with the skills required to excel at the frontier of modern biomedical science. His training efforts have helped to set the standards of genomic research at Yale and ensured that the university remains a world leader in genomics.
See the original post here:
Noonan appointed Kent Professor of Genetics and Professor of Neuroscience - Yale News
Posted in Genetics
Comments Off on Noonan appointed Kent Professor of Genetics and Professor of Neuroscience – Yale News
Happy 200th birthday, Gregor Mendel: 5 ways the father of modern genetics impacted your life today – Clemson News
Posted: July 27, 2022 at 2:47 am
July 20, 2022July 20, 2022
Today is Gregor Mendels 200th birthday andcelebrationsare being heldall across the globe.
Why the fuss over anAugustinian monkfrom the 1800s who grew peas in the garden of the abbey where he lived?
Because breeding and studying those pea plants led Mendel to discover the fundamental laws of inheritance, earning him wide recognition as the father of modern genetics. Today, researchers from Clemson University and those across the world are using genetics in ways Mendel would never have dreamed for personalized medicine that tailors disease prevention and treatment, to breed drought- and disease-resilient crops, and to improve the health of agricultural crops and animals.
It all goes back to Mendel paying attention to what was happening right there around him. Mendel was the first person to really understand that traits can be quantified and inherited in a predictable way. Thats the foundation of genetics, full stop. Thats where it starts, said David F. Clayton, chair of the Clemson UniversityDepartment of Genetics and Biochemstry. Of course, in the years since, such great progress has been made in how that actually happens.
Back when Mendel started growing his pea plants, it was thought that traits in offspring were a result of a blending of traits of each parent, kind of like mixing paint.
Mendel studied seven traits of pea plants: seed color, seed shape, flower position, flower color, pod shape, pod color and stem length. He noticed when he cross-pollinated a pea plant with yellow pods with one that had green pods, he didnt get plants with yellowish-green pods. Instead, all of them were yellow. However, when that crop self-pollinated, 75% of the second generation were yellow and 25% were green. Mendel concluded that each individual had two complete sets of inheritable factors, one from each parent. He attributed that generation-skipping to some characteristics being dominant and some being recessive.
Since then, our knowledge of genetics has grown by leaps and bounds. After all, Mendel didnt know about genes or DNA, and sequencing of the human genome wasnt completed until 2003.
Today, genetics is all around us, woven into our daily live in small and large ways.
The Clemson University Center for Human Genetics and the College of Science will celebrate Mendels birthday with a lecture by Daniel J. Fairbanks of Utah Valley University on Sept. 2 at 2:30 p.m. The lecture, Gregor Mendel at the Bicentennial of his Birth: The Life and Legacy of a Scientific Genius, will be held on Zoom. It is part of the College of Sciences Discover Science Lecture Series.
The College of Science pursues excellence in scientific discovery, learning, and engagement that is both locally relevant and globally impactful. The life, physical and mathematical sciences converge to tackle some of tomorrows scientific challenges, and our faculty are preparing the next generation of leading scientists. The College of Science offers high-impact transformational experiences such as research, internships and study abroad to help prepare our graduates for top industries, graduate programs and health professions. clemson.edu/science
Or email us at news@clemson.edu
Continued here:
Happy 200th birthday, Gregor Mendel: 5 ways the father of modern genetics impacted your life today - Clemson News
Posted in Genetics
Comments Off on Happy 200th birthday, Gregor Mendel: 5 ways the father of modern genetics impacted your life today – Clemson News
Population genomics of Group B Streptococcus reveals the genetics of neonatal disease onset and meningeal invasion – Nature.com
Posted: July 27, 2022 at 2:47 am
Johri, A. K. et al. Group B Streptococcus: global incidence and vaccine development. Nat. Rev. Microbiol. https://doi.org/10.1038/nrmicro1552 (2006).
Nizet, V. I., Ferrieri, P. A. & Rubens, C. E. Molecular pathogenesis of group B streptococcal disease in newborns. Streptococcal Infect. Clin. Asp. Microbiol. Mol. Pathog. 180221 (Oxford Univ. Press, New York, NY, 2000).
Davies, H. G., Carreras-Abad, C., Le Doare, K. & Heath, P. T. Group B Streptococcus: trials and tribulations. Pediatr. Infect. Dis. J. https://doi.org/10.1097/INF.0000000000002328 (2019).
Seale, A. C. et al. Estimates of the burden of Group B Streptococcal disease worldwide for pregnant women, stillbirths, and children. Clin. Infect. Dis. https://doi.org/10.1093/cid/cix664 (2017).
World Health Organisation. Group B Streptococcus Vaccine: Full Value of Vaccine Assessment. Financial Analysis. (World Health Organisation, 2021).
Schuchat, A. Epidemiology of group B streptococcal disease in the United States: Shifting paradigms. Clin. Microbiol. Rev. https://doi.org/10.1128/cmr.11.3.497 (1998).
Bekker, V., Bijlsma, M. W., van de Beek, D., Kuijpers, T. W. & Van der Ende, A. Incidence of invasive group B streptococcal disease and pathogen genotype distribution in newborn babies in the Netherlands over 25 years: a nationwide surveillance study. Lancet Infect. Dis. https://doi.org/10.1016/S1473-3099(14)70919-3 (2014).
Bevan, D., White, A., Marshall, J. & Peckham, C. Modelling the effect of the introduction of antenatal screening for group B Streptococcus (GBS) carriage in the UK. BMJ Open https://doi.org/10.1136/bmjopen-2018-024324 (2019).
Ancona, R. J., Ferrieri, P. & Williams, P. P. Maternal factors that enhance the acquisition of group-B streptococci by newborn infants. J. Med. Microbiol. 13, 273280 (1980).
CAS PubMed Article Google Scholar
Tazi, A. et al. Risk factors for infant colonization by hypervirulent CC17 group B Streptococcus: toward the understanding of late-onset disease. Clin. Infect. Dis. 69, 17401748 (2019).
CAS PubMed Article Google Scholar
Gizachew, M. et al. Proportion of Streptococcus agalactiae vertical transmission and associated risk factors among Ethiopian mother-newborn dyads, Northwest Ethiopia. Sci. Rep. 10, 3477 (2020).
ADS CAS PubMed PubMed Central Article Google Scholar
Hung, L.-C. et al. Risk factors for neonatal early-onset group B streptococcus-related diseases after the implementation of a universal screening program in Taiwan. BMC Public Health 18, 438 (2018).
PubMed PubMed Central Article Google Scholar
Schrag, S. J. et al. A population-based comparison of strategies to prevent early-onset group B Streptococcal disease in neonates. N. Engl. J. Med. 347, 233239 (2002).
PubMed Article Google Scholar
Lawn, J. E. et al. Every country, every family: time to act for group B Streptococcal disease worldwide. Clin. Infect. Dis. ciab859, https://doi.org/10.1093/cid/ciab859 (2021).
Romain, A.-S. et al. Clinical and laboratory features of group B Streptococcus meningitis in infants and newborns: study of 848 cases in France, 20012014. Clin. Infect. Dis. 66, 857864 (2018).
PubMed Article Google Scholar
Alhhazmi, A., Hurteau, D. & Tyrrell, G. J. Epidemiology of invasive group B Streptococcal disease in Alberta, Canada, from 2003 to 2013. J. Clin. Microbiol. 54, 17741781 (2016).
CAS PubMed PubMed Central Article Google Scholar
Moore, M. R., Schrag, S. J. & Schuchat, A. Effects of intrapartum antimicrobial prophylaxis for prevention of group-B-streptococcal disease on the incidence and ecology of early-onset neonatal sepsis. Lancet Infect. Dis. 3, 201213 (2003).
PubMed Article Google Scholar
Ohlsson, A. & Shah, V. S. Intrapartum antibiotics for known maternal Group B streptococcal colonization. Cochrane Database Syst. Rev. https://doi.org/10.1002/14651858.CD007467.pub4 (2014).
Nishihara, Y., Dangor, Z., French, N., Madhi, S. & Heyderman, R. Challenges in reducing group B Streptococcus disease in African settings. Arch. Dis. Child. 102, 72 LP72 77 (2017).
Article Google Scholar
Buurman, E. T. et al. A novel hexavalent capsular polysaccharide conjugate vaccine (GBS6) for the prevention of neonatal group b streptococcal infections by maternal immunization. J. Infect. Dis. https://doi.org/10.1093/infdis/jiz062 (2019).
Madhi, S. A. et al. Safety and immunogenicity of an investigational maternal trivalent group B streptococcus vaccine in healthy women and their infants: a randomised phase 1b/2 trial. Lancet Infect. Dis. https://doi.org/10.1016/S1473-3099(16)00152-3 (2016).
Heyderman, R. S. et al. Group B streptococcus vaccination in pregnant women with or without HIV in Africa: a non-randomised phase 2, open-label, multicentre trial. Lancet Infect. Dis. https://doi.org/10.1016/S1473-3099(15)00484-3 (2016).
Nilo, A. et al. Anti-group B Streptococcus glycan-conjugate vaccines using pilus protein GBS80 as carrier and antigen: comparing lysine and tyrosine-directed conjugation. ACS Chem. Biol. https://doi.org/10.1021/acschembio.5b00247 (2015).
Absalon, J. et al. Safety and immunogenicity of a novel hexavalent group B streptococcus conjugate vaccine in healthy, non-pregnant adults: a phase 1/2, randomised, placebo-controlled, observer-blinded, dose-escalation trial. Lancet Infect. Dis. 21, 263274 (2021).
CAS PubMed Article Google Scholar
Martin, T. R., Ruzinski, J. T., Rubens, C. E., Chi, E. Y. & Wilson, C. B. The effect of type-specific polysaccharide capsule on the clearance of group B Streptococci from the lungs of infant and adult rats. J. Infect. Dis. 165, 306314 (1992).
CAS PubMed Article Google Scholar
Marques, M. B., Kasper, D. L., Pangburn, M. K. & Wessels, M. R. Prevention of C3 deposition by capsular polysaccharide is a virulence mechanism of type III group B streptococci. Infect. Immun. https://doi.org/10.1128/iai.60.10.3986-3993.1992 (1992).
Uchiyama, S. et al. Dual actions of group B Streptococcus capsular sialic acid provide resistance to platelet-mediated antimicrobial killing. Proc. Natl. Acad. Sci. USA. https://doi.org/10.1073/pnas.1815572116 (2019).
Herbert, M. A., Beveridge, C. J. E. & Saunders, N. J. Bacterial virulence factors in neonatal sepsis: group B streptococous. Curr. Opin. Infect. Dis. https://doi.org/10.1097/00001432-200406000-00009 (2004).
Lynskey, N. N. et al. Multi-functional mechanisms of immune evasion by the streptococcal complement inhibitor C5a peptidase. PLOS Pathog. 13, e1006493 (2017).
PubMed PubMed Central Article CAS Google Scholar
Bryan, J. D. & Shelver, D. W. Streptococcus agalactiae CspA is a serine protease that inactivates chemokines. J. Bacteriol. 191, 18471854 (2009).
CAS PubMed Article Google Scholar
Poyart, C. et al. Contribution of Mn-cofactored superoxide dismutase (SodA) to the virulence of Streptococcus agalactiae. Infect. Immun. 69, 50985106 (2001).
CAS PubMed PubMed Central Article Google Scholar
Gibson, R. L., Nizet, V. & Rubens, C. E. Group B Streptococcal -hemolysin promotes injury of lung microvascular endothelial cells. Pediatr. Res. 45, 626634 (1999).
CAS PubMed Article Google Scholar
Zhu, L. et al. Genetic basis underlying the hyperhemolytic phenotype of Streptococcus agalactiae strain CNCTC10/84. J. Bacteriol. 202, e00504e00520 (2020).
CAS PubMed PubMed Central Google Scholar
Deng, L. et al. Characterization of a two-component system transcriptional regulator, LtdR, that impacts group B Streptococcal colonization and disease. Infect. Immun. 86, e0082217 (2018).
CAS PubMed PubMed Central Article Google Scholar
Wang, N.-Y. et al. Group B streptococcal serine-rich repeat proteins promote interaction with fibrinogen and vaginal colonization. J. Infect. Dis. 210, 982991 (2014).
CAS PubMed PubMed Central Article Google Scholar
Buscetta, M. et al. FbsC, a novel fibrinogen-binding protein, promotes Streptococcus agalactiae-host cell interactions. J. Biol. Chem. 289, 2100321015 (2014).
CAS PubMed PubMed Central Article Google Scholar
Doran, K. S. et al. Blood-brain barrier invasion by group B Streptococcus depends upon proper cell-surface anchoring of lipoteichoic acid. J. Clin. Invest. 115, 24992507 (2005).
CAS PubMed PubMed Central Article Google Scholar
Almeida, A. et al. Whole-genome comparison uncovers genomic mutations between group B Streptococci sampled from infected newborns and their mothers. J. Bacteriol. 197, 33543366 (2015).
CAS PubMed PubMed Central Article Google Scholar
Andrea, G. et al. Pan-GWAS of Streptococcus agalactiae highlights lineage-specific genes associated with virulence and niche adaptation. MBio 11, e0072820 (2021).
Google Scholar
Read, T. D. & Massey, R. C. Characterizing the genetic basis of bacterial phenotypes using genome-wide association studies: a new direction for bacteriology. Genome Med. 6, 109 (2014).
PubMed PubMed Central Article CAS Google Scholar
Power, R. A., Parkhill, J. & De Oliveira, T. Microbial genome-wide association studies: lessons from human GWAS. Nat. Rev. Genet. https://doi.org/10.1038/nrg.2016.132 (2016).
Lees, J. A. et al. Large scale genomic analysis shows no evidence for pathogen adaptation between the blood and cerebrospinal fluid niches during bacterial meningitis. Microb. Genomics 3, e000103e000103 (2017).
Google Scholar
Lilje, B. et al. Whole-genome sequencing of bloodstream Staphylococcus aureus isolates does not distinguish bacteraemia from endocarditis. Microb. Genomics 3, e000138 (2017).
Google Scholar
Lees, J. A. et al. Joint sequencing of human and pathogen genomes reveals the genetics of pneumococcal meningitis. Nat. Commun. https://doi.org/10.1038/s41467-019-09976-3 (2019).
Li, Y. et al. Genome-wide association analyses of invasive pneumococcal isolates identify a missense bacterial mutation associated with meningitis. Nat. Commun. 10, 178 (2019).
ADS PubMed PubMed Central Article CAS Google Scholar
Young, B. C. et al. Panton-valentine leucocidin is the key determinant of staphylococcus aureus pyomyositis in a bacterial GWAS. Elife https://doi.org/10.7554/eLife.42486 (2019).
Kulohoma, B. W. et al. Comparative genomic analysis of meningitis- and bacteremia-causing pneumococci identifies a common core genome. Infect. Immun. 83, 41654173 (2015).
Davies, M. R. et al. Atlas of group A streptococcal vaccine candidates compiled using large-scale comparative genomics. Nat. Genet. 51, 10351043 (2019).
CAS PubMed PubMed Central Article Google Scholar
Chaguza, C. et al. Bacterial genome-wide association study of hyper-virulent pneumococcal serotype 1 identifies genetic variation associated with neurotropism. Commun. Biol. 3, 559 (2020).
CAS PubMed PubMed Central Article Google Scholar
Laabei, M. et al. Predicting the virulence of MRSA from its genome sequence. Genome Res. https://doi.org/10.1101/gr.165415.113 (2014).
Coll, F. et al. Genome-wide analysis of multi- and extensively drug-resistant Mycobacterium tuberculosis. Nat. Genet. https://doi.org/10.1038/s41588-017-0029-0 (2018).
Chewapreecha, C. et al. Comprehensive identification of single nucleotide polymorphisms associated with beta-lactam resistance within pneumococcal mosaic genes. PLoS Genet. 10, e1004547e1004547 (2014).
PubMed PubMed Central Article CAS Google Scholar
Farhat, M. R. et al. Genomic analysis identifies targets of convergent positive selection in drug-resistant Mycobacterium tuberculosis. Nat. Genet. 45, 11831189 (2013).
CAS PubMed PubMed Central Article Google Scholar
Suzuki, M., Shibayama, K. & Yahara, K. A genome-wide association study identifies a horizontally transferred bacterial surface adhesin gene associated with antimicrobial resistant strains. Sci. Rep. 6, 37811 (2016).
ADS CAS PubMed PubMed Central Article Google Scholar
Hicks, N. D., Carey, A. F., Yang, J., Zhao, Y. & Fortune, S. M. Bacterial genome-wide association identifies novel factors that contribute to ethionamide and prothionamide susceptibility in Mycobacterium tuberculosis. MBio 10, e00616e00619 (2019).
CAS PubMed PubMed Central Article Google Scholar
Sieber, R. N. et al. Genome investigations show host adaptation and transmission of LA-MRSA CC398 from pigs into Danish healthcare institutions. Sci. Rep. 9, 18655 (2019).
ADS CAS PubMed PubMed Central Article Google Scholar
Ma, K. C. et al. Adaptation to the cervical environment is associated with increased antibiotic susceptibility in Neisseria gonorrhoeae. Nat. Commun. 11, 4126 (2020).
ADS CAS PubMed PubMed Central Article Google Scholar
Chewapreecha, C. et al. Genetic variation associated with infection and the environment in the accidental pathogen Burkholderia pseudomallei. Commun. Biol. 2, 428 (2019).
CAS PubMed PubMed Central Article Google Scholar
Jamrozy, D. et al. Increasing incidence of group B streptococcus neonatal infections in the Netherlands is associated with clonal expansion of CC17 and CC23. Sci. Rep. 10, 9539 (2020).
ADS PubMed PubMed Central Article CAS Google Scholar
Bianchi-Jassir, F. et al. Systematic review of Group B Streptococcal capsular types, sequence types and surface proteins as potential vaccine candidates. Vaccine https://doi.org/10.1016/j.vaccine.2020.08.052 (2020).
Nicola, J. et al. Multilocus sequence typing system for group B Streptococcus. J. Clin. Microbiol. 41, 25302536 (2003).
Article CAS Google Scholar
Cheng, L., Connor, T. R., Sirn, J., Aanensen, D. M. & Corander, J. Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol. Biol. Evol. 30, 12241228 (2013).
CAS PubMed PubMed Central Article Google Scholar
Posted in Genetics
Comments Off on Population genomics of Group B Streptococcus reveals the genetics of neonatal disease onset and meningeal invasion – Nature.com
Scientists are narrowing in on why some people keep avoiding Covid. BA.5 could end that luck. – NBC News
Posted: July 27, 2022 at 2:47 am
A majority of people in the U.S have had Covid-19 at least once likely more than 70% of the country, White House Covid-19 Response Coordinator Ashish Jha said on Thursday, citing data from the Centers for Disease Control and Prevention.
Many have been infected multiple times. In a study that has not been peer viewed that looked at 257,000 U.S. veterans who'd contracted Covid at least once, 12% had a reinfection by April and about 1% had been infected three times or more.
This raises an obvious question: What is keeping that shrinking minority of people from getting sick?
Disease experts are homing in on a few predictive factors beyond individual behavior, including genetics, T cell immunity and the effects of inflammatory conditions like allergies and asthma.
But even as experts learn more about the reasons people may be better equipped to avoid Covid, they caution that some of these defenses may not hold up against the latest version of omicron, BA.5, which is remarkably good at spreading and evading vaccine protection.
"It really takes two to tango," said Neville Sanjana, a bioengineer at the New York Genome Center. "If you think about having an infection and any of the bad stuff that happens after that, it really is a product of two different organisms: the virus and the human."
In 2020, New York University researchers identified a multitude of genes that could affect a person's susceptibility to the coronavirus. In particular, they found that inhibiting certain genes that code for a receptor known as ACE2, which allows the virus to enter cells, could reduce a person's likelihood of infection.
Sanjana, who conducted that research, estimated that about 100 to 500 genes could influence Covid-19 susceptibility in sites like the lungs or nasal cavity.
Genetics is "likely to be a large contributor" to protection from Covid-19, he said. "I would never say its the only contributor."
In July, researchers identified a common genetic factor that could influence the severity of a coronavirus infection. In a study of more than 3,000 people, two genetic variations decreased the expression of a gene called OAS1, which is part of the innate immune response to viral infections. That was associated with an increased risk of Covid-19 hospitalization.
Increasing the gene's expression, then, should have the opposite effect reducing the risk of severe disease though it wouldn't necessarily prevent infection altogether.
"Its very natural to get infected once you are exposed. Theres no magic bullet for that. But after you get infected, how youre going to respond to this infection, thats what is going to be affected by your genetic variants," said Ludmila Prokunina-Olsson, the study's lead researcher and chief of the Laboratory of Translational Genomics at the National Cancer Institute.
Still, Benjamin tenOever, a microbiology professor at the NYU Grossman School of Medicine who helped conduct the 2020 research, said it would be difficult for scientists to pinpoint a particular gene responsible for preventing a Covid infection.
"While there might still be certainly some genetics out there that do render people completely resistant, theyre going to be incredibly hard to find," tenOever said. "People have already been looking intensely for two years with no actual results."
Aside from this new coronavirus, SARS-CoV-2, four other coronaviruses commonly infect people, typically causing mild to moderate upper respiratory illnesses like the common cold.
A recent study suggested that repeated exposure to or occasional infections from these common cold coronaviruses may confer some protection from SARS-CoV-2.
The researchers found that T cells, a type of white blood cell that recognizes and fights invaders, seem to recognize SARS-CoV-2 based on past exposure to other coronaviruses. So when a person who has been infected with a common cold coronavirus is later exposed to SARS-CoV-2, they might not get as sick.
But that T cell memory probably can't prevent Covid entirely.
"While neutralizing antibodies are key to prevent an infection, T cells are key to terminate an infection and to modulate the severity of infection," said Alessandro Sette, the studys author and a professor at the La Jolla Institute for Immunology.
Sette said it's possible that some people's T cells clear the virus so quickly that the person never tests positive for Covid. But researchers aren't yet sure if that's what's happening.
"Its possible that, despite being negative on the test, it was a very abortive, transient infection that was not detected," Sette said.
At the very least, he said, T cells from past Covid infections or vaccines should continue to offer some protection against coronavirus variants, including BA.5.
Although asthma was considered a potential risk factor for severe Covid earlier in the pandemic, more recent research suggests that low-grade inflammation from conditions like allergies or asthma may have a protective benefit.
"Youll hear these stories about some individuals getting sick and having full-blown symptoms of Covid, and having slept beside their partner for an entire week during that period without having given it to them. People think that they must have some genetic resistance to it, [but] a big part of that could be if the partner beside them in any way has a higher than normal inflammatory response going on in their lungs," tenOever said.
A May study found that having a food allergy halved the risk of a coronavirus infection among nearly 1,400 U.S. households. Asthma didn't lower people's risk of infection in the study, but it didn't raise it, either.
One theory, according to the researchers, is that people with food allergies express fewer ACE2 receptors on the surface of their airway cells, making it harder for the virus to enter.
"Because there are fewer receptors, you will have either a much lower grade infection or just be less likely to even become infected," said Tina Hartert, a professor of medicine and pediatrics at the Vanderbilt University School of Medicine, who co-led that research.
The study took place from May 2020 to February 2021, before the omicron variant emerged. But Hartert said BA.5 likely wouldn't eliminate cross-protection from allergies.
"If something like allergic inflammation is protective, I think it would be true for all variants," Hartert said. "The degree to which it could be protective could certainly differ."
For many, the first explanation that springs to mind when thinking about Covid avoidance is one's personal level of caution. NYU's TenOever believes that individual behavior, more than genetics or T cells, is the key factor. He and his family in New York City are among those who've never had Covid, which he attributes to precautions like staying home and wearing masks.
"I dont think for a second that we have anything special in our genetics that makes us resistant," he said.
It's now common knowledge that Covid was easier to avoid before omicron, back when a small percentage of infected people were responsible for the majority of the virus's spread. A 2020 study, for example, found that 10% to 20% of infected people accounted for 80% of transmissions.
But omicron and its subvariants have made any social interaction riskier for everyone involved.
"It's probably far more of an equal playing field with the omicron variants than it ever was for the earlier variants," tenOever said.
BA.5, in particular, has increased the odds that people who've avoided Covid thus far will get sick. President Joe Biden is a prime example: He tested positive for the first time this week.
But even so, Jha said on Thursday in a news briefing, "I dont believe that every American will be infected."
Follow this link:
Scientists are narrowing in on why some people keep avoiding Covid. BA.5 could end that luck. - NBC News
Posted in Genetics
Comments Off on Scientists are narrowing in on why some people keep avoiding Covid. BA.5 could end that luck. – NBC News