基因转换(Allelic gene conversion)能产生近似软性选择清扫(soft sweep)的效果 - 2023

原文:https://doi.org/10.1101/2023.12.05.570141

前言

正选择(positive selection)会导致选择清扫(selective sweep),即某个受到自然选择青睐的等位基因在被选择到被固定的过程中导致其他等位基因频率降低,如同一场清除基因多样性的“扫荡”。

选择清扫一直被分为两种,一种是硬性清扫(hard sweep),即单个优势等位基因直接在选择的作用下达到固定,另一种是软性清扫(soft sweep),即多个不同的优势等位基因一起被保留下来。软性清扫可以通过两种方式达成,一种是同一个优势等位基因的重复突变(recurrent mutation),另一种是在选择发生之前就已经存在的、包含了多种优势等位基因的既有基因差异(standing genetic variation)。一般来说,如果一个种群可以迅速适应新环境,软性清扫的存在就会很普遍,要么是因为更高的突变率,要么是因为多种多样的优势等位基因早就已经存在了(而且这样的种群往往规模非常大,或者历史上的Ne非常大)。

然而,有一种情况下,我们会见到和软性清扫非常相似的现象,那就是存在等位基因转换(allelic gene conversion)的时候。

基因转换的定义(维基百科)

基因转换(Gene conversion)是生物基因组中一段DNA序列将另一段与其同源的DNA序列取代的机制,可在减数分裂时发生在异型合子同源染色体上对应的基因之间(allelic gene conversion),造成配子细胞在该基因座非孟德尔式遗传[1],也可在其他状况下发生在基因组中同源(同基因家族)的两基因(如基因簇串连重复基因中的数个基因)之间(ectopic gene conversion),使各基因序列趋于一致,造成协同演化[2][3]

在这种基因转换存在的时候,作用于一个单起源(single-origin de novo)优势等位基因上的硬性清扫将产生几乎和软性清扫一样的效果,即好像有多个等位基因同时受到了选择,因为基因转换可以使被选择的等位基因“跳”到另一个新的遗传背景上(genetic background,比如非同源染色体上的同源区域),导致这两个区域间存在较高的连锁不平衡(LD),这与不存在基因转换时的标准hard sweep是相反的。

这篇文章将上述“假”的soft sweep称为“pseudo-soft sweep”。

待解决问题

检验上述由基因转换导致的pseudo-soft sweep在真实demographic背景下是否普遍存在。

研究方式

用slim软件对包括人类、果蝇和拟南芥等模式物种基因组进行simulation。

概念图解

F1.large.jpg

结论

在模拟的人类种群中,基因转换只“软化”(soften)了少部分选择清扫

在模拟的果蝇种群中,基因转换“软化”(soften)了大部分的选择清扫

我觉得这部分结论是最神奇的,因为在果蝇当中大部分看上去是soft sweep的选择清扫实际上都有可能是被基因转换影响了的hard sweep,也就是说在其他昆虫种群中也有可能存在大量这方面的误判,如果是这样的话,那在利用sweep scan来寻找具有local adaptation优势的基因位点时就可能出错,甚至严重影响这类研究的结论。

日后会有针对这个现象的辅助软件吗?比如在进行sweep scan的同时侦测基因转换的频率?如何侦测?文章最后讨论确实提到了用machine learning来开发这类软件的可能性:

Part of the reason for the controversy surrounding soft sweeps is that, because of their more subtle signature, detecting them with a low false positive rate is more difficult (Schrider and Kern, 2016). Thus, it would be convenient for sweep-detection efforts if positive selection rarely resulted in soft sweeps. However, we do not appear to be this fortunate—regardless of the mode of selection, signatures essentially indistinguishable from soft sweeps should be the norm in species with large population sizes and per-base pair gene conversion rates comparable to those measured in several animal species. It is also worth noting that our conclusions may not be limited to models with a single target of selection, as polygenic selection can in many cases produce selective sweeps (Thornton, 2019), which of course may also experience gene conversion. Thus, even if true soft sweeps never occur, although this appears unlikely as argued above, the presence of gene conversion necessitates methods capable of detecting the signatures of soft sweeps if we hope to be able to uncover loci responsible for recent adaptation. Continued advances on this front, such as machine learning methods that can detect soft sweeps with increasingly high accuracy (e.g. (Schrider and Kern, 2016; Kern and Schrider, 2018; Mughal and DeGiorgio, 2019; Lauterbur et al., 2023b; Arnab et al., 2023), are therefore essential if we wish to fully understand the importance of positive selection in recent evolutionary history and its role in shaping patterns of diversity.

在模拟拟南芥种群中,基因转换对选择清扫影响很小

对重组单倍型(recombinant haplotype)的影响

As described above, our simulations show that allelic gene conversion has the potential to produce sweeps where the beneficial mutation is found on multiple distinct haplotypes. This suggests that pseudo-soft sweeps caused by gene conversion, much like true soft sweeps, may provide the means for a larger fraction of diversity present prior to selection to escape the sweep than expected under a hard sweep.

其中一部分结果示例:

F2.large.jpg