Section: Evolutionary Biology
Topic: Evolution, Genetics/genomics, Population biology

Detection of domestication signals through the analysis of the full distribution of fitness effects

Corresponding author(s): Gutenkunst, Ryan N. (rgutenk@arizona.edu); Ramos-Onsins, Sebastian E. (sebastian.ramos@cragenomica.es)

10.24072/pcjournal.540 - Peer Community Journal, Volume 5 (2025), article no. e35.

Get full text PDF Peer reviewed and recommended by PCI

Abstract

Domestication is a process marked by complex interactions between demographic changes and selective pressures, which together shape genetic diversity. While the phenotypic outcomes of domestication are well documented, its genetic basis—particularly the dynamics of selection—remain less well understood. To investigate these dynamics, we performed simulations designed to approximate the demographic history of large domestic mammals. These simulations used selection coefficients as a modeling tool to represent changes in selection pressures, recognizing that such coefficients are abstractions rather than direct representations of biological reality. Specifically, we analyzed site frequency spectra (SFS) under varying distributions of fitness effects (DFE) and proportions of mutations with divergent selective pressures. Our results show that the discretized deleterious DFE can be reliably inferred from the SFS of a single population, but reconstructing the beneficial DFE and demographic history remains challenging, even when using the joint SFS of both populations. We further developed a novel joint DFE inference model to estimate the proportion of mutations with divergent selection coefficients (pc), although we found that signals of classic hard sweeps can mimic increases in pc, complicating interpretation. These findings underscore both the utility and limitations of DFE inference and highlight the need for caution when interpreting demographic histories in domesticated populations based on such modeling assumptions.

Metadata
Published online:
DOI: 10.24072/pcjournal.540
Type: Research article
Keywords: Domestication, DFE, demography, selective sweeps, linked selection, 2D-SFS, forward-in-time simulations

Castellano, David 1; Vourlaki, Ioanna-Theoni 2, 3; Gutenkunst, Ryan N. 1; Ramos-Onsins, Sebastian E. 2

1 Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, USA
2 Centre for Research in Agricultural Genomics (CRAG, CSIC-IRTA-UAB-UB), Campus UAB, Vall de la Moronta s/n, Cerdanyola del Valles, Barcelona, 08193, Spain
3 Animal Breeding and Genetics Program, IRTA, Torre Marimón, 08140 Caldes de Montbui, Barcelona, Spain
License: CC-BY 4.0
Copyrights: The authors retain unrestricted copyrights and publishing rights
@article{10_24072_pcjournal_540,
     author = {Castellano, David and Vourlaki, Ioanna-Theoni and Gutenkunst, Ryan N. and Ramos-Onsins, Sebastian E.},
     title = {Detection of domestication signals through the analysis of the full distribution of fitness effects},
     journal = {Peer Community Journal},
     eid = {e35},
     publisher = {Peer Community In},
     volume = {5},
     year = {2025},
     doi = {10.24072/pcjournal.540},
     language = {en},
     url = {https://peercommunityjournal.org/articles/10.24072/pcjournal.540/}
}
TY  - JOUR
AU  - Castellano, David
AU  - Vourlaki, Ioanna-Theoni
AU  - Gutenkunst, Ryan N.
AU  - Ramos-Onsins, Sebastian E.
TI  - Detection of domestication signals through the analysis of the full distribution of fitness effects
JO  - Peer Community Journal
PY  - 2025
VL  - 5
PB  - Peer Community In
UR  - https://peercommunityjournal.org/articles/10.24072/pcjournal.540/
DO  - 10.24072/pcjournal.540
LA  - en
ID  - 10_24072_pcjournal_540
ER  - 
%0 Journal Article
%A Castellano, David
%A Vourlaki, Ioanna-Theoni
%A Gutenkunst, Ryan N.
%A Ramos-Onsins, Sebastian E.
%T Detection of domestication signals through the analysis of the full distribution of fitness effects
%J Peer Community Journal
%D 2025
%V 5
%I Peer Community In
%U https://peercommunityjournal.org/articles/10.24072/pcjournal.540/
%R 10.24072/pcjournal.540
%G en
%F 10_24072_pcjournal_540
Castellano, David; Vourlaki, Ioanna-Theoni; Gutenkunst, Ryan N.; Ramos-Onsins, Sebastian E. Detection of domestication signals through the analysis of the full distribution of fitness effects. Peer Community Journal, Volume 5 (2025), article  no. e35. doi : 10.24072/pcjournal.540. https://peercommunityjournal.org/articles/10.24072/pcjournal.540/

PCI peer reviews and recommendation, and links to data, scripts, code and supplementary information: 10.24072/pci.evolbiol.100795

Conflict of interest of the recommender and peer reviewers:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.

Full text

The full text below may contain a few conversion errors compared to the version of record of the published article.

Introduction

The increase in human population and the emergence of modern society are closely linked to the domestication of plants and animals (Purugganan & Fuller, 2009; Driscoll et al., 2009; Larson & Burger, 2013; Amills et al., 2017). Human civilization was made possible through the domestication of surrounding life forms, where plants and animals such as wheat, dogs, pigs, or chickens were among the first to be domesticated (Dayan, 1994; Zeder et al., 2006; Zeder, 2012, 2015; Redding, 2015; Avni et al., 2017). Domestication is a process that fosters long-term mutualistic relationship, providing benefits to both humans and domesticated species (Zeder et al., 2006). This process began approximately 10-15 thousand years ago and continues to this day (Zeder et al., 2006; Larson & Burger, 2013).

Despite its foundational role in human civilization, our genomic and evolutionary understanding of domestication remains incomplete. Domestication occurs rapidly on evolutionary time scale, but it is not a discrete event; rather, it involves the gradual refinement of domesticated traits. Artificial selection during domestication is often assumed to be relatively stronger and faster than natural selection. However, evidence from plants suggests that the evolutionary rate of domesticated varieties can be similar to that of wild plants, indicating a process more akin natural selection (Purugganan & Fuller, 2009).

Domestication is also commonly associated with population bottlenecks; where only a small subset of individuals from a wild population are domesticated, potentially reducing the efficiency of natural selection (Wright et al., 2005). An additional distinction between natural and artificial selection is the use of truncation selection by modern breeders -a method that selects the top percentage of individuals for desired traits (Granleese et al., 2019). The prevalence of truncation selection in natural populations or prior to industrialization remains unknown. Truncation selection is an efficient form of directional selection (Crow & Kimura, 1979), and significant genetic load accumulation is unlikely in outcrossing species (Kondrashov, 1988; Ohta, 1989) if the population sizes remain sufficiently large (Marsden et al., 2016).

A recent comprehensive meta-analysis of the genetic costs of domestication (Moyers et al., 2018) revealed that domesticated populations generally carry more deleterious variants, or segregate at higher frequencies, compared to their wild counterparts. However, this pattern is not universal, as evidenced by studies in sorghum (Lozano et al., 2021). Such patterns are likely driven by multiple processes that reduce the effectiveness of selection in domesticated populations, a concept first observed in rice genomes (Lu et al., 2006).

Selection, both natural and artificial, can act through a few loci with strong effects or many loci with small effects, depending on the genetic architecture of the trait and the strength of selection (Jain & Stephan, 2017a; b). These two selection models are expected to produce distinct patterns of genetic diversity around selected loci (Stephan & John, 2020). Classic hard selective sweeps have been reported at a few candidate loci for key domesticated traits (reviewed by Andersson, 2012), such as the IGF2 gene region associated with lean pigs (Van Laere et al., 2003), the thyroid-stimulating hormone receptor (TSHR) in chickens (Rubin et al., 2010), and the sh4 and qSW5 loci related to seed shattering and grain width in rice (Shomura et al., 2008; Huang et al., 2012; Li et al., 2018). These cases reflect a Mendelian genetic architecture, where a small number of loci explain most of the phenotypic variance (see Courtier-Orgogozo & Martin, 2020 for a comprehensive list of genes related to domestication). In short, genomic analyses of domestication have traditionally focused on identifying strong selection footprints, often driven by loci with large effects responsible for phenotypic differences (e.g., Groenen et al., 2012; Carneiro et al., 2015; Qanbari et al., 2019; Li et al., 2020). However, Leno-Colorado et al., (2022) found that domesticated and wild pig populations did not differ in the number and type of non-synonymous fixed mutations, contradicting the idea that most domesticated traits follow a Mendelian genetic architecture. Thus, the hard selective sweep model may be the exception rather than the rule in pigs domestication.

In this study, we investigate the genomic consequences of domestication by modeling and comparing the full distribution of fitness effects (DFE) for new and standing genetic variation. A change in the selection regime can be modeled in different ways: as shifts in selection coefficients, as done here, or alternatively, as changes in the optimal value of a quantitative trait determined by a set of loci whose effects on fitness depend on their contribution to the trait and the genetic background they are in. In our approach, we infer the joint DFE for wild and domesticated populations using selection coefficients as abstractions to approximate the effects of selection. This allows us to quantify the proportion of shared genetic variants (modeled as) having diverging selection pressures, providing insights into how selective regimes may differ between these populations. We recognize that alternative frameworks, such as quantitative trait models, may offer complementary perspectives on the genetic consequences of domestication.

Previous studies estimating the DFE have primarily relied on contrasting the site frequency spectrum (SFS) of synonymous and non-synonymous mutations within a single population (1D-SFS). These methods assume that beneficial mutations contribute to divergence but not to polymorphism due to their rapid fixation (Keightley & Eyre-Walker, 2007; Boyko et al., 2009; Kim et al., 2017; Tataru et al., 2017; Barton & Zeng, 2018). Tataru et al. (2017) developed polyDFE to infer the full DFE and the proportion of adaptive substitutions (α) using polymorphism data alone. They also proposed nested models to test whether the parameters of the DFEs are shared between populations. Castellano et al. (2019) applied polyDFE to great apes and found that the shape parameter of the gamma deleterious DFE is likely conserved across these closely related species. However, populations that have diverged much more recently than great apes -such as domesticated and wild populations- tend to share a large number of genetic variants. To better leverage this shared variation, Huang et al. (2021) proposed using the SFS of both populations simultaneously (2D-SFS) to jointly estimate the deleterious DFE. Traditional 1D-SFS-based methods only provide access to the marginal DFE of the population without the need for shared variants, as they do not focus on the selection coefficients of individual mutations. In contrast, joint 2D inference, while more limited in applicability due to its reliance on substantial shared genetic variation, offers the advantage of quantifying the stability of the direction and intensity of natural selection on individual mutations.

Inferring the demographic history of domesticated populations is as important as inferring the change in the selection regime between domesticated and wild populations. Demographic processes associated with domestication have been studied across several species(Murray et al., 2010; Arnoux et al., 2021; Morell Miranda et al., 2023), with key events such as population splits between wild and domesticated groups, bottlenecks, and gene flow being inferred. These studies have also considered the influence of multiple selective sweeps (Caicedo et al., 2007). However, it has been demonstrated that ignoring background selection when analyzing demographic patterns can result in biased estimates (Beissinger et al., 2016; Comeron, 2017; Torres et al., 2020). We will revisit this issue and provide broader context in the results and discussion.

To gain insights into the inference of complex demographic histories and the DFE in the context of domestication, we employ forward-in-time simulations of an idealized domestication process. These simulations explore various demographic and selective scenarios, enabling us to evaluate the ability to detect differences in selective pressures between domesticated and wild populations. We simulate a range of genetic architectures and selective effects, including: (1) Models with a relatively small number of loci undergoing changes in their selective effects and (2) models where numerous loci exhibit divergent selective effects. We also play with the rate and mean effect of beneficial mutations to understand the role of selective sweeps. Importantly, we introduce a novel methodology based on Huang et al. (2021) that incorporates an additional parameter critical for distinguishing populations experiencing rapid selective change: the selective effects of a fraction of existing variants can change (e.g., from beneficial to deleterious, or vice versa) in domesticated populations. This method jointly infers the full DFE parameters for both wild and domesticated populations, including shifts in the selective effects of shared variants. Finally, we describe and discuss how linked selection and changes in the DFE impair our ability to accurately infer the true simulated demographic histories.

Materials and Methods

Simulation of the Domestication Process

A simulation of an idealized domestication process is developed using the forward-in-time simulator SLiM4 (Haller & Messer, 2023). The general model of the domestication process is developed in the SLiM script available in github and Zenodo (https://doi.org/10.5281/zenodo.15084050). Twenty-four different domestication scenarios are analyzed, and the parameters for each scenario are shown in Table 1. All options (flags) used to run the SLiM script are also available in github (https://github.com/CastellanoED/domesticationDFE/blob/main/slim_code_mod4_NEW.slim). We aim to model a general domestication process that resembles the genomic configuration, generation time, mutation and recombination landscape relevant to large domesticated mammals used as livestock. Note that we are not considering the recent processes of genetic improvement performed by commercial companies in the last decades. The constructed model assumes a genome containing a single chromosomal "chunk" or window, with 10,000 loci/exons of 120 base pairs in length, and each locus/exon with one-third (4-fold) neutral synonymous positions and two-thirds (0-fold) selected non-synonymous positions scattered along the locus.

The simulation parameters for each scenario (Figure 1A) are as follows: the initial population at time 0 run for 10*Ne generations to reach mutation-selection-drift equilibrium,
then splits into the domesticated and wild populations. Hereafter we refer to the Wild and Domesticated populations. We aim to mimic a realistic but still general domestication process in large livestock mammals where ancestral Ne (Na) estimates are on average around 10,000 (Murray et al., 2010; Groenen et al., 2012; Larson et al., 2014; Frantz et al., 2015; Yang et al., 2016; Librado et al., 2021; Todd et al., 2022) and the domestication process, according to archeological records, started around 10,000 years ago (Ahmad et al., 2020). The average generation time in these large domesticated mammals is about 5 years per generation (Pacifici et al., 2013). Note that in this study we had to reduce the population size and related population parameters below from 10,000 diploid individuals to Na=5,000 diploid individuals for computational reasons.

Genomic parameters: The mutation rate per site (μ) and generation is 2.5x10-7, and the population size (Na) is 5,000 diploid individuals, thus the expected θ under neutrality is 0.005. Each locus is separated from its neighbors by 3x10-6 recombination events per generation. The recombination rate per site and generation within the loci is fixed to a rate of 1.5x10-7 recombination events per site. Note that the higher recombination between loci aims to mimic their real genetic distance separation (assuming a functional site density of 5%) - this greatly speeds up the simulation as non-coding sites do not need to be simulated. In other words, we simulate 120 Kb of coding sequence in each run, which is equivalent to simulating a 2.4 Mb chromosome window with 5% coding sites. We perform 100 independent runs for each of the twenty-four scenarios.

Demographic parameters: The Domesticated populations of 5,000 diploid individuals suffer
a bottleneck, reducing their population size temporarily to 200 diploid individuals, to recover again to 5,000 diploid individuals after the bottleneck. The bottleneck lasted 100 generations. The simulation finishes 900 generations after the bottleneck. In twelve of the twenty-four simulated scenarios we allow that a ratio of 0.01 of the Wild individuals migrate to the Domesticated population during the 100 generations of the bottleneck. Thus, during the bottleneck 25% of the domesticated population comes from the wild population every generation. In the other twelve combinations there is no exchange of individuals between the Wild and Domesticated populations. This demographic history is equivalent to a 1,000 years long bottleneck followed by a 9,000 years long recovery in an ancestral population with 10,000 diploid individuals and a generation time of 5 years.

DFE parameters: the selective effects produced by domestication are modeled by changing the fitness values of a proportion (calculated with a probability of change called pc) of the existing and new mutations in the domesticated population (at the time of the split) (Table 1). This probability of change can be 0 (our negative control), 0.05 or 0.25. Domesticated and Wild populations show different proportions of beneficial and deleterious new mutations depending on the scenario. SLiM defines ‘s as the selective coefficient for the homozygote, while the inference algorithms used estimate the selective coefficient for heterozygote. Here we have assumed co-dominance and we have scaled the coefficient of selection to the ancestral population to have comparative values, that is, we multiply the Na 4 times and divide the selective coefficient twice, 4Nas/2=2Nas). The negative effects in all scenarios and populations follow a gamma distribution with a shape value of 0.3 and a mean of S = -100 (S =2Nas in the heterozygote, in which each mutation is assumed co-dominant, Na = 5,000 diploid individuals, considering the ancestral population size and s = -0.01), which is in the range of values inferred using empirical data (Boyko et al., 2008; Galtier, 2016), while variants with positive effects follow an exponential distribution. We investigate three combinations of parameters for the positive DFE plus one without positive selection: no positive selection (that is, pb = 0), pervasive & nearly neutral (with mean selective effects Sb = 1 and probability of being beneficial pb = 0.10), common & weak (Sb = 10 and pb = 0.01) and rare & strong (Sb = 100 and pb = 0.001) (Table 1).

Table 1 - Variable demographic and selective parameters across scenarios

Migration

Positive DFE

pc

Scenario ID

(W->D)

 

0

pb =0

0

1

Absent

0.05

2

 

0.25

3

pb =0.1 & Sb = 1

0

4

Pervasive and nearly neutral

0.05

5

 

0.25

6

pb = 0.01 & Sb = 10

0

7

Common and weak

0.05

8

 

0.25

9

pb = 0.001 & Sb = 100

0

10

Rare and strong

0.05

11

 

0.25

12

0.01

pb =0

0

13

Absent

0.05

14

 

0.25

15

pb =0.1 & Sb = 1

0

16

Pervasive and nearly neutral

0.05

17

 

0.25

18

pb = 0.01 & Sb = 10

0

19

Common and weak

0.05

20

 

0.25

21

pb = 0.001 & Sb = 100

0

22

Rare and strong

0.05

23

 

0.25

24

The twenty-four simulated combinations of parameters in this study. The first column represents the migration rate from the Wild to the Domesticated population, the second column refers to the DFE of new beneficial mutations and the third column shows the probabilities to have sites that change their selection coefficients in the Domesticated population (pc). Last column shows the ID we use to quickly label scenarios along the manuscript.

Types of Sites

The sites are initially divided into seven different types (named m1 to m7), being m1 neutral (synonymous) and m2 to m7 functional (non-synonymous) sites having a different selective effect when mutated (see Table 2 and Figure 1C). Mutations at m5, m6 and m7 sites generate deleterious variants in the Wild population, and mutations at m2, m3 and m4 sites generate beneficial mutations in the Wild population. The selection coefficient of mutations generated at m2 (beneficial) or at m5 (deleterious) sites are invariant for the Wild and Domesticated populations. However, the mutations at m3, m4, m6 and m7 sites will change their selective effect in the Domesticated populations relative to the Wild populations. That is, the new selective effect is drawn from the corresponding DFE section (positive or negative), independently of their value in the wild population. The selection coefficient of a given beneficial mutation at m3 sites will remain beneficial in the Domesticated population, but it will be different from the original beneficial effect at Wild. A mutation at m4 sites will change its selection coefficient from beneficial in the Wild to deleterious in the Domesticated population. Equivalently, the selection coefficient of a deleterious mutation at m6 sites will remain negative in the Domesticated population but it will be different from that found at Wild. A mutation at m7 sites will change its selection coefficient from deleterious in the Wild to beneficial in the Domesticated population (see probabilities included in Table 2). For each of the 24x100 simulation runs, we randomly pre-calculate independently the location of each site type (except for the permanent location of two non-synonymous sites followed by a synonymous site within codons) and their selective effect using an ad hoc R script (https://github.com/CastellanoED/domesticationDFE/blob/main/calculate_fitness_position_matrix.R). This hard-coding of selective effects on different sites allows us to gain insight into the relative importance of each mutation type for the domestication process.

Table 2 - Types of sites in simulated scenarios

Site

Wild

Domesticated

Probability1

m1

Neutral

No change, remain Neutral

All synonymous

m2

Beneficial

No change, remain Beneficial

pwb·(1-pc)

m3

Beneficial

Change to a different Beneficial Effect

pwb·pc·pcb

m4

Beneficial

Change to Deleterious

pwb·pc·(1- pcb)

m5

Deleterious

No change, remain Deleterious

(1- pwb)·(1-pc)

m6

Deleterious

Change to a different Deleterious Effect

(1- pwb)·pc·(1-pcb)

m7

Deleterious

Change to Beneficial

(1- pwb)·pc·pcb

1 m1 are only defined at synonymous sites (1/3 of the total sites analyzed). m2 to m7 probabilities consider only non-synonymous sites (2/3 of the total sites analyzed). pwb is the probability that mutations are positively selected in the Wild population, pc is the probability that mutations change selection coefficient in the Domesticated population, and pcb is the probability of those mutations become beneficial in the Domesticated population (note in our simulations pwb = pcb). Note that in simulated scenarios pcb=pwb.

Nucleotide variability estimates

We have counted the number of polymorphic sites and estimated the Watterson variability estimate per nucleotide (Watterson, 1975) for synonymous and non-synonymous sites for each of the 100 run simulations and for all the scenarios and populations. We have also calculated the ratio of synonymous versus non-synonymous polymorphic sites (Pn/Ps) as descriptive estimators of the observed variability at these sites (https://github.com/CastellanoED/domesticationDFE/blob/main/diversity_PnPs_slim.R).

Distribution of fitness effects (DFE): Two complementary approaches

polyDFE: 1D-SFS and 1D-DFE

We use the polyDFEv2.0 framework (Tataru & Bataillon, 2019) to estimate and compare the DFE across Wild-Domesticated population pairs by means of likelihood ratio tests (LRTs). We use the R function compareModels (from https://github.com/paula-tataru/polyDFE/blob/master/postprocessing.R) to compare pairs of models. The inference is performed only on the unfolded SFS data (divergence counts to the outgroup are not fitted), and unfolded SFS data are fitted using a DFE model comprising both deleterious (gamma distributed) and beneficial (exponentially distributed) mutations. The DFE of each Wild-Domesticated population pair is inferred using the 1D-SFS of each population. DFE is calculated assuming S=4Nes, in which s is the selective effect in the heterozygote, and Ne is the effective population size. Note that for comparative analysis Ne will be equivalent to Na. We used S/2 to contrast the simulated value with SLiM and with the inferred value in dadi (see next section). polyDFE assumes that new mutations in a genomic region arise as a Poisson process with an intensity that is proportional to the length of the region and the mutation rate per nucleotide (μ). We assume that μ remains constant across simulations (as it is the case). Both an ancestral SNP misidentification error (ε) and distortion parameters (ri) are estimated. However, we notice that the exclusion of ε does not affect the rest of estimated parameters because under the simulation conditions used here no sites are expected to be misidentified. The ri parameters are fitted independently for each frequency bin (from i = 1 to i = 19), and they are able to correct any distortion that affects equally the SFS of synonymous and non-synonymous variants (such as, in principle, demography or linked selection). Model averaging provides a way to obtain honest estimates that account for model uncertainty. To produce the model average estimates of the full DFE we weight each competing model according to their AIC following the equation 6.1 shown in the polyDFEv2 tutorial (polyDFE/tutorial.pdf at master · https://github.com/paula-tataru/polyDFE). We use the R function getAICweights (from https://github.com/paula-tataru/polyDFE/blob/master/postprocessing.R to do the model averaging) to obtain the AIC values.

dadi: 2D-SFS and 2D-DFE

dadi (Gutenkunst et al., 2009) is employed to infer the joint distribution of fitness effects (Jerison et al., 2014; Ragsdale et al., 2016; Huang et al., 2021) and the demographic history of all simulated population pairs. Following Huang et al. (2021), our model is that any mutation may have different selection coefficients sw and sd in the wild and domesticated populations, respectively. The joint DFE is the two-dimensional probability distribution quantifying the probability that a new mutation has selection coefficients sw and sd. In Huang et al. (2021),, joint DFEs with only deleterious mutations were considered. Here we extend that model to consider joint DFEs that include mutations that are beneficial in one or both populations.

Our new model for the joint DFE between the two populations is a mixture of multiple components designed to mimic the selected mutation types in the simulations (Table 2; Figure 1D). The major exception is that beneficial mutations are modeled to have a single fixed selection coefficient, rather than arising from an exponential distribution. Let pwb be the fraction of mutations that are positively selected in the Wild population, pc be the fraction of mutations that change selection coefficient in the Domesticated population, and pcb be the fraction of those mutations that become beneficial in the Domesticated population (note in our simulations pwb = pcb). To model mutation types m2 and m3, a proportion pwb · (1-pc) + pwb · pc · pcb of mutations are assumed to have the same fixed positive selection coefficient in both populations. To model m4, a proportion pwb · pc · (1-pcb) is assumed to have a fixed positive selection coefficient in the Wild population and a gamma-distributed negative selection coefficient in the Domesticated population. To model m5, a proportion (1-pwb) · (1-pc) of mutations are assumed to have equal negative gamma-distributed selection coefficients in the two populations. To model m6, a proportion (1-pwb) · pc · (1-pcb) is assumed to have independent gamma-distributed selection coefficients in the two populations. To model m7, a proportion (1-pwb) · pc · pcb mutations is assumed to have a gamma-distributed negative selection coefficient in the Wild population and a fixed positive selection coefficient in the Domesticated population. All gamma distributions are assumed to have the same shape and scale. This model is implemented in dadi as the function “dadi.DFE.Vourlaki_mixture” (https://github.com/CastellanoED/domesticationDFE/blob/main/domestication_new_dadi_functions.py). Note in our simulations the marginal 1D-DFEs of Wild and Domesticated populations are exactly the same; the difference is that in Domesticated populations a given fraction of sites (some already polymorphic, some still monomorphic) can change their selection coefficient relative to the Wild population.

To infer the parameters of the joint DFE, we followed the procedure of Huang et al. (2021), but with this new DFE model. Briefly, assuming independence between mutations, the expected joint site frequency spectrum (SFS) for all mutations experiencing selection (here nonsynonymous sites) can be computed by integrating the joint SFS for each possible pair of population-size-scaled selection coefficients Sw and Sd over the joint DFE. Given that expected SFS, the composite likelihood of the nonsynonymous data can be computed by treating it as a Poisson Random Field, as in Gutenkunst et al. (2009). The parameters of the joint DFE model can then be inferred by maximizing that likelihood, using numerical optimization. For this study, we used the default in dadi, the BOBYQA optimization algorithm as implemented by the NLOpt library. For each selection coefficient pair (Sw , Sd ), the expected SFS was calculated from a single integration of the partial differential equation (PDE) implemented by Gutenkunst et al. (2009).in the dadi software.

We integrate over our joint DFE model (Fig. 1D) by summing contributions for the discrete components of the DFE. The m2 + m3 component is simplest, being simply a weighting of the single SFS corresponding to the two positive selection coefficients assumed in the wild and domesticated populations. The m4 and m7 components are integrated over by holding sw or sd fixed and integrating over spectra calculated as the other selection coefficient is varied. The m5 component is integrated over by considering spectra in which Sw = Sd . The m4, m7, and m5 components are thus one-dimensional integrations and employ the numerical methods developed in Kim et al. (2017). The m6 component is a two-dimensional integration over independent gamma distributions and is carried out as in Huang et al. (2021). This complex summation over spectra to calculate the expected SFS under the DFE is much less computationally expensive than calculating the spectra for each (Sw , Sd) pair, so those spectra are precomputed and cached.

For inference, a new, more general demographic model with branch-independent population size changes is first fit to the synonymous mutations from each simulation, and then the newly proposed joint DFE model is fit to the non-synonymous mutations. This model (Fig 1D) is implemented as a custom model using the dadi software and evaluated using the approach of Gutenkunst et al. (2009). The one subtlety is that an if statement is used to enable flexibility as to whether T1F or T2F is larger (https://github.com/CastellanoED/domesticationDFE/blob/main/domestication_new_dadi_functions.py, function name: “Domestication_flexible_demography”). The parameters of the demographic model (Figure 1B) are estimated by running 100 optimizations per inference unit. The 2D-SFS for selected sites are precomputed conditional on the demography for 1042 values of Sw and Sd (S =2Nas, a population scaled selection coefficient for the heterozygote where Na is the ancestral population size), 102 negative and 2 positives. For the negative part of the DFE, γ values were logarithmically equally spaced between -2000 and -10-4. The expected DFE for selected sites can then be computed as a weighted sum over these cached spectra (Kim et al., 2017). The DFE parameters shape, scale, pwb, pc, and pcb are then estimated by maximizing the Poisson likelihood of the simulated data, with the non-synonymous rate of mutation influx fixed to twice that inferred for neutral sites in the demographic history fit. For the DFE inference, optimization is repeated until the best three results are within 0.5 log-likelihood units. Ancestral state misidentification is modelled, however in our simulations no sites are expected to be misidentified.

For the purpose of this work, dadi software is downloaded and installed according to the instructions provided at the following link: https://bitbucket.org/gutenkunstlab/dadi/src/master/. Since dadi operates as a module of Python, the Anaconda3 and Spyder (Raybaut P., 2009; Python 3.7, Rossum G. V & Drake F. L, 2009; Anaconda, 2016) versions are used in this study.

Inference units, and confidence intervals in demographic and DFE parameters

To obtain the sampling variance of parameter estimates and approximate confidence intervals, we use a bootstrap approach. We resample with replacement 100 times 20 independent simulation runs or chromosomal “chunks” (from a pool of 100 “chunks”) and concatenate them. Hence, each concatenated unit (or inference unit) is made of 24 Mb of coding sequence (as comparison, the human genome contains ~26 Mb of coding sequence). Uncertainties of DFE parameter inferences in polyDFE and dadi are calculated by this conventional bootstrapping, but in dadi we hold the demographic model fixed. In polyDFE the distortion introduced by demography (and linked selection) is not estimated but corrected with the ri parameters. Note that our procedure with dadi does not propagate uncertainty in demographic parameters through to the DFE parameters. To obtain the sampling variance of demographic parameter estimates with dadi we use the Godambe approach as described in Coffman et al. (2016). A final consideration on the factor of two differences across simulation and inference tools. We adjusted the population scaled selection coefficients to 2Nas in polyDFE, dadi and SLiM4 to enable a comparative study.

Results and Discussion

Studying the effect of domestication on the DFE of natural populations is particularly challenging, especially when available methods for inferring and comparing the DFE have not been evaluated using exactly the same dataset. In this study, we conduct simulations using different combinations of parameters relevant to the domestication process. A key distinction between the domestication demographic model used here and those commonly applied in speciation studies is the time scale since the split occurred. In our simulations, domesticated populations experience either large or small changes in the number and selective effects of loci under domestication, following a bottleneck period, with or without migration. Hereafter, we refer to these as the Wild and Domesticated populations.

This study focuses on the evolutionary process of domestication from the point of divergence to the present domesticated lineages. We do not account for the genetic improvement programs implemented in recent decades for some domesticated animals, which can significantly increase inbreeding levels (e.g., Makanjuola et al., 2020 estimated inbreeding levels as high as 40% in certain cattle breeds subjected to intense genomic selection). The simulated models in this work include strong selection and reductions in population size, both of which can moderately increase inbreeding levels in our simulations. However, Gilbert et al. (2022) reported that only very high selfing levels (>80%) severely affect DFE inference.

The Wild populations have a constant DFE and constant population size, but limited recombination across loci to mimic a realistic recombination landscape. Beneficial mutations arise at Wild populations following an exponential distribution, while deleterious mutations are drawn from a gamma distribution with shape 0.3 and mean S = -100 (where S = 2Nas, the selection coefficient s in the heterozygote is -1%, and Na = 5,000 diploid individuals is the ancestral effective population size, see Material and Methods: Simulating the Domestication Process). As indicated in Materials and Methods section, all mutations, beneficial and deleterious, are co-dominant. The Domesticated population originates from the Wild population through a bottleneck and a concomitant change in selective effects at a fraction of non-synonymous sites (Figure 1; Table 1). The recombination and mutation landscapes are drawn from the same distribution in the Domesticated and Wild populations.

The change in selective effects affects both new mutations that arise within the Domesticated population and existing variants that existed before the domestication event. Put simply, not only can mutations that were deleterious (or beneficial) before the population split become beneficial (or deleterious) within the domesticated population, but even if the direction of the selective effect remains the same, the intensity of selection can change. Table 2 shows all the combinations of changes in selective effects between Wild and Domesticated populations. Our simulated scenarios aim to cover a variety of possible changes in the genetic architecture (number of loci) and the strength of selection (selection coefficients) of the trait/s under domestication. Three DFEs for beneficial mutations are assumed: (i) pervasive and nearly neutral, where a large fraction of new mutations (10%) are on average nearly neutral (Sb = 1), (ii) common and weak, where beneficial mutations are still fairly common (1%) but weakly selected (Sb = 10) and (iii) rare and strong, where very few mutations (0.1%) are strongly beneficial (Sb = 100). To better understand the role of selective sweeps on downstream inference, we also include simulations without a positive DFE. Depending on the scenario, a selective change occurs only at a small (0.05) or at a substantial proportion (0.25) of sites in the Domesticated population (Table 1, “pc” column). We leave eight scenarios as negative controls; the selection coefficients of new and standing variation in the Domesticated and Wild populations are exactly the same. Finally, demographic changes affect only the Domesticated population; the Wild population evolves under a constant population size. Two versions of the same demographic model (Figure 1A) are simulated: (i) one with migration, and (ii) another without migration. When there is migration, it only occurs from the Wild to the Domesticated population during the domestication bottleneck.

Figure 1 - Joint demographic and DFE models simulated and fit. A: Illustration of the joint demographic model used in SLiM simulations. Na: Effective population size of the Ancestral population. New: Effective population size of the Wild population. Ne1d: Effective population size of the Domesticated population during the bottleneck. Ne2d: Effective population size of the Domesticated population after the bottleneck. T1: Number of generations in the bottleneck period. T2: Number of generations from the bottleneck to the present. m: Wild to Domesticated migration rate (migration occurs along T1). B: Illustration of a more general joint demographic model used in the dadi inferences. Na: Effective population size of the Ancestral population. Npre: Effective population size before the domestication split. N1div: Effective population size of the Wild population after the split. N1F: Effective population size of the Wild population at the end of the simulation. N2div: Effective population size of the Domesticated population after the split. N2F: Effective population size of the Domesticated population at the end of the simulation. Tpre: Number of generations before the domestication split. Tdiv: Number of generations after the bottleneck. T1F: Number of generations under N1F. T2F: Number of generations under N2F. Note that T1F and T2F are estimated independently and that T1F can be the same, longer or shorter than T2F. md: Wild to Domesticated migration rate. mw: Domesticated to Wild migration rate. Both migration rates occur after the domestication split. C: Illustration of the joint DFE model used in the SLiM simulations, with mutation types illustrated. In the illustration, the shadow blue regions in the plot represent the possible different types of mutations considering the selection coefficient values in each of the two populations (from gamma and exponential distributions in wild and domestic and from pwb, pc and pcb probabilities see Table 1 and 2). For example, a point in the left-upper region of the illustration represents a mutation with positive s in the Domestic population but negative in Wild population (type m7). D: Illustration of the joint DFE model used in the dadi inferences and the inferred associated parameters, in which a fixed positive selection coefficient is assumed.

Estimation of demographic parameters in Wild and Domesticated populations

In this study, we investigate the effects of natural selection—both broadly and in terms of how artificial selection alters the selective pressures acting on new and shared genetic variation—on the inference of demographic history and DFE during domestication. We do this using two commonly used inference tools (polyDFE and dadi) that assume free recombination across loci. Note that dadi first infers the demographic history and then infers the DFE assuming those inferred demographic parameters, whereas polyDFE operates independently of specific demographic histories and is designed to correct for distortions that affect both synonymous and non-synonymous site frequency spectra equally (Tataru & Bataillon, 2019). Figure 1 A and B show the simulated joint demographic model and the joint demographic model used in the dadi inferences, respectively. We have increased the complexity of the inference model by introducing additional parameters, allowing it to account not only for "simulated" or true demographic changes, but also for more complex and unknown demographic histories and the potential influence of linked selection on synonymous SFS. The diagnostic plots can be found in Supplementary Figure 1; there is good agreement between the model fits and the data.

Our findings indicate that when positive selection is absent or relatively weak (Sb = 0, Sb = 1 or Sb = 10), the estimated onset of domestication tends to be approximately twice as old as the actual simulated starting point. Additionally, the inferred bottleneck appears slightly shallower but considerably longer than the simulated value (see Figure 2 and Supplementary Table 1 for the confidence intervals). This suggests that the influence of linked selection, likely driven primarily by background selection, has the effect of elongating the inferred timeline. Consequently, it makes the inferred domestication divergence and bottleneck appear more ancient and extended, respectively. For the Wild populations we always infer a larger population expansion than for the Domesticated populations, but without a bottleneck. This signal of a recent expansion in the Wild population is expected because when we consider how linked selection affects the SFS, there are more rare synonymous polymorphisms compared to what we would expect if there was free recombination under a constant population size (Charlesworth et al., 1993, 1995; Nielsen, 2005; Zeng & Charlesworth, 2011; Messer & Petrov, 2013; Nicolaisen & Desai, 2013; Ewing & Jensen, 2016). Remarkably, when positive selection is rare and strong (Sb = 100), the inferred temporal stretch becomes even more pronounced, and the inferred demographic history of both populations overlap extensively. The inferred domestication divergence shifts to approximately 50,000 years ago, whereas the actual simulated split occurred 10,000 years ago. Additionally, the inferred bottleneck appears significantly longer and less severe, while there is an inferred large population expansion in both Wild and Domesticated populations. Although in Figure 2 there appears to be a change in population size before the domestication split, only five scenarios (with IDs 3, 7, 15, 17 and 18) are statistically significant (Supplementary Table 1 and Supplementary Figure 2). Interestingly, we find the migration rate from Wild to Domesticated (mw2d) and from Domesticated to Wild (md2w) are overestimated in most scenarios (Supplementary Table 1 and Supplementary Figure 3). We observe that neither migration nor an increase in pc appears to significantly change the inferred demographic histories that we have just described.

Figure 2 - Lines showing the inferred demographic histories for the twenty-four simulated scenarios. In salmon-orange color is represented the Wild population and in turquoise-green color the Domesticated population. The dark grey line shows the true simulated demography in Domesticated populations. The true Wild population is not shown but it is a constant population size with relative Ne = 1. The x-axis indicates the number of generations in relation to the ancestral population size Na, while the y-axis show the population size at each time in relation to Na (that is, Ne/Na, where 1 means that Ne=Na). The 95% confidence intervals calculated using the Godambe approximation can be found in Supplementary Table 1.

In summary, it was not possible to accurately determine the timing of the onset of domestication, the duration of the domestication bottleneck, or to distinguish between the presence and absence of migration between populations. We believe that these aspects are crucial for contextualizing the role of domestication in human history, and vice versa. Unfortunately, either the 2D-SFS or our “free recombination” modeling assumptions (or both) do not seem to be useful in this context.

Beyond domestication, the signal interference between selective and demographic processes has been widely studied. Linked selection significantly distorts the SFS, leading to biases in inferred demographic parameters. For example, Schrider et al. (2016) found that positive selection can mislead demographic inference, even inferring population size changes where none occurred, with selective sweeps as the primary cause. Gilbert et al. (2022) used forward simulations to report that large population expansions are inferred due to linked selection, particularly in regions of low recombination or high gene density. Finally, Johri et al. (2021) demonstrated biases due to background selection even after masking functional regions. Together with these other findings, our work underscores the persistent difficulty of accurately inferring demographic histories in the presence of linked selection using population genomic data, even when using ancestral recombination graph based approaches (Marsh & Johri, 2024).

Thus, the next question is to what extent can the nuisance ri parameters from polyDFE or this distorted inferred demography from dadi help to recover the simulated DFE parameters?

Is it possible to detect domestication as an artificial change in the marginal full DFE between the two populations?

Next, we investigate whether polyDFE captures differences in the marginal (or 1D) full DFE of Domesticated and Wild populations across the twenty-four domestication scenarios (Table 1). We run five nested models (Table 3) and compare them using likelihood ratio tests (LRTs) (Supplementary Table 2). It is important to note that in all our simulations, the marginal full DFE for new mutations in both Domesticated and Wild populations is the same within a given scenario (as detailed in Table 1). This means that the selection coefficients for sites, whether they are monomorphic or polymorphic, are drawn from the same full DFE. In simpler terms, the proportion of new mutations that are advantageous or detrimental is identical for both Domesticated and Wild populations within a given scenario.

Table 3 - List of nested polyDFE models and (co)estimated parameters.

Model ID

Negative DFE

Positive DFE

shape

S

pb

Sb

M1

Var

Var

-

-

M10

Fix

Var

-

-

M2

Var

Var

Var

Var

M20

Fix

Var

Var

Var

M30

Fix

Var

Fix

Fix

Independently estimated parameters for the Domesticated and Wild populations (Var). Jointly estimated parameters for the Domesticated and Wild populations (Fix). S is always independently estimated to accommodate potential changes in Ne between populations. The population mutation rate (Ө), the nuisance parameters (ri) and the mispolarization parameter (ε) are all independently estimated across Wild and Domesticated populations.

LRTs between different nested models allow us to address important questions about the DFE, without assuming any prior knowledge of our datasets. First, we assess whether the inferred shape of the negative DFE is similar in both populations while also examining if the estimation of the shape parameter is influenced by the presence of advantageous mutations. When comparing models that do not consider beneficial mutations (models M1 versus M10 in the second column of Supplementary Table 2), the model with a distinct shape for Domesticated and Wild populations is accepted only in two, rather unrelated, scenarios (scenarios 7 and 11). This indicates that an artificial alteration in the shape of the deleterious DFE between Domesticated and Wild populations can be inferred. Fortunately, when comparing models that take into account beneficial mutations (models M2 vs M20, third column in Supplementary Table 2), all scenarios show a shared shape of the deleterious DFE, which is expected based on the simulation parameters. These findings suggest that disregarding beneficial mutations can cause an artificial change in the inferred shape of the marginal deleterious DFE between populations, as noted previously by Tataru et al. (2017). Second, when we contrast models with and without considering the positive DFE (that is, testing the nested models M1 vs M2 and M10 vs M20), yields statistically significant results in all scenarios (see Supplementary Table 2, fourth and fifth columns). Hence, polyDFE appears to detect beneficial mutations, regardless of the true presence and strength of positive selection. Third, we investigate whether Domesticated and Wild populations could exhibit an artificial change in the beneficial DFEs as a consequence of domestication. When comparing the M20 and M30 models (refer to the last column in Supplementary Table 2), polyDFE invokes changes in the positive DFE between populations in most scenarios without migration (with IDs 1, 2, 5, 7, 8, 11 and 12). Below we characterize this putative change in the marginal DFEs between populations.

Estimation of DFE parameters in Wild and Domesticated populations

Under the polyDFE framework, we begin by extracting the Akaike Information Criterion (AIC) from every model (Table 3) and then computing the AIC-weighted parameters for all models (Tataru & Bataillon, 2019; Castellano et al., 2019). This approach is used because the true model generating real data in both Wild and Domesticated populations is unknown. Instead, under dadi's framework, we adopt an alternative methodology that utilizes very general, parameter rich and versatile joint demographic and DFE models to fit the 2D-SFS. The diagnostic plots of the new joint DFE model is shown in Supplementary Figure 1, again there is good agreement between the model fits and the data.

Inferred parameters related to the deleterious DFE: Supplementary Figure 4 and 5 depicts the distribution of parameters related to the deleterious DFE that are estimated by performing bootstrap analysis using polyDFE and dadi. We observe that both tools have a tendency to marginally overestimate the shape parameter of the gamma distribution employed to model the deleterious DFE (Supplementary Figure 4). The overestimation is particularly significant in polyDFE, when positive selection is strong. In such scenarios, dadi's shape estimation is sometimes rather noisy. Regarding the mean of the deleterious DFE (s) (Supplementary Figure 5), we observe that the inferred mean values across bootstrap replicates vary by up to 20% higher or lower, depending on the population, scenario, and inference tool. The largest misinference occurs when positive selection is strong and dadi is used and in the Domesticated population when polyDFE is used.

Inferred parameters related to the beneficial DFE: The distribution of parameters associated with the beneficial DFE, estimated by bootstrap analysis using polyDFE and dadi, is shown in Supplementary Figure 6 and Supplementary Table 3 (only dadi). Depending on the scenario, we simulate an average increase in relative fitness (sb) of 0.010, 0.001, and 0.0001. Positive selection's strength is usually substantially underestimated by polyDFE and dadi, but only polyDFE consistently overestimates the proportion of new advantageous mutations (pb), regardless of the true simulated value. Given the distribution of inferred values of pb and sb, a peak of effectively neutral advantageous mutations is being measured by polyDFE. The overall excess of effectively neutral advantageous mutations measured by polyDFE is generally balanced by the defect of effectively neutral deleterious mutations. Consequently, polyDFE seems to have limited power in identifying effectively beneficial mutations on the 1D-SFS (under these simulation conditions). More importantly, the apparent spurious difference in the marginal full DFE between populations detected by polyDFE disappears when the full DFE is discretized. We conclude that if a significant change is detected in the discretized marginal full DFE, it must be considered valid.

It is noteworthy that both polyDFE and dadi tools typically produce comparable and reasonably accurate discretized deleterious DFEs (Figure 3), despite polyDFE’s tendency to infer a peak of effectively neutral beneficial mutations. This suggests that, regardless of the inference method used, the estimation of the “effective” discretized deleterious DFE remains robust to demographic and selective changes, as well as the pervasive effects of linked selection. In contrast, recent studies indicate that in highly selfing species, the deleterious DFE is often misestimated due to the influence of linked selection (Gilbert et al., 2022), particularly strong Hill-Robertson interference (Daigle & Johri, 2024). These findings highlight that the accuracy of inferring the deleterious DFE is not universal but instead depends on factors such as the degree of selfing and inbreeding.

Thus, we conclude that both tools struggle to infer the positive DFE and tend to be overconservative and identify weaker positive selection than what has been simulated. We suspect this arises from linkage between beneficial and synonymous mutations, which may lead to an excess of high-frequency synonymous mutations and an overcorrection of the excess non-synonymous polymorphisms at high frequency, either through polyDFE’s ri parameters or dadi’s inferred demographic history. Notably, these findings are consistent with what was already pointed out by Tataru et al. (2017) and Booker (2020) using a single population. They draw attention to the challenge of inferring parameters of positive selection when counting for weakly and strongly selected mutations. Indeed, Booker (2020) emphasize that in the case of rare and strong positive selection, the SFS can be very noisy, with linked sites playing an important role, making it difficult to infer the positive DFE.

Figure 3 - Sampling distributions of estimated discretized full DFE obtained using 100 bootstrap replicates.

Estimation of the fraction of mutations with divergent selective effects (pc) between Domesticated and Wild populations

One of the main goals of this study is to determine the proportion of new and standing non-synonymous mutations with differing selection coefficients in Wild and Domesticated populations. The usage of our new joint DFE model is not limited to the current study. Our new model, created by mixing multiple distributions to mimic mutation types in our simulations (Table 2; Figure 1C-D), is suitable for usage in any recently diverged populations.

Figure 4 displays the distribution of the inferred pc for three different positive DFEs, along with simulated pc values. When positive selection is not strong, it becomes apparent that scenarios with a significant fraction of mutations with dissimilar selective effects (pc = 0.25) can readily be differentiated from those where a small (pc = 0.05) or nonexistent (pc = 0) number of sites alter their selection coefficient. However, differentiating our negative control from a positive control proves difficult when only 0.05 of the sites show a difference in their selection coefficients. Notably, we overestimate pc significantly in cases of strong positive selection, indicating that classic hard selective sweeps may mimic divergent selection in a substantial amount of non-synonymous mutations. We observe no major impact of migration on the inferred pc values across scenarios.

The overestimation of pc when positive selection is strong is not surprising, since non-synonymous mutations with stable selection coefficients between populations may be in close recombinational proximity and hitchhike with strongly beneficial mutations that are population-specific. This will exacerbate the apparent fraction of mutations with divergent selective effects. In contrast, if positive selection is weaker, recombination will be able to disentangle beneficial mutations from the rest of mutation types and simplify our estimation of pc. One way to ameliorate this problem would be to remove genomic windows with evidence of recent population-specific, complete or partial, selective sweeps and rerun our inference pipeline. For example, these could be regions with low neutral genetic diversity. However, we find this heuristic solution might be difficult to implement in practice.

Supplementary Figure 7 shows the observed level of neutral genetic diversity (measured using Watterson’s theta (Watterson, 1975) and synonymous sites, θs) and the selective constraint (i.e., the ratio of non-synonymous polymorphisms to synonymous polymorphisms per site, Pn/Ps) for each independent simulation run. Note the large decrease in the observed θs, driven entirely by linked selection in Wild populations, relative to the expected level of neutral genetic diversity (expected θs = 0.005 under free recombination). Particularly important is the reduction in the average θs across independent simulation runs in Wild populations when positive selection is rare and strong (θs is ~20% of the expected value), whereas when positive selection is weaker or absent the observed level of genetic diversity is ~40% of the expected value. In strong positive selection scenarios, there may be no heuristic correction or genomic region that escapes genetic draft (Gillespie, 2000), and our current definition and interpretation of pc would be misleading. We also observe that when positive selection is strong, genetic diversity and Pn/Ps are significantly further reduced in both Domesticated and Wild populations, causing the two distributions to largely overlap. As described above for the reconstruction of demographic history, when selective sweeps are strong, the recovered demographic history also tends to overlap between Wild and Domesticated populations. The overlap of demographic histories, neutral genetic diversity and Pn/Ps distributions could be used as a caution signal and as an indicator of strong positive selection and widespread genetic draft. Finally, migration appears to cause a minor reduction in Pn/Ps and increase genetic diversity within Domesticated populations. Thus, migration acts slightly diminishing the Pn/Ps discrepancy between Wild and Domesticated populations.

Implications for empirical analysis of populations

A scenario involving divergent populations, with one undergoing a bottleneck and a shift in the selection regime, may also be relevant in other contexts beyond domestication, such as invasive species, island colonization or recent parapatric and allopatric speciation events. In this work, we observed that selective effects affect the inference of demographic parameters by linked selection, but to different extents depending on the DFE. Background selection contributes to the misinference of domestication divergence time and the duration of the bottleneck, making them appear more ancient and extended than in our simulations. When strong selective sweeps are combined with background selection, the inferred temporal stretch becomes even more pronounced, and the inferred demographic history of both populations overlaps extensively. These demographic distortions in the inference must be considered when interpreting real data using these methods or any other methods that make similar assumptions. Nevertheless, under the assumptions used in this work, we believe that the discretized deleterious DFE is estimated with reasonable accuracy. This suggests that methods designed to infer the entire DFE could be applied first, followed by the estimation of demographic parameters using this information. Interestingly, Johri et al. (2021), using a different approach based on a single population and considering four classes of deleterious mutations, found that while DFE classes were accurately estimated, demographic parameters were not. They proposed a method to jointly infer both demography and deleterious mutations using an ABC framework. Although computationally intensive, this approach may help address some of the inference challenges highlighted in this work.

Another point of interest for empirical geneticists is the development of a new method to jointly infer the DFE between wild and domesticated and their differences in the positive part of the distribution. The 2D dadi extension algorithm allows to infer differences in pwb (the fraction of mutations that are positively selected in the wild population), pc (the fraction of mutations that change the coefficient of selection in the domesticated population), pcb (the fraction of those mutations that become beneficial in the domesticated population).

Figure 4 - Sampling distributions of inferred pc (dadi) are obtained using 100 bootstrap replicates. In light green are shown the scenarios without migration and in dark green the scenarios with migration.

Conclusions

In summary, our use of forward-in-time simulations has provided valuable insights into the inference of complex genetic demographic history and distribution of fitness effects (DFE) for both new and standing amino acid mutations in the context of domestication. Through a comparative analysis of two methods, polyDFE and dadi, and the new implementation of a full 2D-SFS full inference of DFE, we have uncovered the impact of linked selection on the reconstructed demographic history of both wild and domesticated populations. Despite biases in the timelines of domestication events and bottleneck characteristics, the estimation of discretized deleterious DFE remains remarkably reliable, demonstrating the robustness of these analytical approaches in the studied conditions. In particular, the underestimation of effectively beneficial mutations in the DFE highlights the influence of linkage between beneficial and neutral mutations, which requires further consideration in model design and interpretation. In addition, our results shed light on distinguishing scenarios of divergent selective effects between populations under weak and strong positive selection, providing a nuanced understanding of the interplay of evolutionary forces. Nevertheless, we must approach the results of this work with caution, as the simulated demographic and selective patterns are based on specific models/idealizations that may not fully capture the complexities of domestication. On the other hand, as we navigate the complex landscape of domestication, these methodological approaches contribute significantly to unraveling the evolutionary dynamics and adaptive processes that shape the genomes of domesticated organisms, and provide a foundation for future research in this critical area of study.

Acknowledgments

Thanks to Miguel de Navascués and other anonymous reviewers for their fruitful comments and suggestions. Preprint version 5 of this article has been peer-reviewed and recommended by Peer Community In Evolutionary Biology (https://doi.org/10.24072/pci.evolbiol.100795, Fumagalli, 2025).

Funding

SERO was supported by grants AGL2016-78709-R funded by MICIU/AEI/10.13039/501100011033 and by “ERDF A way of making Europe” and grant PID2020-119255GB-I00 funded by MICIU/AEI/10.13039/501100011033. This work was also supported by grants SEV‐2015‐0533 and CEX2019-000902-S funded by MICIU/AEI/10.13039/501100011033, and by the CERCA Programme / Generalitat de Catalunya. RNG and DC were supported by the National Institute of General Medical Sciences of the National Institutes of Health (R35GM149235 and R01GM127348 to RNG). ITV was supported by a predoctoral fellowship funded by MCIN/AEI/ 10.13039/501100011033 through the Grant BES-2017-081139 and by “ESF Investing in your future”. This research was also supported by NSF Grant No. PHY-1748958 and the Gordon and Betty Moore Foundation Grant No. 2919.02.

Conflict of interest disclosure

The authors declare that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article. SERO is recommender for PCI EvolBiol and PCI Genomics.

Data, scripts, code, and supplementary information availability

We have added the Diagnostic plots 2D-SFS for synonymous and Nonsynonymous positions obtained with dadi for all the scenarios analyzed, plus seven additional Figures and three additional Tables in the Supplementary Material. This Supplementary Information is available at Zenodo (https://doi.org/10.5281/zenodo.15084050, Castellano 2025), as well as the scripts used in the analyses of this work.


References

[1] Ahmad, H. I.; Ahmad, M. J.; Jabbir, F.; Ahmar, S.; Ahmad, N.; Elokil, A. A.; Chen, J. The Domestication Makeup: Evolution, Survival, and Challenges, Frontiers in Ecology and Evolution, Volume 8 (2020), p. 103 | DOI

[2] Amills, M.; Megens, H.-J.; Manunza, A.; Ramos-Onsins, S. E.; Groenen, M. A. M. A Genomic Perspective on Wild Boar Demography and Evolution, Ecology, Conservation and Management of Wild Pigs and Peccaries, Cambridge University Press, Cambridge, 2017, pp. 376-387 | DOI

[3] Anaconda Software Distribution. Computer software. Vers. 2-2.4.0., https://anaconda.com., 2016

[4] Andersson, L. Genetics of Animal Domestication, Biodiversity in Agriculture: Domestication, Evolution, and Sustainability, Cambridge University Press, Cambridge, 2012, pp. 260-274 | DOI

[5] Arnoux, S.; Fraïsse, C.; Sauvage, C. Genomic inference of complex domestication histories in three Solanaceae species, Journal of Evolutionary Biology, Volume 34 (2021) no. 2, pp. 270-283 | DOI

[6] Avni, R.; Nave, M.; Barad, O.; Baruch, K.; Twardziok, S. O.; Gundlach, H.; Hale, I.; Mascher, M.; Spannagl, M.; Wiebe, K.; Jordan, K. W.; Golan, G.; Deek, J.; Ben-Zvi, B.; Ben-Zvi, G.; Himmelbach, A.; MacLachlan, R. P.; Sharpe, A. G.; Fritz, A.; Ben-David, R.; Budak, H.; Fahima, T.; Korol, A.; Faris, J. D.; Hernandez, A.; Mikel, M. A.; Levy, A. A.; Steffenson, B.; Maccaferri, M.; Tuberosa, R.; Cattivelli, L.; Faccioli, P.; Ceriotti, A.; Kashkush, K.; Pourkheirandish, M.; Komatsuda, T.; Eilam, T.; Sela, H.; Sharon, A.; Ohad, N.; Chamovitz, D. A.; Mayer, K. F. X.; Stein, N.; Ronen, G.; Peleg, Z.; Pozniak, C. J.; Akhunov, E. D.; Distelfeld, A. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication, Science (New York, N.Y.), Volume 357 (2017) no. 6346, pp. 93-97 | DOI

[7] Barton, H. J.; Zeng, K. New Methods for Inferring the Distribution of Fitness Effects for INDELs and SNPs, Molecular Biology and Evolution, Volume 35 (2018) no. 6, pp. 1536-1546 | DOI

[8] Beissinger, T. M.; Wang, L.; Crosby, K.; Durvasula, A.; Hufford, M. B.; Ross-Ibarra, J. Recent demography drives changes in linked selection across the maize genome, Nature Plants, Volume 2 (2016), p. 16084 | DOI

[9] Booker, T. R. Inferring Parameters of the Distribution of Fitness Effects of New Mutations When Beneficial Mutations Are Strongly Advantageous and Rare, G3 (Bethesda, Md.), Volume 10 (2020) no. 7, pp. 2317-2326 | DOI

[10] Boyko, A. R.; Boyko, R. H.; Boyko, C. M.; Parker, H. G.; Castelhano, M.; Corey, L.; Degenhardt, J. D.; Auton, A.; Hedimbi, M.; Kityo, R.; Ostrander, E. A.; Schoenebeck, J.; Todhunter, R. J.; Jones, P.; Bustamante, C. D. Complex population structure in African village dogs and its implications for inferring dog domestication history, Proceedings of the National Academy of Sciences of the United States of America, Volume 106 (2009) no. 33, pp. 13903-13908 | DOI

[11] Boyko, A. R.; Williamson, S. H.; Indap, A. R.; Degenhardt, J. D.; Hernandez, R. D.; Lohmueller, K. E.; Adams, M. D.; Schmidt, S.; Sninsky, J. J.; Sunyaev, S. R.; White, T. J.; Nielsen, R.; Clark, A. G.; Bustamante, C. D. Assessing the Evolutionary Impact of Amino Acid Mutations in the Human Genome, PLoS Genetics, Volume 4 (2008) no. 5 | DOI

[12] Caicedo, A. L.; Williamson, S. H.; Hernandez, R. D.; Boyko, A.; Fledel-Alon, A.; York, T. L.; Polato, N. R.; Olsen, K. M.; Nielsen, R.; McCouch, S. R.; Bustamante, C. D.; Purugganan, M. D. Genome-wide patterns of nucleotide polymorphism in domesticated rice, PLoS genetics, Volume 3 (2007) no. 9, pp. 1745-1756 | DOI

[13] Carneiro, M.; Piorno, V.; Rubin, C.-J.; Alves, J. M.; Ferrand, N.; Alves, P. C.; Andersson, L. Candidate genes underlying heritable differences in reproductive seasonality between wild and domestic rabbits, Animal Genetics, Volume 46 (2015) no. 4, pp. 418-425 | DOI

[14] Castellano, D. Domestication-DFE-PCI, Zenodo, 2025 | DOI

[15] Castellano, D.; Macià, M. C.; Tataru, P.; Bataillon, T.; Munch, K. Comparison of the Full Distribution of Fitness Effects of New Amino Acid Mutations Across Great Apes, Genetics, Volume 213 (2019) no. 3, pp. 953-966 | DOI

[16] Charlesworth, B.; Morgan, M. T.; Charlesworth, D. The effect of deleterious mutations on neutral molecular variation, Genetics, Volume 134 (1993) no. 4, pp. 1289-1303 | DOI

[17] Charlesworth, D.; Charlesworth, B.; Morgan, M. T. The pattern of neutral molecular variation under the background selection model, Genetics, Volume 141 (1995) no. 4, pp. 1619-1632 | DOI

[18] Coffman, A. J.; Hsieh, P. H.; Gravel, S.; Gutenkunst, R. N. Computationally Efficient Composite Likelihood Statistics for Demographic Inference, Molecular Biology and Evolution, Volume 33 (2016) no. 2, pp. 591-593 | DOI

[19] Comeron, J. M. Background selection as null hypothesis in population genomics: insights and challenges from Drosophila studies, Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, Volume 372 (2017) no. 1736, p. 20160471 | DOI

[20] Courtier-Orgogozo, V.; Martin, A. The coding loci of evolution and domestication: current knowledge and implications for bio-inspired genome editing, The Journal of Experimental Biology, Volume 223 (2020) no. Pt Suppl 1, p. jeb208934 | DOI

[21] Crow, J. F.; Kimura, M. Efficiency of truncation selection, Proceedings of the National Academy of Sciences of the United States of America, Volume 76 (1979) no. 1, pp. 396-399 | DOI

[22] Daigle, A.; Johri, P. Hill-Robertson interference may bias the inference of fitness effects of new mutations in highly selfing species, bioRxiv: The Preprint Server for Biology (2024), p. 2024 | DOI

[23] Dayan, T. Early Domesticated Dogs of the Near East, Journal of Archaeological Science, Volume 21 (1994) no. 5, pp. 633-640 | DOI

[24] Driscoll, C. A.; Macdonald, D. W.; O'Brien, S. J. From wild animals to domestic pets, an evolutionary view of domestication, Proceedings of the National Academy of Sciences, Volume 106 (2009) no. supplement_1, pp. 9971-9978 | DOI

[25] Ewing, G. B.; Jensen, J. D. The consequences of not accounting for background selection in demographic inference, Molecular Ecology, Volume 25 (2016) no. 1, pp. 135-141 | DOI

[26] Frantz, L. A. F.; Schraiber, J. G.; Madsen, O.; Megens, H.-J.; Cagan, A.; Bosse, M.; Paudel, Y.; Crooijmans, R. P. M. A.; Larson, G.; Groenen, M. A. M. Evidence of long-term gene flow and selection during domestication from analyses of Eurasian wild and domestic pig genomes, Nature Genetics, Volume 47 (2015) no. 10, pp. 1141-1148 | DOI

[27] Fumagalli, M. How the analysis of the distribution of fitness effects can reveal novel insights onto the genetics of domestication, Peer Community in Evolutionary Biology (2025) | DOI

[28] Galtier, N. Adaptive Protein Evolution in Animals and the Effective Population Size Hypothesis, PLoS genetics, Volume 12 (2016) no. 1, p. e1005774 | DOI

[29] Gilbert, K. J.; Zdraljevic, S.; Cook, D. E.; Cutter, A. D.; Andersen, E. C.; Baer, C. F. The distribution of mutational effects on fitness in Caenorhabditis elegans inferred from standing genetic variation, Genetics, Volume 220 (2022) no. 1, p. iyab166 | DOI

[30] Gillespie, J. H. Genetic drift in an infinite population. The pseudohitchhiking model., Genetics, Volume 155 (2000) no. 2, pp. 909-919

[31] Granleese, T.; Clark, S. A.; van der Werf, J. H. J. Genotyping strategies of selection candidates in livestock breeding programmes, Volume 136 (2019) no. 2, pp. 91-101 | DOI

[32] Groenen, M. A. M.; Archibald, A. L.; Uenishi, H.; Tuggle, C. K.; Takeuchi, Y.; Rothschild, M. F.; Rogel-Gaillard, C.; Park, C.; Milan, D.; Megens, H.-J.; Li, S.; Larkin, D. M.; Kim, H.; Frantz, L. A. F.; Caccamo, M.; Ahn, H.; Aken, B. L.; Anselmo, A.; Anthon, C.; Auvil, L.; Badaoui, B.; Beattie, C. W.; Bendixen, C.; Berman, D.; Blecha, F.; Blomberg, J.; Bolund, L.; Bosse, M.; Botti, S.; Bujie, Z.; Bystrom, M.; Capitanu, B.; Carvalho-Silva, D.; Chardon, P.; Chen, C.; Cheng, R.; Choi, S.-H.; Chow, W.; Clark, R. C.; Clee, C.; Crooijmans, R. P. M. A.; Dawson, H. D.; Dehais, P.; De Sapio, F.; Dibbits, B.; Drou, N.; Du, Z.-Q.; Eversole, K.; Fadista, J.; Fairley, S.; Faraut, T.; Faulkner, G. J.; Fowler, K. E.; Fredholm, M.; Fritz, E.; Gilbert, J. G. R.; Giuffra, E.; Gorodkin, J.; Griffin, D. K.; Harrow, J. L.; Hayward, A.; Howe, K.; Hu, Z.-L.; Humphray, S. J.; Hunt, T.; Hornshøj, H.; Jeon, J.-T.; Jern, P.; Jones, M.; Jurka, J.; Kanamori, H.; Kapetanovic, R.; Kim, J.; Kim, J.-H.; Kim, K.-W.; Kim, T.-H.; Larson, G.; Lee, K.; Lee, K.-T.; Leggett, R.; Lewin, H. A.; Li, Y.; Liu, W.; Loveland, J. E.; Lu, Y.; Lunney, J. K.; Ma, J.; Madsen, O.; Mann, K.; Matthews, L.; McLaren, S.; Morozumi, T.; Murtaugh, M. P.; Narayan, J.; Nguyen, D. T.; Ni, P.; Oh, S.-J.; Onteru, S.; Panitz, F.; Park, E.-W.; Park, H.-S.; Pascal, G.; Paudel, Y.; Perez-Enciso, M.; Ramirez-Gonzalez, R.; Reecy, J. M.; Rodriguez-Zas, S.; Rohrer, G. A.; Rund, L.; Sang, Y.; Schachtschneider, K.; Schraiber, J. G.; Schwartz, J.; Scobie, L.; Scott, C.; Searle, S.; Servin, B.; Southey, B. R.; Sperber, G.; Stadler, P.; Sweedler, J. V.; Tafer, H.; Thomsen, B.; Wali, R.; Wang, J.; Wang, J.; White, S.; Xu, X.; Yerle, M.; Zhang, G.; Zhang, J.; Zhang, J.; Zhao, S.; Rogers, J.; Churcher, C.; Schook, L. B. Analyses of pig genomes provide insight into porcine demography and evolution, Nature, Volume 491 (2012) no. 7424, pp. 393-398 | DOI

[33] Gutenkunst, R. N.; Hernandez, R. D.; Williamson, S. H.; Bustamante, C. D. Inferring the Joint Demographic History of Multiple Populations from Multidimensional SNP Frequency Data, PLoS Genetics, Volume 5 (2009) no. 10 | DOI

[34] Haller, B. C.; Messer, P. W. SLiM 4: Multispecies Eco-Evolutionary Modeling, The American Naturalist, Volume 201 (2023) no. 5, p. E127-E139 | DOI

[35] Huang, X.; Fortier, A. L.; Coffman, A. J.; Struck, T. J.; Irby, M. N.; James, J. E.; León-Burguete, J. E.; Ragsdale, A. P.; Gutenkunst, R. N. Inferring Genome-Wide Correlations of Mutation Fitness Effects between Populations, Molecular Biology and Evolution, Volume 38 (2021) no. 10, pp. 4588-4602 | DOI

[36] Huang, X.; Kurata, N.; Wei, X.; Wang, Z.-X.; Wang, A.; Zhao, Q.; Zhao, Y.; Liu, K.; Lu, H.; Li, W.; Guo, Y.; Lu, Y.; Zhou, C.; Fan, D.; Weng, Q.; Zhu, C.; Huang, T.; Zhang, L.; Wang, Y.; Feng, L.; Furuumi, H.; Kubo, T.; Miyabayashi, T.; Yuan, X.; Xu, Q.; Dong, G.; Zhan, Q.; Li, C.; Fujiyama, A.; Toyoda, A.; Lu, T.; Feng, Q.; Qian, Q.; Li, J.; Han, B. A map of rice genome variation reveals the origin of cultivated rice, Nature, Volume 490 (2012) no. 7421, pp. 497-501 | DOI

[37] Jain, K.; Stephan, W. Rapid Adaptation of a Polygenic Trait After a Sudden Environmental Shift, Genetics, Volume 206 (2017) no. 1, pp. 389-406 | DOI

[38] Jain, K.; Stephan, W. Modes of Rapid Polygenic Adaptation, Molecular Biology and Evolution, Volume 34 (2017) no. 12, pp. 3169-3175 | DOI

[39] Jerison, E. R.; Kryazhimskiy, S.; Desai, M. M. Pleiotropic consequences of adaptation across gradations of environmental stress in budding yeast, http://arxiv.org/abs/1409.7839, 2014 | DOI

[40] Johri, P.; Riall, K.; Becher, H.; Excoffier, L.; Charlesworth, B.; Jensen, J. D. The Impact of Purifying and Background Selection on the Inference of Population History: Problems and Prospects, Molecular Biology and Evolution, Volume 38 (2021) no. 7, pp. 2986-3003 | DOI

[41] Keightley, P. D.; Eyre-Walker, A. Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies, Genetics, Volume 177 (2007) no. 4, pp. 2251-2261 | DOI

[42] Kim, B. Y.; Huber, C. D.; Lohmueller, K. E. Inference of the Distribution of Selection Coefficients for New Nonsynonymous Mutations Using Large Samples, Genetics, Volume 206 (2017) no. 1, pp. 345-361 | DOI

[43] Kondrashov, A. S. Deleterious mutations and the evolution of sexual reproduction, Nature, Volume 336 (1988) no. 6198, pp. 435-440 | DOI

[44] Van Laere, A.-S.; Nguyen, M.; Braunschweig, M.; Nezer, C.; Collette, C.; Moreau, L.; Archibald, A. L.; Haley, C. S.; Buys, N.; Tally, M.; Andersson, G.; Georges, M.; Andersson, L. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in the pig, Nature, Volume 425 (2003) no. 6960, pp. 832-836 | DOI

[45] Larson, G.; Burger, J. A population genetics view of animal domestication, Trends in genetics: TIG, Volume 29 (2013) no. 4, pp. 197-205 | DOI

[46] Larson, G.; Piperno, D. R.; Allaby, R. G.; Purugganan, M. D.; Andersson, L.; Arroyo-Kalin, M.; Barton, L.; Climer Vigueira, C.; Denham, T.; Dobney, K.; Doust, A. N.; Gepts, P.; Gilbert, M. T. P.; Gremillion, K. J.; Lucas, L.; Lukens, L.; Marshall, F. B.; Olsen, K. M.; Pires, J. C.; Richerson, P. J.; Rubio de Casas, R.; Sanjur, O. I.; Thomas, M. G.; Fuller, D. Q. Current perspectives and the future of domestication studies, Proceedings of the National Academy of Sciences of the United States of America, Volume 111 (2014) no. 17, pp. 6139-6146 | DOI

[47] Leno-Colorado, J.; Guirao-Rico, S.; Pérez-Enciso, M.; Ramos-Onsins, S. E. Pervasive selection pressure in wild and domestic pigs, https://www.biorxiv.org/content/10.1101/2020.09.09.289439v3, 2022 | DOI

[48] Li, N.; Xu, R.; Duan, P.; Li, Y. Control of grain size in rice, Plant Reproduction, Volume 31 (2018) no. 3, pp. 237-251 | DOI

[49] Li, X.; Yang, J.; Shen, M.; Xie, X.-L.; Liu, G.-J.; Xu, Y.-X.; Lv, F.-H.; Yang, H.; Yang, Y.-L.; Liu, C.-B.; Zhou, P.; Wan, P.-C.; Zhang, Y.-S.; Gao, L.; Yang, J.-Q.; Pi, W.-H.; Ren, Y.-L.; Shen, Z.-Q.; Wang, F.; Deng, J.; Xu, S.-S.; Salehian-Dehkordi, H.; Hehua, E.; Esmailizadeh, A.; Dehghani-Qanatqestani, M.; Štěpánek, O.; Weimann, C.; Erhardt, G.; Amane, A.; Mwacharo, J. M.; Han, J.-L.; Hanotte, O.; Lenstra, J. A.; Kantanen, J.; Coltman, D. W.; Kijas, J. W.; Bruford, M. W.; Periasamy, K.; Wang, X.-H.; Li, M.-H. Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits, Nature Communications, Volume 11 (2020) no. 1, p. 2815 | DOI

[50] Librado, P.; Khan, N.; Fages, A.; Kusliy, M. A.; Suchan, T.; Tonasso-Calvière, L.; Schiavinato, S.; Alioglu, D.; Fromentier, A.; Perdereau, A.; Aury, J.-M.; Gaunitz, C.; Chauvey, L.; Seguin-Orlando, A.; Der Sarkissian, C.; Southon, J.; Shapiro, B.; Tishkin, A. A.; Kovalev, A. A.; Alquraishi, S.; Alfarhan, A. H.; Al-Rasheid, K. A. S.; Seregély, T.; Klassen, L.; Iversen, R.; Bignon-Lau, O.; Bodu, P.; Olive, M.; Castel, J.-C.; Boudadi-Maligne, M.; Alvarez, N.; Germonpré, M.; Moskal-Del Hoyo, M.; Wilczyński, J.; Pospuła, S.; Lasota-Kuś, A.; Tunia, K.; Nowak, M.; Rannamäe, E.; Saarma, U.; Boeskorov, G.; Lōugas, L.; Kyselý, R.; Peške, L.; Bălășescu, A.; Dumitrașcu, V.; Dobrescu, R.; Gerber, D.; Kiss, V.; Szécsényi-Nagy, A.; Mende, B. G.; Gallina, Z.; Somogyi, K.; Kulcsár, G.; Gál, E.; Bendrey, R.; Allentoft, M. E.; Sirbu, G.; Dergachev, V.; Shephard, H.; Tomadini, N.; Grouard, S.; Kasparov, A.; Basilyan, A. E.; Anisimov, M. A.; Nikolskiy, P. A.; Pavlova, E. Y.; Pitulko, V.; Brem, G.; Wallner, B.; Schwall, C.; Keller, M.; Kitagawa, K.; Bessudnov, A. N.; Bessudnov, A.; Taylor, W.; Magail, J.; Gantulga, J.-O.; Bayarsaikhan, J.; Erdenebaatar, D.; Tabaldiev, K.; Mijiddorj, E.; Boldgiv, B.; Tsagaan, T.; Pruvost, M.; Olsen, S.; Makarewicz, C. A.; Valenzuela Lamas, S.; Albizuri Canadell, S.; Nieto Espinet, A.; Iborra, M. P.; Lira Garrido, J.; Rodríguez González, E.; Celestino, S.; Olària, C.; Arsuaga, J. L.; Kotova, N.; Pryor, A.; Crabtree, P.; Zhumatayev, R.; Toleubaev, A.; Morgunova, N. L.; Kuznetsova, T.; Lordkipanize, D.; Marzullo, M.; Prato, O.; Bagnasco Gianni, G.; Tecchiati, U.; Clavel, B.; Lepetz, S.; Davoudi, H.; Mashkour, M.; Berezina, N. Y.; Stockhammer, P. W.; Krause, J.; Haak, W.; Morales-Muñiz, A.; Benecke, N.; Hofreiter, M.; Ludwig, A.; Graphodatsky, A. S.; Peters, J.; Kiryushin, K. Y.; Iderkhangai, T.-O.; Bokovenko, N. A.; Vasiliev, S. K.; Seregin, N. N.; Chugunov, K. V.; Plasteeva, N. A.; Baryshnikov, G. F.; Petrova, E.; Sablin, M.; Ananyevskaya, E.; Logvin, A.; Shevnina, I.; Logvin, V.; Kalieva, S.; Loman, V.; Kukushkin, I.; Merz, I.; Merz, V.; Sakenov, S.; Varfolomeyev, V.; Usmanova, E.; Zaibert, V.; Arbuckle, B.; Belinskiy, A. B.; Kalmykov, A.; Reinhold, S.; Hansen, S.; Yudin, A. I.; Vybornov, A. A.; Epimakhov, A.; Berezina, N. S.; Roslyakova, N.; Kosintsev, P. A.; Kuznetsov, P. F.; Anthony, D.; Kroonen, G. J.; Kristiansen, K.; Wincker, P.; Outram, A.; Orlando, L. The origins and spread of domestic horses from the Western Eurasian steppes, Nature, Volume 598 (2021) no. 7882, pp. 634-640 | DOI

[51] Lozano, R.; Gazave, E.; Dos Santos, J. P. R.; Stetter, M. G.; Valluru, R.; Bandillo, N.; Fernandes, S. B.; Brown, P. J.; Shakoor, N.; Mockler, T. C.; Cooper, E. A.; Taylor Perkins, M.; Buckler, E. S.; Ross-Ibarra, J.; Gore, M. A. Comparative evolutionary genetics of deleterious load in sorghum and maize, Nature plants, Volume 7 (2021) no. 1, pp. 17-24 | DOI

[52] Lu, J.; Tang, T.; Tang, H.; Huang, J.; Shi, S.; Wu, C.-I. The accumulation of deleterious mutations in rice genomes: a hypothesis on the cost of domestication, Trends in genetics: TIG, Volume 22 (2006) no. 3, pp. 126-131 | DOI

[53] Makanjuola, B. O.; Miglior, F.; Abdalla, E. A.; Maltecca, C.; Schenkel, F. S.; Baes, C. F. Effect of genomic selection on rate of inbreeding and coancestry and effective population size of Holstein and Jersey cattle populations, Journal of Dairy Science, Volume 103 (2020) no. 6, pp. 5183-5199 | DOI

[54] Marsden, C. D.; Ortega-Del Vecchyo, D.; O'Brien, D. P.; Taylor, J. F.; Ramirez, O.; Vilà, C.; Marques-Bonet, T.; Schnabel, R. D.; Wayne, R. K.; Lohmueller, K. E. Bottlenecks and selective sweeps during domestication have increased deleterious genetic variation in dogs, Proceedings of the National Academy of Sciences of the United States of America, Volume 113 (2016) no. 1, pp. 152-157 | DOI

[55] Marsh, J. I.; Johri, P. Biases in ARG-Based Inference of Historical Population Size in Populations Experiencing Selection, Molecular Biology and Evolution, Volume 41 (2024) no. 7, p. msae118 | DOI

[56] Messer, P. W.; Petrov, D. A. Frequent adaptation and the McDonald-Kreitman test, Proceedings of the National Academy of Sciences of the United States of America, Volume 110 (2013) no. 21, pp. 8615-8620 | DOI

[57] Morell Miranda, P.; Soares, A. E. R.; Günther, T. Demographic reconstruction of the Western sheep expansion from whole-genome sequences, G3 (Bethesda, Md.), Volume 13 (2023) no. 11, p. jkad199 | DOI

[58] Moyers, B. T.; Morrell, P. L.; McKay, J. K. Genetic Costs of Domestication and Improvement, The Journal of Heredity, Volume 109 (2018) no. 2, pp. 103-116 | DOI

[59] Murray, C.; Huerta-Sanchez, E.; Casey, F.; Bradley, D. G. Cattle demographic history modelled from autosomal sequence variation, Philosophical Transactions of the Royal Society B: Biological Sciences, Volume 365 (2010) no. 1552, pp. 2531-2539 | DOI

[60] Nicolaisen, L. E.; Desai, M. M. Distortions in genealogies due to purifying selection and recombination, Genetics, Volume 195 (2013) no. 1, pp. 221-230 | DOI

[61] Nielsen, R. Molecular signatures of natural selection, Annual Review of Genetics, Volume 39 (2005), pp. 197-218 | DOI

[62] Ohta, T. The mutational load of a multigene family with uniform members, Genetical Research, Volume 53 (1989) no. 2, pp. 141-145 | DOI

[63] Pacifici, M.; Santini, L.; Marco, M. D.; Baisero, D.; Francucci, L.; Marasini, G. G.; Visconti, P.; Rondinini, C. Generation length for mammals, Nature Conservation, Volume 5 (2013), pp. 89-94 | DOI

[64] Purugganan, M. D.; Fuller, D. Q. The nature of selection during plant domestication, Nature, Volume 457 (2009) no. 7231, pp. 843-848 | DOI

[65] Qanbari, S.; Rubin, C.-J.; Maqbool, K.; Weigend, S.; Weigend, A.; Geibel, J.; Kerje, S.; Wurmser, C.; Peterson, A. T.; Brisbin, I. L.; Preisinger, R.; Fries, R.; Simianer, H.; Andersson, L. Genetics of adaptation in modern chicken, PLoS genetics, Volume 15 (2019) no. 4, p. e1007989 | DOI

[66] Ragsdale, A. P.; Coffman, A. J.; Hsieh, P.; Struck, T. J.; Gutenkunst, R. N. Triallelic Population Genomics for Inferring Correlated Fitness Effects of Same Site Nonsynonymous Mutations, Genetics, Volume 203 (2016) no. 1, pp. 513-523 | DOI

[67] Raybaut, P. Spyder-documentation, 2009 (https://docs.spyder-ide.org/current/index.html)

[68] Redding, R. W. The Pig and the Chicken in the Middle East: Modeling Human Subsistence Behavior in the Archaeological Record Using Historical and Animal Husbandry Data, Journal of Archaeological Research, Volume 23 (2015) no. 4, pp. 325-368 | DOI

[69] Rossum, G.; Drake, F. L. Python 3 Reference Manual: (Python Documentation Manual Part 2), CreateSpace Independent Publishing Platform, 2009

[70] Rubin, C.-J.; Zody, M. C.; Eriksson, J.; Meadows, J. R. S.; Sherwood, E.; Webster, M. T.; Jiang, L.; Ingman, M.; Sharpe, T.; Ka, S.; Hallböök, F.; Besnier, F.; Carlborg, Ö.; Bed’hom, B.; Tixier-Boichard, M.; Jensen, P.; Siegel, P.; Lindblad-Toh, K.; Andersson, L. Whole-genome resequencing reveals loci under selection during chicken domestication, Nature, Volume 464 (2010) no. 7288, pp. 587-591 | DOI

[71] Schrider, D. R.; Shanku, A. G.; Kern, A. D. Effects of Linked Selective Sweeps on Demographic Inference and Model Selection, Genetics, Volume 204 (2016) no. 3, pp. 1207-1223 | DOI

[72] Shomura, A.; Izawa, T.; Ebana, K.; Ebitani, T.; Kanegae, H.; Konishi, S.; Yano, M. Deletion in a gene associated with grain size increased yields during rice domestication, Nature Genetics, Volume 40 (2008) no. 8, pp. 1023-1028 | DOI

[73] Stephan, W.; John, S. Polygenic Adaptation in a Population of Finite Size, Entropy (Basel, Switzerland), Volume 22 (2020) no. 8, p. 907 | DOI

[74] Tataru, P.; Bataillon, T. polyDFEv2.0: testing for invariance of the distribution of fitness effects within and across species, Bioinformatics (Oxford, England), Volume 35 (2019) no. 16, pp. 2868-2869 | DOI

[75] Tataru, P.; Mollion, M.; Glémin, S.; Bataillon, T. Inference of Distribution of Fitness Effects and Proportion of Adaptive Substitutions from Polymorphism Data, Genetics, Volume 207 (2017) no. 3, pp. 1103-1119 | DOI

[76] Todd, E. T.; Tonasso-Calvière, L.; Chauvey, L.; Schiavinato, S.; Fages, A.; Seguin-Orlando, A.; Clavel, P.; Khan, N.; Pérez Pardal, L.; Patterson Rosa, L.; Librado, P.; Ringbauer, H.; Verdugo, M.; Southon, J.; Aury, J.-M.; Perdereau, A.; Vila, E.; Marzullo, M.; Prato, O.; Tecchiati, U.; Bagnasco Gianni, G.; Tagliacozzo, A.; Tinè, V.; Alhaique, F.; Cardoso, J. L.; Valente, M. J.; Telles Antunes, M.; Frantz, L.; Shapiro, B.; Bradley, D. G.; Boulbes, N.; Gardeisen, A.; Horwitz, L. K.; Öztan, A.; Arbuckle, B. S.; Onar, V.; Clavel, B.; Lepetz, S.; Vahdati, A. A.; Davoudi, H.; Mohaseb, A.; Mashkour, M.; Bouchez, O.; Donnadieu, C.; Wincker, P.; Brooks, S. A.; Beja-Pereira, A.; Wu, D.-D.; Orlando, L. The genomic history and global expansion of domestic donkeys, Science (New York, N.Y.), Volume 377 (2022) no. 6611, pp. 1172-1180 | DOI

[77] Torres, R.; Stetter, M. G.; Hernandez, R. D.; Ross-Ibarra, J. The Temporal Dynamics of Background Selection in Nonequilibrium Populations, Genetics, Volume 214 (2020) no. 4, pp. 1019-1030 | DOI

[78] Watterson, G. On the number of segregating sites in genetical models without recombination, Theoretical Population Biology, Volume 7 (1975) no. 2, pp. 256-276 | DOI

[79] Wright, S. I.; Bi, I. V.; Schroeder, S. G.; Yamasaki, M.; Doebley, J. F.; McMullen, M. D.; Gaut, B. S. The effects of artificial selection on the maize genome, Science (New York, N.Y.), Volume 308 (2005) no. 5726, pp. 1310-1314 | DOI

[80] Yang, J.; Li, W.-R.; Lv, F.-H.; He, S.-G.; Tian, S.-L.; Peng, W.-F.; Sun, Y.-W.; Zhao, Y.-X.; Tu, X.-L.; Zhang, M.; Xie, X.-L.; Wang, Y.-T.; Li, J.-Q.; Liu, Y.-G.; Shen, Z.-Q.; Wang, F.; Liu, G.-J.; Lu, H.-F.; Kantanen, J.; Han, J.-L.; Li, M.-H.; Liu, M.-J. Whole-Genome Sequencing of Native Sheep Provides Insights into Rapid Adaptations to Extreme Environments, Molecular Biology and Evolution, Volume 33 (2016) no. 10, pp. 2576-2592 | DOI

[81] Zeder, M. A. The Domestication of Animals, Journal of Anthropological Research, Volume 68 (2012) no. 2, pp. 161-190 | DOI

[82] Zeder, M. A. Core questions in domestication research, Proceedings of the National Academy of Sciences of the United States of America, Volume 112 (2015) no. 11, pp. 3191-3198 | DOI

[83] Zeder, M. A.; Emshwiller, E.; Smith, B. D.; Bradley, D. G. Documenting domestication: the intersection of genetics and archaeology, Trends in genetics: TIG, Volume 22 (2006) no. 3, pp. 139-155 | DOI

[84] Zeng, K.; Charlesworth, B. The Joint Effects of Background Selection and Genetic Recombination on Local Gene Genealogies, Genetics, Volume 189 (2011) no. 1, pp. 251-266 | DOI