Methods to detect selection – Methods in population genomics

Checking for population structure is an essential step when performing analyses on genome-level datasets. Neglecting it can bias demographic inferences (Chikhi et al., 2010; Heller et al., 2013) or the detection of loci under selection (e.g. Nielsen et al., 2007); thus, checking for outlier individuals and assessing the global structure is required prior to any more sophisticated analysis. On the other hand, selection acts both on correlations i) between alleles and environment at selected loci and ii) between alleles from different loci, either directly under selection or not. This is reflected respectively by i) variation in polymorphism within and between populations and ii) linkage disequilibrium (LD) between loci. If selection is widespread in the genome, the study of population history can thererore be biased, making necessary the joint study of selection and population structure.
This table summarizes current methods to detect selection.

Software	Class of method	Purpose	Specifics	Issues and warnings	Link	Reference
ARGWeaver/ARGweaver-D	Ancestral Recombination Graphs/coalescence	Retracing the whole process of recombination and coalescence along a genome	Provides quantitative estimates for TMRCA and topologies at each locus. ARGWeaver-D can estimate introgression. Estimates effective population size. Provides tools to extract summary statistics for the topologies retrieved. Does not require phasing (but slower).	High computing cost. Slower on unphased or low depth data. ARGWeaver-D is not part of the Anaconda (Python) distribution (http://compgen.cshl.edu/ARGweaver/doc/argweaver-d-manual.html)	Can be installed via conda: conda install -c genomedk argweaver and https://github.com/mjhubisz/argweaver and http://compgen.cshl.edu/ARGweaver/doc/argweaver-d-manual.html	(Rasmussen et al., 2014; Hubisz et al., 2020)
GAPIT3	Association	Detecting association with environmental/phenotypical features	Includes most methods for GWAS studies, including procedures for fast computation, mixed linear models, efficient mixed model association, bayesian methods such as BLINK, diagnostics such as QQ plots and genotype filtering.	May be slow for very large datasets	https://github.com/jiabowang/GAPIT3	(Wang and Zhang, 2020)
GEMMA	Association	Detecting association with environmental/phenotypical features	Computationnally efficient for large scale datasets	Imports data from PLINK format	http://www.xzlab.org/software.html	(Zhou and Stephens, 2012)
GENABEL	Association	Detecting association with environmental/phenotypic features	Modularity, facilitates correction for population structure/relatedness.	Imports data from PLINK format. No longer supported!	http://www.genabel.org/	(Aulchenko et al., 2007)
PLINK	Association	Detecting association with environmental/phenotypical features	Handles a variety of tests for population structure and relatedness	Population structure/kinship need to be assessed prior association analysis	http://pngu.mgh.harvard.edu/~purcell/plink/	Purcell et al., 2007)
Trinculo	Association	Detecting association with environmental/phenotypical features	Specifically designed to handle categorical variables with more than 2 categories. Performs multinomial logistic regression and provides frequentist and bayesian frameworks.	Requires lapack library in Unix. Allows fine-mapping by testing for corrrelations between adjacent markers.	https://sourceforge.net/projects/trinculo/	(Jostins and McVean, 2016)
SAMBADA	Association/Environmental association	Detecting association with environmental/phenotypical features	Designed to be fast, underlying models have been kept simple. Allows conversion from PLINK format. Takes into account spatial autocorrelation of individual genotypes. Allows correction for population structure	Does not work with pooled data. Possibly high levels of false positives. Relatedness between samples should be assessed independently. Should be used in combination with LFMM or BayPass.	http://lasig.epfl.ch/sambada	(Stucki et al., 2016)
Relate	Coalescence with recombination	Reconstruct genome-wide genealogies for hundreds of samples	Provides quantitative estimates for TMRCA and topologies at each locus. Infers past demography (similar to PSMC methods). Infers changes in mutation rates. Performs scans for positive selection over discrete time periods.	Requires an outgroup to polarize alleles as ancestral/derived. Requires a recombination map. Does not reconstruct ARG sensus stricto, and does not estimate uncertainty of the local genealogies	https://myersgroup.github.io/relate/index.html	(Speidel et al., 2019)
diCal-IBD	Coalescent with recombination/IBD	Predicting IBD tracts from demographic models	High IBD sharing suggests recent positive selection.	Uses diCal output to obtain expectations based on demographic scenarios	https://sourceforge.net/projects/dical-ibd/	(Tataru et al., 2014)
VolcanoFinder	Composite likelihood test	Adaptive introgression	Detects a specific signature of increase then drop in diversity near a selected locus brought in a population through introgression	Private input format. Computationnally intensive, needs to be run in parallel.	http://degiorgiogroup.fau.edu/vf.html	(Setter et al., 2020)
SCCT	Conditional coalescent tree	Detecting positive selection	Designed for detecting recent positive selection. Clains to be more precise at identifying selected sites	The ancestral state of alleles must be obtained through an outgroup	https://github.com/wavefancy/scct	(Wang et al., 2014)
LFMM	Environmental association	Detecting adaptation to environmental features	Corrects for population structure using latent factors, faster than BAYENV for large datasets	Only performs association with environment	http://membres-timc.imag.fr/Olivier.Francois/lfmm/software.htm	(Frichot et al., 2013)
CLUES	Genealogies at selected loci	Estimate the time at which a beneficial allele rises in frequency	Previous version used ARGWeaver output, current version uses Relate. Provides scripts to plot the trajectory of selected alleles.	Assumes a panmictic population, neglects the effects of selection at linked sites.	https://github.com/35ajstern/clues	(Stern et al., 2019)
PALM	Genealogies at selected loci	Estimate the strength and timing of selection on polygenic traits	Uses genealogies estimated from Relate and results from GWAS to estimate timing and strength of selection for polygenic traits. Should be robust to pleiotropy and residual structure in GWAS	May overestimate selection for older events. Only tested in humans.	https://github.com/35ajstern/palm	(Stern et al., 2020)
startmrca	Genealogies at selected loci	Estimate the time at which a beneficial allele rises in frequency	Compares genealogies between carriers and non-carriers of an advantageous mutation, assuming a star- genealogy at selected loci. Can handle VCF files	Requires a reference panel of noncarrier haplotypes. Sensitive to loca diversity before the sweep, and to migration events during a sweep. More indicated for recent sweeps.	https://github.com/jhavsmith/startmrca	Smith, Coop, Stephens, & Novembre, 2018)
Ancestry_HMM-S	Identity-by-state tracts	Adaptive introgression	Estimates the selective coefficient of the introgressed loci through a hidden-Markov chain approach.	Requires the time and extent of introgression to be defined by the user	https://github.com/jesvedberg/Ancestry_HMM-S/	(Svedberg et al., 2020)
H12 test	LD	Detecting selection using signatures of high LD	Does not require phased data. Designed for detecting soft sweeps	Coalescent simulations are recommended to evaluate the likelihood of selection	https://github.com/ngarud/SelectionHapStats/	(Garud et al., 2015)
LDna	LD	Detecting selection using signatures of high LD	Can be used to address population structure or detect large inversions or indel polymorphism through LD	The user needs to play with parameters to ensure robustness of SNPs significantly linked	https://github.com/petrikemppainen/LDna	(Kemppainen et al., 2015)
rehh	LD	Detecting selection using signatures of high LD	Can compute both XP-EHH and Rsb. Handles several input formats	Requires phased data and high density of markers	https://cran.r-project.org/web/packages/rehh/index.html	(Gautier and Vitalis, 2012)
Scan for epistatic interaction (based on LD)	LD	Polygenic selection/Epistatic interactions	Uses genome-wide LD between a candidate locus and the rest of the genomes to identify epistatic interactions. Can test SNP-SNP interaction, or between genomic windows (summarizes genotypes through PCA)	Lack of a detailed tutorial	https://github.com/leaboyrie/LD_corpc1	(Boyrie et al., 2020)
Selscan	LD	Detecting selection using signatures of high LD	Includes the nSL statistics dedicated to soft sweep detection	Does not include utilities to specify the ancestral state of alleles. Requires phased data and high density of markers	https://github.com/szpiech/selscan	(Szpiech and Hernandez, 2014)
BALLET	Likelihood test for balancing selection	Detecting balancing selection	Designed for detecting ancient balancing selection. Does not require phasing	Requires whole-genome data and recombination map. The ancestral state of alleles must be obtained through an outgroup	http://www.personal.psu.edu/mxd60/ballet.html	(DeGiorgio et al., 2014)
Betascan2	Local associations of allele frequencies	Detecting balancing selection	Uses correlations in frequencies between genomically proximate SNPs to compute a score. Can incorporate information about ancestral/derived alleles, fixed derived variants and normalizes the statistics depending on the amount of sites in a given genomic window. Very detailed tutorial and utilities.	Requires estimating the length distribution of ancestral fragments on each side of the selected site. The 95% percentile can be estimated with the formula L=-log(0.05)/(T*rho), with T the time since selection in generations and rho the effective recombination rate/ generation.	https://github.com/ksiewert/BetaScan	(Siewert and Voight, 2017, 2020)
NCD statistics	Local associations of allele frequencies	Detecting balancing selection	Examines the observed and expected frequency spectra of polymorphisms in genomic windows to test for selection. Can incorporate fixed differences with an outgroup (NCD2), but not mandatory (NCD1)	Private input format, requires simulations to calibrate the statistics. Requires to define the expected equilibrium frequency of alleles (usually between 0.3 and 0.5). Low sensitivity below these frequencies.	https://github.com/bbitarello/NCD-Statistics	(Bitarello et al., 2018)
Bayescan	Population differentiation	Detecting positive selection and local adaptation	Incorporates uncertainty on allele frequencies due to low sample sizes	Sensitive to priors on the ratio of selected/neutral sites. False positive rates can be high under scenarios of demographic expansion, admixture and isolation by distance	http://cmpg.unibe.ch/software/BayeScan/	(Foll and Gaggiotti, 2008)
FDIST2	Population differentiation	Detecting positive selection and local adaptation	Allows to control for hierarchical population structure	False positive rate is high when an island model cannot be assumed	http://datadryad.org/resource/doi:10.5061/dryad.v8d05	(Beaumont and Balding, 2004)
PCAdapt	Population differentiation	Detecting positive selection and local adaptation	Does not require to define populations. Handles admixed populations and pooled datasets	False positive rate can be high	http://membres-timc.imag.fr/Michael.Blum/PCAdapt.html	(Duforet-Frebourg et al., 2016)
SelEstim	Population differentiation	Detecting positive selection and local adaptation	Can estimate the coefficients of selection. Calibration using a pseudo-observed dataset (can be used in combination with the R function simulate.baypass() in BayPass).	Assumes a Wrigth-Fisher island model.	http://www1.montpellier.inra.fr/CBGP/software/selestim/	(Vitalis et al., 2014)
Bayenv, BayPass	Population differentiation/Association	Detecting positive selection and adaptation to environmental features	Less sensitive to population demographic history than previous methods. Handle pooled datasets	Significance thresholds need to be determined from pseudo-observed datasets. Calibration with neutral SNPs is recommended. BayPass better estimates the kinship matrix	http://www1.montpellier.inra.fr/CBGP/software/baypass/ ; https://bitbucket.org/tguenther/bayenv2_public/src	(Günther and Coop, 2013; Gautier, 2015)
FLK	Population differentiation/Association	Detecting positive selection and local adaptation	Less sensitive to population demographic history than previous methods	Requires an outgroup population	https://qgsp.jouy.inra.fr/index.php?option=com_content&view=article&id=50&Itemid=55	(Bonhomme et al., 2010)
LSD	Population differentiation/Population-branch test	Detecting positive selection and local adaptation	Compares the level of exclusively shared differences between internal and external branches of a population tree. Allows testing selection occurring on the ancestral branch leading to two populations.	Requires several populations to perform the test. May be less sensitive to selection on standing variation.	https://bitbucket.org/plibrado/LSD	(Librado and Orlando, 2018)
POPBAM	Summary statistics	Detecting selection using AFS, differentiation	Extracts summary statistics directly from BAM files	Does not allow for sophisticated filtering and SNP calling	http://popbam.sourceforge.net/	(Garrigan, 2013)
VCFTOOLS	Summary statistics	Detecting selection using AFS, differentiation	Extracts summary statistics from VCF files. Also allows VCF filtering and conversion	Set of summary statistics not as extensive as PopGenome	http://vcftools.sourceforge.net/	(Danecek et al., 2011)
RAiSD	Summary statistics/Allele frequency spectrum + LD	Detecting positive selection and local adaptation	Scans the genome for composite signals of selective sweeps summarized by the μ statistics. Corrects for the effects of background selection by estimating a threshold value for the statistics based on simulations with background selection	Uses a single population of interest.	https://github.com/alachins/raisd	(Alachiotis and Pavlidis, 2018)
TASSEL	Summary statistics/Association	Detecting association with phenotype	User friendly (Java interface), corrects for relatedness, allows computing summary statistics (LD, diversity)	Requires relatedness to be assessed externally (with e.g. STRUCTURE)	http://www.maizegenetics.net/tassel	(Bradbury et al., 2007)
ANGSD	Summary statistics/Association/Population Branch test	Detecting selection using AFS, differentiation, association with functional traits	Allows for association using generalized linear models	Descriptive statistics. P-values need to be evaluated through coalescent simulations.	http://www.popgen.dk/angsd/index.php/ANGSD	(Korneliussen et al., 2014)
SweeD	Summary statistics/Composite Likelihood test	Designed for whole genome data (or large continuous regions)	Supports Fasta and VCF formats. Estimates selection coefficients.	NA	http://pop-gen.eu/wordpress/software/sweed	(Degiorgio et al., 2016)
selectionTools	Summary statistics/LD	Detecting selection using AFS, differentiation and LD statistics	Allows combining several tools in a single pipeline. Includes phasing tools.	Set of available summary statistics remains limited (same as VCFtools + Fay and Wu's H)	https://github.com/MerrimanLab/selectionTools	(Cadzow et al., 2014)
PAML/CODEML	Summary statistics/phylogeny	Distribution of fitness effects/selection on coding variation	Estimates selection along branches in a phylogeny for genes of interest, contrasting patterns of synonymous and non-synonymous substitutions. A detailed tutorial is available here: https://link.springer.com/protocol/10.1007%2F978-1-4939-1438-8_4#Sec29	Slow for large datasets. Needs to be parallelised.	http://abacus.gene.ucl.ac.uk/software/paml.html	(Yang, 2007)
polyDFE2.0	Summary statistics/phylogeny	Distribution of fitness effects/selection on coding variation	Can test for invariance of DFEs across datasets (genomic regions within species, or different species). No need for divergence estimates (does not assume that the same DFE is shared between species and outgroup). Very detailed tutorial available here: https://link.springer.com/protocol/10.1007/978-1-0716-0199-0_6	Comparisons require a large number of SNPs for each dataset for comparisons to be meaningful	https://github.com/paula-tataru/polyDFE	(Tataru and Bataillon, 2019)
POPGenome	Summary statistics/Population Branch test	Detecting selection using AFS, differentiation	Fast, embedded in R, allows using annotation files (GFF/GTF format).	Does not perform association, but can be used in combination with GENABEL within R	https://cran.r-project.org/web/packages/PopGenome/index.html	(Pfeifer et al., 2014)
ETEToolkit	Summary statistics/phylogeny	Distribution of fitness effects/selection on coding variation	ETEToolkit can call CODEML from Python and can streamline phylogenetic analyses of selection. Estimates selection along branches in a phylogeny for genes of interest, contrasting patterns of synonymous and non-synonymous substitutions.	Slow for large datasets. Needs to be parallelised.	http://etetoolkit.org/	(Huerta-Cepas et al., 2016)
HaplotypeDFEStandingVariation (no official name for the pipeline)	Summary statistics/phylogeny	Distribution of fitness effects/selection on coding variation	Uses variation in the length of tracts of Identity-by-State to infer the distribution of fitness effect.	The pipeline relies on in-house simulators and ABC to obtain more robust estimates of the DFE. Comprehensive but may be difficult to deploy for a naive user.	https://github.com/dortegadelv/HaplotypeDFEStandingVariation	(Ortega-Del Vecchyo et al., 2022)

References

Alachiotis, N., & Pavlidis, P. (2018). RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Communications Biology, 1(1). doi: 10.1038/s42003-018-0085-8

Aulchenko, Y. S., Ripke, S., Isaacs, A., & van Duijn, C. M. (2007). GenABEL: An R library for genome-wide association analysis. Bioinformatics, 23(10), 1294–1296. doi: 10.1093/bioinformatics/btm108

Beaumont, M. A., & Balding, D. J. (2004). Identifying adaptive genetic divergence among populations from genome scans. Molecular Ecology, 13(4), 969–980. doi: 10.1111/j.1365-294X.2004.02125.x

Bitarello, B. D., De Filippo, C., Teixeira, J. C., Schmidt, J. M., Kleinert, P., Meyer, D., & Andres, A. M. (2018). Signatures of long-term balancing selection in human genomes. Genome Biology and Evolution, 10(3), 939–955. doi: 10.1093/gbe/evy054

Bonhomme, M., Chevalet, C., Servin, B., Boitard, S., Abdallah, J. M., Blott, S., & San Cristobal, M. (2010). Detecting Selection in Population Trees: The Lewontin and Krakauer Test Extended. Genetics, (186), 241–262. doi: 10.1534/genetics.110.117275

Boyrie, L., Moreau, C., Frugier, F., Jacquet, C., & Bonhomme, M. (2020). A linkage disequilibrium-based statistical test for Genome-Wide Epistatic Selection Scans in structured populations. Heredity. doi: 10.1038/s41437-020-0349-1

Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y., & Buckler, E. S. (2007). TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics (Oxford, England), 23(19), 2633–2635. doi: 10.1093/bioinformatics/btm308

Cadzow, M., Boocock, J., Nguyen, H. T., Wilcox, P., Merriman, T. R., & Black, M. A. (2014). A bioinformatics workflow for detecting signatures of selection in genomic data. Frontiers in Genetics, 5(AUG), 1–8. doi: 10.3389/fgene.2014.00293

Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., … Durbin, R. (2011). The variant call format and VCFtools. Bioinformatics, 27(15), 2156–2158. doi: 10.1093/bioinformatics/btr330

Degiorgio, M., Huber, C. D., Hubisz, M. J., Hellmann, I., & Nielsen, R. (2016). Genetics and population analysis SWEEPFINDER 2 : Increased sensitivity , robustness , and flexibility. Bioinformatics. doi: 10.111/mec.13351.RR

DeGiorgio, M., Lohmueller, K. E., & Nielsen, R. (2014). A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genetics, 10(8), e1004561. doi: 10.1371/journal.pgen.1004561

Duforet-Frebourg, N., Luu, K., Laval, G., Bazin, E., & Blum, M. G. B. (2016). Detecting genomic signatures of natural selection with principal component analysis: Application to the 1000 genomes data. Molecular Biology and Evolution, 33(4), 1082–1093. doi: 10.1093/molbev/msv334

Foll, M., & Gaggiotti, O. (2008). A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics, 180(2), 977–993. doi: 10.1534/genetics.108.092221

Frichot, E., Schoville, S. D., Bouchard, G., & François, O. (2013). Testing for associations between loci and environmental gradients using latent factor mixed models. Molecular Biology and Evolution, 30(7), 1687–1699. doi: 10.1093/molbev/mst063

Garrigan, D. (2013). POPBAM: Tools for evolutionary analysis of short read sequence alignments. Evolutionary Bioinformatics, 2013(9), 343–353. doi: 10.4137/EBO.S12751

Garud, N. R., Messer, P. W., Buzbas, E. O., & Petrov, D. A. (2015). Recent Selective Sweeps in North American Drosophila melanogaster Show Signatures of Soft Sweeps. PLoS Genetics, 11(2), 1–32. doi: 10.1371/journal.pgen.1005004

Gautier, M. (2015). Genome-Wide Scan for Adaptive Divergence and Association with Population-Specific Covariates. Genetics, 201(September), 1555–1579. doi: doi:0.1534/genetics.115.181453

Gautier, M., & Vitalis, R. (2012). Rehh An R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics, 28(8), 1176–1177. doi: 10.1093/bioinformatics/bts115

Günther, T., & Coop, G. (2013). Robust identification of local adaptation from allele frequencies. Genetics, 195(1), 205–220. doi: 10.1534/genetics.113.152462

Hubisz, M. J., Williams, A. L., & Siepel, A. (2020). Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph. PLoS Genetics, 16(8), 1–24. doi: 10.1371/JOURNAL.PGEN.1008895

Jostins, L., & McVean, G. (2016). Trinculo: Bayesian and frequentist multinomial logistic regression for genome-wide association studies of multi-category phenotypes. Bioinformatics, 32(12), 1898–1900. doi: 10.1093/bioinformatics/btw075

Kemppainen, P., Knight, C. G., Sarma, D. K., Hlaing, T., Prakash, A., Maung Maung, Y. N., … Walton, C. (2015). Linkage disequilibrium network analysis (LDna) gives a global view of chromosomal inversions, local adaptation and geographic structure. Molecular Ecology Resources, (July), 1031–1045. doi: 10.1111/1755-0998.12369

Korneliussen, T. S., Albrechtsen, A., & Nielsen, R. (2014). ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics, 15(1), 356. doi: 10.1186/s12859-014-0356-4

Librado, P., & Orlando, L. (2018). Detecting signatures of positive selection along defined branches of a population tree using LSD. Molecular Biology and Evolution, 35, 1520–1535. doi: 10.1093/molbev/msy053

Ortega-Del Vecchyo, D., Lohmueller, K. E., Novembre J. (2022). Haplotype-based inference of the distribution of fitness effects, Genetics. doi: 10.1093/genetics/iyac002

Pfeifer, B., Wittelsburger, U., Ramos-Onsins, S. E., & Lercher, M. J. (2014). PopGenome: An efficient swiss army knife for population genomic analyses in R. Molecular Biology and Evolution, 31(7), 1929–1936. doi: 10.1093/molbev/msu136

Rasmussen, M. D., Hubisz, M. J., Gronau, I., & Siepel, A. (2014). Genome-Wide Inference of Ancestral Recombination Graphs. PLoS Genetics, 10(5). doi: 10.1371/journal.pgen.1004342

Setter, D., Mousset, S., Cheng, X., Nielsen, R., DeGiorgio, M., & Hermisson, J. (2020). VolcanoFinder: Genomic scans for adaptive introgression. PLoS Genetics, 16(6), 1–44. doi: 10.1371/journal.pgen.1008867

Siewert, K. M., & Voight, B. F. (2017). Detecting Long-Term Balancing Selection Using Allele Frequency Correlation. Molecular Biology and Evolution, 34(11), 2996–3005. doi: 10.1093/molbev/msx209

Siewert, K. M., & Voight, B. F. (2020). BetaScan2: Standardized Statistics to Detect Balancing Selection Utilizing Substitution Data. Genome Biology and Evolution, 12(2), 3873–3877. doi: 10.1093/gbe/evaa013

Speidel, L., Forest, M., Shi, S., & Myers, S. R. (2019). A method for genome-wide genealogy estimation for thousands of samples. Nature Genetics, 51(9), 1321–1329. doi: 10.1038/s41588-019-0484-x

Stern, A. J., Wilton, P. R., & Nielsen, R. (2019). An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. In PLoS Genetics (Vol. 15). doi: 10.1371/journal.pgen.1008384

Stern, A., Speidel, L., Zaitlen, N., & Nielsen, R. (2020). Disentangling selection on genetically correlated polygenic traits using whole-genome genealogies. BioRxiv, 1–30. doi: 10.1101/2020.05.07.083402

Stucki, S., Orozco-Terwengel, P., Bruford, M. W., Colli, L., Masembe, C., Negrini, R., … Consortium, N. (2016). High performance computation of landscape genomic models integrating local indices of spatial association. Molecular Ecology Resources, 17(5), 1072–1089. doi: 10.1111/j.1540-8191.2009.00972.x

Svedberg, J., Shchur, V., Reinman, S., Nielsen, R., Corbett-Detig, R., & Svedberg, J. (2020). Inferring Adaptive Introgression Using Hidden Markov Models. BioRxiv. doi: https://doi.org/10.1101/2020.08.02.232934

Szpiech, Z. A., & Hernandez, R. D. (2014). selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol Biol Evol, 31(10), 2824–2827. doi: 10.1093/molbev/msu211

Tataru, P., & Bataillon, T. (2019). PolyDFEv2.0: Testing for invariance of the distribution of fitness effects within and across species. Bioinformatics, 35(16), 2868–2869. doi: 10.1093/bioinformatics/bty1060

Tataru, P., Nirody, J. A., & Song, Y. S. (2014). DiCal-IBD: Demography-aware inference of identity-by-descent tracts in unrelated individuals. Bioinformatics, 30(23), 3430–3431. doi: 10.1093/bioinformatics/btu563

Vitalis, R., Gautier, M., Dawson, K. J., & Beaumont, M. A. (2014). Detecting and measuring selection from gene frequency data. Genetics, 196(3), 799–817. doi: 10.1534/genetics.113.152991

Wang, J., & Zhang, Z. (2020). GAPIT Version 3: Boosting Power and Accuracy for Genomic Association and Prediction. BioRxiv.

Wang, M., Huang, X., Li, R., Xu, H., Jin, L., & He, Y. (2014). Detecting recent positive selection with high accuracy and reliability by conditional coalescent tree. Molecular Biology and Evolution, 31(11), 3068–3080. doi: 10.1093/molbev/msu244

Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution, 24(8), 1586–1591. doi: 10.1093/molbev/msm088

Zhou, X., & Stephens, M. (2012). Genome-wide efficient mixed model analysis for association studies. Nature Genetics, 44(7), 821–824. doi: 10.1038/ng.2310.