Checking for population structure is an essential step when performing analyses on genome-level datasets. Neglecting it can bias demographic inferences (Chikhi et al., 2010; Heller et al., 2013) or the detection of loci under selection (e.g. Nielsen et al., 2007); thus, checking for outlier individuals and assessing the global structure is required prior to any more sophisticated analysis. On the other hand, selection acts both on correlations i) between alleles and environment at selected loci and ii) between alleles from different loci, either directly under selection or not. This is reflected respectively by i) variation in polymorphism within and between populations and ii) linkage disequilibrium (LD) between loci. If selection is widespread in the genome, the study of population history can thererore be biased, making necessary the joint study of selection and population structure.
This table summarizes current methods to detect selection.
Software | Class of method | Purpose | Specifics | Issues and warnings | Link | Reference |
---|---|---|---|---|---|---|
ARGWeaver/ARGweaver-D | Ancestral Recombination Graphs/coalescence | Retracing the whole process of recombination and coalescence along a genome | Provides quantitative estimates for TMRCA and topologies at each locus. ARGWeaver-D can estimate introgression. Estimates effective population size. Provides tools to extract summary statistics for the topologies retrieved. Does not require phasing (but slower). | High computing cost. Slower on unphased or low depth data. ARGWeaver-D is not part of the Anaconda (Python) distribution (http://compgen.cshl.edu/ARGweaver/doc/argweaver-d-manual.html) | Can be installed via conda: conda install -c genomedk argweaver and https://github.com/mjhubisz/argweaver and http://compgen.cshl.edu/ARGweaver/doc/argweaver-d-manual.html | (Rasmussen et al., 2014; Hubisz et al., 2020) |
GAPIT3 | Association | Detecting association with environmental/phenotypical features | Includes most methods for GWAS studies, including procedures for fast computation, mixed linear models, efficient mixed model association, bayesian methods such as BLINK, diagnostics such as QQ plots and genotype filtering. | May be slow for very large datasets | https://github.com/jiabowang/GAPIT3 | (Wang and Zhang, 2020) |
GEMMA | Association | Detecting association with environmental/phenotypical features | Computationnally efficient for large scale datasets | Imports data from PLINK format | http://www.xzlab.org/software.html | (Zhou and Stephens, 2012) |
GENABEL | Association | Detecting association with environmental/phenotypic features | Modularity, facilitates correction for population structure/relatedness. | Imports data from PLINK format. No longer supported! | http://www.genabel.org/ | (Aulchenko et al., 2007) |
PLINK | Association | Detecting association with environmental/phenotypical features | Handles a variety of tests for population structure and relatedness | Population structure/kinship need to be assessed prior association analysis | http://pngu.mgh.harvard.edu/~purcell/plink/ | Purcell et al., 2007) |
Trinculo | Association | Detecting association with environmental/phenotypical features | Specifically designed to handle categorical variables with more than 2 categories. Performs multinomial logistic regression and provides frequentist and bayesian frameworks. | Requires lapack library in Unix. Allows fine-mapping by testing for corrrelations between adjacent markers. | https://sourceforge.net/projects/trinculo/ | (Jostins and McVean, 2016) |
SAMBADA | Association/Environmental association | Detecting association with environmental/phenotypical features | Designed to be fast, underlying models have been kept simple. Allows conversion from PLINK format. Takes into account spatial autocorrelation of individual genotypes. Allows correction for population structure | Does not work with pooled data. Possibly high levels of false positives. Relatedness between samples should be assessed independently. Should be used in combination with LFMM or BayPass. | http://lasig.epfl.ch/sambada | (Stucki et al., 2016) |
Relate | Coalescence with recombination | Reconstruct genome-wide genealogies for hundreds of samples | Provides quantitative estimates for TMRCA and topologies at each locus. Infers past demography (similar to PSMC methods). Infers changes in mutation rates. Performs scans for positive selection over discrete time periods. | Requires an outgroup to polarize alleles as ancestral/derived. Requires a recombination map. Does not reconstruct ARG sensus stricto, and does not estimate uncertainty of the local genealogies | https://myersgroup.github.io/relate/index.html | (Speidel et al., 2019) |
diCal-IBD | Coalescent with recombination/IBD | Predicting IBD tracts from demographic models | High IBD sharing suggests recent positive selection. | Uses diCal output to obtain expectations based on demographic scenarios | https://sourceforge.net/projects/dical-ibd/ | (Tataru et al., 2014) |
VolcanoFinder | Composite likelihood test | Adaptive introgression | Detects a specific signature of increase then drop in diversity near a selected locus brought in a population through introgression | Private input format. Computationnally intensive, needs to be run in parallel. | http://degiorgiogroup.fau.edu/vf.html | (Setter et al., 2020) |
SCCT | Conditional coalescent tree | Detecting positive selection | Designed for detecting recent positive selection. Clains to be more precise at identifying selected sites | The ancestral state of alleles must be obtained through an outgroup | https://github.com/wavefancy/scct | (Wang et al., 2014) |
LFMM | Environmental association | Detecting adaptation to environmental features | Corrects for population structure using latent factors, faster than BAYENV for large datasets | Only performs association with environment | http://membres-timc.imag.fr/Olivier.Francois/lfmm/software.htm | (Frichot et al., 2013) |
CLUES | Genealogies at selected loci | Estimate the time at which a beneficial allele rises in frequency | Previous version used ARGWeaver output, current version uses Relate. Provides scripts to plot the trajectory of selected alleles. | Assumes a panmictic population, neglects the effects of selection at linked sites. | https://github.com/35ajstern/clues | (Stern et al., 2019) |
PALM | Genealogies at selected loci | Estimate the strength and timing of selection on polygenic traits | Uses genealogies estimated from Relate and results from GWAS to estimate timing and strength of selection for polygenic traits. Should be robust to pleiotropy and residual structure in GWAS | May overestimate selection for older events. Only tested in humans. | https://github.com/35ajstern/palm | (Stern et al., 2020) |
startmrca | Genealogies at selected loci | Estimate the time at which a beneficial allele rises in frequency | Compares genealogies between carriers and non-carriers of an advantageous mutation, assuming a star- genealogy at selected loci. Can handle VCF files | Requires a reference panel of noncarrier haplotypes. Sensitive to loca diversity before the sweep, and to migration events during a sweep. More indicated for recent sweeps. | https://github.com/jhavsmith/startmrca | Smith, Coop, Stephens, & Novembre, 2018) |
Ancestry_HMM-S | Identity-by-state tracts | Adaptive introgression | Estimates the selective coefficient of the introgressed loci through a hidden-Markov chain approach. | Requires the time and extent of introgression to be defined by the user | https://github.com/jesvedberg/Ancestry_HMM-S/ | (Svedberg et al., 2020) |
H12 test | LD | Detecting selection using signatures of high LD | Does not require phased data. Designed for detecting soft sweeps | Coalescent simulations are recommended to evaluate the likelihood of selection | https://github.com/ngarud/SelectionHapStats/ | (Garud et al., 2015) |
LDna | LD | Detecting selection using signatures of high LD | Can be used to address population structure or detect large inversions or indel polymorphism through LD | The user needs to play with parameters to ensure robustness of SNPs significantly linked | https://github.com/petrikemppainen/LDna | (Kemppainen et al., 2015) |
rehh | LD | Detecting selection using signatures of high LD | Can compute both XP-EHH and Rsb. Handles several input formats | Requires phased data and high density of markers | https://cran.r-project.org/web/packages/rehh/index.html | (Gautier and Vitalis, 2012) |
Scan for epistatic interaction (based on LD) | LD | Polygenic selection/Epistatic interactions | Uses genome-wide LD between a candidate locus and the rest of the genomes to identify epistatic interactions. Can test SNP-SNP interaction, or between genomic windows (summarizes genotypes through PCA) | Lack of a detailed tutorial | https://github.com/leaboyrie/LD_corpc1 | (Boyrie et al., 2020) |
Selscan | LD | Detecting selection using signatures of high LD | Includes the nSL statistics dedicated to soft sweep detection | Does not include utilities to specify the ancestral state of alleles. Requires phased data and high density of markers | https://github.com/szpiech/selscan | (Szpiech and Hernandez, 2014) |
BALLET | Likelihood test for balancing selection | Detecting balancing selection | Designed for detecting ancient balancing selection. Does not require phasing | Requires whole-genome data and recombination map. The ancestral state of alleles must be obtained through an outgroup | http://www.personal.psu.edu/mxd60/ballet.html | (DeGiorgio et al., 2014) |
Betascan2 | Local associations of allele frequencies | Detecting balancing selection | Uses correlations in frequencies between genomically proximate SNPs to compute a score. Can incorporate information about ancestral/derived alleles, fixed derived variants and normalizes the statistics depending on the amount of sites in a given genomic window. Very detailed tutorial and utilities. | Requires estimating the length distribution of ancestral fragments on each side of the selected site. The 95% percentile can be estimated with the formula L=-log(0.05)/(T*rho), with T the time since selection in generations and rho the effective recombination rate/ generation. | https://github.com/ksiewert/BetaScan | (Siewert and Voight, 2017, 2020) |
NCD statistics | Local associations of allele frequencies | Detecting balancing selection | Examines the observed and expected frequency spectra of polymorphisms in genomic windows to test for selection. Can incorporate fixed differences with an outgroup (NCD2), but not mandatory (NCD1) | Private input format, requires simulations to calibrate the statistics. Requires to define the expected equilibrium frequency of alleles (usually between 0.3 and 0.5). Low sensitivity below these frequencies. | https://github.com/bbitarello/NCD-Statistics | (Bitarello et al., 2018) |
Bayescan | Population differentiation | Detecting positive selection and local adaptation | Incorporates uncertainty on allele frequencies due to low sample sizes | Sensitive to priors on the ratio of selected/neutral sites. False positive rates can be high under scenarios of demographic expansion, admixture and isolation by distance | http://cmpg.unibe.ch/software/BayeScan/ | (Foll and Gaggiotti, 2008) |
FDIST2 | Population differentiation | Detecting positive selection and local adaptation | Allows to control for hierarchical population structure | False positive rate is high when an island model cannot be assumed | http://datadryad.org/resource/doi:10.5061/dryad.v8d05 | (Beaumont and Balding, 2004) |
PCAdapt | Population differentiation | Detecting positive selection and local adaptation | Does not require to define populations. Handles admixed populations and pooled datasets | False positive rate can be high | http://membres-timc.imag.fr/Michael.Blum/PCAdapt.html | (Duforet-Frebourg et al., 2016) |
SelEstim | Population differentiation | Detecting positive selection and local adaptation | Can estimate the coefficients of selection. Calibration using a pseudo-observed dataset (can be used in combination with the R function simulate.baypass() in BayPass). | Assumes a Wrigth-Fisher island model. | http://www1.montpellier.inra.fr/CBGP/software/selestim/ | (Vitalis et al., 2014) |
Bayenv, BayPass | Population differentiation/Association | Detecting positive selection and adaptation to environmental features | Less sensitive to population demographic history than previous methods. Handle pooled datasets | Significance thresholds need to be determined from pseudo-observed datasets. Calibration with neutral SNPs is recommended. BayPass better estimates the kinship matrix | http://www1.montpellier.inra.fr/CBGP/software/baypass/ ; https://bitbucket.org/tguenther/bayenv2_public/src | (Günther and Coop, 2013; Gautier, 2015) |
FLK | Population differentiation/Association | Detecting positive selection and local adaptation | Less sensitive to population demographic history than previous methods | Requires an outgroup population | https://qgsp.jouy.inra.fr/index.php?option=com_content&view=article&id=50&Itemid=55 | (Bonhomme et al., 2010) |
LSD | Population differentiation/Population-branch test | Detecting positive selection and local adaptation | Compares the level of exclusively shared differences between internal and external branches of a population tree. Allows testing selection occurring on the ancestral branch leading to two populations. | Requires several populations to perform the test. May be less sensitive to selection on standing variation. | https://bitbucket.org/plibrado/LSD | (Librado and Orlando, 2018) |
POPBAM | Summary statistics | Detecting selection using AFS, differentiation | Extracts summary statistics directly from BAM files | Does not allow for sophisticated filtering and SNP calling | http://popbam.sourceforge.net/ | (Garrigan, 2013) |
VCFTOOLS | Summary statistics | Detecting selection using AFS, differentiation | Extracts summary statistics from VCF files. Also allows VCF filtering and conversion | Set of summary statistics not as extensive as PopGenome | http://vcftools.sourceforge.net/ | (Danecek et al., 2011) |
RAiSD | Summary statistics/Allele frequency spectrum + LD | Detecting positive selection and local adaptation | Scans the genome for composite signals of selective sweeps summarized by the μ statistics. Corrects for the effects of background selection by estimating a threshold value for the statistics based on simulations with background selection | Uses a single population of interest. | https://github.com/alachins/raisd | (Alachiotis and Pavlidis, 2018) |
TASSEL | Summary statistics/Association | Detecting association with phenotype | User friendly (Java interface), corrects for relatedness, allows computing summary statistics (LD, diversity) | Requires relatedness to be assessed externally (with e.g. STRUCTURE) | http://www.maizegenetics.net/tassel | (Bradbury et al., 2007) |
ANGSD | Summary statistics/Association/Population Branch test | Detecting selection using AFS, differentiation, association with functional traits | Allows for association using generalized linear models | Descriptive statistics. P-values need to be evaluated through coalescent simulations. | http://www.popgen.dk/angsd/index.php/ANGSD | (Korneliussen et al., 2014) |
SweeD | Summary statistics/Composite Likelihood test | Designed for whole genome data (or large continuous regions) | Supports Fasta and VCF formats. Estimates selection coefficients. | NA | http://pop-gen.eu/wordpress/software/sweed | (Degiorgio et al., 2016) |
selectionTools | Summary statistics/LD | Detecting selection using AFS, differentiation and LD statistics | Allows combining several tools in a single pipeline. Includes phasing tools. | Set of available summary statistics remains limited (same as VCFtools + Fay and Wu's H) | https://github.com/MerrimanLab/selectionTools | (Cadzow et al., 2014) |
PAML/CODEML | Summary statistics/phylogeny | Distribution of fitness effects/selection on coding variation | Estimates selection along branches in a phylogeny for genes of interest, contrasting patterns of synonymous and non-synonymous substitutions. A detailed tutorial is available here: https://link.springer.com/protocol/10.1007%2F978-1-4939-1438-8_4#Sec29 | Slow for large datasets. Needs to be parallelised. | http://abacus.gene.ucl.ac.uk/software/paml.html | (Yang, 2007) |
polyDFE2.0 | Summary statistics/phylogeny | Distribution of fitness effects/selection on coding variation | Can test for invariance of DFEs across datasets (genomic regions within species, or different species). No need for divergence estimates (does not assume that the same DFE is shared between species and outgroup). Very detailed tutorial available here: https://link.springer.com/protocol/10.1007/978-1-0716-0199-0_6 | Comparisons require a large number of SNPs for each dataset for comparisons to be meaningful | https://github.com/paula-tataru/polyDFE | (Tataru and Bataillon, 2019) |
POPGenome | Summary statistics/Population Branch test | Detecting selection using AFS, differentiation | Fast, embedded in R, allows using annotation files (GFF/GTF format). | Does not perform association, but can be used in combination with GENABEL within R | https://cran.r-project.org/web/packages/PopGenome/index.html | (Pfeifer et al., 2014) |
ETEToolkit | Summary statistics/phylogeny | Distribution of fitness effects/selection on coding variation | ETEToolkit can call CODEML from Python and can streamline phylogenetic analyses of selection. Estimates selection along branches in a phylogeny for genes of interest, contrasting patterns of synonymous and non-synonymous substitutions. | Slow for large datasets. Needs to be parallelised. | http://etetoolkit.org/ | (Huerta-Cepas et al., 2016) |
HaplotypeDFEStandingVariation (no official name for the pipeline) | Summary statistics/phylogeny | Distribution of fitness effects/selection on coding variation | Uses variation in the length of tracts of Identity-by-State to infer the distribution of fitness effect. | The pipeline relies on in-house simulators and ABC to obtain more robust estimates of the DFE. Comprehensive but may be difficult to deploy for a naive user. | https://github.com/dortegadelv/HaplotypeDFEStandingVariation | (Ortega-Del Vecchyo et al., 2022) |
References
Alachiotis, N., & Pavlidis, P. (2018). RAiSD detects positive selection based on multiple signatures of a selective sweep and SNP vectors. Communications Biology, 1(1). doi: 10.1038/s42003-018-0085-8
Aulchenko, Y. S., Ripke, S., Isaacs, A., & van Duijn, C. M. (2007). GenABEL: An R library for genome-wide association analysis. Bioinformatics, 23(10), 1294–1296. doi: 10.1093/bioinformatics/btm108
Beaumont, M. A., & Balding, D. J. (2004). Identifying adaptive genetic divergence among populations from genome scans. Molecular Ecology, 13(4), 969–980. doi: 10.1111/j.1365-294X.2004.02125.x
Bitarello, B. D., De Filippo, C., Teixeira, J. C., Schmidt, J. M., Kleinert, P., Meyer, D., & Andres, A. M. (2018). Signatures of long-term balancing selection in human genomes. Genome Biology and Evolution, 10(3), 939–955. doi: 10.1093/gbe/evy054
Bonhomme, M., Chevalet, C., Servin, B., Boitard, S., Abdallah, J. M., Blott, S., & San Cristobal, M. (2010). Detecting Selection in Population Trees: The Lewontin and Krakauer Test Extended. Genetics, (186), 241–262. doi: 10.1534/genetics.110.117275
Boyrie, L., Moreau, C., Frugier, F., Jacquet, C., & Bonhomme, M. (2020). A linkage disequilibrium-based statistical test for Genome-Wide Epistatic Selection Scans in structured populations. Heredity. doi: 10.1038/s41437-020-0349-1
Bradbury, P. J., Zhang, Z., Kroon, D. E., Casstevens, T. M., Ramdoss, Y., & Buckler, E. S. (2007). TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics (Oxford, England), 23(19), 2633–2635. doi: 10.1093/bioinformatics/btm308
Cadzow, M., Boocock, J., Nguyen, H. T., Wilcox, P., Merriman, T. R., & Black, M. A. (2014). A bioinformatics workflow for detecting signatures of selection in genomic data. Frontiers in Genetics, 5(AUG), 1–8. doi: 10.3389/fgene.2014.00293
Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., … Durbin, R. (2011). The variant call format and VCFtools. Bioinformatics, 27(15), 2156–2158. doi: 10.1093/bioinformatics/btr330
Degiorgio, M., Huber, C. D., Hubisz, M. J., Hellmann, I., & Nielsen, R. (2016). Genetics and population analysis SWEEPFINDER 2 : Increased sensitivity , robustness , and flexibility. Bioinformatics. doi: 10.111/mec.13351.RR
DeGiorgio, M., Lohmueller, K. E., & Nielsen, R. (2014). A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genetics, 10(8), e1004561. doi: 10.1371/journal.pgen.1004561
Duforet-Frebourg, N., Luu, K., Laval, G., Bazin, E., & Blum, M. G. B. (2016). Detecting genomic signatures of natural selection with principal component analysis: Application to the 1000 genomes data. Molecular Biology and Evolution, 33(4), 1082–1093. doi: 10.1093/molbev/msv334
Foll, M., & Gaggiotti, O. (2008). A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics, 180(2), 977–993. doi: 10.1534/genetics.108.092221
Frichot, E., Schoville, S. D., Bouchard, G., & François, O. (2013). Testing for associations between loci and environmental gradients using latent factor mixed models. Molecular Biology and Evolution, 30(7), 1687–1699. doi: 10.1093/molbev/mst063
Garrigan, D. (2013). POPBAM: Tools for evolutionary analysis of short read sequence alignments. Evolutionary Bioinformatics, 2013(9), 343–353. doi: 10.4137/EBO.S12751
Garud, N. R., Messer, P. W., Buzbas, E. O., & Petrov, D. A. (2015). Recent Selective Sweeps in North American Drosophila melanogaster Show Signatures of Soft Sweeps. PLoS Genetics, 11(2), 1–32. doi: 10.1371/journal.pgen.1005004
Gautier, M. (2015). Genome-Wide Scan for Adaptive Divergence and Association with Population-Specific Covariates. Genetics, 201(September), 1555–1579. doi: doi:0.1534/genetics.115.181453
Gautier, M., & Vitalis, R. (2012). Rehh An R package to detect footprints of selection in genome-wide SNP data from haplotype structure. Bioinformatics, 28(8), 1176–1177. doi: 10.1093/bioinformatics/bts115
Günther, T., & Coop, G. (2013). Robust identification of local adaptation from allele frequencies. Genetics, 195(1), 205–220. doi: 10.1534/genetics.113.152462
Hubisz, M. J., Williams, A. L., & Siepel, A. (2020). Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph. PLoS Genetics, 16(8), 1–24. doi: 10.1371/JOURNAL.PGEN.1008895
Jostins, L., & McVean, G. (2016). Trinculo: Bayesian and frequentist multinomial logistic regression for genome-wide association studies of multi-category phenotypes. Bioinformatics, 32(12), 1898–1900. doi: 10.1093/bioinformatics/btw075
Kemppainen, P., Knight, C. G., Sarma, D. K., Hlaing, T., Prakash, A., Maung Maung, Y. N., … Walton, C. (2015). Linkage disequilibrium network analysis (LDna) gives a global view of chromosomal inversions, local adaptation and geographic structure. Molecular Ecology Resources, (July), 1031–1045. doi: 10.1111/1755-0998.12369
Korneliussen, T. S., Albrechtsen, A., & Nielsen, R. (2014). ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics, 15(1), 356. doi: 10.1186/s12859-014-0356-4
Librado, P., & Orlando, L. (2018). Detecting signatures of positive selection along defined branches of a population tree using LSD. Molecular Biology and Evolution, 35, 1520–1535. doi: 10.1093/molbev/msy053
Ortega-Del Vecchyo, D., Lohmueller, K. E., Novembre J. (2022). Haplotype-based inference of the distribution of fitness effects, Genetics. doi: 10.1093/genetics/iyac002
Pfeifer, B., Wittelsburger, U., Ramos-Onsins, S. E., & Lercher, M. J. (2014). PopGenome: An efficient swiss army knife for population genomic analyses in R. Molecular Biology and Evolution, 31(7), 1929–1936. doi: 10.1093/molbev/msu136
Rasmussen, M. D., Hubisz, M. J., Gronau, I., & Siepel, A. (2014). Genome-Wide Inference of Ancestral Recombination Graphs. PLoS Genetics, 10(5). doi: 10.1371/journal.pgen.1004342
Setter, D., Mousset, S., Cheng, X., Nielsen, R., DeGiorgio, M., & Hermisson, J. (2020). VolcanoFinder: Genomic scans for adaptive introgression. PLoS Genetics, 16(6), 1–44. doi: 10.1371/journal.pgen.1008867
Siewert, K. M., & Voight, B. F. (2017). Detecting Long-Term Balancing Selection Using Allele Frequency Correlation. Molecular Biology and Evolution, 34(11), 2996–3005. doi: 10.1093/molbev/msx209
Siewert, K. M., & Voight, B. F. (2020). BetaScan2: Standardized Statistics to Detect Balancing Selection Utilizing Substitution Data. Genome Biology and Evolution, 12(2), 3873–3877. doi: 10.1093/gbe/evaa013
Speidel, L., Forest, M., Shi, S., & Myers, S. R. (2019). A method for genome-wide genealogy estimation for thousands of samples. Nature Genetics, 51(9), 1321–1329. doi: 10.1038/s41588-019-0484-x
Stern, A. J., Wilton, P. R., & Nielsen, R. (2019). An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. In PLoS Genetics (Vol. 15). doi: 10.1371/journal.pgen.1008384
Stern, A., Speidel, L., Zaitlen, N., & Nielsen, R. (2020). Disentangling selection on genetically correlated polygenic traits using whole-genome genealogies. BioRxiv, 1–30. doi: 10.1101/2020.05.07.083402
Stucki, S., Orozco-Terwengel, P., Bruford, M. W., Colli, L., Masembe, C., Negrini, R., … Consortium, N. (2016). High performance computation of landscape genomic models integrating local indices of spatial association. Molecular Ecology Resources, 17(5), 1072–1089. doi: 10.1111/j.1540-8191.2009.00972.x
Svedberg, J., Shchur, V., Reinman, S., Nielsen, R., Corbett-Detig, R., & Svedberg, J. (2020). Inferring Adaptive Introgression Using Hidden Markov Models. BioRxiv. doi: https://doi.org/10.1101/2020.08.02.232934
Szpiech, Z. A., & Hernandez, R. D. (2014). selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol Biol Evol, 31(10), 2824–2827. doi: 10.1093/molbev/msu211
Tataru, P., & Bataillon, T. (2019). PolyDFEv2.0: Testing for invariance of the distribution of fitness effects within and across species. Bioinformatics, 35(16), 2868–2869. doi: 10.1093/bioinformatics/bty1060
Tataru, P., Nirody, J. A., & Song, Y. S. (2014). DiCal-IBD: Demography-aware inference of identity-by-descent tracts in unrelated individuals. Bioinformatics, 30(23), 3430–3431. doi: 10.1093/bioinformatics/btu563
Vitalis, R., Gautier, M., Dawson, K. J., & Beaumont, M. A. (2014). Detecting and measuring selection from gene frequency data. Genetics, 196(3), 799–817. doi: 10.1534/genetics.113.152991
Wang, J., & Zhang, Z. (2020). GAPIT Version 3: Boosting Power and Accuracy for Genomic Association and Prediction. BioRxiv.
Wang, M., Huang, X., Li, R., Xu, H., Jin, L., & He, Y. (2014). Detecting recent positive selection with high accuracy and reliability by conditional coalescent tree. Molecular Biology and Evolution, 31(11), 3068–3080. doi: 10.1093/molbev/msu244
Yang, Z. (2007). PAML 4: Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution, 24(8), 1586–1591. doi: 10.1093/molbev/msm088
Zhou, X., & Stephens, M. (2012). Genome-wide efficient mixed model analysis for association studies. Nature Genetics, 44(7), 821–824. doi: 10.1038/ng.2310.