CHEN Ke-Ke, LUO A-Rong, WANG Yun-Liang, SI Cheng-Cai,HUANG Dun-Yuan, SU Cheng-Yong, HAO Jia-Sheng,*, ZHU Chao-Dong, *
(1.College of Life Sciences, Anhui Normal University, Wuhu, Anhui 241000, China; 2.Key Laboratory ofZoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China;3.College of Life Sciences, Chongqing Normal University, Chongqing 401331, China)
Abstract: 【Aim】This study aims to explore the genetic diversity, genetic differentiation, and phylogenetic relationships among populations of the pierid butterfly Colias fieldii in China, to infer their origin and divergence time, and to preliminarily clarify the causes of their spatiotemporally evolutionary history.【Methods】 The four mitochondrial gene(COI, Cytb, NDI and ND5)sequences of 115 individuals from 23 geographic populations of C.fieldii in China collected in 2006-2018 were amplified by PCR and sequenced.The genetic diversity and genetic differentiation were analyzed using MEGA v.7.0, DnaSP v.5.0, Arlequin v.3.5.1 and other genetic analysis software.Using the closest relatives as the outgroups, the phylogenetic trees and haplotype median-joining network of C.fieldii were reconstructed with such analytical software as IQ-TREE, MrBayes v.3.1.2, Network v.4.6 and BEAST v.1.8.3, and the origin and divergence time of C.fieldii were estimated by using relaxed molecular dating method and calibrations of the previous studies.Based on the present biogeographic distribution of C.fieldii and the main earth environmental events since the Quaternary Period, the spatio-temporal pattern of its biogeographic distribution and the underlying earth environmental factors were tentatively inferred.【Results】 The aligned sizes of mitochondrial gene segments of COI, Cytb, NDI and ND5 of C.fieldii populations are 648, 699, 393 and 777 bp, respectively, and the concatenated sequence of the four genes is 2 517 bp in length, which was shown to be significantly AT biased.In total, 18 haplotypes based on the four mitochondrial gene sequences were found in 115 individuals of 23 geographic populations of C.fieldii, with the haplotype diversity(Hd)of 0.677±0.048 and nucleotide diversity(π)of 0.00066±0.00007 of the total population, showing a relatively high level of haplotype diversity and a low level of nucleotide diversity.The phylogenetic analysis showed that 18 haplotypes of C.fieldii populations were distinctly categorized into two large clades(clade I and II).Clade I included 13 haplotypes of populations from Shaanxi, Henan, Gansu, Anhui, Hubei, Sichuan, Qinghai, and some regions of Yunnan.Clade II included five haplotypes of populations from some regions of Yunnan and Tibet.The results of reconstructed haplotype median-joining network were generally consistent with those of the phylogenetic tree.AMOVA analysis indicated that a larger level of population differentiation(64.36%)occurred between the two haplotype clades and a subtle genetic differentiation(35.64%)existed within each haplotype clade of C.fieldii.The analysis results of neutrality tests and mismatch distribution indicated that populations with haplotypes in clade I did not experience a population expansion event, whereas those with haplotypes in clade II probably had a sudden demographic expansion at about 0.085 Ma in the late Pleistocene, a little earlier than the Last Glacial Maximum event, which may be caused by the warm and humid plateau climate in the interglacial period, the expansion of forest grassland and the decreased heavy rainfall on the core plateau.【Conclusion】 The genetic differentiation of C.fieldii populations is correlated significantly with the geographical distance.In addition, we proposed that C.fieldii populations originated at about 0.48 Ma in southwestern areas of China(presently the Hengduan Mountains and adjacent areas), and began to diversify into two clades and later dispersed into other low-latitude areas due to the Quaternary glacial-interglacial cycle events, Southeast Asia monsoon and different habitat environments.
Key words: Colias fieldii; genetic differentiation; haplotype; mitochondrial gene; molecular dating; phylogeography; China
Coliasfieldiiis one pierid butterfly species commonly found and distributed in Central Asia, inhabiting at areas of grassy and wooded steppes(Laiho and Ståhls, 2013; Ding and Zhang, 2017).As one of the important pollinator insects, it plays a critical role in plant reproduction(Yangetal., 2019).Previous morphological and ecological studies have provided much useful evidence for taxonomy and phylogeny of this species and related taxa(Mengetal., 2013).In recent decades, more researchers attempted to investigate the relevant studies by using molecular data criteria.For example, Wheat and Watt(2008)reconstructed the molecular phylogenetic relationships of North AmericanColiastaxa using mitochondrial genes, and their results suggested that molecular divergences among theColiasspecies were relatively small.Schovilleetal.(2011)studied the evolutionary history ofColiasbehrii, showing thatC.behriihad a very low level of genetic diversity at mitochondrial and nuclear loci.However, this study did not explore the impacts of Quaternary glacial period on distribution and genetic structure of theColiasspecies in the Qinghai-Tibet Plateau(QTP).
The Qinghai-Tibet Plateau(QTP)in China, recognized as the world’s third pole, is the highest and largest plateau in the world with an average elevation of about 4 000 m and an area of about 2.5 million km2.Because of its high elevation and specific climate, this plateau dramatically affects terrestrial fauna and ecosystems in western China and the neighboring areas.Specifically, the rapid uplift of the QTP between 1.1 and 0.6 Ma gave rise to the average altitude and initiated widespread mountain glaciers during the ice ages(Zheng, 1996; Zhouetal., 2006; Favreetal., 2015).The topographic variation of the QTP, including the forming of the Hengduan Mountains, coupled with cyclical climatic changes, namely the alternating glacial-interglacial periods in the Pleistocene exerted enormous influences on the current spatial distribution and genetic structure of many resident species(Caoetal., 2012; Lietal., 2012).Previous studies have demonstrated that the geological events as well as accompanying environmental changes might have caused rapidly evolutionary radiation of many insect groups(Grattonetal., 2008).C.fieldiiis also found in the QTP, and its genetic differentiation is closely related to the environmental background of the QTP.
Due to lack of recombination, maternal inheritance and other merits compared with nuclear genes, mitochondrial DNA(mtDNA)genes have been widely used to study molecular evolution, phylogenetics, and population genetics in animals for several decades(Galtieretal., 2009).The cytochrome oxidase subunit I(COI)gene is relatively conservative and widely used in inferring inter-and intra-species phylogenetic relationships, and the cytochrome b(Cytb)gene has a moderate rate of molecular evolution, being suitable to clarify the phylogenetic relationships between families, genera, species and even populations of a single species.In addition, the mitochondrial genes dehydrogenase subunit 5 gene(ND5)and dehydrogenase subunit I gene(NDI)which harbor a relatively fast evolutionary rates, are suitable for resolving relatively lower level relationships(Cameron, 2014; Heetal., 2016; Suetal., 2017).In recent decades, combined mitochondrial dataset was often utilized to investigate the genetic differentiations and phylogenetic relationships among insects at the inter-and intra-species level(Taoetal., 2020).
In this study, we firstly determined the sequences of four mitochondrial genes(COI,Cytb,NDIandND5)of total 115 individuals of 23 populations ofC.fieldiifrom the Qinghai-Tibet Plateau and other areas in China.Based on these datasets of four mitochondrial gene sequences, we analyzed their genetic differentiation among populations, assessed their phylogeographic structure using a variety of genetic diversity analysis methods, and analyzed their demographic history using the mismatch distribution and neutrality test analysis methods, reconstructed their phylogenetic trees, and estimated the divergence time of their main lineages with relaxed molecular dating methods.In general, we aim to clarify their divergence timescales and the related geological and environmental events in the evolutionary history ofC.fieldii.
We collected 115 adult individuals ofC.fieldiifrom 23 sampling sites in central and southwestern China.Sample size and geographical coordinates for each population were shown in Table 1.After species identification, the fresh thorax muscle tissues were immediately preserved in 100% ethanol for DNA fixation and stored at-20℃ for further genomic DNA isolation.Total genomic DNA was extracted from thorax muscle using a DNA extraction kit(Sangon Biotech, Shanghai)according to the manufacturer’s instructions.
Four mitochondrial gene(COI,Cytb,NDI, andND5)fragments were amplified by using standard short primers(Simonetal., 1994; Yagietal., 1999).All primers were synthesized by Sangon Biotech(Shanghai)shown in Table 2.PCR was performed in a PCR reaction(50 μL)of genomic DNA(20 ng/mL)1.5 μL, 10×Buffer 6.0 μL, Mg2+(25 mmol/L)8.0 μL, dNTPs(0.2 mmol/L)1.5 μL, Taq DNA polymerase 0.6 μL, 1.8 μL of each of the primers(1.0 ng/mL), and ddH2O 28.8 μL, under the following conditions: an initial denaturation at 95℃ for 5 min; 35 cycles of denaturation at 94℃ for 50 s, annealing at 46-52℃(depending on the primer pairs)for 50 s, and extension at 72℃ for 120 s; and a final extension at 72℃ for 10 min.The PCR products were sequenced on an ABI-377 automatic DNA sequences(Sangon Biotech, Shanghai).
Table 2 PCR primers used in this study
UsingGonepteryxrhamniandEuremahecabeas the outgroups, phylogenetic trees were reconstructed with the Bayesian inference(BI)and maximum likelihood(ML)methods based on different mitochondrial datasets.The jModelTest v.2.0(Darribaetal., 2012)was used to select the GTR+I+G model as the best-fit substitution model of molecular evolution.BI analysis using the Markov chain Monte Carlo(MCMC)method was performed by MrBayes v.3.1.2(Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003).ML analysis was carried out in IQ-TREE v.1.6.8 under the GTR+I+G model(Nguyenetal., 2014; Zhangetal., 2018).Finally, a ML tree in NEWICK format was visualized by FigTree v.1.4.3(Rambaut, 2009).In addition, median-joining network of haplotypes was constructed with Network v.4.6(Bandeltetal., 1999).The phylogeographic structure was tested by calculating two indices of genetic differentiationGSTandNSTusing DnaSP v.5.0(Pons and Petit, 1996; Librado and Rozas, 2009).The genetic differentiation coefficient(FST)and gene flow(Nm)were calculated using Arlequin v.3.5.1(Excoffier and Lischer, 2010).Analysis of molecular variance(AMOVA)was performed to distinguish variations within and among populations, with significance tested via 1 000 nonparametric permutations.
To infer population demographic history ofC.fieldii, Tajima’sD(Tajima, 1989)and Fu’sFS(Fu, 1997)were calculated using Arlequin v.3.5.1.In addition, mismatch distribution was used to identify historical processes between recent demographic expansion and population equilibrium(Rogers and Harpending, 1992; Cristianoetal., 2016).The validity of the two models fitting to our data was assessed by the sum of squared deviation(SSD)and Harpending’s raggedness index(HRI)(Harpending, 1994).The time of population expansion(t, time in generations)was then estimated using the formulaτ=2ut(Youetal., 2010), whereτwas the mode of mismatch distribution anduwas the mutation rate per generation for the entire sequences under this study(the value ofuwas calculated using the equationu=μk, whereμwas the mutation rate per nucleotide per generation andkwas the number of nucleotides assayed).
Based on the 18 haplotypes, usingArtogeiamelete,Pierisrapae,AporiaintercostataandAporiahippiaas the outgroups,C.fieldiiof this study and their related taxa ofC.montium,C.erate,C.croceus,Catopsiliapomona,Gonepteryxmahaguru,Gonepteryxrhamni, andEuremahecabewere used to reconstruct the species tree and estimate the divergence time simultaneously using the BEAST v.1.8.3(Drummond and Rambaut, 2007).Uncorrelated relaxed clocks were used to estimate the branch lengths, with the tree prior of the birth-death process, and the nucleotide substitution model was set to the GTR+G model.The oldest Pieridae fossilsStolopsychelibytheoideaandColiatesproserpinawere selected for fossil calibrations, and thus the minimum age of the genusPierisand the genusAporiawere constrained to be 37.2-33.9 Ma and 33.5-30 Ma, respectively(Scudder, 1875, 1889; Braby and Trueman, 2006).In addition, according to insect-host plant coevolutionary scenario and the molecular dating of the Brassicales’s radiation, the earliest divergence time of Coliadinae and Pierinae was set to be 90-85 Ma.The convergence of the MCMC chains was evaluated by ESS values above 200 in Tracer v.1.7.1(Rambautetal., 2018).The maximum-clade-credibility tree was finally visualized and edited in FigTree v.1.4.3(Carneiroetal., 2018).
The mitochondrial gene sequences ofCOI,Cytb,NDI, andND5 from all individuals ofC.fieldiiwere aligned via MAFFT v.7.1(Katohetal., 2017)and concatenated by DAMBE v.6.0(Xia, 2017).The nucleotide composition of the mitochondrial gene sequences was analyzed by MEGA v.7.0(Kumaretal., 2016).The haplotype diversity(Hd), haplotype number(h), and nucleotide diversity(π)were calculated using DnaSP v.5.0(Librado and Rozas, 2009).
Four mitochondrial genes including partialCOI(648 bp),Cytb(699 bp),NDI(393 bp)andND5(777 bp)were amplified and obtained, and the concatenated sequence of the four genes is 2 517 bp in length.The obviously AT-biased dataset with no indels or stop codons totally contained 18 haplotypes(A1-A18)(Table 3), among which six were unique haplotypes, each found only in one population, and the remaining twelve haplotypes were shared among two or more different populations.The most abundant haplotype A15, shared by 64 individuals, was distributed in 18 populations(Table 1).The calculated haplotype diversity(Hd)and nucleotide diversity(π)of the whole population were 0.677±0.048 and 0.00066±0.00007, respectively, with the haplotype diversity and nucleotide diversity of each geographic population shown in Table 1.
Table 3 Characteristics of haplotypes based on the concatenated sequence of mitochondrial genes COI, Cytb,NDI and ND5 of Colias fieldii populations in China
Our phylogenetic analyses based on the Dataset 1 showed that BI and ML trees harbored the same topologies.Both trees exhibited thatC.fieldiiof this study was made up of two major clades(I and II)with relatively strong supports(PP=1.0,BS=100)(Fig.1: A), which were compatible with the haplotype median-joining network analysis(Fig.1: B).Clade I contained 13 haplotypes from Shaanxi, Henan, Gansu, Anhui, Hubei, Sichuan, Qinghai, and some regions of Yunnan.Clade II consisted of five haplotypes from some regions of Yunnan and Tibet.Among 18 haplotypes, A7 and A2 of the population DL were detected in clades I and II, respectively.Clade I could be further divided into two subclades, each of which was not geographically specific, while clade II only contained one subclade(Fig.1: A).
Fig.1 Phylogenetic tree(A)of all the haplotypes of the concatenated sequence of mitochondrial genes COI, Cytb, NDI and ND5 of Colias fieldii populations in China inferred using both maximum likelihood(ML)and Bayesian inference(BI)methods and haplotype median-joining network(B)Bootstrap values(BS)lower than 50 and posterior probabilities(PP)lower than 0.5 were not shown.Haplotypes are represented as pie-diagrams with the slice-size proportional to the number of individuals as detailed in Table 1.A1-A18: Haplotypes 1-18.
The comparison between the fixed indicesNSTandGSTrevealed an obvious correlation between phylogeny and geography.TheNSTvalue(0.63323)was much larger than theGSTvalue(0.18432)(P<0.01), indicating the existence of a significant phylogeographic structure.The reconstructed haplotype median-joining network showed that the haplotype A7 was probably the ancestral haplotype of clade I, which subsequently diverged into two subclades(A3, A4, A7, A10 and A13; A5, A6, A8, A11, A9, A12, A15 and A1, respectively), with A3, A7 and A15 being the shared haplotypes by some populations.The haplotype A16 should be considered as the ancestral haplotype of clade II, with A14 and A18 shared each by two populations.Of all the populations in this study, populations from Yunnan had the highest number of haplotypes, with A15 being the most widely distributed one.
The results of analysis of AMOVA showed that the genetic variation among and within populations were 64.36% and 35.64%, respectively, indicating a larger population differentiation between the two clades ofC.fieldiiand a subtle differentiation within each clade(Table 4).Meanwhile, the genetic differentiation coefficient(FST)of the total population was 0.608, and the gene flow(Nm)of the total population was 0.160, indicating high population genetic differentiation(FST>0.250)and relatively limited gene flow(Nm<1.000).TheFSTandNmvalues between the two clades were 0.632 and 0.140, respectively.In addition, the calculated haplotype diversity(Hd)and nucleotide diversity(π)of clade I were 0.511±0.064 and 0.00041±0.00008, respectively, and those of clade II were 0.723±0.062 and 0.00046±0.00008, respectively.
Table 4 Analysis of molecular variance(AMOVA)of different Colias fieldii populations in China based on the concatenated sequence of mitochondrial genes COI, Cytb, NDI and ND5
Demographic analysis based on the concatenated sequence of mitochondrial genesCOI,Cytb,NDIandND5 showed that the two clades I and II ofC.fieldiipopulations have undergone different demographic histories.The un-unimodal mismatch distribution and non-significantly negative Tajima’sDvalue indicated that no obvious population expansion events happened in populations with haplotypes in clade I(Table 5).Demographic expansion of populations with haplotypes in clade II was confirmed by the significantly negative Fu’sFsvalue(Fs=-28.963,P=0), although Tajima’sDvalue(D=0.160,P>0.10)was non-significantly positive.The mismatch distribution was unimodal(Fig.2)and both SSD and HRI index tests failed to reject the hypothesis of a sudden expansion model(Ramos-Onsins and Rozas, 2002; Excoffier, 2004; Lietal., 2016).
Fig.2 Pairwise mismatch distributions of Colias fieldii populations in China with haplotypes in clades I(A)and II(B)based on the concatenated sequence of mitochondrial genes COI, Cytb, NDI and ND5
Our molecular dating results indicated that the first divergence between the subfamilies Coliadinae and Pierinae originated at about 86.22 Ma[95% confidence interval(CI)=83.33-89.27 Ma]in the Late Cretaceous(Santonian), and the two subfamilies began to diverge at about 64.75 Ma(95% CI=52.15-77.73 Ma)and 66.84 Ma(95% CI=55.29-77.87 Ma)during the Late Cretaceous and the early Paleocene, respectively.ForC.fieldii, the divergence time of the two evolutionary clades(clades I and II)occurred at about 0.48 Ma(95% CI=0.24-0.74 Ma)during the middle Pleistocene(Fig.3).Based on the value ofτand substitution rate of 0.73×10-9per site per year for four mitochondrial genes, the demographic expansion time of clade II was estimated at approximately 0.085 Ma during the Late Pleistocene.
Nucleotide and haplotype diversities can provide information about an organism’s population structure and history.Our results revealed thatC.fieldiihad a relatively high level of haplotype diversity(Hd>0.500)and a low level of nucleotide diversity(π<0.005)(Grant and Bowen, 1998).In addition, the values of haplotype and nucleotide diversities of clade I were lower than those of clade II, and the fact might be attributed to the genetic drift or inbreeding depression among clade I populations or individuals.Furthermore, the genetic differentiation between the two clades I and II was mainly associated with a wide range of ecological types, as well as the complex mountain barriers(niche variations)(Slatkin and Maddison, 1989; Wright, 1990).
The haplotype analysis showed that among all the geographic populations ofC.fieldii, the Yunnan population possessed the most haplotypes, including the unique and the most broadly distributed ones, such as haplotypes A7 and A15.The fact suggests that the southwestern Yunnan area probably be the diversification centre ofC.fieldiiin China, which is in agreement with the common view that the Hengduan Mountains is one of the most significant biodiversity hotspots in the world, with high levels of species richness and endemism(Yangetal., 2009; Cheetal., 2010; Songetal., 2010; Liaoetal., 2016).However, more genetic differentiation analyses are needed to better understand the way in which palaeogeographical and palaeoclimatic events shaped the distribution and phylogeographical patterns ofC.fieldiipopulations.The small sample size of this study is also one of the reasons for this result, so we need to examine more populations in the future.
Our phylochronological results(Fig.3)generally agreed with those of Brabyetal.(2005), Braby and Trueman(2006)and Caoetal.(2016), who determined the divergence between Coliadinae and Pierinae dated at about 86.34 Ma(95% CI=83.59-88.83 Ma)in the Late Cretaceous, but significantly differed from the results obtained by Dongetal.(2019), who provided dates of the same event at about 71.5 Ma(95%CI=93.6-49.4 Ma)(Late Cretaceous).We found that the major source of differences in dating the tree came from calibration points and their distribution in the tree.In this analysis, we adopted, in our calibration system(Fig.3), high-quality fossil dates ofPierisandAporiaand relevant dates from the host-plants(Brassicales)based on the insect-host plant coevolutionary scenario, which were likely better proxies approaching the true reference time frame.Therefore, the divergence time in this study might be earlier than the divergence time estimated without the restriction of host plant fossils.Although the time obtained in this study was slightly different from the time proposed by other scholars, there was still a large range overlap for the time estimates of 95% confidence interval.
Our results revealed that the two clades ofC.fieldiidiverged at about 0.48 Ma during the longest interglacial time after the Quaternary Wangkun glacial period(0.50-0.72 Ma, according to marine isotope stages, MIS 16-20)(Cuietal., 1998)and the Southeast Asia moonson happened about 0.57-0.48 Ma.This time of increasing warmness and wetness accompanied by mountain forest expansion might have driven some populations to disperse from the margin into the core area of QTP, which colonized at new suitable habitats.In addition, some populations later dispersed into other low-latitude areas due to the Quaternary glacial-interglacial cycle events and Southeast Asia monsoon.All these factors eventually caused the complete diversification of the two clades distributed at high-and low-altitudes.That is, the low-altitude populations should experience less isolation, while high-altitude populations were more isolated in general(Luetal., 2012).Additionally, the genetic differentiation ofC.fieldiiin this study caused by geographical isolation could be further reinforced and accumulated due to ecological divergence over time, and finally the two clades inhabited differently and evolved independently into two genetically diversified clades(Yanetal., 2010; Zhanetal., 2011).
Besides, an obvious phylogeographic structure ofC.fieldiidetected in this study is somewhat attributed to their relatively lower flight ability compared with other animals that harbor a stronger ability of movement(Grattonetal., 2008; Wheat and Watt, 2008).In these cases, such a low dispersal capacity tends to cause the gene flows to decrease continuously, thus promoting the genetic divergences among populations.
Previous studies have shown that compared with the drastic climate oscillations and glacial cycles in Europe and North America(Hewitt, 2004; Schmitt, 2007), a relatively mild Pleistocene climate and lack of ice sheets were presented in many areas of East Asia(Juetal., 2007).Population expansion of many species in Europe and North America usually occurred after LGM(Last Glacial Maximum), while the population expansions of many widespread species occurred before LGM in East Asia(Huangetal., 2010).Our study showed that the two clades ofC.fieldiimight have taken different pathways in their evolutionary histories: clade II experienced an obvious population expansion event at about 0.085 Ma in the late Pleistocene interglacial period before the LGM, while clade I did not show any signs of population expansion.Thus, the demographic expansion of clade II during this period was consistent with previous reports concerning some birds and mammals in East Asia, such as the Chinese bamboo partridgeBambusicolathoracica(Huangetal., 2010), Chinese hwameiLeucodioptroncanorum(Lietal., 2009), and tufted deerElaphoduscephalophus(Sunetal., 2016).
The high haplotype diversity and low nucleotide diversity also indicated that the clade II ofC.fieldiimight have undergone a rapid population expansion and accumulation of variation after the bottleneck effect.Accordingly, the discordance in evolutionary histories of the two clades ofC.fieldiicould be accounted for the differentiation of habitat distributions.During the extensive Pleistocene glaciation period, most regions of Qinghai and Tibet might have been heavily covered with ice, but the plain areas remained relatively ice-free, under which condition milder climate might have mitigated demographic stresses for the populations of plain areas relative to the extremes experienced by the QTP populations(Zhangetal., 2000; Quetal., 2010).Additionally, these different evolutionary histories were probably caused by the different environmental events they experienced, that is, during late Pleistocene interglacial period(MIS5), the reinforced winter monsoon and decreased heavy rainfall on the core plateau synergistically promoted the population expansion of clade II.Similar cases were also reported in birds, plants, and other insects(Wangetal., 2010; Leietal., 2014; Yeetal., 2016).