Structural Variation Analysis of Mutated Nannochloropsis oceanica Caused by Zeocin Through Genome Re-Sequencing

2018-08-28 09:07:36LINGenmeiZHANGZhongyiGUOLiDINGHaiyanandYANGGuanpin
Journal of Ocean University of China 2018年5期

LIN Genmei , ZHANG Zhongyi , GUO Li , DING Haiyan ,and YANG Guanpin ,

1) Laboratory of Marine Genetics and Breeding, Ocean University of China, Qingdao 266003, China

2) College of Marine Life Sciences, Ocean University of China, Qingdao 266003, China

3) Institutes of Evolution and Marine Biodiversity, Ocean University of China, Qingdao 266003, China

(Received August 15, 2017; revised March 20, 2018; accepted April 25, 2018)

© Ocean University of China, Science Press and Springer-Verlag GmbH Germany 2018

Abstract Zeocin can cause double strand breaks of DNA and thus may be employed as a mutagen. In this study, two strains of Nannochloropsis oceanica, the wild and the Zeocin-tolerant strains, were re-sequenced to verify such function of Zeocin. The results showed that Zeocin can mutate the N. oceanica genome and cause the structural variation. Zeocin either swept away or selected the alleles of genes functioning in ubiquitin-mediated proteolysis, alpha-linolenic acid metabolism, ascorbate and aldarate metabolism,ribosome biogenesis, and circadian rhythm, indicating that N. oceanica may have adjusted its metabolic performances for protein,carbohydrate, and lipid, and changed its ribosome biosynthesis and living rhythm to survive in Zeocin containing medium. In addition, Zeocin caused mutation may have influenced the expression of a set of transcription factors. It was concluded that Zeocin effectively caused the structural variation of the genome of N. oceanica, and forced the microalgae to select out the alleles of a set of genes around these variations in order to adapt to Zeocin containing medium. Further studies on the genetic basis of the phenotypic adaptation of this haploid and asexual microalga and the application of Zeocin to its genetic improvement are very important.

Key words Nannochloropsis oceanica; Zeocin; mutation; genome re-sequencing; structural variation

1 Introduction

Zeocin is a member of the bleomycin/phleomycin family isolated from Streptomyces verticillus. It has long served as a selection marker in microalgal genetic transformation (Jin et al., 2001; Kilian et al., 2011; Radakovits et al., 2012; Stevens et al., 1996; Zaslavskaia et al., 2000).Zeocin also causes double strand breaks (DSBs) of DNA(Berdy, 1980), thus can be a potential mutagen. However,such mutation function of Zeocin has not yet been proved.Genome re-sequencing is a method for unearthing the sequence and structure variations of a genome by comparing the new sequence with either the reference or that of the control. Zeocin-tolerant strain of Nannochloropsis oceanica has been isolated when it was mutated in Zeocin containing medium. By comparing the transcriptome of the wild and the Zeocin-tolerant strains, the mutagen function of Zeocin to N. oceanica has been verified (Lin et al., 2017). In this study, we re-sequenced the wild type and Zeocin-tolerant strains of N. oceanica, and observe directly the type and position of mutation caused by Zeocin, thus verified its mutagen function at DNA level. Our findings will help to understand the mutation mechanism of Zeocin, which can support its application in elite strain breeding of microalgae.

2 Materials and Methods

2.1 Strains

Two strains of N. oceanica, the wild type (WS) and Zeocin-tolerant (ZS), were used in this study. WS was provided by Key Laboratory of Mariculture of Chinese Ministry of Education, Ocean University of China, while ZS was mutated from WS by increasing Zeocin in f/2 medium gradually to the maximum of 10.0 µg mL-1(Lin et al., 2017). Before the sequencing, both strains were purified through two rounds of single colony culture, and then they were cultured in f/2 medium without Zeocin to obtain an appropriate amount of biomass for DNA isolation.

2.2 DNA Extraction and Sequencing

Genomic DNA was extracted using OMEGA HP Plant DNA Kit. The concentration and purity of DNA were checked using a NanoDrop 2000c Spectrophotometer(Thermo Fisher Scientific, USA). The quality of DNA was checked through electrophoresis in 1% agarose gel.

To build a shotgun library, high-quality genomic DNA was fragmented randomly with ultrasonic wave with the fragments within a length range from 150 bp to 800 bp retrieved by electrophoresis. After the ends were polished with T4 DNA polymerase, Klenow DNA polymerase and T4 polynucleotide kinase, the fragments were extended a base of A at their 3’ ends and then were ligated with adaptors with an extrusion of base T. The ligation product was purified through electrophoresis and amplified through PCR as a sequencing library. The library was sequenced on the Illumina Hiseq 4000 platform.

Soapfilter (v2.2) was used for sequence quality control.The raw reads with adaptors, poly-N (> 5%) and at low quality (> 50% bases with Qphred ≤10) were discarded,and the clean ones were stacked onto the genome of N.oceanica IMET1 (used as the reference) (https://www.ncbi.nlm.nih.gov/genome/13215?genome_assembly_id=2 92543; http://www.bioenergychina.org/fg/d.wang_genomes/,31.5 M in assembly size, 28.0 M in effective size and 47.8% in GC content) with BWA (v0.7012-r1039).

Three mutation types were recorded, which included single nucleotide polymorphism (SNP), insertion/deletion (InDel) and structural variation (SV). InDel specifically refers to the insertion and deletion of 1-50 bp while the SV mainly included deletion (DEL), inversion (INV),inter-chromosomal translocation (InterCT) and intrachromosomal translocation (IntraCT). Some SV may have resulted in copy number variation. SAMtools (v1.3) (Li et al., 2009; Li, 2011) and GATK (McKenna et al., 2010)were used for SNP and InDel calling while Breakdancer(v1.31) was used for SV calling.

2.3 Annotation

Genes around each SV were picked up and annotated against databases Nr (NCBI non-redundant protein sequences), GO (gene ontology, http://www.geneontology.org/) and KEGG (Kyoto encyclopedia of genes and genomes, http://www.genome.jp/kegg/). The annotated genes in this study and those consistently identified as differentially expressed genes previously (Lin et al., 2017) were used to characterize the mutated N. oceanica.

3 Results

3.1 Re-Sequencing Profiles of Genomes of ZS and WS

Re-sequencing generated 23.22 and 22.08 millions of raw reads for ZS and WS, and 18.82 and 18.59 millions clean reads (150 bp in length) were obtained for ZS and WS, respectively. Totally 94.61% and 94.21% of the clean reads of ZS and WS were stacked onto the nonalternative reference sequences, respectively. Of 42524 SNP and 9185 InDel identified between the reference and ZS and WS, only 162 SNP (0.38%) and 336 InDel (3.66%)existed between ZS and WS. In contrast, 1559 SV between the reference and ZS and WS were found, of which 1284 existed between ZS and WS (82.36%) (Table 1).Repairing the double strand breaks of DNA may yield SNP and InDel; however, these SNP and InDel may also exist between our strains and the reference genome. The majority of SNP and InDel should originate from the pronounced genomic difference between our strains and the reference, which implied that very rich variations existed among N. oceanica strains including the reference genome holder. It was preferred to believe that Zeocin mutated N. oceanica genome and caused the structural variation. The SNP and InDel between ZS and WS may associate with Zeocin tolerance; however we did not go further digging out the relationship between these SNP and InDel and Zeocin tolerance in this study because of their scarceness and the uncertainty of their relationship with Zeocin.

Table 1 The type and number of structural variation identified in ZS and WS

The shared SV should exist before the mutation and associate with Zeocin tolerance. The genes differing between ZS and WS either existed before the mutation and were unfavorable for surviving the Zeocin containing medium (WS-specific), or were muted by Zeocin and were essential for tolerating Zeocin (ZS-specific). In total,2153 genes around 1040 SV (66.71% of the total) functionally matched with the annotated ones of the reference genome. Of these genes, 930 ones around either ZS- or WS-specific SV (hereafter ZS-specific or WS-specific genes) were assigned into 22 GO terms of 3 major functional groups, including biological process, cellular component, and molecular function (Fig.1). Additionally, 755 genes were included in 114 KEGG pathways of 5 major branches, including cellular processes, environmental information processing, genetic information processing,metabolism, and organismal systems (Fig.2). In addition,261 genes around SV shared by ZS and WS were assigned into 119 GO terms, and 213 genes around SV shared by ZS and WS were assigned into 66 KEGG pathways. It was clear that Zeocin mutated and Zeocin tolerance associated genes involved in a wide range of cellular function and metabolic pathways.

3.2 Enrichments Analysis of ZS-Specific and WS-Specific Genes

Fig.1 GO terms into which the genes around ZS-specific and WS-specific SV were assigned.

Fig.2 KEGG pathways into which the genes around ZS-specific and WS-specific SV were assigned.

It was interesting to notice that 22 GO terms represented by 930 ZS-specific and WS-specific genes were all significantly enriched (P ≤ 0.05). Of them, 9 (0006812,0006811, 0022857, 0016820, 0005215, 0015075, 0022891,0015399 and 0015405) are related to transporter activity and transmembrane movement, 4 (0016887, 0042626,0043492, and 0015662) to ATPase activity, 3 (0030554,0032559, and 0005524) to ATP binding, 3 (0006511,0019941, and 0043632) to protein catabolic processes, 1(0031461) to cullin-RING ubiquitin ligase complex, 1(0006281) to DNA repair, and 1 (1902589) to single-organism organelle organization (Table 2). The enrichment revealed that Zeocin tolerance is associated with a wide range of cellular functions and metabolic processes.

Table 2 GO terms significantly enriched by ZS-specific and WS-specific genes

Among 114 KEGG pathways represented by 755 ZS-specific and WS-specific genes, 5 were significantly enriched (P ≤ 0.05, Table 3), with which the physiological mechanism underlining Zeocin tolerance of N. oceanica could be characterized. Zeocin either swept away (WS-specific) or selected (ZS-specific) the alleles of genes functioning in ubiquitin-mediated proteolysis, alpha-linolenic acid metabolism, ascorbate and aldarate metabolism, ribosome biogenesis, and circadian rhythm (Table 3).These findings indicated that N. oceanica may have adjusted its performances in the metabolism of protein,carbohydrate, and lipid, and changed its ribosome biosynthesis and living rhythm to survive Zeocin containing medium.

Table 3 KEGG pathways significantly enriched by ZS-specific and WS-specific genes

Except for 5 basal transcription factors enriched in KEGG pathways, a few more were also found, which may play important roles in DNA replication and gene transcription. These included the winged helix-turn-helix transcription repressor, a protein family which can bind DNA using interactions between the wings and the groove(Gajiwala and Burley, 2000), the transcription factor IIS(TFIIS) regulating the rate of specific elongation of transcription by RNA polymerase II and subsequently affecting the efficiency of transcription (Reinberg and Roeder,1987; Bengal et al., 1991). TFIIS is able to impact UV-inhibited transcription, localize the excision repair complex and remove the transcription blocking lesion caused by UV exposure (Jensen and Mullenders, 2010). Cys2His2 is a zinc finger protein belonging to a large family of DNA, RNA and proteins binding proteins, and key regulators of stress-responsive gene expression (Görner et al.,1998). Ap2 is an ethylene-responsive transcription factor,participating metabolism and growth, and responding to environmental stimuli (Licausi et al., 2013). The identification of these transcription regulating factor encoding genes brought us the opportunity of determining their relationships with the Zeocin tolerance of microalgae.

4 Discussion

Through the genome re-sequencing of a Zeocin mutated strain and a wild strain of N. oceanica, we found that Zeocin caused mainly the structural variation of N.oceanica genome as expected. This microalga could tolerate high Zeocin concentrations by sweeping off and selecting out the alleles of genes encoding proteins functioning in protein, carbohydrate, and lipid metabolism, as well as ribosome biosynthesis and life rhythm. With these physiological changes, either alone or in combination,many variants were obtained (Guo and Yang, 2015; Liang et al., 2017; Lin et al., 2017). Detail studies are needed to determine the genetic basis of phenotypic changes. Moreover, the DEG between ZS and WS were found mainly enriched in KEGG pathways involved in growth reduction (Lin et al., 2017), which was in accordance with what we found in this study. Our findings will also help the studies concerning these basic biological processes,and a set of mutants among the Zeocin tolerant cells will be isolated. N. oceanica is characterized by haploid and asexual reproduction (Pan et al., 2011). It will be an appropriate material for adaptive and evolutionary researches (Dragosits and Mattanovich, 2013).

A large quantity and a wide range of SNP and InDel were verified, indicating that many mutations existed among N. oceanica cells including the sequenced one.Zeocin causes mainly the double strand breaks of DNA,and it may further cause InDel in repairing DNA breaks.As expected, Zeocin functioned as a mutagen. It caused an extremely large proportion of SV that are specific for ZS. It also provided the selection pressure of Zeocin tolerance. A set of alleles of the genes around SV which are specific for WS were swept away.

The variations, SNP, InDel and SV may either be caused by Zeocin or co-exist between our strains and the reference before the re-sequencing. In the genome of N. oceanica (Pan et al., 2011), more than 3000 microsatellites are found, which are pronouncedly richer than those in the genome of Phaeodactylum tricornutum (about 30 in total) (Bowler et al., 2008). Such rich variations imply that the genome of N. oceanica is highly flexible. DNA repair and cellular response to DNA damage stimulus were found to be significantly enriched cellular processes during mutation, which may underline the easiness of structural modification of the genome of N. oceanica.

Acknowledgements

This study was funded by the National Natural Science Foundation of China (No. 31270408) and the National High Technology Research and Development Program(863 Program) of China (No. 2014AA022001).