Identification of overlapping differentially expressed genes in hepatocellular carcinoma,breast cancer,and depression by bioinformatics analysis

2022-10-13 13:53ZhanFangXieGangGangLi
Precision Medicine Research 2022年3期

Zhan-Fang Xie,Gang-Gang Li,2*

1Department of Pharmacy, The Fifth Affiliated Hospital of Zhengzhou University, Zhengzhou 450052, China. 2Marshall Medical Research Center, The Fifth Affiliated Hospital of Zhengzhou University,Zhengzhou 450052,China.

Abstract

Keywords: differentially expressed genes; hepatocellular carcinoma; breast cancer;depression;bioinformatics

Background

Patients with hepatocellular carcinoma (HCC) or breast cancer (BC)are more likely to experience depression than is the general population. Studies show that 10%-25% of cancer patients are depressed [1]. However, to date, systematic attempts have not been made to delineate the potential relationships between HCC, BC and depression with respect to differentially expressed genes (DEGs).

To better explain the potential genetic connections of HCC, BC and depression, we used DNA microarray data to analyze the potential overlapping DEGs between these conditions. Molecular biological gene expression analysis methods are increasingly considered promising tools for clinical application in medical oncology[2,3].Not surprisingly, DNA microarray analysis can be used for diagnostic purposes and to identify the similarities between multiple conditions through comparative analysis, especially when integrated with Gene Ontology (GO) enrichment, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and protein-protein interaction (PPI)network analyses [4].

Here, we evaluated the raw datasets from the Gene Expression Omnibus (GEO) database, namely GSE19665 (HCC), GSE65194 (BC)and GSE12654(depression).Each dataset was analyzed by GEO2R,an online tool for biomarker analysis in GEO datasets [5]. The PPI network, KEGG and GO pathway analysis were used to evaluate GEO series on microarray data of patients with HCC, BC or depression.The flow chart of our experimental program is presented in Figure 1.

Materials and Methods

Microarray data

It is assumed that gene regulatory networks for different diseases may share commonalities. At the start-up stage of this study, we screened gene expression profiles from the GEO database. Expression microarray datasets GSE19665(10 HCC samples,10 control samples),GSE65194 (14 BC samples, 11 control samples), and GSE12654 (11 depression samples,15 control samples) were included.

Identification of DEGs

The expression profiles of various genes were analyzed in GEO2R(https://www.ncbi.nlm.nih.gov/geo/geo2r/); only those with|log2(fold change (FC))| > 1.0 and adjustedP< 0.05 were considered as DEGs.Then, we listed DEGs of HCC, BC and depression,and we screened for overlapping DEGs to obtain the following combinations: HCC and depression; BC and depression; HCC and BC;HCC, BC and depression.

Volcano plot and heat map analyses

In Microsoft Office Excel 2017, the following formula was used to calculate the z-score for each data point in the population: (raw intensity-mean intensity/standard deviation) [6]. In the current research, we converted the raw microarray data of HCC, BC and depression to obtain corrected expression intensities. Then, their genes were represented by a single point on the volcano plot.Afterward, ShengXinRen software (https://shengxin.ren) was used to draw the heat map of highly expressed genes in HCC, BC and depression.

GO and KEGG analyses of DEGs

Figure 1 The flowchart of the overlapping DEGs in hepatocellular carcinoma, breast cancer and depression. FC, fold change; DEGs,differentially expressed genes; GO,Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes;PPI, protein-protein interaction.

To cluster the biological functions of the main hubs,we calculated and assessed the critical paths with the help of the Database Visualization and Integrated Discovery system (v.6.8:https://david.abcc.ncifcrf.gov/home.jsp) [7]. The OmicShare database (https://www.omicshare.com/) was used for the visual analysis of KEGG and GO for enrichment analysis. As a widely applicable method, the enriched biological terms of biological processes, molecular functions and cellular components were identified withP< 0.05. Also, KEGG provides large-scale data collection of biological systems and of interpretations and definitions of molecular-level functions [8]. Consequently, we selected the KEGG pathway analysis through the OmicShare database to execute functional annotation on HCC and BC overlapping DEGs.

PPI network analysis

In general, PPI data is from the STRING database(https://string-db.org/), which contains most of the known human information on PPI [9]. We set the protein type as “Homo sapiens” to obtain the target interaction network map, which was saved as a TSV file. The graphical interactions in the network were visualized with the software Cytoscape(v.3.2.1:https://www.cytoscape.org/)[10].In order to ensure a high degree of confidence, the DEGs with the lowest interaction score were set to 0.9. Further, nodes not connected to the main network were excluded in order to reduce the false detection rate.

Statistical analyses

Univariate Cox regression analysis was applied, whereP<0.05 was considered as statically significant.

Results

Identification of DEGs

RNA-binding protein with multiple splicing; GATA binding protein 6;histone cluster 1, HIST1H2AM; WAS/WASL interacting protein family member 1;angiotensin II receptor type 1(AGTR1);ETS variant 6;TNF receptor superfamily member 17; aurora kinase A; integrin subunit alpha 6; neurofibromin 1 (NF1) and fibrinogen like protein 2 were identified as overlapping DEGs related with HCC, BC and depression(Figure 2A).

Other than these 11 genes, ankyrin repeat and SOCS box containing 4,KIF25 antisense RNA 1, thyroid stimulating hormone receptor,corticotropin-releasing hormone binding protein (CRHBP),IL2-inducible T-cell kinase, FYN binding protein, G protein-coupled receptor 171 and natural cytotoxicity triggering receptor 1 were listed as the overlapping DEGs between HCC and depression (Figure 2A).Transcription factor AP-2 beta; glutaredoxin 3; zinc finger RNA-binding protein;lysine acetyltransferase 2B; Cbl proto-oncogene;phosphodiesterase 5A; elongator acetyltransferase complex subunit 4;iron-responsive element-binding protein 2; ADAM metallopeptidase domain 12; natural killer cell triggering receptor; MYC induced nuclear antigen; putative homeodomain transcription factor 1;activating transcription factor 3; mucin 1, cell surface associated;dickkopf WNT signaling pathway inhibitor 1 were overlapping DEGs between BC and depression (Figure 2A). Of note, 1942 DEGs of HCC and BC were identified, among which 1931 overlapped directly with HCC and BC, and 11 overlapped with HCC, BC and depression.

Figure 2 Venn diagram and Volcano plot. (A) Venn diagram based on overlapping DEGs of HCC, BC, depression and controls. (B) Volcano plot representing the DEGs of HCC (GSE19665),satisfying the criteria of|log2(FC)| >1.0 and P <0.05; (C) Volcano plot representing the DEGs of BC(GSE65194),satisfying the criteria of|log2(FC)|>1.0 and P <0.05;(D)Volcano plot representing the DEGs of depression(GSE12654),satisfying the criteria of|log2(FC)|>1.0 and P <0.05.Genes with significant expression are indicated by red and green dots.DEGs,differentially expressed genes; HCC, hepatocellular carcinoma; BC,breast cancer; FC,fold change.

Gene expression profiling

The volcano plot based on the genes after the filtering procedures showed that of the 31,051 genes simultaneously compared between these 10 normal samples and 10 HCC samples, a set of 1,091 genes was found to be dysregulated, according to|log2(FC)| >1.0 andP<0.05 (Figure 2B). Of the 25, 186 genes simultaneously compared between these 11 normal samples and 14 BC samples, 8,560 genes were found to be dysregulated (Figure 2C). Of the 12,625 genes simultaneously compared between these 15 normal samples and 11 depression samples, 71 genes were found to be dysregulated (Figure 2D). The majority (n = 904) of the DEGs found in HCC were down-regulated and 187 gene transcripts were up-regulated in the liver. The heat map of DEGs of the 26 most up-regulated and down-regulated genes in HCC was illustrated in Figure 3A. Besides,5,278 up-regulated and 3,282 down-regulated genes were discovered in BC.The heat map of DEGs in BC is shown in Figure 3B.In addition,37 up-regulated differential genes and 34 down-regulated genes were found in depression. The heat map of DEGs in depression is shown in Figure 3C.

GO term enrichment analysis

The enrichment results of GO showed that DEGs were significantly enriched in biological processes, including metabolic processes,biological regulation, and cellular processes (Figure 4). For cellular component analysis, organelle, membrane and protein-containing complexes were significantly enriched (Figure 4). In addition,molecular function analysis revealed that the DEGs were enriched in binding,catalytic activity and molecular function processes(Figure 4).The GO term enrichment results all met the inclusion criteria of false discovery rate <0.01 andP<0.01.

Figure 3 Heat map (top 26 up-regulated and down-regulated genes). Higher values represent up-regulation. Genes were found to be dysregulated by |log2(FC)|>1.0 and P<0.05. (A) DEGs expression heat map in HCC; (B) DEGs expression heat map in BC; (C) DEGs expression heat map in depression.HCC, patients with hepatocellular carcinoma; BC,patients with breast cancer; D,depression; C, control subjects.

Figure 4 GO analysis of the overlapping DEGs between HCC and BC by DAVID and visualized by Omicshare. P <0.05 was treated as a significant pathway for further analysis. GO, Gene Ontology; DEGs, differentially expressed genes; HCC, hepatocellular carcinoma; BC, breast cancer; DAVID, Database Visualization and Integrated Discovery.

KEGG pathway analysis

The analysis results of the top 20 standards of the KEGG pathway are shown in Figure 5.The DEGs were enriched in the cell cycle pathway,p53 signaling pathway, Staphylococcus aureus infection, HTLV-I infection, PI3K-AKT signaling pathway, cell adhesion molecules(CAMs), an intestinal immune network for IgA production, FoxO signaling pathway, viral myocarditis, melanoma, Rap1 signaling pathway, rheumatoid arthritis and proteoglycans in cancer.

PPI network analysis

To screen out proteins with significant interaction, we selected the minimum requirement of interaction score ≥0.9 in the STRING database as the primary criterion. Furthermore, the high-confidence PPI networks were mapped in Cytoscape (Figure 6). This PPI network data constructed in the STRING platform were imported into Cytoscape. The top 10 significant hub nodes were screened out by descending order of degree value,including cyclin-dependent kinase 1(CDK1), cyclin B1 (CCNB1), cyclin B2 (CCNB2), cyclin A2 (CCNA2),MAD2 mitotic arrest deficient-like 1 (MAD2L1), aurora kinase B(AURKB),baculoviral IAP repeat containing 5, histone cluster 2,H2be(HIST2H2BE), extra spindle pole bodies like 1 (ESPL1) and H2B histone family member S (H2BFS).

Discussion

HCC is one of the most common malignant tumors and is extremely harmful to human health [11, 12]. In clinical practice, the incidence and recurrence rates of depression are higher in HCC and BC patients[13]. Recent advances in molecular epigenetics have found that depression increases the risk of HCC via epigenetic downregulation of hypocretin [14]. Stimulating the hypothalamic-pituitary-adrenal(HPA) axis also results in the upregulation of pro-inflammatory cytokines,which may induce depression-like behaviors in HCC and BC[15]. The progression of cancer also results in the release of pro-inflammatory cytokines, which leads to the stimulation and hyperactivation of the HPA axis [16]. Our results showed that the number of overlapping DEGs between HCC and BC was significantly higher than those between HCC, BC and depression.

Figure 5 KEGG analysis of the overlapping DEGs between HCC and BC by DAVID and visualized by Omicshare. P <0.05 was treated as a significant pathway for further analysis. DEGs,differentially expressed genes; HCC, hepatocellular carcinoma; BC, breast cancer; DAVID, Database Visualization and Integrated Discovery.

Figure 6 Constructed PPI network for the overlapping DEGs between HCC and BC with PPI scores ≥0.9 in the STRING database. The top 10 hubs are indicated as pink dots. DEGs, differentially expressed genes; HCC, hepatocellular carcinoma; BC, breast cancer; PPI, protein-protein interaction. STRING, search tool for the retrieval of interacting genes/proteins.

Functional annotation suggested that the DEGs between HCC and BC were predominantly related to biological regulation, cellular and metabolic processes, and stimulus-response. Similar to the results of KEGG analysis, these results confirmed the correlation between genes related to cellular processes. Based on the interaction score, most of the 1931 DEGs were found to be interconnected in the PPI network.The top 10 hub nodes with higher degrees were screened in the following pathways: i) cell cycle-related pathways, including cellular senescence,gap junction and p53 signaling pathway;ii)cancer-related pathways, including JAK-STAT pathways and extracellular matrix-receptor pathways; iii) other pathways, including the MAPK CAM and FoxO signaling pathways.

In the overlapping DEGs related to HCC, BC and depression, NF1 was found to be closely associated with several key cancer-associated proteins, including Ras, SOS and integrin subunit alpha 6, which are abnormally overexpressed in various types of cancer. In this study,NF1 appears to be a negative regulator of the Ras signal transduction pathway. Specifically, as one of the 20 most strongly down-regulated genes in HCC, thyroid stimulating hormone receptor was recognized as an overlapping DEG between HCC and depression [17]. The differential expression of CRHBP has been reported to be significantly associated with a higher risk of suicidal behavior [18]. Interestingly,CRHBP has also been shown to be related to HCC [19]. In addition,the PTEN, FoXO, PI3K-AKT/mTOR and MAPK pathways have also been confirmed to be related to[20-23].It has been previously shown that histone cluster 1, HIST1H2AM is significantly associated with depression and can induce deacetylation of PTEN[24].

Although recent studies have shown that cell cycle signaling is associated with cell cycle arrest and apoptosis, we still need to clarify their role in HCC and BC. To our knowledge, overexpression of wild-type or mutant p53 can down-regulate Bcl-2 expression, thereby leading to apoptosis [25]. Understanding the molecular interface between cell cycle arrest and apoptosis after cell cycle activation is essential. Interestingly, AGTR1, the gene encoding the angiotensin II receptor, was also identified as an overlapping DEG between BC and depression. AGTR1 was found to be related to depression and frontotemporal morphology differences, and it is a molecular marker to distinguish benign from malignant BC [26, 27]. These results corroborate those of our study.

PPI network analysis revealed that CDK1 and CCNB1 were the most significant hub proteins. The CDK1 and CCNB1 genes encode CDK1 and CyclinB1 proteins, and they interact to participate in centrosome duplication, chromosome segregation, and cell cycle regulation [28].In addition, the hub proteins AURKB and HIST2H2BE have been shown to be closely related to cancer [29], and the PPI network was shown to interact with CCNB2, CCNA2, MAD2L1, ESPL1 and H2BFS.Our results suggested that the top 10 proteins might be related to metabolic processes,the cell cycle,CAMs,and the PI3K-AKT and FoxO signaling pathways, consistent with the GO analysis. These results suggest an underlying connection between HCC, BC and depression.

Cancer patients are more likely to be depressed than the general population, which contributes to the poor prognosis of cancer patients. However, psychosocial stress associated with diagnosis fails to fully explain the increase in the prevalence of depression in cancer patients[30].Hormones and their receptors,transporter proteins,and enzymes play an intricate role in the endocrine system, including the HPA axis [31]. In addition, depression is related to elevated plasma interleukin-6 (IL-6) in cancer patients [32]. These patients exhibit HPA axis dysfunction, which is characterized by cortisol decline [33].

Therefore, IL-6 and cortisol become useful tools for diagnosing depression in cancer patients, for the parameters of high sensitivity and specificity. Furthermore, in the regulation of the hypothalamic-pituitary-gonadal (HPG) axis, depression and estrogen also have a significant relationship. Moreover, changes to the HPG axis during stress are specific and depend on the type and duration of the stimulus [34]. Accordingly, hormonal imbalance acting on the HPA and HPG axes may adversely affect liver, breast and/or cognitive processes.Unfortunately, we could not perform experimental research for probing potential oncogenic mechanisms of HCC, BC and depression.We don’t have follow-up data for patients.Two limitations exist for this approach. First, this is an article about public dataset analysis, lacking sequencing data by collecting samples. Lack of external dataset validation of the manuscript. Second, it is better to use R software to personalize the modificated parameters in the future. In conclusion, these results indicate that the genetic profile of HCC, BC and depression, may yield valuable insights for our understanding of the underlying mechanisms underlying synergistic genetic regulation and may provide novel ideas for developing homeopathic therapies based on bioinformatics data.