Association of long non-coding RNA HOTAIR and MALAT1 variants with cervical cancer risk in Han Chinese women

2019-11-19 02:02MeiqunJiaLuluRenLingminHuHongxiaMaGuangfuJinDakeLiNiLiZhibinHuDongHang
THE JOURNAL OF BIOMEDICAL RESEARCH 2019年5期

Meiqun Jia, Lulu Ren, Lingmin Hu, Hongxia Ma, Guangfu Jin, Dake Li, Ni Li,Zhibin Hu,7,✉, Dong Hang,✉

1Department of Epidemiology, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China;

2Department of Gynecologic Oncology, the Affiliated Tumor Hospital of Nantong University (Nantong Tumor Hospital), Nantong, Jiangsu 226361, China;

3Department of Reproduction, the Affiliated Changzhou Maternity and Child Health Care Hospital of Nanjing Medical University, Changzhou, Jiangsu 213003, China;

4Jiangsu Key Lab of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Medicine, Nanjing Medical University, Nanjing 211166, China;

5Department of Gynecologic Oncology, Nanjing Maternity and Child Health Hospital, Nanjing, Jiangsu 210004,China;

6Program Office for Cancer Screening in Urban China, National Cancer Centre/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100021, China;

7State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, Jiangsu 211166, China.

Abstract Long noncoding RNA (lncRNA) HOTAIR and MALAT1 are implicated in the development of multiple cancers. Genetic variants within HOTAIR and MALAT1 may affect the gene expression, thereby modifying genetic susceptibility to cervical cancer. A case-control study was designed, including 1 486 cervical cancer patients and 1 536 healthy controls. Based on RegulomeDB database, 11 SNPs were selected and genotyped by using Sequenom's Mass ARRAY. Univariate and multivariate logistic regression models were used to calculate the odds ratio (OR) and 95% confidence interval (CI). We found that the A allele of rs35643724 in HOTAIR was associated with increased risk of cervical cancer, while the C allele of rs1787666 in MALAT1 was associated with decreased risk. Compared to individuals with 0-1 unfavorable allele, those with 3-4 unfavorable alleles showed 18% increased odds of having cervical cancer. Our findings suggest that HOTAIR rs35643724 and MALAT1 rs1787666 might represent potential biomarkers for cervical cancer susceptibility.

Keywords: cervical cancer, variant, long noncoding RNA, HOTAIR, MALAT1

Introduction

Cervical cancer is the fourth most common malignancy and the fourth leading cause of cancerrelated deaths in global women[1]. The annual estimation of newly diagnosed cervical cancer reaches approximately 527 600, and more than 85% occur in the developing countries[2]. Cervical infection with high-risk human papillomavirus (HPV) has been recognized as the major risk factor for the disease[3].Although 80% of women are likely to acquire HPV infections during their lifetime, most of the infections regress without intervention. Less than 4% of the infections become persistent, and even fewer lead to premalignant lesions and cancer[4-5]. Previous studies demonstrated that a family history of cervical cancer was associated with cervical cancer risk, supporting the critical role of genetic susceptibility in cervical carcinogenesis[6].

Genome-wide association studies (GWAS) that scan the entire genome for common genetic variants have identified over 450 SNPs associated with the susceptibility to different types of cancer[7]. Of note,only 7% of these loci are located in protein-coding regions but 93% in noncoding regions[8-9]. For cervical cancer, three GWAS analyses were undertaken,showing that several genetic variants within immunerelated genes (e.g., MICA and HLA) were associated with the risk of cervical cancer[10-12]. The identified variants only explained a small fraction of heritability of cervical cancer. Although candidate gene association studies also reported associations between cervical cancer and SNPs involved in DNA repair, cell cycle, apoptosis, and other processes, the findings were inconsistent[13]. Therefore, further exploration of novel susceptibility biomarkers is needed to improve the identification of individuals at higher risk[14].

Long noncoding RNAs (lncRNAs) are noncoding transcripts that are longer than 200 nucleotides and have been described as the largest subclass in the noncoding transcriptome in humans[15]. Accumulating evidence shows that Hox transcript antisense intergenic RNA (HOTAIR) plays an oncogenic role in tumorigenicity of different cancers, such as gastric,colorectal, breast, and cervical cancer[16]. In addition,deregulated expression of HOTAIR in HPV16-positive cervical cancer was shown to be E7-dependent, suggesting that HOTAIR could be a potential target of HPV16 E7 oncoprotein[17]. It was also reported that genetic variations within HOTAIR might affect the interaction with HPV16 E7[18].Another lncRNA, termed as metastasis-associated lung adenocarcinoma transcript 1 (MALAT1), was first found in association with lung cancer[19]. A prior study showed that MALAT1 expression was upregulated in cervical cancer cells compared with that in normal cells, and MALAT1 might interact with HPV, promoting cell proliferation and migration[20].Therefore, HOTAIR and MALAT1 may serve as two important regulators in cervical cancer development[21].

Genetic variants within HOTAIR have been linked with the susceptibility to multiple cancers including gastric cancer[22], breast cancer[23], and ovarian cancer[24]. Our previous study also suggested that functional variants within specific lncRNAs might modify the risk of cervical cancer[25]. However, no prior study has evaluated the association of functional variants of HOTAIR and MALAT1 with risk of cervical cancer.

Therefore, to extend our knowledge, we screened potentially functional SNPs within HOTAIR and MALAT1 based on RegulomeDB database which includes high-throughput, experimental evidence for genetic variants[26]; and genotyped 11 SNPs with a case-control study of 1 486 cervical cancer cases and 1 536 age-matched healthy controls from a Han Chinese population.

Materials and methods

Study population

This study was approved by the institutional review board of Nanjing Medical University and all participants provided written informed consent before enrollment. The criteria for participants' enrollment were described previously[25]. In brief, a total of 1 486 newly diagnosed and histologically confirmed cervical cancer patients were consecutively recruited from the First Affiliated Hospital of Nanjing Medical University and the Nantong Tumor Hospital in Jiangsu Province from March 2006 to December 2010. All cases were histologically confirmed and those having a history of cancer or having metastasized cancer from other organs were excluded.The 1 536 controls were randomly selected from a pool of more than 30 000 individuals who participated in a community-based screening program for noninfectious diseases in Changzhou, Jiangsu Province during the same period. All controls reported no history of cancer and were matched to the cases by age (±5 years). All participants were unrelated ethnic Han Chinese and were interviewed to complete a standardized questionnaire for collecting information on demographic data, menstrual and reproductive history, and environmental exposure history such as smoking and alcohol drinking. After the interview,approximately 5 mL of venous blood sample was collected from each participant.

SNP selection

We used RegulomeDB (http://regulome.stanford.edu), which includes high-throughput, experimental data sets from the Encyclopedia of DNA Elements(ENCODE) and other sources, to identify putative functional variants. RegulomeDB presents a scoring system with categories ranging from 1 to 6 based on the degree of the experimental and computational functional consequence of a given variant, and the lower score indicates the stronger evidence for a variant to be located in a functional region. We selected the SNPs ranging from 20 kb upstream and downstream of HOTAIR and MALAT1, and included the SNPs with RegulomeDB scores ranging from 1 to 3a. Based on the criteria of minor allele frequency (MAF) >0.05 and linkage disequilibrium (LD) <0.8 in Han Chinese,we found a total of 14 potentially functional SNPs.

Genotyping and quality control

Genomic DNA was isolated from leukocyte pellets of venous blood by standard phenol-chloroform method. Candidate SNPs were genotyped by using Sequenom's Mass ARRAY® iPLEX assay according to the manufacturer's instructions. Genotyping was performed blindly without knowing the status of cases and controls. Due to the failure of primer design, three SNPs were excluded and 11 SNPs were successfully genotyped with a call rate >90%. Two water controls in each 384-well plate were used as blank controls for quality control and approximately 20% of samples were randomly selected to be repeated, yielding a concordance rate of >99%. The samples with overall genotype completion rates <90% were excluded,leaving 1 356 cervical cancer cases and 1 496 healthy controls in the final analysis.

Statistical analysis

The difference in demographic characteristics between cases and controls was evaluated by χ2test for categorical variables and Student'st-test for continuous variables. Deviation of genotype distribution from the Hardy-Weinberg equilibrium for each SNP was tested by a goodness-of-fit χ2test among controls. The associations between SNPs and cervical cancer risk were estimated by odds ratios(OR) and 95% confidence intervals (CI) from univariate and multivariate logistic regression analyses in three genetic models (additive, dominant,and recessive). Each model makes different assumptions about the genetic effect, as described elsewhere[27]. The χ2-based Q test was used to assess the heterogeneity between subgroups. The statistical analyses were performed using SPSS 17.0 and PLINK software. All tests were two-sided andP<0.05 was defined as statistically significant.

Results

Table 1shows the characteristics of 1 356 cervical cancer cases and 1 496 healthy controls. We found that cases had lower menarche age (P<0.01), higher parity (P<0.01), and higher proportion of smoking(P<0.001) than controls. No statistically significant difference was observed in the distribution of age,menopausal status, and family history of cervical cancer between cases and controls.

Among 11 genetic variants, rs1194337 was excluded as it was not in Hardy-Weinberg equilibrium among the controls (P<0.01). As shown inTable 2, by using multivariate logistic regression, we found that the A allele of rs35643724 in HOTAIR was significantly associated with increased risk of cervical cancer (additive model: adjusted OR=1.13, 95%CI=1-1.28,P=0.047). InTable 3, the C allele of rs1787666 in MALAT1 was associated with decreased risk of cervical cancer (additive model:adjusted OR=0.86, 95% CI=0.76-0.97,P=0.01).

The combined effect of these two variants on cervical cancer risk was further analyzed. InTable 4,we found a dosage effect of rs35643724 and rs1787666 on cervical cancer risk (Pfor trend=0.001).Individuals with 3-4 unfavorable alleles had 18%higher odds of having cervical cancer than those with 0-1 unfavorable allele (additive model: adjusted OR=1.18, 95% CI=1.06-1.3,P=0.002).

We further performed stratified analysis by age,menarche age, parity, and menopausal status. As shown inTable 5, the association for rs35643724 was statistically significant in the subgroup of menarche age >16 (Pfor heterogeneity=0.01). Also, we observed that the pathogenic effects of rs35643724 A allele were prominent in the subgroups of parity 0-1, nonsmokers, and no family history of cancer; the association for rs1787666 was statistically significant in the subgroups of menarche age >16, parity 0-1, premenopause, nonsmokers, and family history of cancer.However, heterogeneity tests showed nonsignificant difference between those subgroups.

Discussion

In the current study, we systematically investigated the association of potentially functional variants within lncRNA HOTAIR and MALAT1 with cervical cancer risk. We found that HOTAIR rs35643724 was associated with increased risk of cervical cancer and MALAT1 rs1787666 was associated with decreased risk, suggesting that these two variants might serve as novel susceptibility biomarkers for cervical cancer.

Previous studies have reported several HOTAIR variants (e.g., rs920778, rs7958904, and rs874945) are associated with risk of different cancers, such asesophageal cancer, colorectal cancer, gastric cancer,and breast cancer[28]. A prior case-control study including 510 cervical cancers and 713 healthy controls found an association between HOTAIR rs920778 and cervical cancer[29], and another study of 1 209 cases and 1 348 controls identified HOTAIR rs7958904 in relation with cervical cancer[30]. Unlike these two studies with tagging SNPs, we screened potentially functional SNPs by using RegulomeDB and provided the first evidence that HOTAIR rs35643724 might be a causal variant associated with cervical cancer risk. We observed no strong LD between rs35643724 and those reported SNPs. On the other hand, several MALAT1 SNPs (rs1194338,rs4102217, rs619586, and rs618586) have been linked with colorectal cancer, hepatocellular cancer, and breast cancer, suggesting a critical role of MALAT1 variants in cancer susceptibilty[31-33]. We screened potentially functional variants within MALAT1 and first found that rs1787666 was associated with the susceptibility to cervical cancer.

Table 1 Demographic and clinical characteristics of 1 356 cervical cancer patients and 1 496 healthy controls

The RegulomeDB score of rs35643724 was 2b andChIP-seq data suggested that the SNP was likely to affect the binding of EZH2 protein. Overexpression of EZH2 is closely related to FIGO stage, lymph node metastasis, and a poor prognosis of cervical cancer[34-36]. Future studies are needed to confirm the functional effects of rs35643724 on the EZH2 binding and HOTAIR expression. On the other hand, the RegulomeDB score of rs1787666 was 1f and ChIPseq data indicated that the SNP was likely to affect the binding of multiple proteins to MALAT1. Previous studies showed that the expression of MALAT1 in cervical cancer tissues was significantly higher than in adjacent normal tissues[37-38]. Therefore, we speculated that rs1787666 might lower the expression of MALAT1, thereby reducing the risk of cervical cancer.

Table 2 Association of variations within lncRNA HOTAIR and cervical cancer risk

Table 3 Association of variations within lncRNA MALAT1 and cervical cancer risk

Table 4 Combined effect of rs35643724 and rs1787666 on cervical cancer risk

This study has several limitations. First, although we performed a relatively large study, more subjects could be available for the validation. Second, although we provided functional evidence by using bioinformatics tools, functional studies need to be conducted to validate the biological effects of the two SNPs. Third, we did not collect cervical tissue specimens to detect HPV infections among the participants. It remains unknown whether the identified associations would be modified by the exposure to HPV infection.

In conclusion, our study provided that HOTAIR rs35643724 was associated with higher risk of cervical cancer, while MALAT1 rs1787666 was associated with lower risk, suggesting the role of long noncoding RNAs in cervical carcinogenesis. Further studies are warranted to confirm the biological function of the variants.

Acknowledgments

This study was supported by National Natural Science Foundation of China (81502873), the Natural Science Foundation of Jiangsu Province (BK20150997),Priority Academic Program for the Development of Jiangsu Higher Education Institutions (Public Health and Preventive Medicine), Innovation Fund of State key Laboratory of Reproductive Medicine (SKLRMGC201802), Clinical Medicine Research Fund of the Chinese Medical Association (17020420711) and Top-notch Academic Programs Project of Jiangsu Higher Education Institutions (PPZY2015A067).