Lei Wu, Cong Wang, Xianzheng Tan, Zixuan Cheng1,, Ke Zhao1,, Lifen Yan, Yanli Liang,Zaiyi Liu, Changhong Liang1,
1School of Medicine, South China University of Technology, Guangzhou 510006, China; 2Department of Radiology, Guangdong General Hospital,Guangdong Academy of Medical Sciences, Guangzhou 510080, China; 3School of Automation Science and Engineering, South China University of Technology, Guangzhou 510641, China
Abstract Objective: To predict preoperative staging using a radiomics approach based on computed tomography (CT)images of patients with esophageal squamous cell carcinoma (ESCC).Methods: This retrospective study included 154 patients (primary cohort: n=114; validation cohort: n=40) with pathologically confirmed ESCC. All patients underwent a preoperative CT scan from the neck to abdomen. High throughput and quantitative radiomics features were extracted from the CT images for each patient. A radiomics signature was constructed using the least absolute shrinkage and selection operator (Lasso). Associations between radiomics signature, tumor volume and ESCC staging were explored. Diagnostic performance of radiomics approach and tumor volume for discriminating between stages I−II and III−IV was evaluated and compared using the receiver operating characteristics (ROC) curves and net reclassification improvement (NRI).Results: A total of 9,790 radiomics features were extracted. Ten features were selected to build a radiomics signature after feature dimension reduction. The radiomics signature was significantly associated with ESCC staging (P<0.001), and yielded a better performance for discrimination of early and advanced stage ESCC compared to tumor volume in both the primary [area under the receiver operating characteristic curve (AUC): 0.795 vs. 0.694,P=0.003; NRI=0.424)] and validation cohorts (AUC: 0.762 vs. 0.624, P=0.035; NRI=0.834).Conclusions: The quantitative approach has the potential to identify stage I−II and III−IV ESCC before treatment.
Keywords: Esophageal cancer; tumor staging; diagnostic imaging; tumor volume
Esophageal cancer (EC), the eighth most frequent malignant disease and the sixth most prevalent cause of disease associated deaths worldwide, had an estimated 456,000 new cases and 400,000 deaths in 2012 (1). EC prognosis is strongly associated with the stage at diagnosis.Most of the patients diagnosed with EC tend towards the locally advanced stage and the 5-year survival rate is very low (less than 20%). However, for early stage (stage I−II)patients, the survival rate could be up to 85% (2,3).Additionally, surgical resection, chemoradiation or other optimal therapeutic approaches depend on accurate preoperative staging (4). Therefore, accurate preoperative staging is important for predicting prognosis and choosing a suitable therapeutic strategy for patients with EC.
In clinical practice, computed tomography (CT) is universally used for preoperative diagnostics and remains the mainstay for preoperative staging of EC. However,because of poor contrast resolution of the esophageal wall,it is difficult to distinguish the different histologic layers in CT. CT is mainly used to evaluate regional spread into the adjacent organization (T4) and distant metastases (M1)(5-7). Recently, the study of radiomics has become a hot field. Radiomics, a noninvasive, quantitative and low-cost approach, can objectively and comprehensively evaluate tumor heterogeneity by extracting high-throughput quantitative features from medical images through data characterization algorithms (8,9). These features have the potential to reveal disease characteristics and provide valuable information for personalized therapy (10,11).Some previous studies have shown that clinical parameters merged into quantitative radiomics features as a predictive biomarker or radionics signature could enhance predictive accuracy in oncology (12-14). Through texture analysis,some studies have shown that several texture features of the tumor, such as entropy and uniformity, were associated with an early or advanced stage in EC (15,16). However,using multiple imaging biomarkers to predict the stage of EC based on CT images has been unexplored. To our knowledge, there have been no reports about whether a radiomics approach could predict the stage of EC based on CT images.
We hypothesized that the radiomics approach could be helpful in differentiating stage I−II from stage III−IV EC.Tumor volume is an important independent prognostic indicator that was extensively researched (17,18). We excluded tumor volume from radiomics features and separately analyzed the discrimination performance for stage I−II and III−IV patients. Therefore, the purpose of this study was mainly to investigate the feasibility of preoperative staging using a radiomics approach based on CT images in patients with esophageal squamous cell carcinoma (ESCC).
Institutional Review Board approval of Guangdong General Hospital, Guangdong Academy of Medical Sciences was obtained, and written informed consent was waived by the Institutional Review Board. The data of this study were collected from the Institutional Picture Archiving and Communication System for patients with ESCC who underwent radical surgery at Guangdong General Hospital, Guangdong Academy of Medical Sciences between January 2008 and August 2016. All the patients underwent an enhanced CT scan from the neck to the abdomen.
A total of 211 consecutive patients were initially enrolled in this study, of which 57 patients were excluded according to the following exclusion criteria: 1) clinicopathological information was incomplete (n=27); 2) pathologically diagnosed with adenocarcinoma (n=2); 3) cases with unknown histological grade (n=3); or 4) absence of preoperative contrast-enhanced CT (n=25). The remaining 154 patients were included in this study and met the following inclusion criteria: 1) pathologically confirmed ESCC; 2) underwent radical surgery for ESCC; 3) standard contrast-enhanced CT was performed within 1−2 weeks prior to surgery; and 4) complete clinical information was available. We randomly divided the data into primary cohort and validation cohort by a ratio of about 3:1. We trained models in the primary cohort and then validated models in the validation cohort. Tumor staging was performed according to the American Joint Committee on Cancer TNM Staging System Manual, 8th Edition (19).The clinicopathologic characteristics of patients in the primary and validation cohorts are presented inTable 1.
All patients underwent non-enhanced and contrastenhanced CT of the esophagus performed using a 64-channel multi-detector CT scanner (LightSpeed VCT, GE Medical Systems, Milwaukee, Wis, USA). The acquisition parameters were as follows: 120 kV; 160 mAs; 0.5-second rotation time; detector collimation, 64 × 0.625 mm; field of view, 350 mm × 350 mm; and matrix, 512 × 512. After routine non-enhanced CT, contrast-enhanced CT was performed after a 25-second delay following intravenous administration of 85 mL of iodinated contrast material(Ultravist 370; Bayer Schering Pharma, Berlin, Germany)at a rate of 3.0 mL/s with a pump injector (Ulrich CT Plus 150, Ulrich Medical, Ulm, Germany). All images were reconstructed with a thick slice of 5.0 mm.
Although the CT images were obtained from the same scanner and the same protocol, the intensities of the images may vary because of some uncontrollable factors such as room humidity, temperature, slice location, etc. Those factors influenced the gray-level ranges and further affected the extraction of image features. We normalized segmented regions of interest (ROIs) to reduce the influence of imageintensity variation via two steps: 1) Gray-level range selection: this step normalized image intensities into the range (μ—3σ, μ+3σ) (where μ is the gray-level mean, and σ is the gray-level standard deviation), as shown by Collewetet al.(20). 2) Image quantification: this step specifies the number of bits for each pixel to quantify the resulting graylevel range as follows (21):
Table 1 Characteristics of patients in primary and validation cohorts
WhereRangeis a discrete value (8, 16, 32, 64),Iis the intensity of the original ROI, andΦis the set of pixels in the ROIs area. Sixteen discrete values were adopted to resample normalization in this study.
Radiomics features were extracted from late arterial phase CT images with a 5.0 mm thickness for ESCC patients(Figure 1). The methodology of radiomics feature extraction is presented inSupplementary Material S1.Quantitative radiomics features were conducted using inhouse radiomics analysis software and executed in Matlab 2016b (Mathworks, Natick, USA). The radiomics features included the following categories: 1) first-order statistics features (8); 2) size and shape-based features (8); 3) texture features; and 4) wavelet features.
Figure 1 Flowchart of radiomics features extraction. An example of imaging segmentation and features extraction in a poorly differentiated, middle thoracic and stage III ESCC patient. (A)Original CT imaging; (B) Region of interest (ROI) manual segmentation on slice (A); (C) Features extraction from ROI,quantifying tumor intensity, shape, texture and wavelet texture.
The ROI was manually outlined along the tumor boundaries on each slice by an experienced radiologist.Interobserver reproducibility was analyzed with randomly selected 30 patients’ images to delineate the ROIs by two experienced radiologists (doctors 1 and 2, with 11 and 13 years of clinical experience in chest CT interpretation,respectively), and the rest of images were outlined by doctor 1. We utilized interclass correlation coefficient(ICC) to determine the agreement in feature values between the observers.
To eliminate the redundant features, we removed highly correlated features by calculating the correlation coefficient between the features. Then, the least absolute shrinkage and selection operator (Lasso) logistic regression model(22) was used to identify the most useful prognostic features. Lasso is a method of regression analysis that performs feature selection and regularization to improve the prediction accuracy via penalized estimation functions.This maximizes the area under the receiver operating characteristic curve (AUC) by tuning parameter (λ)selection and adopts 5-fold cross-validation via minimum criteria. Simultaneously, most covariate coefficients were shrunk to zero and the remaining variables with non-zero coefficients were selected by Lasso. Finally, the radiomics signature was built by combining those variables in the primary cohort and validated in the validation cohort.Radiomics score (Rad-score) was calculated for each ESCC patient in the primary and validation cohorts. We divided patients into stages I−II or III−IV according to the cutoff value of Rad-score. The higher the Rad-score, the higher the probability of stage III−IV.
We first assessed the potential association between radiomics signature and stage in the primary cohort and then validated it in the validation cohort using the Mann-Whitney U test. The discrimination and classification ability was adopted to estimate the predictive performance of radiomics signature. Receiver operating characteristic(ROC) curves were plotted for each cohort; the AUC,sensitivity, and specificity were calculated to evaluate the classification ability.
A combination model, which combined volume into the radiomics signature, was built using a logistic regression model and the predictive performance was evaluated by means of discrimination. ROC curves were plotted and the AUC, sensitivity, specificity, and accuracy were calculated.Similarly, volume was used as a variable to establish a univariate regression model and evaluate the predictive performance as the combination model.
To compare the discrimination ability of those models on predictive performance in staging, we compared the difference of AUC of ROC among those models using the DeLong test in the primary and validation cohorts. The net reclassification improvement (NRI) was also considered in two cohorts. NRI is often used to quantify how well a new model reclassifies subjects compared with an old model(23). It is also used to assess whether one set of predictive effects is better than another. The value of NRI can be positive or negative. A positive value means a net improvement of the model in discrimination for patients’tumor stage.
All statistical analysis was performed on R software (version 3.4.0, R Foundation for Statistical Computing, Vienna,Austria) in this study. The following R packages were used.The “glmnet” package was used for Lasso logistic regression. AUC values comparison was performed with the “pROC” package. The “Hmisc” package was used for calculating NRI. The “ggplot2” and “pROC” packages were used to draw ROC curves.
The statistical significance levels in this study were all two-sided, with P<0.05 considered as statistically significant. The difference test for gender, stage, tumor location, and histologic grade between the primary and validation cohorts was calculated by taking an independent samples Chi-square test. Continuous variables such as age,tumor volume, and radiomics score were analyzed using the Mann-Whitney U test.
We retrospectively analyzed 154 patients with ESCC who were treated between 2008 and 2016 (stage I−II, n=69;stage III−IV, n=85). A total of 114 patients were assigned to the primary cohort (84 males and 30 females; mean age,57.48±8.49 years), while 40 patients were assigned to the validation cohort (32 males and 8 females; mean age,59.35±8.44 years). Clinical characteristics of patients in the primary and validation cohorts with stages I−II and III−IV are summarized inTable 1. There were no significant differences in gender, age, primary site, histologic grade, or tumor volume between the primary and validation cohorts(P=0.277−0.876).
In total, we extracted 9,790 radiomics features from CT images. These may contain many redundant and highly correlated features. To find robust and valuable features,the following steps were performed:
Features with ICC≥0.90 were identified as robust features. After robustness assessment, 6,140 features (ICC:0.900−0.998) were selected from 9,790 features (ICC:0.373−0.998). Then, the correlation coefficient was calculated for each pair of features to remove highly correlated features. The most predictive feature in each feature pair with correlation coefficient ≥0.9 was retained while the other feature was discarded. After highly correlated analysis, 218 features remained. Finally,10 features were selected from 218 features with Lasso(Figure 2). All those calculations were implemented in the primary cohort. The radiomics signature was built with 10 selected features in the primary cohort and validated in the validation cohort. For linear models to produce a reasonable robust estimate, Peduzziet al.(24,25)recommended that the number of observations per variable was at least 10. Therefore, it is reasonable to select 10 variables for 114 samples in the primary cohort. In addition, Rad-score was calculated by a Rad-score calculation formula that was constructed using those 10 selected variables. The Rad-score calculation formula is presented inSupplementary Material S2andSupplementary Figure S1.
Figure 2 Radiomics feature selection using the least absolute shrinkage and selection operator (Lasso) logistic regression model.(A) Turning penalization parameter lambda (λ) using 5-fold crossvalidation and minimum criterion in Lasso model. The area under the receiver operating characteristic (AUC) curve was plotted versus log (λ). Log (λ)=−3.006, with λ=0.049 was chosen; (B) Lasso coefficient profiles of the 218 radiomics features. The vertical gray line was drawn at the value selected using 5-fold cross-validation in (A), where the optimal λ yield 10 features with non-zero coefficients.
A significant difference in radiomics score was observed between stages I−II and III−IV in the primary cohort(P<0.001), which was validated in the validation cohort(P<0.001). Compared with stage I−II, patients who were in stage III−IV had a higher Rad-score in the primary(median: 0.486vs.−0.206) and validation cohorts (median:0.361vs.−0.491). The radiomics signature shows a significant discrimination between stages I−II and III−IV. The AUC was 0.795 (95% CI: 0.714−0.875) in the primary cohort and 0.762 (95% CI: 0.600−0.924) in the validation cohort.
The radiomics signature performance of discrimination for stage I−II and III−IV ESCC in the primary and validation cohorts is presented using ROC curves inSupplementary Figure S2A, and the Rad-score for each patient in the primary and validation cohorts is shown inFigure 3.
Figure 3 Bar charts of Rad-score for each patient in primary cohort (A) and validation cohorts (B). Red bars indicate the radscore of stage I−II ESCC, while light green bars indicate the radscore of stage III−IV ESCC. Blue dotted line shows the cut-off value (0.054) of rad-score; above the line indicates stage III−IV,below the line indicates stage I−II. Red bars above the blue dotted line or light green bars below the blue dotted line mean misclassification.
The predictive performance of volume for discrimination of stage I−IIvs.III−IV yielded an AUC of 0.694 (95% CI:0.597−0.790) and 0.624 (95% CI: 0.427−0.821) in primary and validation cohorts, respectively. The combination model, combining volume into the radiomics signature,yielded an AUC of 0.801 (95% CI: 0.722−0.880) in the primary cohort and 0.780 (95% CI: 0.628−0.932) in the validation cohort. The volume and combination model performance of discriminating stage I−II and III−IV ESCC in primary and validation cohorts is presented using ROC inSupplementary Figure S2B.
The comparisons of predictive performance among volume, radiomics signature, and combination model were evaluated using the DeLong test. As shown inTable 2,there were significant differences between volume and radiomics signature and combined model in the primary and validation cohorts. Although there is no significant difference in predictive performance between the radiomics signature and combined model, the AUC, sensitivity,specificity, and accuracy of the combined model were better than those of the radiomics signature in the validation cohort. In addition, the NRIs among models were analyzed (Table 3).
Our results show that in discriminating between stage I−II and III−IV ESCC, the radiomics approach is superior to volume. The model of radiomics signature is better than volume (primary cohort: AUC, 0.795vs.0.694, P=0.003;validation cohort: AUC, 0.762vs.0.624, P=0.035); the combination model is better than volume (primary cohort:AUC, 0.801vs.0.694, P<0.011; validation cohort: AUC,0.780vs.0.624, P=0.047). We further proved our results through analysis of the NRI (radiomics signaturevs.volume: primary cohort, NRI=0.424, Z-statistics=2.37,P=0.017; validation cohort, NRI=0.834, Z-statistics=2.92,P=0.004; and combination modelvs.volume: primary cohort, NRI=0.829, Z-statistics=3.74, P<0.0001; validation cohort, NRI=0.921, Z-statistics=3.83, P=0.001).
In this study, we developed and validated a radiomics signature based on CT images for the preoperative individualized prediction of stage I−II or III−IV ESCC.Based on the significant difference in Rad-score, the radiomics signature successfully differentiated between stages I−II and III−IV of ESCC before treatment.Additionally, by comparing the volume, radiomics signature and combination model, our results showed that the radiomics approach has potential value in preoperative differentiation of stage I−II and III−IV ESCC.
In clinical practice, CT, magnetic resonance imaging(MRI), positron emission tomography (PET), and endoscopic ultrasound (EUS) have their own advantages and disadvantages in the staging of EC. The current UK guidelines recommended to combine these different modalities to stage EC, as each imaging modality provides unique staging information (26,27). Consolidating this unique information provided by these modalities is conducive to improving the probability of accurate staging.However, the use of these modalities is limited becausethey are expensive and time-consuming. CT is a low cost,readily available, and noninvasive imaging modality. Our study constructed a CT-based radiomics signature and demonstrated the potential to discriminate between stage I−II and III−IV ESCC. Currently, the primary role of CT is the detection of distant metastases in EC.
Table 2 Predictive performance of discrimination of all models
Table 3 NRI comparisons of inter-models in primary and validation cohorts
To build the radiomics signature, we selected 10 potential predictors from 6,140 candidate radiomics features by both eliminating highly correlated features and Lasso logistic regression. Most of the radiomics features’regression coefficients were shrunk to zero by Lasso in the model fitting process. Thus, the established model is more easily interpreted. In addition, it not only selects potential predictors and combines them into the radiomics signature,but also avoids over fitting (28). In the radiomics approach in this study, like in most multimarker analyses (29-31), we combined several individual images for radiomics features to discriminate stage I−II and III−IV ESCC, and achieved a satisfactory discrimination (AUC: 0.795 in primary cohort;AUC: 0.762 in validation cohort). Compared with previous studies on tumor staging, the performance of the radiomics signature we built was better than those reported in lung(AUC: 0.762vs.0.64) (32) and colorectal cancers (AUC:0.762vs.0.708) (33); the accuracy was higher than that reported in EC (accuracy <0.50) (34). Similarly, Donget al.(35) used PET/CT images combined with texture features to stage ESCC and obtained a satisfying result (AUC:0.789). Liuet al.(36) applied one texture feature that was extracted from CT images to identify ESCC with overall stage (I−II and III−IV) and achieved a good performance(AUC: 0.778). Although their results are slightly superior to our result, the studies of Donget al.(35) and Liuet al.(36) were limited by the lack of independent validation and the small sample size, and needed further study before they could be applied to clinical practice. As a possible alternative to preoperative staging, we constructed and validated a radiomics signature that can be used as an independent predictor to discriminate between stage I−II and III−IV ESCC.
The significant differences in AUC and NRI between volume and radiomics approaches indicate that the predictive performance of staging with the radiomics approach is better than that of volume. It is not surprising that the radiomics approach yielded a satisfactory performance in tumor staging. The tumor volume is a morphological characteristic that only reflects the size of the tumor, and the information it provides in tumor heterogeneity is limited. Nevertheless, the radiomics approach can more fully reveal the tumor heterogeneity and biological behaviors by combining the quantitative texture features and morphological features (8,9).
Some previous studies reported that the volume of ESCC could be used as a prognostic factor for radiotherapy and chemotherapy assessment, lymph node metastasis and tumor staging (15,37-39). However, the volume of the tumor showed poor discrimination ability in staging early and advanced EC compared with the radiomics approach in our study. Additionally, the accuracy was not significantly improved when volume was combined with the radiomics signature, and there was no significant difference between the radiomics signature and combined model. This may be due to two main reasons: 1) The volume may be a predictive factor in univariate analysis, but it is no longer an effective predictor when combined with other potential predictors in multivariate analysis; 2) Though contrastenhanced CT images were adopted, CT can’t accurately reveal the length of EC and the depth of invasion, which can then affect the accuracy of the stage. Therefore, this suggests that we should take more predictive factors into account to reduce the deviations caused by the single predictor.
There were several limitations in our study. First, we used thick-slice CT images rather than thin-slice images for the radiomics approach. Zhaoet al.(40) found that thin-slice images fully reflected texture features of tumor compared to thick-slice images. For the measurement of tumor volumes, thin-slice images had less measurement variability. We will further study the effect of thin-slice CT images for the staging of ESCC and compare it with thickslice images to confirm whether the thick-slice images are comparable. Second, all data in this study are derived from the same institution, and our findings lack multi-center validation. We will further investigate whether the findings are applicable to other institutions. Third, the tumor length and the depth of tumor wall invasion were not included in this study. Zeybeket al.(41) found that tumor length and invasion depth were independent prognostic factors in predicting survival and tumor staging for EC. In the future, we will attempt to investigate the performance of adding those factors into our study.
This study explored the radiomics approach based on CT images as a feasible method to identify stage I−II and III−IV ESCC before treatment. We constructed a multifeature radiomics model from extracted radiomics features.This model showed satisfactory performance in the preoperative identification of stage I−II and III−IV ESCC.Therefore, as a noninvasive and quantitative method,radiomics approach has the potential to guide individualized treatment decisions by preoperative staging.
All the authors have contributed significantly and have approved the manuscript. This work was supported by the National Key R&D Program of China (No.2017YFC1309100), National Natural Scientific Foundation of China (No. 81771912), and Science and Technology Planning Project of Guangdong Province (No.2017B020227012)
Conflicts of Interest: The authors have no conflicts of interest to declare.
Chinese Journal of Cancer Research2018年4期