Risk prediction platform for pancreatic fistula after pancreatoduodenectomy using artificial intelligence

2020-11-30 06:55InWoongHanKyeongwonChoYoungjuRyuSangHyunShinJinSeokHeoDongWookChoiMyungJinChungOhChulKwonBaekHwanCho
World Journal of Gastroenterology 2020年30期

In Woong Han, Kyeongwon Cho, Youngju Ryu, Sang Hyun Shin, Jin Seok Heo, Dong Wook Choi, Myung Jin Chung, Oh Chul Kwon, Baek Hwan Cho

Abstract

Key words: Postoperative pancreatic fistula; Pancreatoduodenectomy; Neural networks; Recursive feature elimination

INTRODUCTION

Despite advancements in surgical technique and operative management, postoperative pancreatic fistula (POPF) is still widely considered to be the greatest contributor to major morbidity and mortality after pancreatoduodenectomy (PD), with an incidence of 10%-30%[1-5]. Furthermore, it frequently delays the timely delivery of adjuvant therapies, and reduces overall patient survival[6]. The indications of PD have been widening, and the procedure is offered to an increasing number of elderly patients with multiple comorbidities[7,8], prompting the need to accurately define which patients are fit for PD and could tolerate a potentially life-threatening POPF.

Recently, the management of POPF has undergone a paradigm shift from a standardized and uniform approach that could not reflect an individual's characteristics to a proactive mitigation strategy. The new strategy uses various predictive systems to enable early prediction and prevention and optimize individual treatment decisions[6,9,10]. Previous predictive systems[6,9,10]might reflect POPF incidence and had the merit of simplicity, but their predictive accuracy is somewhat questionable[11,12]. Therefore, to more accurately predict POPF, further research is needed.

Machine-learning (ML) is an artificial intelligence (AI) technology that has been adopted in many areas of modern society, including medical science. In ML, computational models composed of multiple processing layers learn various data representations with multiple levels of abstraction[13]. ML is currently being used in not only surgery[5,14,15]but also other areas, such as, pharmacogenomics, image classification, and medical decision support systems[16-20]. Therefore, for this study we aimed to develop a new risk prediction platform for POPF after PD using ML algorithms. If so, we expected that a patient’s predicted POPF risk could direct their clinical management and prevent or mitigate untoward outcomes.

MATERIALS AND METHODS

Patient selection

Under institutional review board approval (No. SMC 2017-01-017), we retrospectively collected clinicopathological variables for 1846 patients who underwent PD to treat various periampullary tumors at Samsung Medical Center (Seoul, Republic of Korea) between January 2007 and December 2016. Among them, we excluded 77 (4.2%) patients who had metastasis from sites other than the primary tumor origin, had direct invasion from the primary tumor into adjacent organs, underwent surgery for a recurrence, or lacked medical information about POPF. We analyzed the remaining 1769 patients (1079 men and 690 women).

Definition and Selection of pre- and intraoperative input variables

Data on preoperative, intraoperative, and postoperative outcomes were collected and maintained on a web-based database (MDB, Seoul, Korea). We originally analyzed 38 preoperative and intraoperative variables that could be associated with POPF. Preoperative laboratory data, such as serum C-reactive protein, amylase, lipase, and carbohydrate antigen (CA) 19-9 level just before the operation was entered to algorithms. Co-existing pancreatitis was defined as classic feature of pancreatitis on preoperative Computed Tomography (CT) scan or intraoperative findings. Underlying heart diseases included hypertension on medication, coronary or valvular heart disease, or various arrhythmic diseases. The location of tumors and pancreatic duct (pduct) diameter were determined or measured by preoperative CT scan. Total intraoperative fluid infusion consisted of total amount of intravenous crystalloid, colloid, volume expanders, or blood transfusion. The pancreatic texture was determined as soft or hard by the surgeon during the operation.

Among the continuous variables, there was some level of missing data. Median imputation[19], which is a common approach for dealing with missing values in ML algorithms, was used. None of the categorical variables had missing values. We used the one-hot encoding technique to encode the categorical variables when only one of the categories was assigned. When a categorical variable had three exclusive choices, then we transformed the categorical variable into three individual binary variables. When the choices within a categorical variable are not exclusive (e.g., “A”, “B”, “both A and B”, and “none”), binary encoding was used by splitting each choice into separate columns and converting to binary codes[21,22]. As a result, 44 encoded variables were input in the ML models (Table 1).

Surgical techniques and perioperative management

In cases of cholangitis or jaundice, preoperative endoscopic or percutaneous biliary drainage was performed. After the introduction of a definition of borderline resectable pancreatic cancer, 24 (1.4%) patients received neoadjuvant treatment using various regimens. All surgical procedures were performed by experienced 5 pancreatic surgeons at Samsung Medical Center who underwent more than 50 PDs annually. To create pancreatic anastomosis, 1761 (99.5%) patients underwent pancreaticojejunostomy (PJ) and 8 (0.5%) patients underwent pancreaticogastrostomy. Pancreatoenteric anastomosis with stents was conducted in 1185 patients (70.0%) (Table 1). At the end of each surgical procedure, two or three drains were placed adjacent to the PJ anastomosis and on the right side of the superior mesenteric arterial resection margin. Serum and drain fluid amylase levels were routinely measured on postoperative days 1-3 and 5, 6, or 7, if the drains were maintained. In this study, we used ‘the definition of POPF in the 2016 update of the International Study Group (ISGPS) definition and grading of POPF’[23]. As a result, the grade A fistula has been removed from the POPF classification in this study. Drains adjacent to the PJ anastomosis were removed if no evidence of a leak was found in an abdominal CT scan on postoperative days 5-7. Patients who experienced POPF received propermanagement, including conservative, interventional, or surgical treatment, depending on each patient’s clinical condition.

Table 1 Clinicopathologic variables included in the machine learning algorithms

Machine learning algorithms as artificial intelligence

Two ML algorithms, random forest (RF) and neural network (NN), were used to predict POPF. RF method is a kind of ensemble learning algorithm that builds multiple decision trees expecting better performance by taking mode or mean of individual trees[24]. An NN is a ML algorithm that emulates the synaptic structure of the brain[13]. It contains singular or multiple hidden layers, between the input and output layers[13]. Recursive feature elimination (RFE), which is a feature selection method that removes the weakest features until the maximum area under the curve (AUC) is reached[25,26], was used to identify the subset of features used in the final NN model. We tuned hyperparameters of NN (such as number of hidden layers, number of nodes, learning rates, batch size, dropout rate, and so on) to maximize the performance by grid search algorithm on each RFE step[27].

These AI-driven POPF prediction algorithms were developed using MATLAB Release 2017 band Python software with Tensoflow library.

Data analysis and statistical methods

The characteristics of the study population were described for each dataset, including the mean and standard deviation for each variable. For the development of the ML algorithms, the total dataset was split into a training set and a test set. The training set was used to derive the POPF prediction algorithms, and the test set was used to evaluate the derived algorithms. In order to evaluate our ML approaches, we used a stratified 5-fold cross-validation test. This randomly divides all the data into 5 partitions (folds) keeping each one with similar positive and negative data distribution. Then, we train a model with four of the partitions and test the model with the remaining fold. By changing the folds for training and testing, this process is performed 5 times. Also, the whole cross-validation was repeated 10 times by random split of the dataset, evaluating the performance of the models at the end. These processes ensure the generalized performance of a model by preventing overfitting to the samples. Because the outputs from the ML-driven POPF prediction algorithms are probabilistic estimates of risk, the performance for the test data was evaluated using AUC. The clinical meaning of the AI-driven risk factors for POPF was identified by sliding window approach[28]. All statistical and mechanical analyses assessing algorithm performance were done by Cho K and Cho BH from Medical AI Research Center, Samsung Medical Center using GraphPad Prism version 5.00 for Windows (GraphPad Software, SD, CA) and MATLAB Release 2017b (MathWorks, Inc., Natick, MA) software.

RESULTS

Clinicopathological characteristics and outcomes

Table 1 provides the clinicopathologic details of the 1769 patients. Among them, grade B or C POPF occurred in 221 (12.5%) patients according to the ISGPF 2016 definition, and 130 (7.3%) patients had an American Society of Anesthesiologists’ (ASA) score ≥ 3. The mean value of body mass index (BMI) was 22.5 kg/m2, and the mean albumin level was 4.0 g/dl. The mean operating time was 443.2 min. The mean total fluid input and estimated blood loss during the operation were 3129.5 and 962.4 mL, respectively. A soft pancreas was observed in 750 (42.4%) patients. The mean diameter of the pancreatic duct was 4.2 mm. The most common tumor location was the pancreas, which occurred in 568 (32.1%) patients. Presumed pancreatitis was observed in 370 (20.9%) patients (Table 1). 30-d postoperative mortality was observed in 23 (1.3%) patients.

Development of machine learning models using random forest and neural network

Table 2 summarizes the results from each algorithm using three different configurations of the dataset. Firstly, the data with complete values for the 38 original variables were input into the two ML algorithms. The average AUCs over the 5-fold cross validation with 10 repetition were 0.67 with the RF and 0.74 with the NN, respectively. When complete data for 34 original variables (without serum C-reactive protein, amylase, lipase, and CA 19-9 level) were input into the algorithms, the 5-fold average AUCs were 0.67 with the RF and 0.72 with the NN. For the configuration of missing values treatment, we input data from 1769 patients into the ML algorithms. Those 5-fold average AUCs increased to 0.68 with the RF and 0.71 with the NN. All those AUCs are summarized in Table 2.

Machine learning models using neural network with recursive feature elimination

Using all 1769 data samples after missing data treatment, we could further improve the AUC from 0.71 to 0.74 using NN with RFE method (Table 2). Sixteen risk factors for POPF were identified using NN with RFE method: Pancreatic duct diameter, BMI, preoperative serum albumin, lipase level, amount of intraoperative fluid infusion, age, platelet count, extrapancreatic location of tumor, combined venous resection, coexisting pancreatitis, neoadjuvant radiotherapy, ASA score, sex, soft texture of the pancreas, underlying heart disease, and preoperative endoscopic biliary decompression. (Figure 1). The post hoc analysis revealed a nonlinear relationship by showing the response of NN model to each input variable at every RFE step. Ten discrete points cover the observed range of variation for each corresponding variable. We found several patterns of NN output response, in which the predicted POPF risk seemed to have a positive, negative, or biphasic relationship with each variable. The contribution profiles of the top 16 variables are shown in Supplement Figure 1. Based on these multiple and complex relationships among the risk factors for POPF after PD, we made a network connections illustration to improve understanding (Figure 2).

Establishment of risk prediction platform for postoperative pancreatic fistula using artificial intelligence

NN algorithm using RFE that had the best performance across the metrics ofdiscrimination, calibration, and overall performance was integrated into an interactive interface. We designed our clinical decision tool to collect values entered by a clinician, feed those values into the pre-trained algorithm, retrieve the result, and output that result to the clinician in real time. This POPF prediction platform is available as an open-access, web-based application programmed to be accessible and adaptable for use on desktops, tablets, and smartphones. It is freely available at https://popfrisk.smchbp.org/.

Table 2 Prediction performance of the various dataset for postoperative pancreatic fistula

Figure 1 Performance of the neural network models optimized within each recursive feature elimination step. 1: Pancreatic duct diameter; 2: Body mass index; 3: Serum albumin; 4: Amount of intraoperative fluid infusion; 5: Age; 6: Platelet count; 7: Extrapancreatic location of tumor; 8: Combined venous resection; 9: Co-existing pancreatitis; 10: Serum lipase; 11: Neoadjuvant radiotherapy; 12: ASA score; 13: Sex; 14: Soft texture of pancreas; 15: Underlying heart disease; 16: Preoperative endoscopic biliary decompression; 17: Hemoglobin; 18: Serum total bilirubin; 19: Operative time; 20: Intraoperative transfusion; 21: Neoadjuvant chemotherapy; 22: Anastomotic methods (1); 23: Serum amylase; 24: Anastomotic methods (2-1); 25: Pancreatic duct stent (1); 26: White blood cell count; 27: Type of surgery (1); 28: Serum carbohydrate antigen 19-9; 29: Serum C- reactive protein; 30 Estimated blood loss; 31: Combined vascular resection; 32: Pancreatic duct stent (2); 33: Preoperative percutaneous biliary drainage; 34: Underlying cerebrovascular disease; 35: Combined organ resection; 36: Type of surgery (2); 37: Type of surgery (3); 38: Anastomotic methods (2-2); 39: Underlying liver disease; 40: Underlying chronic kidney disease; 41: Underlying pulmonary disease; 42: Underlying cerebrovascular disease; 43: Diabetes mellitus; 44: Preoperative endoscopic pancreatic drainage; ASA: American Society of Anesthesiologists; AUC: Area under the curve.

DISCUSSION

POPF is a serious inherent risk of a pancreatic resection. The best option for managing POPF is undoubtedly prevention using a preoperative and intraoperative POPF risk assessment that guides response measures postoperatively[6,9,10]. Theoretically, ML could offer an opportunity to improve the accuracy of risk assessment by exploiting the complex interactions among risk factors that affect POPF. To the best of our knowledge, this study presents the first ML algorithm for predicting POPF using multiple pre- and intraoperative variables derived from a large, single-institutional dataset. The maximum AUC of this model was considerable: 0.74 with NN with RFE method (Figure 1). Ultimately, a patient’s predicted POPF risk could direct their clinical management and prevent or mitigate untoward outcomes.

Figure 2 Illustration of artificial intelligence algorithm for 16 risk factors affecting postoperative pancreatic fistula. PV-SMV: Portal veinsuperior mesenteric vein; ASA: American Society of Anesthesiologists; ERBD: Endoscopic retrograde biliary drainage; ENBD: Endoscopic nasobiliary drainage; POPF: Postoperative pancreatic fistula.

AI in the form of ML discovers intricate structures in large datasets by using a backpropagation algorithm to indicate how a machine should change the internal parameters it uses to compute the representation in each layer based on the representation in the previous layer[13]. ML can identify latent variables that are unlikely to be observed but might be inferred from other variables[18,19,24]. For example, NN with RFE algorithm found many pre- and perioperative predictors for POPF that we used in our final modeling (Figure 1). Previously developed risk assessment models[6,9,10]implicitly assume that the risk factors are related to POPF in a linear fashion. Those models could thus oversimplify complex, nonlinear relationships among many risk factors. Even if it might be cumbersome to calculate the risk of POPF using 16 variables in actual clinical care, our AI-driven risk platform better incorporates multiple risk factors and can account for more nuanced relationships between the risk factors and POPF (Figure 2). An example of nonlinearity is shown in Supplement Figure 1, and note that each variable has a variable effect.

This ML algorithms found 16 risk factors for POPF (Figure 1). These risk factors can be categorized into 3 groups: The technically demanding group, intraoperative volume status–related group, and poor general condition group (Figure 2). The risk factors in the technically demanding group (soft pancreas[2,5,6,9], small pancreatic duct[6,9,29], extrapancreatic lesion[6], absence of preoperative pancreatitis or low lipase level[30], absence of preoperative endoscopic biliary decompression, absence of neoadjuvant radiotherapy, and high BMI[7]) indicate potential difficulty in reconstructing the pancreatic-enteric anastomosis, which could cause POPF. Patients with pancreatic cancer, chronic pancreatitis, or neoadjuvant treatment have increased pancreatic fibrosis and a lower incidence of POPF than other PD patients[30,31]. Also, it is wellknown that preoperative endoscopic biliary drainage is frequently associated with procedure-related pancreatitis[32]. As a result, those procedures might reduce the risk of POPF. The risk factors in the intraoperative volume status–related group (large intraoperative fluid administration, concomitant portal vein-superior mesenteric vein resection, and low platelet count) could cause ischemia and poor healing of the pancreatic-enteric anastomosis, which is compounded by tissue edema from aggressive volume replacement in a rebound fashion[6,33]. The resultant swelling of the anastomosis can cause duct occlusion or suture disruption. The risk factors in the poor general condition group (old age, underlying heart disease, low preoperative serum albumin level, and low ASA score) could be related to poor nutritional status, which is considered to correlate with a high risk of POPF[34,35].

Notably, a high probability of POPF in patients characterized by only 1 or 2 classical fistula risk factors could not be determined. In this study, we found 16 risk factors by using AI algorithms (Figure 1), but controversy remains about the true risk factors for POPF. The varying results from different studies could be influenced by study design, the composition of the patient populations, or statistical methods. For example, there is still debate about whether intraoperative volume status, such as intraoperative blood loss, transfusion, or the amount of fluid administration, are risk factors for POPF[6,9,33,36]. Some reports suggest intraoperative volume status as an independent risk factor for POPF[6,33]because of pancreatic parenchymal and intestinal edema from aggressive volume replacement, but other studies have denied its adverse effect because estimation of blood loss during surgery is unreliable and inaccurate[3,9]. We think this discrepancy about the prognostic value of different risk factors for POPF could be a fundamental interpretation error caused by the assumption of linearity and an attempt to simplify what isn't actually simple. Therefore, ML algorithms such as those used in the study will be an important tool for POPF risk assessments.

Other recently proposed risk prediction models[6,9,10]have the advantage of being easily performed because they use only 3–6 variables. However, previously unknown risk factors for POPF are still being newly identified. For example, preoperative sarcopenia, an age-related decrease in muscle mass, has been identified as a risk factor for POPF[37-39]. Existing models cannot reflect new factors for POPF as they emerge but must be re-analyzed and developed from scratch. To make matters worse, as the number of potential risk factors increases, the complexity of the conventional models can cause over-fitting, yielding implausible results. However, we addressed that possibility by using active and appropriate choices in pre-training, hyper-parameter selection, and regularization in our AI-driven algorithms[19]. Because AI is scalable, there is no need to develop a new model; even if many new variables affecting POPF are introduced, it is possible to just continue adding them to the original model. As the amount of pancreatectomy data continues to grow, the creation and deployment of learning systems as accessible tools could significantly enhance the prognosis and management of POPF. New learning algorithms and architectures that are currently being developed, such as convolution[40]or recurrent[41]NNs, will accelerate this progress.

This AI-driven risk prediction platform for POPF could assist the drive toward personalized medicine by better tailoring risk management to individual patients. For example, after a risk evaluation, high-risk patients could be selected for a multipledrain strategy and postoperative prophylactic octreotide use. In this way, we expect our platform to help select patients who need more intense therapy and establish effective (and cost-effective) treatment strategies for POPF. Various mitigation strategies have been proposed to reduce the occurrence and morbidity of POPF, including technical variations, such as, pancreaticogastrostomy reconstruction[2,42], dunking/invaginating anastomosis[1,43,44], absorbable mesh patches[45,46], and the use of intraperitoneal drains[29], anastomotic stents[47], and prophylactic somatostatin analogues[4,48,49]. As a part of those efforts, we have an ongoing trial of this risk score wherein we are applying a somatostatin analogue during postoperative days 0–3 in high-risk patients. Future prospective studies could stratify treatments based on the outcome of this platform and provide comprehensive treatment algorithms.

This study has several limitations. First, Co-existing pancreatitis, which is bound to be subjective, was defined as classic feature of pancreatitis on preoperative CT scan or intraoperative findings. Also, the input data used to develop risk prediction platform was pre- and intraoperative variables. In practice, it is important for both pre- and intraoperative variables to enter the algorithms in order to improve the predictability of POPF, but clinically, it may be helpful for only preoperative variables to enter the algorithms. Therefore, we will sooner or later conduct modeling for only preoperative variables using multicenter data. Second, the NN’s output response to changing each input variable partially revealed the variables’ nonlinear relationships to POPF risk. Therefore, the pattern of the output response should not be understood as a direct relationship between an input variable and POPF risk. Nevertheless, this process could help inform further explorations of diverse predictive risk factors and the future development of new risk prediction approaches and algorithms. Finally, the study, though a large institution, was not only conducted on patients in single center but also had the disadvantage of not performing external validation. As a result, the follow-up study will be conducted by performing an external validation on patients in multiple institutions.

In conclusion, ML algorithms are promising tools for the prediction of POPF that can be integrated into clinically useful decision tools. Compared with established POPF risk prediction methods, our ML algorithms better predict the POPF risk correctly. After external validation, this new platform could be used to select patients who need more intense therapy and to preoperatively establish an effective treatment strategy.

ARTICLE HIGHLIGHTS

ACKNOWLEDGEMENTS

We thank the study research nurse, Hyemin Kim, for her tremendous work in the acquisition of data, and we thank Sang Eun Lee for dedicated support with medical illustration at Samsung Medical Information & Media Services at Samsung Medical Center.