Miguel Jimenez Perez, Rocio Gonzalez Grande
Abstract Although artificial intelligence (AI) was initially developed many years ago, it has experienced spectacular advances over the last 10 years for application in the field of medicine, and is now used for diagnostic, therapeutic and prognostic purposes in almost all fields. Its application in the area of hepatology is especially relevant for the study of hepatocellular carcinoma (HCC), as this is a very common tumor, with particular radiological characteristics that allow its diagnosis without the need for a histological study. However, the interpretation and analysis of the resulting images is not always easy, in addition to which the images vary during the course of the disease, and prognosis and treatment response can be conditioned by multiple factors. The vast amount of data available lend themselves to study and analysis by AI in its various branches, such as deeplearning (DL) and machine learning (ML), which play a fundamental role in decision-making as well as overcoming the constraints involved in human evaluation. ML is a form of AI based on automated learning from a set of previously provided data and training in algorithms to organize and recognize patterns. DL is a more extensive form of learning that attempts to simulate the working of the human brain, using a lot more data and more complex algorithms. This review specifies the type of AI used by the various authors. However, welldesigned prospective studies are needed in order to avoid as far as possible any bias that may later affect the interpretability of the images and thereby limit the acceptance and application of these models in clinical practice. In addition, professionals now need to understand the true usefulness of these techniques, as well as their associated strengths and limitations.
Key Words: Artificial intelligence; Machine learning; Hepatocellular carcinoma; Diagnosis; Treatment; Prognosis
In the era of Big Data, the need for the efficient management of the great amount of information available has led to the development and application of artificial intelligence (AI) and its various techniques in the general field of medicine. Although the concept of AI arose in the 1950s[1], it was not until a few years ago that it really came to experience its breakthrough.
The term AI refers to those computer programs that try to reproduce human cognitive functions, like learning or problem solving. Initially, machine learning (ML), developed as a branch of AI, analyzed data in order to create algorithms that can detect patterns of behavior from which predictive models can be established. Different ML techniques, such as support vector machines (SVM), artificial neural networks (ANNs) or classification and regression trees, have all been used in multiple studies in the field of medicine[2]. Technological advances over the last ten years have resulted in the appearance of deep learning (DL) as a new model of ML to develop multi-layered neural network algorithms, using such techniques as convolutional neural network (CNN), a multilayer of ANN, that has proven of great use in the analysis of radiological images[3,4].
Although the application of AI in various fields of medicine has shown promising results, we should nevertheless be aware of its limitations. The retrospective manner of many of these studies and the use of not particularly suitable databases, with their inherent bias, can affect the accuracy of AI. Thus, it is necessary to draw up prospective, well-designed, multicenter studies that are free from bias which could affect their interpretability and thereby limit their acceptance and application in clinical practice. And of course, we must not forget such other aspects as costeffectiveness, regulations by the health authorities and ethical considerations.
AI has been used in the field of hepatology for the diagnosis, treatment and prognostic prediction of various different disorders, though with special relevance in the study of hepatocellular carcinoma (HCC) as this is a very common tumor. Estimates by the American Cancer Society put the number of new cases of liver and intrahepatic bile duct cancer during 2020 to be 42810, with 30160 deaths[5]. HCC has particular radiological features that enable its diagnosis without the need for any histological study. Accordingly, the analysis of imaging tests takes on special relevance as their interpretation is not always easy, in addition to which they vary over the course of the disease, as do the prognosis and response to treatment, which are all affected by multiple factors. This all results in a vast amount of data whose integration and efficient analysis lend themselves to study by AI. Indeed, several studies have recently been undertaken to aid in the decision-making process and overcome the limitations of human evaluation[4,6].
Figure 1 gives a schematic idea of how AI, with its variants, could be used to study a patient with HCC, whether diagnosed or suspected. AI is able to perform a combined analysis of radiological, clinical and histological data, producing information that can aid in the diagnostic accuracy, tumor staging, treatment planning using methods of segmentation and evaluation of the presence of microvascular invasion, in addition to giving a prognostic estimate.
Figure 1 Graphic presentation of the applications of artificial intelligence in the approach to hepatocellular carcinoma.
In the field of liver cancer, the use of AI techniques to aid traditional diagnostic techniques is promising. CNN is a multilayer ANN interconnected in such a way that all input data traverse all the various layers during which the information is processed to produce output data. It can be considered an advanced form of DL with its own learning capacity. CNN can increase the diagnostic yield of ultrasound studies, abdominal computerized tomography or abdominal magnetic resonance imaging (MRI), positron emission tomography (PET) and histology.
HCC usually, though not always, develops in a cirrhotic liver. Accordingly, clinical practice guidelines recommend regular abdominal ultrasound in patients with hepatic cirrhosis; indeed, it is considered the method of choice for screening of spaceoccupying lesions. Ultrasound is therefore the main tool to evaluate liver disease and detect new lesions. However, image interpretation is not always easy and may present interobserver variability.
To assess the underlying disease, Bhartiet al[7]proposed an ANN model to differentiate four stages of liver disease using data obtained from ultrasound images: normal liver, chronic liver disease, cirrhosis, and HCC. The classification accuracy of the model was 96.6%[7]. Liuet al[8]designed an algorithm to classify ultrasound images. They selected the liver capsule to determine the presence of cirrhosis, even in early stages when the usual findings reported by the radiologist, such as a nodular liver outline, enlarged porta or splenomegaly, are still not obvious. Using their analysis of the morphology of the liver capsule, they were able to determine the presence or absence of cirrhosis, with an area under the curve of 0.968[8].
The human yield when characterizing a liver lesion from ultrasound images is limited. Schmauchet al[9]designed a DL system able to detect and classify spaceoccupying lesions in the liver as benign or malignant. After a supervised training using a database of 367 images together with the radiological reports, the resulting algorithm detected and characterized the lesions with a mean receiver operating characteristic of 0.93 and 0.916, respectively. Although the system requires validation, it could increase the diagnostic yield of ultrasound and warn of possibly malignant lesions[9].
AI has also been used in contrast-enhanced ultrasound (C-US), improving its ability to identify characteristics suggestive of cancer. Guoet al[10]demonstrated that DL applied to the behavior of liver lesions seen on C-US in three phases (arterial, portal and late) increased the accuracy, sensitivity and specificity of the study[10].
When a follow-up ultrasound shows a new liver lesion, other imaging studies are undertaken, mainly dynamic contrast-enhanced computed tomography (CT) or MRI, to obtain a more precise evaluation. The radiological behavior of liver lesions in dynamic CT or MRI studies is useful for characterization of the lesion. If a liver lesion > 1 cm fulfils certain radiological criteria, almost pathognomonic for HCC such as the presence of hyperenhancement at arterial phase and washout at portal or late phases in a cirrhotic patient, no further studies are needed for its diagnosis or histological confirmation. However, liver nodules often present an indeterminate behavior on CT and a biopsy of the lesion is required, as recommended in the European Association for the Study of the Liver guidelines[11]. This, though, requires assuming the risks involved in the procedure, or close follow-up, as indicated in the American Association for the Study of Liver Diseases guidelines[12], with a high number of studies and the possibility of not diagnosing a malignant lesion in time. Mokraneet al[13]undertook a retrospective analysis of 178 patients with cirrhosis and liver nodules in whom the Liver Reporting and Data System criteria were unable to distinguish the neoplastic from the non-neoplastic lesions, thereby necessitating a biopsy; 77% proved to be malignant on biopsy. Using DL techniques to classify nodules as HCC or non-HCC achieved an area under the curve (AUC) of 0.70. Another retrospective study, by Yasakaet al[14], analyzed the yield of an ANN, composed of three layers, classifying liver masses using contrast-enhanced CT into five categories: A, classic HCC; B, malignant tumors apart from HCC (cholangiocarcinoma, hepatocholangiocarcinoma or metastasis); C, indeterminate masses, dysplastic nodules or early HCC and benign masses other than cysts or hemangiomas; D, hemangiomas; E, cysts. After supervised training using over 55000 image sets the authors obtained a high accuracy for the classification of liver lesions, especially for the differentiation between categories A-B and C-D.
Quantifying the tumor load may be useful, particularly for the detection of tumor recurrence in follow-up CT studies. As tumor relapses can be small and go unnoticed, Vivantiet al[15]described an automated detection method of recurrence, based on the initial appearance of the tumor, its CT behavior, and the quantification of the tumor load at baseline and during the follow-up. The technique had a high rate of true positives in the identification of tumor recurrence, with an accuracy of 86%.
Liver segmentation is of great importance to assess liver lesions and for planning the ideal treatment. However, manual segmentation is made more difficult by the heterogeneity of the lesions or their diffuse borders. Liet al[16]proposed a CNN that can segment liver tumors based on CT images, with an accuracy of 82.67% ± 1.43%, better than that of traditional techniques, thereby favoring suitable treatment planning.
CNN applied to MRI has also been analyzed. Hammet at[17]developed and validated a DL system based on a CNN that classifies MRI liver lesions, with an accuracy of 92%, a sensitivity of 92% and a specificity of 98%; and an average computation time of 5.6 ms.
Other studies have associated additional MRI sequences and risk factors plus the patient’s clinical data to apply an automated classification system cataloguing liver lesions as adenoma, cyst, hemangioma, HCC and metastasis, with a sensitivity and specificity of 0.8/0.78, 0.93/0.93, 0.84/0.82, 0.73/0.56 and 0.62/0.77, respectively[18]. Zhanget al[19]described a training model using MRI in 20 patients to classify liver tissues. Their results were promising, improving the yield of the reference models used.
Preiset al[20]evaluated the yield of fluorine 18 fluorodeosyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) using a neural network to analyze liver uptake of 18F together with patient and laboratory data. They achieved a high sensitivity and specificity to detect liver malignancy unidentified visually, showing that this technique could complement the radiologist in the interpretation of PET, though their main aim was to evaluate metastatic liver disease, where 18F-FDG PET/CT has greater applicability.
The histopathological classification of a liver lesion and the differentiation of the tumor strain is crucial for treatment planning and a prognostic estimate of the disease, though this can sometimes prove challenging even for expert pathologists. Kianiet al[21]used AI as a support for the pathologists, focusing on the histological differentiation between HCC and cholangiocarcinoma. They analyzed prospectively the impact of this aid in the diagnostic yield of 11 pathologists, finding that it made no change to their mean accuracy.
Others have described how a deep CNN using previous histopathological images of HCC can make an automated diagnosis of HCC and distinguish healthy tissue from tumor tissue, in addition to identifying certain biological predictors[22]. Table 1 shows the main studies using AI techniques in the diagnosis of HCC.
The individual biological variability between patients in the behavior of HCC hinders evidence-based clinical assessment that is applicable to all patients. Accordingly, powerful, standardized risk stratification systems are needed in order to optimize treatment strategies and assess their effects. It is here that AI can play an important role in the therapeutic approach to HCC. Most studies on the use of AI in the treatment of HCC are aimed at the analysis of certain particular tumor characteristics, such as radiological, histological or genetic features, or a combination of the clinical data in order to predict the response to a particular treatment. This, in turn, will allow for the suitable selection of patients for particular treatment options.
In usual clinical practice the diagnosis and treatment assessment of HCC are commonly done with such imaging techniques as C-US, CT and MRI, after analyzing certain tumor features like vascularization or behavior after the administration of contrast material[23]. However, these characteristics are liable to subjectivity in their interpretation by the radiologist, in addition to the lack of high resolution dimensional images. But a new technology has recently appeared in the area of radiology and cancer, radiomics[24]. Although it is not yet in extensive use in clinical practice, it has nevertheless awoken great interest. This technology allows for the extraction of a great amount of quantifiable objective data contained in the radiological images and their later association with the underlying biological processes. Analysis of all these data with AI software can provide useful diagnostic and prognostic information with predictive accuracy[24,25].
Early tumor recurrence after surgical resection is associated with a poor prognosis. The preoperative identification of patients at high risk of recurrence is fundamental to avoid unnecessary treatment. Computer models have been developed that analyze certain tumor characteristics and aid in the preoperative prediction of the risk of recurrence or the evaluation of survival after resection.
Vascular microinvasion (VMI) has been established as an independent predictive factor of recurrence, associated with poor results after tumor resection[26]. Although the preoperative availability of information about VMI would be of much benefit, radiological techniques currently in clinical use do not provide for an adequate direct diagnosis.
Several studies have managed to elaborate radiomic signatures that enable prediction of the preoperative status of VMI, based on contrast-enhanced CT[27,28]or MRI[29]. However, these techniques involve radiological exposure, and are laborious to perform and costly. Recently, Donget al[30]published a study using radiomic algorithms based on grayscale ultrasound images to elaborate radiomic signatures with the potential to aid in the prediction of VMI, with promising results. Jiet al[31]created predictive models for recurrence after surgical resection using radiomic techniques to analyze contrast-enhanced CT images, with a C-index of 0.633-0.699. In conjunction with the inclusion of clinical data, the model can be used to establish a personalized risk stratification facilitating the individual management of HCC.
Survival after surgical resection has also been assessed in several studies using ML techniques[32-34], and more recently with more advanced DL models based on digitalized histological images of the tumor. Saillardet al[35]drew up a predictive model of survival after resection, attaining a C-index for survival prediction of 0.78. A recent prospective study by Schoenberget al[36]involving 180 patients also led to a predictive model based on the analysis of 26 preoperative routine clinical variables, obtaining a predictive value of 0.78.
Transcatheter arterial chemoembolization (TACE) is the treatment of choice for intermediate stage B HCC, in the Barcelona Clinical Liver Cancer (BCLC) classification[37]. Adequate selection of patients who might benefit from this treatmentis vital in order to avoid unnecessary examinations that can sometimes have undesirable secondary effects for the patient and waste costs for the health system. Studies have been developed based on AI techniques to attempt to predict the response to treatment with TACE and aid adequate patient selection. Most of these studies are based on imaging analysis, though some have used genomic signatures. Morshidet al[38]elaborated a fully automated ML algorithm using the combination of quantitative characteristics of CT images plus pretreatment patient clinical data to predict the response to TACE. They achieved a prediction accuracy rate of 74.2% using a combination of the BCLC stage plus quantitative image featuresvsusing just the BCLC stage alone. Penget al[39]validated a DL model to predict the response to TACE using CT images from a total of 789 patients in three different hospitals. They obtained an accuracy of 84% and an AUC of 0.97 to predict complete response. Liuet al[40]constructed and validated a DL model (DL radiomics-based C-US model) but based on the quantitative analysis of C-US cine recordings. It was highly reproducible and had an AUC of 0.93 (95%CI: 0.80-0.98) to predict the response to TACE.
Note: AI: Artificial intelligence; HCC: Hepatocellular carcinoma; CNN: Convolutional neural network; ML: Machine learning; DL: Deep learning; C-US: Contrast-enhanced ultrasound; CT: Computed tomography; MRI: Magnetic resonance imaging; PET: Positron emission tomography.
Other studies have used ML techniques combining MRI with clinical data to predict the response to TACE. Abajianet al[41]studied 36 patients who underwent MRI before TACE. They developed a predictive model of response with an accuracy of 78%, sensitivity of 62.5% and specificity of 82%.
The efficacy of TACE has also been examined by survival analysis of the patients after its application. Mähringer-Kunzet al[42]developed a prediction model of survival after TACE by constructing an ANN, using all the parameters of the main conventional prediction scores (ART[43], ABCR[44]and SNACOR[45]). They predicted a one-year survival with an AUC of 0.77, sensitivity of 78% and specificity of 81%, better results compared to those of the conventional scores mentioned.
Although most studies evaluating the use of AI to assess TACE have used radiomics, some have also evaluated the prediction of response to TACE using genetic analysis. Zivet al[46]studied genetic mutations using SVM techniques to predict the tumor response after TACE, though it was a retrospective study with a low number of cases.
Radiofrequency ablation (RFA) as a therapy aimed at curing HCC in early stages[37]has also been evaluated. Lianget al[47]drew up a predictive model of HCC recurrence based on SVM. They studied 83 patients with HCC who underwent RFA, obtaining an AUC of 0.69, sensitivity of 67% and specificity of 86%, with which they were able to identify patients at a high risk of recurrence. Table 2 summarizes the studies using AItechniques in the treatment of HCC.
Table 2 Studies applying artificial intelligence for the treatment of hepatocellular carcinoma
Note: All studies were retrospective studies in their design, except the study by Liang et al[47] was prospective. AI: Artificial intelligence; CNN: Convolutional neural network; ML: Machine learning; DL: Deep learning; VMI: Vascular microinvasion; C-US: Contrast-enhanced ultrasound; AUC: Area under the curve; Acc: Accuracy; Sen: Sensitivity; Spe: Specificity; CT: Computed tomography; C-CT: Contrast-enhanced CT; MRI: Magnetic resonance imaging; C-MRI: Contrast-enhanced MRI; TACE: Transcatheter arterial chemoembolization; RFA: Radiofrequency ablation.
The prediction of overall survival of HCC, apart from the application of any therapy, has also been assessed using AI techniques. Current evidence on the relation of abnormalities in DNA methylation and HCC[48-50]was the basis for the study by Donget al[51]. These authors used ML techniques (SVM) to analyze DNA methylation data from 377 HCC samples, and constructed three risk categories to predict overall survival, with a mean 10-fold cross-validation score of 0.95.
Larger studies are needed comparing the yield of medical professionals with the support of AIvsother professionals without such support in order to demonstrate its benefit as an aid in medicine. In particular, to assess liver masses and study HCC these trials should focus on aspects related to treatment and prognosis, such as the characterization of hepatic lesions catalogued as indeterminate, the presence of vascular invasion and the response to percutaneous therapy. Another important aspect is the use of AI in the analysis of the behavior of HCC in cirrhotic and non-cirrhotic patients, as well as the differentiation of primary and metastatic liver lesions[52]and especially the differential diagnosis with cholangiocarcinoma that can be complicated with currently available techniques and whose management and prognosis are completely different from those of HCC. At the same time, it is also necessary to start training health-care professionals to be prepared for the future incorporation of AI in daily practice in the field of liver cancer.
The incorporation of AI technologies in medicine has represented one of the most relevant advances in recent years. It will doubtless experience a progressively increasing rise, due to its usefulness in the processing and analysis of the enormous amount of data currently available. Nevertheless, we should be aware that certain constraints still exist that can limit its acceptance and applicability in clinical practice. Health care professionals must learn the true usefulness of AI and accept the need for its coexistence with the indispensable need for human evaluation, accepting that AI is here to support human intelligence, never to replace it. Despite the great progress represented by AI, it is nevertheless vital to guarantee that medical protocols remain rigorously transparent.
World Journal of Gastroenterology2020年37期