Comparison of four non-alcoholic fatty liver disease detection scores in a Caucasian population

2020-06-28 06:37LarsLindLarsJohanssonkanAhlstrJanErikssonAndersLarssonUlfRisrusJoelKullbergJanOscarsson
World Journal of Hepatology 2020年4期

Lars Lind,Lars Johansson,Håkan Ahlström,Jan W Eriksson,Anders Larsson,Ulf Risérus,Joel Kullberg,Jan Oscarsson

Lars Lind,Jan W Eriksson,Anders Larsson,Department of Medical Sciences,Uppsala University,Uppsala 75185,Sweden

Lars Johansson,Håkan Ahlström,Joel Kullberg,Antaros Medical AB,BioVenture Hub,Mölndal 43153,Sweden

Håkan Ahlström,Joel Kullberg,Department of Surgical Sciences,Uppsala University,Uppsala 75185,Sweden

Ulf Risérus,Department of Public Health and Caring Sciences Clinical Nutrition and Metabolism,Uppsala University,Uppsala 75122,Sweden

Jan Oscarsson,Global Medicines Development,AstraZeneca,MöIndal 43150,Sweden

Abstract

Key words:Comparison;EFFECT studies;Fatty liver;Non-alcoholic fatty liver disease;Non-invasive indices;Screening

INTRODUCTION

Non-alcoholic fatty liver disease (NAFLD) is a common disorder,with an estimated prevalence ranging from 20% to 35% in the general population;the prevalence is approximately doubled in the obese population[1-4].NAFLD can be diagnosed using liver biopsies,ultrasound,magnetic resonance imaging (MRI),or spectroscopy;however,these investigations may not be readily available in primary care.Thus,the general physician should have simple tools available to use for screening,since not all obese subjects could be referred to imaging or biopsy.

In order to identify simpler and cost-effective approaches to diagnose NAFLD,several scores based on easily measurable biochemical and clinical parameters,such as the fatty liver index (FLI)[5],hepatic steatosis index (HSI)[6],lipid accumulation product (LAP)[7],and NAFLD liver fat score (LFS)[8],have been developed.However,only one study has evaluated these scores directly in the same population,using population-based NHANES data and ultrasound to diagnose NAFLD[9];this study found that LFS was the best score for NAFLD detection.

MRI-proton density fat fraction (PDFF) can quantitatively assess the degree of liver steatosis as percent of the liver volume and can more accurately detect mild steatosis compared to ultrasound[10].Since the extent to which the different scores can predict NAFLD in a high-risk individualvsa non-selected individual is unknown,the present study was conducted to compare the ability of the abovementioned four scores to predict NAFLD in two sample sets,a population-based sample and a sample at high risk for NAFLD,using MRI-PDFF,which can accurately quantify liver fat values.In both the samples,NAFLD was diagnosed by MRI-PDFF using the median of the fat fraction values inside the delineated total liver volume.The hypothesis tested was that the different scores performed differently in the two samples.

MATERIALS AND METHODS

Study populations

The EFFECT studies:In the EFFECT I study (ClinicalTrials.gov NCT02354976)[11],screened patients were eligible for inclusion in the treatment part of the study provided they were 40-75 years old and had a body mass index (BMI) of 25-40 kg/m2,serum triglyceride level of 1.7 mM (150 mg/dL) or higher,and liver PDFF > 5.5% of liver volume.Exclusion criteria were as follows:Patients with diabetes mellitus,a history of other hepatic disease,an inability to undergo MRI scanning,and a significant alcohol intake (over 14 units per week for both women and men).

The EFFECT II study (ClinicalTrials.gov NCT02279407)[12]had similar inclusion and exclusion criteria to the EFFECT I study,with the exception that eligible patients must have had a prior history of type 2 diabetes,and serum triglyceride levels were not considered for inclusion.

Thus,only data from the screening parts of the EFFECT I and II studies,including both patients who were randomized and screen failures,were used in the present study (Table 1).Data from 140 and 170 patients in the EFFECT I and EFFECT II studies,respectively,for whom a successful MRI liver scan was performed were pooled as a high-risk sample for NAFLD.Further details on the EFFECT I and II studies have recently been published[11,12].

The prospective investigation of obesity,energy and metabolism study:The prospective investigation of obesity,energy and metabolism (POEM) study was a population-based study investigating individuals (all aged 50 years) from Uppsala[13].Of 502 individuals recruited (50% women),a successful MRI liver scan was performed in 310 individuals (Table 1).

None of these subjects reported a significant alcohol intake,as defined above for EFFECT I and II participants.

Liver fat measurement using MRI

MRI was used to determine PDFF using a water-fat separated scan with large liver coverage collected in a single breath hold as described earlier[11,12].The body coil was used to collect a spoiled,threedimensional,six-gradient echo with axial orientation.For the EFFECT studies,imaging was performed at seven different sites.Six of these used a 1.5T scanner and one used a 3T system.One of the sites used water-fat reconstruction supplied by the system vendor.Data from the other sites and from the POEM study were reconstructed using an in-house developed software that included T2 and a multi-peak lipid spectrum in the signal model.The POEM study data were collected on a 1.5T system.Images from the EFFECT studies were sent for centralized analysis at the imaging core laboratory at Antaros Medical (Mölndal,Sweden),and the POEM MRI data were analyzed at the Department of Radiology,Uppsala University.The liver was segmented by trained operators from the axial slices of the water image using the software ImageJ (National Institutes of Health,Bethesda,Maryland,United States,https://imagej.nih.gov/ij/).The border of the liver was avoided to reduce partial volume effects.Analysis of the EFFECT data was performed by one trained operator and POEM data by another operator.PDFF was determined using the median of the fat fraction values inside the delineated liver volume.The coefficient of variation for repeated examinations and analyses of liver PDFF was 5.3%,as determined by test-retest scanning and analysis of data from 10 healthy volunteers.

Blood analyses

The EFFECT studies:Fasting blood samples were collected in the morning.Patients were instructed to fast for a minimum of 10 h.Plasma glucose levels were analyzed using a hexokinase enzymatic method with a Glucose HK Gen.3 reagent kit (Roche Diagnostics,Indianapolis,IN,United States).Plasma insulin levels were measured using the Access Ultrasensitive Insulin assay (Beckman Coulter,Inc.,Brea,CA,UnitedStates),a simultaneous one-step immunoenzymatic (sandwich) assay.Serum levels of total cholesterol and triglycerides were measured using the Cholesterol Gen.2 reagent and Triglyceride reagent,respectively (Roche Diagnostics).Highdensity lipoprotein cholesterol (HDL-C) and low-density lipoprotein cholesterol (LDLC) concentrations were measured using direct HDL-C and LDL-C assays (HDLC3 third generation reagents and LDL-C plus second generation assay;Roche Diagnostics).Other analytes such as gamma-glutamyl transferase (GGT),alanine aminotransferase (ALT),and aspartate aminotransferase (AST) were measured in the local hospitals using conventional methods[11,12].

Table1 Basic characteristics of the EFFECT and prospective investigation of obesity,energy and metabolism samples

The POEM study:All samples were collected in the morning after an overnight fast.Fasting plasma glucose and lipids were measured by conventional methods at the clinical chemistry laboratory at the University Hospital in Uppsala.Serum insulin was measured using a microtiter-based enzyme-linked immunosorbent assay (ELISA;10-1113-01,Mercodia,Uppsala,Sweden).The assay was calibrated against the first international reference preparation 66/304 for human insulin[13].

Fatty liver disease algorithms

FLI was calculated using the following formula[5].

HSI was calculated using the following formula[6]:HSI = 8 × ALT/AST ratio + BMI(+ 2,if diabetes mellitus;+ 2,if female).LAP was calculated using the following formula[7]:LAP = (waist - 65) × triglycerides in men and (waist - 58) × triglycerides in women.NAFLD LFS was calculated using the following formula[8]:LFS = -2.89 +[1.18 × MetS (Yes:1,No:0)] + [0.45 × diabetes mellitus (Yes:2,No:0)] + (0.15 ×insulin) + (0.04 × AST) - [0.94 × (AST/ALT)].

Where MetS is the metabolic syndrome according to the International Diabetes Federation criteria[14].

Both the EFFECT studies and the POEM study were approved by the Ethics Committee of Uppsala University,and all participants have given their informed consent.

Statistical analysis

To evaluate the effectiveness of the four different scores in predicting NAFLD,a logistic regression model was used with liver PDFF > 5.5% (binary) as the dependent variable and the score as the independent variable.From the logistic regression model,the area under the curve (AUC) for sensitivityvs1-Specificity was calculated.To compare the AUC values from the two sample sets,their respective logistic regression models were compared by C-statistics.

For the exploratory analysis,a logistic regression model was used with NAFLD(binary) as the dependent variable and the variables included in the LFS equation as the independent variables.A backward stepwise procedure was used to eliminate independent variables withP> 0.05.

STATA14 (Stata Inc.,College Station,TX,United States) was used for all analyses.

RESULTS

General population (POEM study)

Very few subjects (n= 5) with liver fat > 5.5%,indicating NAFLD,had a BMI < 25 kg/m2(Figure 1).The prevalence of NAFLD was 23% in the population-based sample.FLI showed the highest receiver operating characteristic (ROC) AUC value (0.82),while the ROC AUC values for the other three indices were similar (0.77-0.78;Figure 2,Table 2).However,the ROC AUC for FLI showed a significant difference only with respect to the LAP score (P= 0.005),but notvsLFS (P= 0.08) or HSI (P= 0.12).

High-risk population (EFFECT studies)

The relationship between BMI and liver fat in the EFFECT studies is shown in Figure 3.The prevalence of NAFLD was 74% in the high-risk sample.LFS showed the highest ROC AUC value (0.80;Figure 4),and the ROC AUC for LFS was significantly higher than that for FLI (P= 0.0019) and LAP (P= 0.0022),but not HSI (P= 0.11).

Since the EFFECT studies consisted of two overweight/obese high-risk subgroups,namely,patients diagnosed with diabetes or hypertriglyceridemia,we performed a sensitivity analysis with stratification for these two subgroups.

No major differences in the detection of NAFLD were observed between the two subgroups,except for LAP,which performed best in patients with hypertriglyceridemia (Table 3).

Exploratory analysis

LFS showed the highest ROC AUC value in the high-risk population.Since LFS is rather cumbersome to calculate due to the many variables included in the LFS equation,we investigated whether the number of variables included could be reduced without any loss in ROC AUC.Using the regression coefficients from the logistic regression analysis,the formula 0.27 × fasting insulin (mU/L) - 2.6 × AST/ALT ratio resulted in a higher ROC AUC than that for the original LFS,but this was not statistically significant (simplified LFS ROC AUC,0.8404;original LFS,0.7994;P=0.12) in the high-risk population.However,in the population-based sample,the simplified version of LFS resulted in a lower ROC AUC than that for the original LFS(simplified LFS ROC AUC,0.7464;original LFS,0.7774;P= 0.33).Furthermore,the ROC AUC for the simplified LFS was lower than that for FLI in the population-based sample (P= 0.039 for difference) but was not higher than that for the other scores.

DISCUSSION

In accordance with our hypothesis,the NAFLD scores investigated demonstrated different NAFLD detection abilities in the two samples.Of the four evaluated scores,FLI was preferable in the population-based sample (NAFLD prevalence,23%),whereas LFS performed best in the high-risk sample (NAFLD prevalence,73%).The prevalence of the NAFLD scores found in this study were similar to those found in other population-based studies[1-4,9]and in high-risk groups,such as diabetes[15,16].

Figure1 Relationship between body mass index and liver fat in the population-based prospective investigation of obesity,energy and metabolism study.

FLI is a simple score that can be applied by the general practitioner and was found to be useful in the general population as a screening tool.It should however be remembered that this would only be the first step in the characterization of NAFLD,demanding further investigations,for example,transient elastography or MR elastography,and eventually a biopsy.

In a study comparing three of the scores used in the present studyvsliver histology(gold standard) in a sample of patients with a high prevalence of liver steatosis(95%)[17],FLI,LFS,and HSI performed almost equally well to diagnose liver steatosis(AUC:0.80-0.83).In another study comparing the different NAFLD diagnosis scoresvsNAFLD diagnosed by imaging,LFS performed optimally in the population-based NHANES sample,with a NAFLD prevalence of 18% measured by ultrasound[9].In the NHANES-based study,use of LFS resulted in an AUC of 0.77 in the total sample,which is similar to the AUC for LFS in the population-based sample used in the current study;however,FLI showed a superior performance over LFS in this low-risk population.This difference in the performance of NAFLD scores between the two population-based sample sets could be due to differences in the sensitivity of the techniques used for NAFLD diagnosis in the two studies.The limited sensitivity of ultrasound for detecting mild steatosis might have led to an underestimation of NAFLD prevalence in NHANES[9].Further,the Scandinavian population included in the current population-based study was almost exclusively nonHispanic Caucasians,which could have also influenced the performance of the NAFLD scores,since LFS and FLI performed almost equally in the non-Hispanic Caucasian subpopulation included in the NHANES study.

In the POEM study,very few cases of NAFLD were detected among subjects with a BMI < 25 kg/m2,which is consistent with other studies[18].Thus,there is clearly a limited need to screen for NAFLD in subjects with normal weight.In the high-risk population (EFFECT studies),all patients had a BMI > 25 kg/m2and had either type 2 diabetes or elevated serum triglycerides (> 1.7 mmol/L).Not surprisingly,almost three-fourths of the population showed NAFLD measured with abdominal MRIPDFF.In this high-risk population,LFS performed significantly better than FLI.

In the POEM study,60% of subjects had a BMI > 25 kg/m2,with a NAFLD prevalence of 35% in the overweight/obese subgroup of the population.In this moderate-risk population,the need for NAFLD screening is greater than that in the general population,as also suggested by other studies[18].Thus,future studies to determine an optimal screening tool for NAFLD should be performed in an overweight/obese population,which constitutes more than half of the middle-aged population in many countries.

Since LFS contains many variables and can be quite complicated to calculate in the clinical setting,we tried to simplify this score by using data on fasting insulin and AST/ALT only.This resulted in a simplified LFS score that performed at least as well as the original LFS in the high-risk sample but was less efficient in the populationbased sample.However,if this finding of simplified LFS score could be reproduced by others in a high-risk group,the use of this simplified LFS score could be an attractive tool in the clinical setting for screening of NAFLD in high-risk individuals.

Another observation in the high-risk sample was that LFS performed almostequally in the overweight/obese diabetes and hypertriglyceridemia subgroups.Thus,diabetes alone did not have a major impact on the predictive power of LFS,since many patients in the diabetes subgroup had hypertriglyceridemia.

Table2 Area under the curve of the receiver operating characteristic curves for the liver fat scores in the prospective investigation of obesity,energy and metabolism study

The strength of this study is the evaluation and comparison of four different scores for NAFLD diagnosis in two different samples,high-risk and low-risk,using a validated,highly accurate,and reproducible method to quantify liver fat content,MRI-PDFF[19].The coefficient of variation for this method was found to be low (5.3%)in healthy volunteers.However,it is a limitation that we have not evaluated the coefficient of variation in populations with a high proportion of liver steatosis,in a similar manner to the sample based on the EFFECT studies.The C-statistics being used to evaluate the discrimination between the scores is known to be a rather weak test demanding large samples to be significant even if the difference in AUC is within the 2%-3% range.With sample sizes around 300 that were observed in the present study,we therefore have a limited power to detect significant differences regarding discrimination between the tests.In the present study,we performed a very detailed history of previous diseases and alcohol intake to exclude other causes of liver steatosis than NAFLD.Thus,although it cannot be excluded that we missed some cases of liver disease other than NAFLD,the vast majority of individuals included in the present study are not likely to have any liver disease other than NAFLD.

In conclusion,the four investigated scores for NAFLD diagnosis performed differently in the population-based setting compared with the high-risk setting.FLI was preferable in the population-based setting,while LFS,or a simplified version of LFS,performed best in the high-risk setting.

Table3 Area under the curve of the receiver operating characteristic curves for the subgroups (diabetes and hypertriglyceridemia) in the high-risk population in the EFFECT studies

Figure2 Relationship between the four scores in the detection of non-alcoholic fatty liver disease and measured liver fat > 5.5% given as receiver operating characteristic curves and area under the curve in the population-based prospective investigation of obesity,energy and metabolism study.

Figure3 Relationship between body mass index and liver fat in the EFFECT studies.

Figure4 Relationship between the four scores in the detection of non-alcoholic fatty liver disease and measured liver fat > 5.5% given as receiver operating characteristic curves and area under the curve in the high-risk population investigated in the EFFECT studies.

ARTICLE HIGHLIGHTS

Research background

Non-alcoholic fatty liver disease (NAFLD) is a common disorder,with an estimated prevalence of 20% to 35% in the general population.Several non-invasive indices based on routinely available biochemical and physical parameters have been developed for the detection of NAFLD.However,data comparing the efficacy of these indices within a population-based sample are lacking.

Research motivation

To better understand the applicability of different non-invasive indices for detecting NAFLD in a population-based sample [based on prospective investigation of obesity,energy and metabolism(POEM) study]vsa high-risk sample (based on EFFECT studies).

Research objectives

To compare the efficacy of four non-invasive indices,fatty liver index (FLI),hepatic steatosis index (HSI),lipid accumulation product (LAP),and NAFLD liver fat score (LFS),in predicting NAFLD in population-based samples comprising normal and high-risk individuals.

Research methods

NAFLD screening was performed in a population-based sample of 50-year-old individuals in Uppsala,Sweden (n= 310;POEM study) and a high-risk population comprising patients with a body mass index > 25 kg/m2and either high plasma triglycerides (≥ 1.7 mM) or type 2 diabetes(n= 310;EFFECT studies).NAFLD was defined as liver fat > 5.5% using magnetic resonance imaging-proton density fat fraction.FLI,HSI,LAP,and NAFLD LFS were assessed.A logistic regression model was used to evaluate the effectiveness of the different scores.

Research results

The prevalence of NAFLD was 23% in POEM.FLI showed the highest ROC AUC (0.82) and was significantly better than the LAP score (P= 0.005vsLAP,P= 0.08vsLFS,P= 0.12vsHSI) for detection of NAFLD.The other three indices performed equally in POEM (0.77-0.78).The prevalence of NAFLD was 74% in EFFECT;LFS performed best (ROC AUC 0.80) in this sample.The ROC AUC for LFS (0.80) was significantly higher than that for FLI (P= 0.0019) and LAP (P=0.0022),but not HSI (P= 0.11).We performed a sensitivity analysis with stratification for the two high-risk subgroups (patients with diabetes or hypertriglyceridemia) from the EFFECT studies.LAP performed best in patients with hypertriglyceridemia.No major differences were observed between the other scores.

Research conclusions

The four investigated NAFLD scores performed differently in the populationbasedvshigh-risk setting.FLI was preferable in the population-based setting,while LFS performed best in the high-risk setting.

Research perspectives

In the populationbasedvshigh-risk setting,the indices performed differently.FLI was preferable in the population-based setting,while LFS performed best in the high-risk setting.

ACKNOWLEDGEMENTS

The authors thank the participants of the study,the study investigators,and the staff at the recruiting hospitals involved in this study.