Comparison of tumor regression grading systems for locally advanced gastric adenocarcinoma after neoadjuvant chemotherapy

2022-01-10 07:58:54ZiNingLiuYinKuiWangLiZhangYongNingJiaShanFeiXiangJiYingYanZhangShuangXiLiYuSunZiYuLiJiaFuJi

Zi-Ning Liu,Yin-Kui Wang,Li Zhang,Yong-Ning Jia,Shan Fei,Xiang-Ji Ying,Yan Zhang,Shuang-Xi Li,Yu Sun,Zi-Yu Li,Jia-Fu Ji

Zi-Ning Liu,Yin-Kui Wang,Yong-Ning Jia,Shan Fei,Xiang-Ji Ying,Yan Zhang,Shuang-Xi Li,Zi-Yu Li,Jia-Fu Ji,Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing),Gastrointestinal Cancer Center,Peking University Cancer Hospital and Institute,Beijing 100142,China

Li Zhang,Yu Sun,Department of Pathology,Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing),Peking University Cancer Hospital and Institute,Beijing 100142,China

Abstract BACKGROUND Current tumor regression grade (TRG) evaluations are based on various systems which brings confusion for oncologists and pathologists when interpreting results.The recent six-tier system (JGCA2017-TRG) recommended by the Japanese Gastric Cancer Association (JGCA) is worth investigating,as four-tier TRG systems are favored in various parts of the world.AIM To compare the predictive accuracies of five published TRG systems.METHODS Data were retrospectively collected from patients with locally advanced gastric cancer (LAGC) who underwent neoadjuvant chemotherapy followed by D2 Lymphadenectomy between January 2005 and January 2014 at our institution.Outcomes were overall survival (OS) and disease-free survival (DFS),which were evaluated separately using the following TRG systems:JGCA2017,JGCA,Becker,AJCC/CAP,and Mandard.RESULTS All five published TRG systems were independent predictors for OS and DFS.Concordance indices of the JGCA2017,JGCA,Becker,AJCC/CAP-TRG,and Mandard systems were 0.651/0.648 0.652/0.649,0.693/0.695,0.688/0.685,and 0.674/0.675 for OS and DFS,respectively.The four-tier Becker system showed the highest c-index,which was significantly greater than that of the six-tier JGCA2017 and five-tier JGCA systems (P<0.05 in OS and DFS).When residual tumor percentages were reset as:“no residual tumor”,<10%,<100%,and “no response”,the rearranged cutoff values achieved a maximum c-index with 0.728 for OS and 0.737 for DFS,which was superior to the other five systems.CONCLUSION The newly introduced six-tier JGCA-TRG system cannot increase prognostic stratification.The four-tier Becker system is more suitable for LAGC patients.A population-based study is warranted to define the optimal criterion for TRG in LAGC patients.

Key Words:Gastric cancer;Neoadjuvant chemotherapy;Tumor regression grade;Survival;Concordance index

INTRODUCTION

Although surgical resection is the mainstay therapy of locally advanced gastric cancer(LAGC),neoadjuvant chemotherapy (NACT) has now been widely adopted for LAGC in Europe and most recently in China due to the solid evidence that it reduces the risk of recurrence and improves overall survival[1-3].Theoretical benefits of NACT are downstaging the primary tumor,increasing the R0 resection rate,and treatment of potential micrometastases.

The effects of NACT on the tumor can be histopathologically evaluated in subsequent resection specimens by applying pathological tumor regression grading (TRG)systems.There are currently more than five commonly used TRG systems for GC across the world with different principles,different layers,and different cutoff values[4,5].These various practices in TRG evaluation place a large burden on oncologists and pathologists and make it hard to interpret results from different systems in similar clinical contexts.Pathologists may also be required to be familiar with more than one TRG system in daily practice[4].

Currently,most pathologists favor four-tier TRG systems in gastrointestinal cancer.There are the Becker system and the American Joint Committee on Cancer (AJCC)/College of American Pathologists (CAP) system,as these have superior inter-rater agreement with no loss of discriminatory ability[4,6].In October 2017,the 15thJapanese Classification of Gastric Carcinoma proposed a new six-tier pathological regression evaluation for GC based on its previous Japanese Gastric Cancer Association (JGCA)TRG system[7].This added the following sub-groupings of JGCA-TRG grade 2(residual tumor 1%-33%):grade 2a (residual tumor 10%-33%) and 2b (residual tumor<10%) according to the result of JCOG1004-A[8,9].This new classification did not draw much attention in Western countries,only in East Asia[10].However,as both the JGCA and AJCC/CAP criteria obtained good consistency in Chinese patients,extensive validation is warranted to verify the adjustment in the new JGCA-TRG system[11].Furthermore,an optimized histopathological evaluation system for predicting patient prognosis is urgently needed to resolve this contentious issue.

Therefore,the present study sought to validate the utility of the new JGCA-TRG system (JGCA2017-TRG).This was achieved by comparing JGCA2017-TRG with different TRG systems and exploring meaningful cutoff values of residual tumor percentage based on a current dataset comprising 413 LAGC patients who received D2 Lymphadenectomy following NACT.

MATERIALS AND METHODS

Patients

Data were obtained from a retrospective database of all patients receiving NACT followed by curative gastrectomy at the Peking University Cancer Hospital and Institute (“The Institute”) from January 1,2005 to January 1,2014.

The inclusion criteria included:(1) Proven diagnosis of gastric adenocarcinoma by preoperative pathology;(2) No signs of distant metastasis at first visit;(3) Complete perioperative medical record and documentation of NACT in the Institute;and (4)Curative gastrectomy with D2 Lymph node resection performed at the Institute.

The exclusion criteria were as follows:(1) Insufficient record of clinicopathological information;(2) Patients who received radiotherapy or targeted therapy before surgery;(3) Specimen information was not available;(4) Patients with R1/R2 resection or suspected of having metastasis when surgery was performed;(5) Non-adenocarcinoma diagnosis based on postoperative histological findings (except for complete response cases);(6) Remnant gastric cancer;and (7) Died within 30 d post-surgery.

Regimen and radical surgery

Except for eight patients with logistic reasons,e.g.,poor economic status or severe adverse events,all patients received at least two cycles of chemotherapy.In summary,364 patients received platin-based doublet regimens,25 patients received Taxol-based doublet regimens,and 24 patients received Taxol-platin-based triplet regimens.Supplementary Table 1 describes the detailed dosing regimens.

To assess the influence of the treatment duration,three 14 d cycles of FOLFOX or POS were regarded as two 21 d cycles of treatment.Dosage reduction or withdrawal was applied in cases of severe adverse events during chemotherapy;this was determined by the clinician according to the Common Terminology Criteria for Adverse Events (CTCAE) version 4.0,as in our previous study[12].After two to three chemotherapy cycles,the antitumor effect was evaluated using abdominal computed tomography (CT).In most cases,two or three alignment cycles were performed.The therapy was prematurely terminated in cases of disease progression.Otherwise,gastrectomy or continued NACT was considered after obtaining informed consent and approval from patients.Subtotal or total gastrectomy plus D2 Lymphadenectomy was performed according to the JGCA guideline[13].

Histopathological examinations

The pathological preparation of the surgical specimens was commenced immediately after the operation.After recording the localization,measurement,and complete inclusion of visible tumor or suspected tumor areas,a surgeon identified the lymph node groups in the specimen.They were dissected and labeled separately from the main stomach specimen.Generally,the stomach tissue was fixed in 10% neutral buffered formalin overnight and then embedded in paraffin wax.Sections of 5 μm thickness were cut and stained with hematoxylin and eosin (H&E) for microscopic examination,all according to standard procedures.The histological patterns,degrees of differentiation,the extent of tumor invasion,number of regional lymph node metastases,and lymphovascular invasion (LVI),were recorded in each patient’s pathology report.This information was then integrated according to the 8thAJCC Cancer Staging Manual and World Health Organization pathologic classifications by two oncologists (Liu ZN and Wang YK)[14,15].

All normal sections were stained with H&E and preserved in paraffin.From December 2017,two designated pathologists (Zhang L and Sun Y) were responsible for reviewing the extent of tumor regression.All patients’ H&E slides were reexamined using bright-field fluorescence microscopy for discrimination between necrotic or heat-fixed tissue and viable tissue.The extent of regressive tumors was evaluated and recorded according to:(1) The amount of viable tumorvsfibrotic tissue,which ranged from a total lack of tumor regression to complete response with no viable tumor identified;and (2) The percentage of the viable residual tumor,which was calculated by dividing the viable residual tumor area by the total tumor area.Tumor regression grades were then allocated according to the JGCA2017,JGCA,Becker,AJCC/CAP,and Mandard systems (Table 1)[7,8,16-18].As for the tumor regression grade,the JGCA2017 criteria example for each grade is shown in Figure 1.The review task ended in December 2019.

Table 1 Criteria of five tumor regression grading systems

Figure 1 Tumor regression grading according to 15th Japanese Classification of Gastric Carcinoma criteria.

Data collection

In addition to histopathological features,other included patient characteristics were age,sex,body mass index (BMI),American Society of Anesthesiologists score (ASA),ECOG performance status,tumor location,tumor diameter (on short axis),type of resection,type of NACT regimens,complications grade by Clavien-Dindo classification,NACT cycles,survival time,and survival status[19].The follow-up methods were described in our earlier study[20].Disease-free survival (DFS) was calculated from the date of surgery to the date of recurrence or metastasis.

Statistical analysis

Continuous variables were summarized as the median (IQR) and were compared across groups using the Kruskal-Wallis test.Categorical variables were analyzed using the Chi-squared test.The relationships between clinical and pathological factors and long-term DFS and OS were assessed using univariate log-rank tests and a multivariate Cox proportional hazard model.Tumor or treatment characteristics that achieved aPvalue<0.10 in univariate analysis were included in the multivariate analysis.The prognostic strength and the discrimination ability of each TRG system were assessed using the concordance index (c-index ± SE),with a concordance index of 1 indicating perfect prediction and 0.5 indicating no discrimination.The c-index was calculated and compared using the “survcomp” R package[21].Testing for trends was based on various statistical hypotheses when necessary.For all analyses,P<0.05 was considered to be statistically significant.Statistical analyses were performed using SE STATA (Stata Statistical Software,release 15.1;Stata Corp,College Station,TX,United States).

RESULTS

Patient characteristics

A total of 413 patients met the inclusion criteria and were included in this study(Figure 2).All achieved total tumor clearance (R0).The patients had a median age of 61 years (range 24-82) and were predominantly male (73.61%).Tumor localization was proximal (including esophagogastric junction Siewert III) in 166 cases,body in 51 cases,distal in 170 cases,and 26 patients had tumor involvement in the whole stomach(linitis plastica).Most patients received preoperative therapy of 5-Fu-based oxaliplatin doublet regimen (88.14%) and 105 patients did not receive adjuvant treatment after complete resection (25.42%).The demographic data of these patients are shown in Table 2,stratified by the JGCA2017-TRG system.

Figure 2 Selection of patients for inclusion.

Tumor regression assessment

According to the JGCA system,26 cases were grade 0 (6.30%),205 were grade 1a(49.64%),78 were grade 1b,68 were grade 2 (16.46%) including 29/39 (7.02%/9.44%) in grades 2a/2b according to the JGCA2017 classification,and 36 patients were grade 3(8.72%;Table 2,Supplementary Table 2).Similarly,the subgroup frequencies according to the Becker,AJCC/CAP,and Mandard systems are presented in Supplementary Table 3-6,respectively.Significant differences were found in the ypT,ypN,ypTNM,and LVI stages in all five systems.The correlation coefficients of ypT were 0.619,0.587,0.662,0.639,and 0.616 for the JGCA017,JGCA,Becker,AJCC/CAP and Mandard systems,respectively.On the other hand,no statistical significance was found between the NACT regimen and the TRG grade or between the duration of NACT and the TRG grade in any system.

Table 2 Clinical and demographic characteristics of the study population

BMI:Body mass index;ASA:American Society of Anesthesiologists;ECOG:Eastern Cooperative Oncology Group;LVI:Lymphovascular invasion;NACT:Neoadjuvant chemotherapy;TRG:Tumor regression grade.

Survival analysis and performance evaluation

The median follow-up was at 62 mo,with an IQR of 4.5 to 210 mo.At the final followup,209 patients had recurrence,and 200 died due to cancer.Kaplan-Meier curves for OS and DFS based on each system are presented in Figures 3 and 4.In the univariate analyses,all five regression classification systems had prognostic relevance (Table 3).Although all five systems revealed statistical trends towards an increase in the risk of OS and DFS (Ptrend<0.001),JGCA2017 grade 2a showed a higher OS risk compared with grade 1b despite no statistical intergroup significance (HR:1.06;95%CI:0.59-1.89;P=0.855).The C-index for the six-tier JGCA2017,five-tier JGCA,four-tier Becker,four-tier AJCC/CAP,and five-tier Mandard systems was 0.651 ± 0.027,0.652 ± 0.027,0.693 ± 0.033,0.688 ± 0.031,and 0.674 ± 0.028,respectively,for OS,and 0.648 ± 0.028,0.649 ± 0.028,0.695 ± 0.034,0.685 ± 0.031,and 0.675 ± 0.028,respectively,for DFS.The four-tier Becker system had the highest c-index and was statistically significantly more accurate in predicting survival and recurrence than the six- or five-tier JGCA systems(BeckervsJGCA2017,P=0.006 for OS,P=0.002 for DFS;BeckervsJGCA,P=0.007 for OS,P=0.003 for DFS).The c-indices were comparable between the Becker and AJCC/CAP systems (P=0.397 for OS andP=0.273 for DFS),and between the Becker and Mandard systems (P=0.148 for OS andP=0.136 for DFS),while the predictive ability of the four-tier AJCC/CAP system was more accurate than the five-tier Mandard system for OS (P=0.039) under similar evaluation principles.

Figure 4 Kaplan-Meier curves for progression-free survival of five tumor regression grade systems.

Table 3 Univariate analyses for overall survival and progression-free survival using a Cox proportional hazards model

BMI:Body mass index;ASA:American Society of Anesthesiologists;ECOG:Eastern Cooperative Oncology Group;NACT:Neoadjuvant chemotherapy;TRG:Tumor regression grade;HR:Hazard ratio;DFS:Disease-free survival;OS:Overall survival.

Multivariate analysis for overall OS and DFS were then performed,including features that were related to poorer survival prognosis in univariate analysis (P<0.10):BMI,ECOG,tumor location,diameter in short axis,differentiation,histology type,LVI,resection type,adjuvant chemotherapy,and ypT and ypN stages.After adjusting for potential confounders in the multivariate Cox regression model,BMI,histology type,LVI,and the ypN stage were independent predictors for OS,while LVI and the ypN stage were independent risk factors for DFS.All five TRG systems showed significant differences when setting the “complete response” group as a reference (Table 4).However,the increase in the hazard ratio was not entirely in accord with the increase in the TRG grade in the JGCA2017,AJCC/CAP,and Mandard systems.In fact,the intergroup differences were not statistically significant when the“complete response” group was absent in each system.Only a marginal difference was found between JGCA2017-TRG grade 0 (no response)vs2b (<10%) for OS (HR:1.84;95%CI:0.90-3.75;P=0.096) and DFS (HR:1.87;95%CI:0.92-3.83;P=0.085).

BMI:Body mass index;ECOG:Eastern Cooperative Oncology Group;AC:Adjuvant chemotherapy;TRG:Tumor regression grade;HR:Hazard ratio;DFS:Disease-free survival;OS:Overall survival.

Rearranged cutoff values based on current residual tumor percentage

According to the previous analysis,a comparison of the five systems revealed the Becker system to enable the best prognostic differentiation between subgroups across the whole patient cohort.The AJCC/CAP system,although having the second-highest c-index,did not provide better intergroup discrimination in multivariate analysis.According to the JGCA2017 criteria,two cutoff values of residual tumor percentage -10% and 100% - were of more clinical significance than any other commonly used cutoff percentages except for total regression.Despite the intergroup differences being marginal,a higher c-index of 0.728 ± 0.035 for OS and 0.737 ± 0.035 for DFS,could be achieved based on the following rearranged residual tumor percentage cutoffs:0 (no residual tumor;reference),<10% (HR:4.61;95%CI:1.02-20.73;P=0.047 for OR;HR:4.46;95%CI:0.99-20.08;P=0.051 for DFS),10-99% (HR:5.98;95%CI:1.40-25.63;P=0.016 for OS;HR:5.93;95%CI-25.29;P=0.016 for DFS),no response (HR:8.36;95%CI:1.82-38.44;P=0.006 for OS;HR:8.33;95%CI:1.82-38.23;P=0.006 for DFS).There was a significant difference in the prognostic ability for DFS (P=0.046) and a borderline significance for OS (P=0.073) between the rearranged cutoffs and the Becker system(Table 5).

Table 5 The pairwise comparison of C-indexes between different tumor regression grade based on Cox regression for overall survival

DISCUSSION

Neoadjuvant chemotherapy followed by surgery and adjuvant chemotherapy is the current standard treatment for LAGC[1,2].Although the benefit of this multimodality treatment was first confirmed by the MAGIC trial in 2006,the use of NACT had been adopted in GC for 30 years[22,23].To assess the treatment response,despite the widespread use of the TRG system for gastrointestinal tract tumors,the response rates are always poor in GC compared with esophageal or colorectal cancer[5].This might be due to the lack of chemoradiation and sensitive regimens in preoperative settings.Due to the currently limited preoperative therapies and the limited number of responsive patients,findings on the value of TRG prognostic systems in LAGC are varied.Additional complexities arise when the study contexts are based on different TRG systems,especially on the comparison between TRG and ypTNM systems as independent predictors of patients survival[24-29].Beckeret al[16] investigated 480 patients with LAGC undergoing surgical resection and found TRG 2-3 grade (10%-100% residual tumor) to be an independent risk factor for patient OS;this reinforced the efficacy of the Becker TRG system.Ikomaet al[25] reviewed 356 LAGC patients receiving D0-D2 Lymphadenectomy following NACT or NACRT,finding that the residual tumor<50% group was associated with a shorter OS but not as an independent predictor[25].And Derieux first proved the predictive value of the Mandard system in GC,observing a poorer OS and DFS in patients with a high proportion of residual cancer cells (Mandard TRG 4) and no response (Mandard TRG 5)[29].

Therefore,when verifying the prognostic value of the histological response,considerable work should be done on determining an optimal tumor response classification for GC.Currently,two major principles are common to these systems for grading tumor regression:(1) estimating residual tumor in relation to fibrotic changes,e.g.,the Mandard,AJCC/CAP,and Dworak systems[17,18,30];and (2) proportioning the residual tumor in relation to the previous tumor site,e.g.,the Becker and JGCA system[16,31].Although both are semiquantitative principles,the use of different systems reveals great regional disparities.The estimation of residual tumor is considered to be easier than considering therapy-induced fibrosis by the majority of pathologists[4],which potentially means a better inter-rater consistency for the residual tumor percentage method[32,33].Most recently,an international survey was conducted and summarized preferences for using various TRG systems in gastrointestinal cancer among 173 global pathologists[4].According to the published results,the AJCC/CAP and Mandard systems were widely adopted in North America and Europe,respectively.However,the questionnaires from East Asia - one from Japan and the other from Korea -accounted for only two of the 173 valid responses,with no input from China[34];it is doubtful whether these two contributions could fully picture the three countries that account for approximately one-third of the worldwide GC population.Overall,global diversity leads to obstacles in the comparison of experiments using different standards.

A comparison of different histological response systems for GC was conducted by Zhuet al[28].This study included 192 patients and found that five TRG systems -including Mandard,JGCA,AJCC/CAP,Becker,and China - were not independent predictors for patient survival.Although the predictive abilities of each system were not measured,the Mandard and JGCA systems were recommended due to their superior prognosis prediction abilities.This was because a higher hazard ratio was discovered in the “no response” patients.In JCOG1004-A,173 patients who received surgery following NACT were stratified according to different residual tumor cutoff percentages of 10%,33%,50%,and 67%.The 10% cutoff was found to be the best predictor of survival for various pathological types[9].While this 10% cutoff finding was remarkable and coincided with Becker’s cutoff method (described above)[35],whether the current five- or six-tier JGCA standard provided the optimal discrimination value was not further investigated.

In the present study,c-index analysis was used to compare the discrimination value of five TRG systems including the most recent JGCA2017-TRG system.Both the fivetier JGCA and the six-tier JGCA2017 systems scored significantly lower c-indexes than the four-tier AJCC/CAP and Becker systems.Because both the JGCA2017 and the JGCA have overlapped measuring spacing compared with the four-tier Becker system,the results of the present study indicated that five- or six-tier grading systems performed no better (and even worse) than four-tier systems in evaluating GC patients.The c-index comparison suggested that the four-tier Becker system had the best predictive value for GC patients.Because of their relatively wide measuring distance,four-tier systems based on residual percentages also mean a lower workload,easier understanding of protocols,and less inter-observer disagreement propagation[4,36].

On the other hand,based on the JCGA2017 criteria,the present study revealed that grade 2b (1%-10%) was likely to predict longer OS and DFS than grade 0 (no response).Interestingly,when the percentages of residual tumor were reset to “no residual tumor”,<10%,<100%,and “no response”,the c-index of the rearranged cutoff values scored significantly higher than the Becker system for patient survival.Similar results using these revised cutoffs were reported by Zhuet al[28],wherein an overt higher HR was observed for grade 3 among the other JGCA grades,and by Beckeret al[35] who demonstrated the independent predictive ability of the Becker system by using a cutoff of<10% residual tumor.The results of the present study suggested that among moderate-to-poor (residual tumor 10-99%) responders,the response rate may not have a decisive impact on hazard stratification because NACT or chemotherapy only accounted for a small part of improving the prognosis among significant covariates in this group of GC patients.Meanwhile,a complete or subtotal response (0%-10%) often indicated a fairly good sensitivity to chemotherapy and vice versa for non-responders (no regression),who cannot receive any benefit but toxicity.Although the non-responders only accounted for 6.3% of the total patient number,it is suggested that this “break off both ends” approach provides a way for screening chemosensitivity and predicting prognosis in GC patients.However,a larger sample size is required to verify this proposal.

There were some limitations to this study.First,it was restricted by its single-center retrospective nature.Second,although histopathology was performed by two pathologists with over 10 years of experience,analysis of the inter- and intra-observer variability of the actual TRG classification was not conducted.Third,despite the involvement of many covariates,the macroscopic information may not be sufficient.According to JCOG1004-A,the TRG cutoff standard may not be recommended for Bormann type IV patients,for which a current dataset is not available[9].Furthermore,this study did not consider intestinal and diffuse types according to the Lauren classification,which are thought to be independent prognostic factors for survival[37].Statistically,collinearity between the TRG and ypT categories is inevitable but would have affected the multivariable analysis results:the Pearson's coefficients with ypT were 0.619,0.587,0.662,and 0.639 for the JGCA2017,JGCA,Becker,and AJCC/CAP systems,respectively.To reduce the impact of multicollinearity,studies with an increased sample size are warranted.

CONCLUSION

In conclusion,it was demonstrated that although all five TRG systems could be used as independent predictors for LAGC patient survival,the six-tier JGCA-TRG system did not increase prognostic stratification but may reduce the reproducibility and increase the working load on histological response evaluation.Patient survival can be effectively discriminated by the Becker system using the residual tumor percentage rather than by estimating the fibrosis/residual tumor ratio.Apart from when using the Becker classification,the group of non-responders with no regression was predicted to have a poorer prognosis.A large population-based study is still required to find the optimal criteria and validate the boundary settings of current TRG systems for LAGC patients.

ARTICLE HIGHLIGHTS

Research background

The tumor regression grade systems for gastric cancer (GC) are various,while the most suitable one is yet to be known.

Research motivation

We aimed to investigate the most accurate criteria for TRG in predicting patient’s prognosis.

Research objectives

To collect 413 locally advanced GC (LAGC) patient’s clinical data and their posttreatment pathological samples after neoadjuvant chemotherapy treatment.

Research methods

This is a retrospectively clinical study in which the LAGC patient’s specimens were reviewed by two pathologists and the TRG grades were revalued.Then,the predictive abilities of five TRG criteria were assessed and statistically compared based on survival/risk prediction model.

Research results

The four-tier Becker system showed the highest predictive ability,among the five common TRG criteria.The TRG criteria could achieve an optimal prediction when the residual tumor percentages were reset as:“no residual tumor”,<10%,<100%,and“no response”.

Research conclusions

The four-tier Becker system is more suitable and should be recommended for LAGC patients.

Research perspectives

A population-based study is warranted to define the optimal criterion for TRG for GC.