Claire PM van Helsdingen, Audrey CHM Jongen, Wouter J de Jonge, Nicole D Bouvy, Joep PM Derikx
Abstract
Key words: Anastomotic leak; Consensus; Colorectal surgery; Postoperative complication;Morbidity; Colorectal anastomosis; Definition
Colorectal resection with the creation of an anastomosis is performed as part of the treatment for various colorectal diseases, such as cancer, inflammatory bowel diseases and diverticulitis. Colorectal anastomotic leakage (CAL) remains one of the most feared complications after colorectal surgery with a reported incidence between 1.5%and 23%[1,2]. Despite the emerging knowledge about CAL and the improvement of surgical techniques, the incidence has remained stable over the last decades. If CAL occurs, morbidity and mortality rates increase and health-related quality of life decreases[3-5]. CAL also results in a reduced long term overall-survival, increased risk of cancer recurrence and higher healthcare costs[6-8].
Although CAL regularly occurs as a complication of colorectal surgery and CAL is a commonly used outcome measure in clinical and experimental research, there is a lack of a generally accepted definition. Multiple definitions have been proposed in the past years. The United Kingdom Surgical Infection Study Group introduced the definition “A leak of luminal contents from a surgical join between two hollow viscera”[9]. The International Study Group of Rectal Cancer (ISREC) proposed the definition “A defect of the intestinal wall integrity at the colorectal or colo-anal anastomotic site leading to a communication between the intra- and extraluminal compartments” combined with a grading system[10]. Nevertheless, none of these are widely accepted[11,12]. Van Rooijenet al[13]performed a consensus based survey between Dutch and Chinese surgeons that conclude there is no uniform definition of CAL[13].Our research group recently performed a systematic review on the definition of CAL,to gain insight in the different definitions and to show the effect of different definitions on the incidence rates of CAL. Therefore, 2938 abstracts and 1382 full-text articles were reviewed. Eventually, only 347 articles contained a definition of CAL,which was striking given that CAL was one of the outcome measures in these studies.The definitions used in the papers varied strongly in composition and consisted of clinical parameters, radiological findings, treatment consequences and grading systems. Almost 66% of the articles used clinical signs and symptoms in their definition or method of diagnosis, varying from fever and abdominal pain to purulent discharge from wound or drain. Grading systems for CAL were used in 19% of the formulated definitions. The most common was the ISREC severity grading system of CAL (Supplementary material 1), followed by the Clavien-Dindo (CD) classification(Supplementary material 2)[10,14]. It also became clear that the reported incidence of CAL is dependent on the used definition[15].
In conclusion, there is a great variety in the definitions of CAL, which hampers further investigation and intervention studies. Because of the lack of a general definition of CAL it is difficult to compare study outcomes and quality of hospital care. Therefore, there is an urgent need of a widely accepted definition that will increase both the understanding between clinicians and researchers and also the comparability of clinical trials. The aim of this Delphi study is to minimize the variation in the definitions of CAL by reaching consensus on a definition of CAL, with a subaim to gain insight in the various components, such as clinical parameters,laboratory tests, radiological findings and findings during reoperation that the definition of CAL should contain. The Delphi technique is a widely used and accepted consensus method based on the opinions of experts, which will allow us to formulate a recommendation for a general definition that is eventually supported by a panel of international experts.
The preliminary results of the systematic review on the definition of CAL performed by our research group served as the base of the survey[15]. For the current study we used the RAND/UCLA appropriateness method (RAM)[16]. The process consisted of two survey rounds followed by a third round in which the participants received our recommendation. This modified Delphi took place from January 30 through June 3,2019. We used the recommendations of Sinhaet al[17]for the reporting of the results. A summary of the consensus process is shown in Figure 1.
An international expert panel of colorectal surgeons and researchers was formed. We invited authors who had published three or more articles about CAL in the past years.In addition, we tried to create an equal distribution in terms of discipline (clinicians and researchers) and countries where the authors were employed. The authors were selected from the articles used in our review, which included articles between January 1990 and January 2016. To include the authors from papers published between January 2016 and January 2019, we performed an additional literature search. The search was conducted using Pubmed with the following terms: Colorectal anastomotic leakage, anastomotic leak, anastomosis and colorectal, both normal search terms and MeSH-terms. The participants did not know the identity of the other participants, to prevent influencing each other. We sent a participation letter by e-mail with a briefly explanation of the aim of the study and included two appendices with a more extended description of the consensus process and an overview of the literature.The e-mail also contained the link to the first questionnaire. Because of a low response rate, we invited a second group, who were selected in the same way as the participants in the first group.
The questionnaires were developed and distributed using SurveyMonkey(SurveyMonkey Inc, Palo Alto, CA, United States; www.survey-monkey.com, for the used questions see Supplementary material 3). The questions were divided in nine categories: General definition, clinical parameters, laboratory tests, radiological findings, findings during relaparoscopy or relaparotomy, grading systems, timing and distinction between colon and rectum. The first questionnaire consisted of 11 rating questions, one multiple-choice question and one open-ended question. The participants were requested to rate the appropriateness of each statement made in the rating questions by the use of a 1 to 9 Likert scale. Where 1 equals “inappropriate”and 9 equals “appropriate”, in two questions the terms “inappropriate” and“appropriate” were replaced by the terms “completely disagree” and “completely agree”, respectively. In one of the rating questions we asked the participants to select 9 items from a list of 12 items and to put those 9 items in a 1-9 rank, where 1 equals most contributing and 9 least contributing. Each question was followed by a text field,where remarks could be made by the participants and they also could provide arguments for their answers. The first survey had an additional category to collect the participant characteristics.
Figure 1 Flow diagram of the consensus process.
All participants who completed the first round received the results of the survey by means of the average group response and their individual score by e-mail; the participants did not receive any specific answers of other respondents. The e-mail also contained the link to the second questionnaire, which was also developed and distributed using SurveyMonkey. The second survey consisted of the same statements as in the first survey, regardless of whether consensus was reached or not. However,some statements were adapted or some items were added, based on comments and suggestions of the expert panel. The rating of the statements in de second round was in the same way as in the first round. The second survey provided the participants the possibility to reconsider their answers and to criticize the average group response (See Supplementary material 4 for the used questions).
As a third and final round all statements were analyzed and a recommendation regarding the definition of CAL was presented (See Supplementary material 5). The recommendation along with an overview of all reviewed statements were send by email to all the participants who completed the second round. The participants were asked to reply whether they agreed or disagreed with our recommendation.
Consensus was reached if statements were rated “appropriate” (panel median 7-9) or“inappropriate” (panel median 1-3) without disagreement. Disagreement was measured by the interpercentile range adjusted for symmetry (IPRAS). This is according to the method used by Moossdorffet al[18]. MS Excel 2016 (Microsoft Corp,Redmond WA, United States) and IBM SPSS (SPSS 24.0, IBM, Chicago, IL, United States) were used to conduct the analyses. The statistical methods for this study were reviewed by Prof. Dr. Jos W.R. Twisk from the Amsterdam UMC.
Fifty-eight surgeons and researchers were invited by e-mail to participate in this Delphi study. Twenty-three responded positively and completed the first questionnaire (40% response rate), 31 did not respond and four could not participate due to a lack of time or that they were not clinically active anymore. The second survey was completed by 21 participants (91% response rate). Eventually, 19 panel members (17 surgeons, one surgeon who is also a researcher and one researcher)finished the third round (90% response rate). From the participants who finished all three rounds, 14 were currently employed in a hospital in Europe, four in North-America and one in Asia. For the list of panel members and the characteristics of the expert panel who completed this Delphi study see Table 1 and Figure 2 respectively.
The first questionnaire consisted of 31 ranking items. Twenty-three items were rated appropriate without disagreement (74%). Five items (16%) were rated uncertain, four without disagreement and one with disagreement. And finally, three items were rated inappropriate without disagreement. After round 1, consensus existed on 26 out of 31 items (84%) and uncertainty and/or disagreement on five items. The multiple choice question was about which existing general definition was most suitable according to the participants: 11 (48%) indicated the ISREC definition, the other 12 indicated multiple definitions. The open ended question asked the panel to give a range of postoperative days (POD) in which CAL can occur to define it as CAL, the range varied between 1 d up to 365 d. The most frequently suggested range was the range up to 30 d (31%).
The second questionnaire was an adjusted version of the first survey. Based on remarks from the participants three questions were rephrased, which resulted in 15 additional items. The category clinical parameters consisted of one ranking question in a different format than the other ranking questions, so we converted the question to the same format (1-9 Likert scale). The new formulated question contained 12 items.There was one item added to the category laboratory tests and we divided two items into four items in the category grading systems, so eventually the survey consisted of 46 ranking items. Thirty-three items were rated appropriate without disagreement,nine items were rated uncertain without disagreement and four items were rated inappropriate without disagreement. In conclusion, after two rounds consensus was reached on 37 out of 46 items (80%). There were no changes in the items that already reached consensus in round one. In the multiple choice question we excluded the option to choose multiple general definitions, which led to 15 participants indicating the ISREC as most suitable (71%). Table 2 for the summary of the items on which consensus was reached.
The final round consisted of 11 recommendations, based on the outcomes of the two questionnaires (Table 3). Sixteen participants fully agreed with our recommendations(84%). The other three participants partly agreed with our recommendations.
After the first two rounds only nine items (19%) lacked consensus. Five items in the category clinical parameters, namely tachypnea, (sub-) febrile temperature, postoperative ileus, oliguria and agitation. The panel members rated the laboratory tests leukocytosis, procalcitonin (PCT) and neutrophil to lymphocyte ratio uncertain. The last item that did not achieve consensus was the radiological finding of an abscess not near the anastomosis.
Despite the relevance of CAL in daily clinical practice and in research, there is still no uniform definition of CAL. We performed a Delphi analysis using the RAM and reached consensus on the definition of CAL in 80% of the statements after two rounds. The ISREC-definition for CAL is most frequently advised to use in both daily clinical practice and research. According to the experts, purulent discharge from the drain, a rectovaginal fistula and a defect found by digital rectal examination contributes the most to the suspicion of CAL. Furthermore, the serum markers CRP and CRP in combination with leukocytosis are valuable in the diagnostic process ofCAL. PCT, albumin and urea are not deemed useful. Radiological criteria for computed tomography scan (CT-scan) based diagnosis of CAL should be extravasation of endoluminal contrast, an abscess around or near the anastomosis, air around the anastomosis and free intra-abdominal air. There was divided opinion with regard to the abscess found not near the anastomosis on CT-scan. The findings that need to be considered as CAL during re-operation are necrosis of the anastomosis,necrosis of the blind loop, dehiscence of the anastomosis and signs of peritonitis.Moreover, the definition should contain a grading system, and, according to the panel, both the ISREC-classification and the CD-classification are appropriate. The expert panel agreed that there should not be a fixed range of POD in which CAL can occur to define it as CAL. Furthermore, there should be a distinction in the definition between early anastomotic leakage (EAL) and late anastomotic leakage (LAL). Finally,colonic anastomotic leakage and rectal anastomotic leakage should be seen as separate problems.
Table 1 List of panel members
The heterogeneous clinical presentation of CAL remains a challenge for clinicians in the diagnostic process, therefore there is a need for more specific tests. Serum markers are an important tool in the follow-up after colorectal surgery. Potential contributing laboratory tests found in literature included CRP, leukocytes, PCT,neutrophil to lymphocyte ratio, albumin, urea and creatinine[12,19-22]. In the first round of this Delphi analysis only CRP was rated as an appropriate laboratory test,leukocytosis was rated uncertain. In the second survey, the item “combination of CRP and leukocytosis” was added, which was rated appropriate. One of the experts indicated that the trend between the two laboratory tests would be considered more reliable than the absolute values. Smithet al[23]investigated postoperative trajectory testing of the serum biomarkers CRP, PCT, white cell count (WCC) and gammaglutamyl transferase. CRP, PCT and WCC are potential markers with the highest accuracy for CRP with an area under the receiver operator characteristic curve of 0.961 [binomial 95% confidence Interval (CI): 0.921-0.982]. The combination of CRP and WCC had an area under the receiver operator characteristic of 0.958[23]. Urea,creatinine and albumin were unanimously rated inappropriate, therefore we suggest that these tests are not to be considered relevant for the diagnosis of CAL any longer.PCT and neutrophil to lymphocyte ratio remained uncertain. Even though, CRP and a combination of CRP and leukocytosis were rated appropriate, these laboratory tests are not specific for CAL. More research is needed to investigate more specific serum markers for CAL and the use of trajectory testing.
Figure 2 Panel members characteristics. A: Specialty; B: Continent employed; C: Country employed.
In addition to the clinical situation of the patient and biochemical tests, radiological examination plays an important role in the diagnostic process of CAL. The CT-scan is the preferred modality[24-26]. Our panel considered extravasation of endoluminal administrated contrast, a collection around the anastomosis, an abscess near the anastomosis, perianastomotic air and free intra-abdominal air all appropriate radiological findings for CAL. The only remark was that defining free air on a CTscan as CAL depends on the POD and whether the operation was performed open or laparoscopically. The abscess not near the anastomosis is still a topic of debate; earlier studies have also described the indistinctness of pelvic abscesses[12,27,28]. The abovementioned radiological findings should be described in every radiology report where the question was whether there is CAL or not. Creating this kind of standard radiological reports is important for daily clinical practice as well as for research purposes. Likewise the signs of anastomotic leakage found during relaparotomy or relaparoscopy, such as necrosis of the anastomosis, necrosis of the blind loop,dehiscence of the anastomosis and any signs of peritonitis should be described in the operation report.
Using a grading system for CAL is considered important to improve comparing outcomes of hospital care and clinical studies. According to our panel members both the ISREC-classification and the CD-classification are appropriate systems. The ISREC-classification is known as a valid system facilitating the comparison of clinical results, which is clear and easy to use[29]. However, the ISREC-classification has several limitations. It is especially developed for low anterior resections in rectal cancer, where colorectal or colo-anal anastomoses are constructed, and therefore should not be used for colon-colon anastomoses. Another limitation of the ISRECclassification is that it is only useful in clinical practice but not in research and it cannot be compared with other complications[10]. In contrast, the CD-classification can compare the impact of different complications and is therefore also useful in research.However, that means that the CD-classification is not specific for anastomotic leaks and thus does not take into account the severity or sequelae of the intervention to correct the leak. Furthermore, the CD-classification does not determine if there is a presence or absence of a leak. According to our panel a weakness of the CDclassification is that it is a combination of therapeutic actions and outcomes and that the outcomes not necessarily correspond with the leak, but may be due to comorbidities. Concluding, the CD-classification is widely accepted, not specific for CAL and more useful in research. On the other hand, the ISREC classification is specific for anastomotic leaks in patients after low anterior resections and is more useful in clinical daily practice. Since both grading systems are suitable, but for different purposes, they should be used together when grading CAL.
Consensus was reached regarding whether there should be a fixed range of POD inwhich the leak can occur to define it as CAL. According to our panel members there should not be a range of days. Remarkably, the majority of the panel members (62%)gave a range of days in the open-ended question. However, the ranges varied widely from one day post-operatively to 365 d post-operatively. This is in line with the distinction between EAL and LAL. Recent papers showed that there are some differences between these two groups[30,31]. EAL appears to have different risk factors than LAL, namely younger age, increased Body Mass Index, laparoscopically performed anastomosis, emergency operation and no diverting ileostomy, that are more related to surgery difficulty. Independent risk factors for LAL include high Charlson Comorbidity Index, high American Society of Anesthesiologists score,preoperative complications and preoperative radiotherapy, which are more patientrelated factors[32]. This raises the question of whether there are two different types of AL. However, multiple definitions of EAL and LAL are used in literature. Used cutoff points for LAL also vary widely between > 6 POD, > 90 POD and after hospital discharge[30,32-37]. Our panel members agreed that a distinction should be made between EAL and LAL based on clinical experience, but more research is needed to really prove this difference and to define the optimal cutoff point.
Table 2 Summary of the consensus on the definition of colorectal anastomotic leakage after two rounds
Furthermore, the experts agreed upon the statement that colonic anastomoses and rectal anastomoses should be seen as different entities. In the preliminary results of our review a comparison was made between incidence rates of colonic anastomotic leakage and rectal anastomotic leakage. It showed a significantly higher risk of rectal anastomotic leakage [Odds ratio: 0.71 (95%CI: 0.693-0.736),Pvalue ≤ 0.001][15]. The reason for this higher risk could possibly be the difference in anatomy, but also different surgical techniques that are used to construct the anastomosis and a different microbiome[38].
This study has several limitations. An inherent limitation of any consensus method is the number of panel members. According to the RAM, the panel should consist of a minimum of seven members[16]. Our panel of 19 members amply meet the advised panel size of seven. Most of the panel members were employed in Europe and North-America, only one was employed in Asia. From the 56 surgeons and/or researchers who were invited, 14 were based in Asia and only three of them have responded to our invitation. All our communication (invitation letter, appendix, questionnaire) was in English. The low participation rate among the Asian experts could possibly be due to a lack of ability to understand the English language. This low participation rate is in contrast to the high response rate on a consensus survey among Dutch an Chinese surgeons, where the questionnaire, originally in Dutch, was translated to Chinese[13].Our suggestion for future research should be to translate the questionnaire into different languages, if the goal is to achieve an even global distribution within the expert panel. Notwithstanding the uneven distribution across the world, we achieved to create an international panel of experts who were all employed at different institutes. The last limitation was that our panel consisted mainly of colorectal surgeons (96%), so the group was very homogeneous. This could cause informationbias. Considering that this definition will be used primarily by colorectal surgeons, we perceive this an appropriate panel composition for this Delphi study.
Table 3 Recommendations final round
In conclusion, consensus was reached regarding the definition of CAL. The panel recommends that the ISREC definition should be used as the general definition of colorectal anastomotic leakage. And when defining CAL, the ISREC grading system should be complemented with the Clavien-Dindo classification.
A consensus-based recommendation for the definition of CAL was formed using our modified Delphi method that can be widely incorporated in the field.
This study shows that there is an urgent need for a uniform definition of CAL. The consensusbased recommendation for the definition of CAL is a step forward in achieving this uniform definition. Now it needs to be incorporated in the clinic and in research to improve the quality of research outcomes.
The authors gratefully acknowledge our panel members for their active participation and their valuable contribution to this study.
World Journal of Gastroenterology2020年23期