Development and validation of the perioperative recovery scale for integrative medicine

2019-07-25 06:46LiZhouBiYingSuShaoNanLiuXiaoYanLiLiXingCaoLiMingLuZeHuaiWenZhiQiangChen
Traditional Medicine Research 2019年4期

Li Zhou,Bi-Ying Su,Shao-Nan Liu,Xiao-Yan Li,Li-Xing Cao,Li-Ming Lu,Ze-Huai Wen,6*,Zhi-Qiang Chen*

1The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China.2Key Unit of Methodology in Clinical Research, Guangdong Provincial Hospital of Chinese Medicine, Guangzhou, China.3Guangdong Provincial Hospital of Integrated Traditional Chinese and Western Medicine, Foshan, China.4Department of Surgery, Guangdong Provincial Hospital of Chinese Medicine, Guangzhou, China.5Guangzhou University of Chinese Medicine, Guangzhou, China.6National Center for Design Measurement and Evaluation in Clinical Research,Guangzhou University of Chinese Medicine,Guangzhou,China.

Background

Every year, more than 230 million major surgical operations occur across the globe [1].Owing to improvements in surgical methods and infection control, morbidity and mortality related to major surgery are on the decline[2].Nevertheless,the quality of life for postoperative patients with complications has not improved [3-5].Derogaret al.[5] has demonstrated that the occurrence of postoperative complications exerts a long-lasting negative effect on quality of life among patients who survive at least 5 years after a cancer-related esophagectomy.

Traditional CM, including the use of Chinese herbal medicine, acupuncture, and massage, combined with Western medicine (WM), also called integrative medicine(IM),is becoming increasingly popular in the treatment of postoperative complications and in improving quality of life [6-10].As early as 610 A.D.,the ancient book of Chinese medicine namedZhu Bing Yuan Hou Lun, recorded the application of traditional CM after surgical operation.It said patients with intestinal rupture caused by trauma could only drink porridge within 20 days after intestinal anastomosis.If the patient ate too much after the surgery, he would suffer from stomachache.Morever, taking Qianxie Powder (a Chinese medicine prescription consisting of iron scraps,Fructus Anisi Stellati, etc.) helped to relieve the pain.Wai Ke Zheng Zongpublished in Ming Dynasty of China (1617 A.D.) also recorded that Hui Xiang Cao powder (a Chinese medicine prescription consisting ofFoeniculum vulgareandAlpinia Rhizoma.) was used for the treatment of postoperative analgesia, andCrinis Carbonisatuswas used for postoperative hemostasis.

In the fields of psychology and medicine, various scales are used as outcome measurements.There have been scales developed to assess postoperative recovery,however many are limited in their usability in the CM/IM context.In 2008, Kluivers [11] conducted a systematic review of recovery specific quality of life(QOL) instruments.Results showed that no validated instrument exists for the assessment of general postoperative recovery according to the quality criteria for measuring the properties of health status questionnaires.Only two scales were recommended:the Post-discharge Surgical Recovery Scale [12] and the Quality of Recovery-40(QoR-40)[13].Allvinet al.[14] have also developed a scale containing the five domains of physical symptoms, physical functions,psychological, social and activity.Stark and Myles have [15] also developed a simplified version called QoR-15 which is based on QoR-40, and provided an accurate, extensive, and yet efficient evaluation of postoperative health.A limitation of these instruments is that they can not adequately assess treatment efficacy for CM/IM.

Major differences exist between CM and WM in terms of their methods of evaluating health status on the basis of dissimilarities in human physiology and pathology.Practitioners of CM usually collect patient-reported information including objective symptoms such as diet,sleep,pain,subjective feelings,and emotions to make a diagnosis and to evaluate treatment efficacy [16].QOL or patient-reported outcomes (PROs)could be a more reliable approach to evaluation.However, there is a dearth of suitable scales to evaluate the efficacy of both CM and IM during the perioperative period.Thus, developing a scale that is consistent with the clinical features of IM during the perioperative period can not only assess patient recovery, but also better reflect the treatment characteristics of CM/IM.

Several other instruments have been developed based on Chinese medical theory or for integrative medical research, however few have addressed postoperative quality of life [17-20].Wang Z [21] and his colleagues have developed a scale for postoperative rehabilitation, which may be the first to consider the characteristics of CM.However, problems do exist with this scale,as it only includes 10 items,insufficient for a thorough analysis of post-operative patient condition.The creators of the scale only focused on the direct influence of operations, potentially neglecting indirect influences such as mental health and overall assessment of patient health.Furthermore, the sample size in Wang’s study may have been too small.Therefore, the primary goal of this study is to develop a scale that can be applied to the evaluation recovery status of postoperative patients.

Methods

For our study, we made slight modifications to the standard PROs procedures recommended by the Food and Drug Administration (FDA) of the U.S.Department of Health and Human Services [22].We conducted the procedures in accordance with the following five stages: (1) conceptualisation and item generation; (2) item reduction and development of the initial version; (3) preliminary evaluation of the validity and reliability of the initial version;(4) further selection of items and generation of a test version of perioperative recovery scale for integrative medicine(PRSIM); and (5) a final evaluation of the scale for validity,reliability and responsiveness, constituting the final version(Figure 1).

Conceptualisation and item generation

In accordance with the guidance of CM theories and the guidelines for developing psychometric scales, the theoretical framework and item pool of the PRSIM were initially devised with the aid of literature review,other scales and expert consultation.Several instruments, related to surgery and anesthesia and published between 1976 and 2010, were identified in the literature review.In light of the purpose of developing this scale, the initial theoretical framework for the perioperative IM recovery scale was developed with comprehensive analysis of previous scales’items and expert opinions.Next, the scale was constructed,including the domains of physical function, mental function, activity function, general health perceptions and pain.Apart from these domains, a social function also included assessing patient condition before surgery and after discharge in a longer term perspective.We decided that the domains of the scale should encompass the Five organs(including the heart,lungs, liver, kidneys and spleen), Qi (Qi is believed to be a vital force forming part of any living entity,https://en.wikipedia.org/wiki/Qi)-blood and Yin-Yang(Yin and Yang is a concept of dualism in ancient Chinese philosophy, describing how seemingly opposite or contrary forces may actually be complementary, interconnected, and interdependent in the natural world, and how they may give rise to each xx other as they interrelate to one another.https://en.wikipedia.org/wiki/Yin_and_yang),according to CM theory.

The pathogenesis of surgical diseases in CM is mainly attributed to disorders of the Five organs,Qi-blood and Yin-Yang.The original prognosis of surgical illness in CM is also related to the Five organs.In accordance with the CM theory mentioned above,six domains were included.In total, 122 items were included in the item pool, based on the theoretical framework of the scale.The number of items in the six domains was 49, 24, 25, 7, 12 and 5 for physical function, mental function, activity function, social function, pain and general health assessment,respectively (see additional file 1).Details have been published by Suet al.[23].To select appropriate response descriptors for the PRSIM, response scale analysis was conducted on capacity, frequency,intensity and evaluation, with a total of 20 scale descriptors[24](see additional file 2).

Figure 1 Flow chart of PRISM development

Item reduction, development and evaluation of the initial version

The main task at this stage was item selection.When developing scales, researchers typically utilize both expert consultations and patient investigation.After the first round of expert consultations, 42 of the existing 122 items were retained [25].The aim of the second round of expert consultations was to select 25 items out of the 42,which would cover the same domains as before.Of these 25 items, 13 were grouped under the dimensions of physical function, 5 under mental function, 2 under activity, 2 under pain and 3 under general health perceptions.We deleted 2 items and retained 23, in accordance with the results of four types of screening: dispersion, correlation coefficients,factor analysis and stepwise regression.50 patients,from orthopedics, general surgery and gynecology,participated in the test.Out of 50, 21 had received test-retest assessment and the time interval of retest was 24 to 48 hours after surgery; the majority was assessed within 24 hours of surgery [13, 14].The test-retest reliability was 0.727, the split-half reliability was 0.739, and the Cronbach coefficient was 0.821 after the initial evaluation of the scale.The results showed better reliability.There was good content validity, but construct validity showed some divergence from the theoretical framework.Details have been previously published by Liuet al.[25] (see additional file 2).

Further selecting items and generating the final version of the PRSIM

An expert consultation meeting was organized to discuss the existing problems based on the preliminary investigation version.Experts were asked to present specific proposals for this version.Then, the members of a focus group discussed the existing proposals and decided which questions to retain.The focus group also summarized suggestions raised by the patients.A second expert consultation meeting was held to discuss the results of the first meeting and the suggestions of the focus group.With the experts’ views in mind, we modified the preliminary investigation version.Ambiguous words or sentences as well as incomprehensible items or answers were modified to develop a version that would be easy to implement and understand.After two expert consultation meetings,one patient investigation, and one focus group discussion, 20 items for the five domains, including direct and indirect influence, activity, mental function and general health perceptions were agreed upon (see additional file 3).

Final evaluation of the PRSIM

The final version of the PRSIM was evaluated by investigating the participants who had been recruited from the departments of general surgery, gynecology,breast surgery and orthopedics at Guangdong Provincial Hospital of Chinese Medicine(GPHCM).In order to measure discriminant validity, the study also included healthy persons from the GPHCM Physical Examination Center whose physical examination results had not reported any major diseases.The estimated total sample size was 200, according to the principle of sample size estimation for psychometric tests, with 5-10 participants for each item of the scale[26].The inclusion criteria for patients were as follows:(1) aged 18-75; (2) conscious, with some degree of mental comprehension and ability to communicate with others either by speech or writing; and (3) had agreed to participate in the study and had signed the informed consent.The exclusion criteria were: (1)patients with mental disorders; and (2) patients who had undergone minor operations (i.e.gynecological hysteroscopy).

Statistical analysis

All data were input by double-entry with EpiData(Jens M.Lauritsen & Michael Bruus, v.3.1, 270108), and then double-checked.A correlation coefficient,exploratory and confirmatory factor analysis (CFA),analysis of variance, Cronbach's α coefficient, and a structural equation model were performed using PASW Statistics 18.0 (IBM SPSS Inc., Armonk, New York,USA) and Amos 19.0 (Amos Development Corporation,Meadville,USA).

There was no recommendation for sample size calculation for evaluating PRO scale’s validity from the FDA guidance [22].According to the principle of sample size estimation for psychometric tests, with 5-10 participants for each item of the scale [26],estimated total sample size for validity and reliability test of PRSIM scale was 200.

Construct validity was first demonstrated through factor analysis when Kaiser-Meyer-Olkin (KMO) and a sphericity test were sufficient to support it.After the exploratory factor analysis (EFA), we conducted CFA using structural equation modeling to verify whether the conceptual model fit.Discriminant validity was evaluated by comparing the results between patients and healthy people using an independent t-test.The responsiveness of the scale was acquired from a comparison of patient data with another dataset collected from the same patients,but at a different time.This was done with a paired t-test.

Internal consistency reliability was evaluated by calculating a Cronbach's α, and a value of α greater than or equal 0.7 was considered satisfactory [27].Test-retest reliability was measured by utilizing the intra-class correlation coefficient (ICC).The split half reliability was calculated by correlation and Cronbach's α, according to the odd indexes as one group,and the even indexes as another group.

Results

Sample characteristics

A total of 354 participants, more than the estimated sample size of 200, were investigated.Of the 354 participants, 349 were included in the final analysis after deleting 2 whose age did not meet the inclusion criteria, and 3 for which more than 20% of the data was missing.The participants used a five-point Likert scale to rate their symptoms for the 20 items.Higher scores indicated better recovery.Among them, only 288 participants filled in the finishing time of the scale,and the mean time was 5.94 mins (min: 1, max: 20).The characteristics of the participants are shown in Table 1.

Table 1 Characteristics of participant s

Validity

Construct validity was assessed with EFA.The results of the KMO and Bartlett’s test of sphericity demonstrated a sufficient amount of significant correlations to perform factor analysis.EFA of the five factors with varimax rotation showed that eigenvalues for the five factors jointly accounted for 50.4% of the variance(see Table 2).

Construct validity was also tested with CFA by establishing a five-factor model according to the original scale structure [23].The results are presented in Figure 2.The following data support a good fit for the model: Chi-Square/DF = 1.907, GFI = 0.92,AGFI= 0.90, NFI = 0.80, CFI = 0.89, SRMR = 0.06,RMSEA=0.051(95%CI:0.042~0.060).

Discriminant validity was demonstrated by comparing the 349 patients with 51 healthy people for whom a physical examination in a hospital had shown no obvious organic lesions.Since 3 items(5,18 and 20)in the scale contained pain and postoperative evaluation, only 17 were included in the analysis.The results showed that there was good discrimination validity in the PRSIM (Table 3).Responsiveness was demonstrated with results at different time points from the first to the fourth day(Table 4).

Reliability

Table 5 shows the internal consistency of the items on the scale.Cronbach's α for the scale was 0.70, and the values for direct or indirect influence, activity, mental function, and general health perceptions of the five domains in the scale were 0.51, 0.45, 0.52, 0.31, 0.54,respectively.Test-retest reliability was calculated for the 50 participants who had repeated the test after 24 hours.One case was deleted because it was missing more than 20%of the data.The ICC for the total score of the PRSIM was 0.91 (95%CI: 0.85~0.95), which indicated satisfactory results.ICCs of the five domains were 0.67, 0.78, 0.81, 0.47 and 0.84, respectively(Table 5).The split half reliability was calculated by correlation and Cronbach's α: the correlation between odd indexes and even indexes was 0.66; the Cronbach's α of odd indexes and even indexes were 0.61 and 0.68,respectively.

Discussion

The PRSIM is a general scale designed for use with patients in the perioperative period.349 patients from four inpatient departments took part in this evaluation of the psychometric properties of the PRSIM.This is the first study to develop and validate a scale on the recovery status of postoperative patients in a general surgical setting combining CM and WM.The PRISM can be applied to postoperative patients of either CM or IM.This study indicates that the PRSIM is reliable and accurate in assessing perioperative recovery.

Table 2 Exploratory factor analysis of the PRSIM

Figure 2 Confirmatory factor analysis model for the PRSIM

Table 3 Discriminant validity of the PRSIM between patients and healthy people

Table 4 Responsiveness of the PRSIM(n=50)

Perioperative recovery is worthy of attention because operations create bodily and psychological discomfort beyond the removal of lesions.To our knowledge, no other general scale for perioperative recovery applicable to both CM and IM has been widely adopted.In accordance with the standard procedures of PROs recommended by the FDA [22],we used the combination of CM theory and the context of Chinese culture to develop the PRSIM.It contained 20 items with 5-point Likert scales, covering five domains (see additional file 3): direct influence,indirect influence,activity,mental function and general health perceptions.These five domains should accurately reflect overall patient recovery.

With regard to the internal consistency of the PRSIM, Cronbach's α exceeded 0.70 for the total score of the PRSIM, indicating high consistency.The corrected item-total correlations were all well above the recommended level of 0.2, which also showed that the PRISM was acceptable.The ICCs of the test-retest reliability of the PRSIM were all above 0.4-not satisfactory, but acceptable.This was due to small sample size.In addition, the split half reliability also proved to be high.

Construct validity is an approach whereby a questionnaire is tested against a hypothesis [28].The PRSIM demonstrated high construct validity in EFA and CFA analysis.The PRSIM was broken down into five factors, and the values of the five-factor model were high.Similar studies have not utilized CFA analysis [13, 15].However, 2 items (18 and 19) had double loadings,and they remained in factor 3 because of there were higher loadings in factor 3 than in factor 1.Thus, CFA analysis further supported the construct of the PRSIM.

There was high discrimination in direct influence and activity domains, but not in indirect influence,mental function or general health perceptions.There was no statistically significant difference between the general health perception scores of patients and healthy people.However in other domains, healthy persons were superior to patients.The indirect influence and mental function can be explained by improvements in surgical techniques.The application of minimally invasive surgery saves operation time and reduces intraoperative bleeding.

The attribute of responsiveness is regarded as a constituent of scale validation [29].With the exception of the activity domain, the results of responsiveness were not ideal, but there was a tendency for the fourth day to be better than the first, owing to high scores on the fourth day.More time may be necessary to assess post-surgery patient response.

Several limitations should be kept in mind.These include the small sample size compared to other large psychometric studies of this kind, the use of a single site survey which limited the representativeness of the population, and the fact that convergent construct

validity was not calculated.Nonetheless, these results can be considered a first step in the validation process for perioperative recovery in IM.They present a promising starting point for future research on this topic.In future studies, this scale could be applied to larger populations and different settings.

Conclusion

The PRSIM scale was developed to evaluate the recovery of postoperative patients based on PROs, and considering CM theory.Acceptable reliability and validity of the PRSIM have each been demonstrated.The PRISM has the potential to be applied in clinical and research settings to evaluate the efficacy of therapies, and particularly in studies utilizing either CM or IM.In the future, the scale will be tested in larger populations, as well as in multi-center and international contexts.