Reliability of Chinese web-based ocular surface disease index questionnaire in dry eye patients: a randomized,crossover study

2021-06-11 00:49XinMeiZhangLanTingYangQingZhangQingXiaFanCanZhangYueYouChenGuangZhangTieZhuLinLingXuSalissouMoutariJonathanMooreEmmanuelPazoWeiHe

Xin-Mei Zhang, Lan-Ting Yang, Qing Zhang, Qing-Xia Fan, Can Zhang, Yue You, Chen-Guang Zhang, Tie-Zhu Lin, Ling Xu, Salissou Moutari, Jonathan E. Moore,Emmanuel E. Pazo, Wei He

1Department of Ophthalmology, He Eye Specialists Hospital,Shenyang 110034, Liaoning Province, China

2The Second Affiliated Hospital of Dalian Medical University,Dalian 116000, Liaoning Province, China

3School of Mathematics and Physics, Queens University Belfast, University Road, Belfast, Northern Ireland BT7 1NN,United Kingdom

4Cathedral Eye Clinic, 89‐91 Academy Street, Belfast,Northern Ireland BT1 2LS, United Kingdom

5Biomedical Sciences Research Institute, University of Ulster,Coleraine, Northern Ireland BT52 1SA, United Kingdom

Abstract

● KEYWORDS: dry eye disease; ocular surface disease index; Rasch analysis; test-retest reliability; web-based questionnaire

INTRODUCTION

The worldwide prevalence of dry eye disease (DED) is assessed to be anywhere from 5% to 50% and reported to be highly prevalent in China and globally[1‐4]. It continues to accelerate due to factors such as increase in multimedia screen usage, ageing population, and environmental factors. Several objective clinical tests are available for evaluating DED and due to the inherent variability of clinical features in DED favors the use of subjective assessment[5‐6]. Among various DED questionnaires, ocular surface disease index (OSDI)is one of the most popular DED assessment questionnaire following its conception in 1997[7]. DED can lead to discomfort and impaired vision, along with decrease in quality of life and work productivity[8‐9]. DED is generally managed by artificial tears, warm compresses, omega‐3 fatty acid, anti‐inflammatory drugs, tetracyclines, secretagogues, intense pulse light (IPL),cholinergics, lacrimal plug, systemic immunosuppressives, eye lid massage and expression, serum tears, amniotic membrane biologic corneal bandage lens to name a few[10‐13].

Traditionally, survey data in research has been collected using paper questionnaire[14]. However, in recent years, this method is facing challenges as multiple reports have stated that the response rates have declined by approximately 1% per annum in various countries[15]. Recent smartphones offer advanced computing and communication capability. Smartphone, along with web‐based health‐related services, is transforming clinical research settings. Since 2011, World Health Organization(WHO) has recognized the use mobile phones and other electronic devices for medical and public health practices under the umbrella of mHealth[16]. WeChat (Tencent Holdings Ltd., China) a smartphone application has a large user base in Asia and offers a real‐time platform for sharing information.Currently, web‐based questionnaires utilizing WeChat platform is rapidly growing in field of telemedicine. Additionally,multiple studies have utilized, and validated questionnaires administered via WeChat‐based for health‐related research and clinical practice[17‐19]. In the last decade, due to the increased adoption of the internet, researchers have adopted in using web‐based data entry and direct e‐mail for collecting data[20‐21].Since internet‐based questionnaires are increasingly gaining popularity in survey research, it is imperative to test the instruments’ reliability. While scholars have explored methods of validation, administration, real‐world considerations, and reliability of electronic versions of patient response outcomes measures (PROM)[22‐23]and a growing number of clinical researchers support the use of web‐based survey methods and instruments in reducing the hurdles of logistic associated with large sample size survey research[24‐25]. Gwaltneyet al’s[26]Meta‐analysis suggests that there is an overall high level of agreement between paper and electronic versions of health‐related questionnaires. The review included peer‐review articles from the fields of allergies, asthma, alcoholism,cardiology, diabetes, diabetes, gastrointestinal disease, pain assessment, psychiatry, and rheumatology. On the other hand,a study from the European Organization for Research utilizing Treatment of Cancer Quality of Life Questionnaire‐Core 30 questionnaire reported a minor, nevertheless statistically significant differences of 3 to 7 mean score points (on a scale of 100‐point) related with various methods of questionnaire administration[27]. Claytonet al[28]while comparing the equivalence of web‐based and paper‐based subscale of OSDI in DED patients with a sample size of 68 participants, primarily consisting of Caucasian (n=43) demographic found no statistically significant difference between the paper‐based and web‐based version. However, the rigorous reliability of OSDI in the Chinese language has not been assessed. Additionally,it has been documented that health‐related questionnaires scores have the potential to be culturally biased or neutral[29‐30].Therefore, this current study aims to assess the reliability of web‐based OSDI questionnaire in Chinese language (C‐OSDI)for evaluating the ocular surface health of DED participants in comparison with the paper‐based administration of C‐OSDI.

SUBJECTS AND METHODS

Ethical ApprovalThis study was reviewed and approved by the Institutional Review Board of He Eye Specialist Hospital,Shenyang, China and followed the tenets of the Declaration of Helsinki. All participants signed an informed consent after receiving a detailed explanation and possible consequences of participating the study. Data from the participants was collected between September 2019 to December 2019.In this randomized, crossover design study all participants completed both paper‐based and web‐based versions of the same C‐OSDI questionnaire and were previously clinically diagnosed with DED. The C‐OSDI questionnaire quantitatively measures the subjective symptoms of DED[31].

Two hundred and fifty‐four consecutive consenting Chinese adults were enrolled into this prospective study. Diagnostic criteria: 1) At‐least 1 of 6 symptoms of dryness, burning,sandiness, tiredness, discomfort of the eye and/or blurred vision with non‐invasive tear break‐up time (NITBUT) ≤10s[5].2) At‐least 1 of 6 symptoms: dryness, burning, sandiness,tiredness, discomfort, and blurred vision accompanied by corneal fluorescein staining (CFS) score[32]. Inclusion criteria:full legal age, diagnosis of DED, follow study guidelines, read,and comprehend the questionnaire without help or support,complete the entire study protocol, and provide signed consent.Exclusion criteria: lacking the ability to give informed consent and participation in other studies (burden of participation),best‐corrected visual acuity (BCVA) <20/20, previous ocular surgery or trauma, acute inflammation, blepharal dysraphism,history of blepharal and periorbital skin disease or allergies in the last 1‐month, history of herpes zoster infection, rheumatic immune systemic diseases, pregnancy, breastfeeding, and use of photosensitive drugs/foods.

Experimental DesignThe pretesting and pilot testing phase of the study consisted of evaluating usability, accessibility,and clarity of the web‐based version of C‐OSDI questionnaire by 3 ophthalmologist and 3 non‐experts. This was conducted to assess the functionality of the web‐based C‐OSDI questionnaire, which was identical to the validated paper‐based OSDI questionnaire.

Three hundred patents between 18 and 62 years of age voluntarily participated under the study. Participants were assessed for eligibility during their initial visit at the clinic and eligible participants were requested to enroll for the study.Participants enrolled for the study were asked to complete the questionnaires during their visit at the hospital under the observation of trained physicians. This study was designed as a 2‐group (armed), prospective, crossover, randomized study.All participants were required to complete both versions of the same C‐OSDI questionnaire (paper‐based and web‐based).Participants in group A, first completed the paper version followed by the web version on their personal smartphone.A 20‐minute break was allotted between the paper and web sessions. While participants in group B filled out the web version followed by the paper version on their personal smartphone with 20min break between the two sessions(Figure 1). Taking into consideration that symptoms of DED can vary from day to day, and environmental conditions, both groups completed their both versions of the of the C‐OSDI questionnaire on the same day with 20min break in between them. Additionally, we tried to mitigate the carry‐over effect of the previous questionnaire with an interval break of 20min.

RandomizationParticipants were randomly enrolled to either group A or group B in a 1:1 ratio by a computer‐generated randomization list with a specified seed and block size of 4. Prior to the administration of the questionnaires,written instructions were provided to all participants and was completed at the hospital under the supervision of three trained medical doctors (Fan QX, Zhang C, and You Y).

QuestionnaireOSDI (Allergan Inc, Irvine, CA, USA) is a frequently used instrument to assess DED, which comprises of 12 items, and the final score range from 0 (no symptoms)to 100 (severe symptoms) points[7]. The 12 items of the questionnaire are sub‐grouped into three subscales. Authors followed the guidelines for self‐administered questionnaire design to reduce the risk of errors (Figure 2)[33]. Industry standard guidelines for translation were employed to achieve a scientifically accurate translation of the OSDI questionnaire from English to Chinese[34].

Clinical AssessmentFull ophthalmic examination including BCVA (Snellen) at 4 m, corneal conjunctival examinations with slit lamp microscope and intraocular pressure (IOP)measurements were performed. Subjects were evaluated for DED before the administration of the C‐OSDI questionnaires using the following assessments: NITBUT was measured using the Keratograph 5M (Oculus, Germany) and three times consecutively measurements were obtained. The median value was recorded used in the final analysis. Tear film lipid layer (TFLL) interferometry: DR‐1 (Kowa, Nagoya, Japan)was performed to assess TFLL quality and graded from 1 to 5 according to Yokoi DE severity grading system[35]. CFS: the cornea after instilling fluorescein were evaluated using the Efron system and was scored between 0 and 4[36]. Conjunctival hyperemia (CH) was assessed using the Keratograph 5M(Oculus, Germany). The redness scores (RS; accurate to 0.1 unit) generated by the device[37].

Figure 1 Study flow diagram.

Figure 2 Screenshot of web-based C-OSDI version.

Calculations of Questionnaire ScoresQuestionnaire responses of the paper questionnaires were manually transferred into a password protected electronic spreadsheet by three trained medical doctors (Fan QX, Zhang C, and You Y) and responses were automatically transcribed after the participant concluded the questionnaire and downloaded into a password protected electronic spreadsheet. All questionnaires were checked for completeness in‐terms of per‐item basis and incomplete questionnaires were no included in the final analysis. The total C‐OSDI score was obtained by the following official guidelines[7]. Direct comparison of individual items, subscales and total scores were the primary aim of this study. Following the completion of both versions of the C‐OSDI questionnaire, participants were requested to choose whether they preferred paper‐based, web‐based or both versions of the questionnaire.

Statistical AnalysisThe sample size for this crossover design comparisons of means between the groups was derived from the equation: (1‐ρ)/2; where ρ is an estimate of the expected correlation between the two modes of administration[26]. All statistical analyses for this study were conducted using SPSS(IBM, version 25). Questionnaires with missing values (items not filled) were not included in the final analysis. Descriptive sociodemographic characteristics of patients was determined by analyzing the frequency distribution of the overall data.Reliability, internal consistency, discrepancy of responses and the rate of consistency between paper‐based and web‐based responses were assessed. Reliability for the 12 individual items as well as for the 3 subscales (ocular symptoms, vision‐related,and environmental triggers) and the total C‐OSDI score under the OSDI guidelines were all calculated[7]. Shapiro‐Wilk test inferred that the paired samples were not normally distributed.Due the ordinal nature of the data, Wilcoxon test was utilized to detect possible statistically significant differences in the test of parallel forms of reliability between the 12 items, 3 subscales and the total C‐OSDI score. The mean values of the paper‐based and web‐based measures were calculated,Spearman rank correlation coefficient (Spearman ρ) for each item, subscale, and total score was used to assess consistency.To assess test‐retest reliability, intraclass correlation coefficient(ICC; two‐way random‐effects model) was used. In this study,P<0.05 (2‐tailed) was considered statistically significant differences (alpha=0.05). The psychometric properties of C‐OSDI questionnaire were analyzed utilizing Rasch analysis.Further information and background to Rasch analysis in ophthalmic research by McNeelyet al[38]is recommended.

RESULTS

The final analysis included 254 DED participants diagnosed under the criteria put forth by dye eye workshop (DEWS) the patients were classified as DED[32]. Initially, 300 patients were assessed for eligibility, however, 46 participants were excluded(Figure 1). Kemainder in groups A (n=127) and B (n=127)completed their questionnaires consecutively. There were no significant differences in response behavior, sociodemographic status, or therapy setting between the participants in either groups, the two groups were pooled in the final analysis. Table 1 shows the sociodemographic and clinical characteristics of the study group. The study consisted of 129 (51%) male and 125 (49%) female outpatients with a mean age of 27.90±9.06y.The mean BCVA value for both eyes were ‐0.10±0.01 logMAR,mean IOP for both eyes were 14.09±1.40 mm Hg. A total of 258 paper and web‐based questionnaires were available, out of which 254 had completed all 12 items and therefore only 254 were selected for the final data analysis. Additionally, in group A, four patients had failed to complete both version of the questionnaires (Figure 1). Parallel test‐retest reliability for all paper‐based and web‐based scale scores were assessed for each item, subscale, and total score (Table 2). Item 11 was found to have the lowest level of agreement (Spearmanρ=0.806, ICC=0.824). In the present study, standard deviations(SD) for total C‐OSDI score for paper‐based was 12.78 and web‐based was 12.43. To assess the effects of the type of questionnaire (web‐based or paper‐based) and the sequence of administration, random‐effects 2‐way ANOVA was used.Since the order of administration was balanced (50%;n=172),no interaction was found among the type and administration order of questionnaires. Similarly, there was no effect of the type of questionnaire or administration order. As shown in Table 2, reliability indexes were within the acceptable range,with Pearson correlations greatest for item 1 (0.965) and intraclass correlation ranging from 0.824 (item 11) to 0.989(total C‐OSDI score). Mean scores were significantly different for item 5 and subscale 1 score according to Wilcoxon signed rank tests for paired samples. However, the total C‐OSDI score showed no significant difference. Table 2 also displays the Spearman rho correlation values between all individual items, subscales and total score. All 12 items, subscale and total score demonstrated a comparable correlation (>0.8).The distribution of web‐based and paper‐based total C‐OSDI scores (0‐100 points, where higher values reflect a worse state) are illustrated as box plots (Figure 3). The marginally higher total mean web‐based C‐OSDI score (29.87 points)vspaper‐based (29.63 points) can be ascribed to the few outlies depicted in the boxplot and the difference was not found to be statistically significant (P=0.09; Table 3). The whisker of the web‐based C‐OSDI boxplot interquartile range waswithin the paper‐based version. Bland‐Altman chart (Figure 4)illustrates that the individual total C‐OSDI scores of the two versions of the questionnaire are mostly close to one another.However, 13 out of 254 participants had their total C‐OSDI scores beyond the SD on the web‐based version. Figure 5 illustrates a positive correlation between total C‐OSDI scores of the two questionnaires. Wilcoxon sign rank test was used to assess parallel reliability in single items, subscale, and total score of C‐OSDI (Table 3). No systematic location difference was observed for continuous variables except for item 5(poor vision) and subscale 1. However, most of the responses to the items had same response (ties) in both versions of the questionnaires. These findings suggest a high parallel reliability. A moderately statistically significant difference could only be identified in subscale 1 (ocular symptoms).Additionally, the IQR of the “Item 5” for the paper‐based and web‐based questionnaires were also different (0‐2 and 1‐2 respectively). Although the web‐based total mean score was slightly higher by 0.24 points but was not statistically significant different in comparison to the paper‐based version.The most used metrics, in Rasch Analysis, to assess the randomness of the response to items are the mean‐squares fit statistics (Outfit MNSQ and Infit MNSQ). The values of both Outfit MNSQ and Infit MNSQ are expected to be around 1,and any value far away from 1 suggests either a low or high degree of randomness in the response to the items, which could jeopardize the quality of the fitted model. Low values of Outfit MNSQ and Infit MNSQ highlight that the responses to the items are easily predictable, and this could result into the overfitting of the model. High values of Outfit MNSQ and Infit MNSQ points out that the responses to the items are very unpredictable, which could result into the misfitting of the model. The infits and outfit mean square statistics are below the 1.5 threshold and above the 0.5 threshold for all the items in both web‐based and paper‐based OSDI questionnaire.Therefore, all the items are relevant and capture the underlaying latent trait (Table 4). On the other hand, the items characteristic curves for all the all items in both web‐based(Figure 6) and paper‐based (Figure 7) OSDI questionnaire show that the rating 2 and 3 are redundant and only three rating scales, namely 0, 1, and 4 are enough to capture the underlaying latent trait.

Table 1 Demographic and clinical information on study participants

Figure 3 Boxplot distribution of web-based and paper-based C-OSDI total scores.

Figure 4 Bland-Altman analysis for clinical agreement between the web-based C-OSDI and paper-based C-OSDI final scores revealed a clinical difference (bias) of -0.25 units.

Figure 5 Correlation between web-based and paper-based OSDI total scores.

Table 2 Parallel test-retest reliability of single items, subscale and total score

Table 3 Rank test-retest reliability of single items, subscale and total score (Wilcoxon rank test)

Figure 6 Items characteristic curves of web-based C-OSDI items.

Results on the user preference survey were analyzed separately(Table 5). Of the 254 patient surveys that were completed, 9%(24/254) reported that they preferred the paper‐based OSDI questionnaire, while 72% (182/254) preferred the web‐based questionnaire and 19% (48/254) preferred both versions of the C‐OSDI questionnaire. Regarding the version preference by participants, there were no significant associations found with age, gender, education level or level of DED severity. The median time to complete the paper‐based C‐OSDI was 109.5s,while for the web‐based C‐OSDI was 61s.

DISCUSSION

This study assessed the test‐retest reliability of self‐administered C‐OSDI questionnaireviaweb‐based user‐interface. In accordance with the international guidelines, the validation of a web‐based version must demonstrate equivalent measurement properties to its predecessor. This is can be measured by correlation and intraclass correlation. In general, reliability was found to be good for the web‐based C‐OSDI questionnaire as measured with ICC and Wilcoxon sign rank test. Spearman rho correlation analysis demonstrated that the mean differences were close to zero, implying high reliability of the web‐based version of C‐OSDI. Additionally, Rasch analysis revealed high degree of responses and predictability of the items.

Table 4 Infit and outfit mean square values for web-based and paper-based questionnaires

Figure 7 Items characteristic curves of paper-based C-OSDI items.

Table 5 User preference and time analysis

Patricket al’s[39]Meta‐analysis stated that an average correlation between paper‐based and electronic administration was 0.90 without significant changes from various research relying on ICC or weighted kappa. Findings from our current study indicate that test‐retest reliability, as measured by an ICC of the C‐OSDI web‐based version questionnaire achieved good (>0.80) results for subscales and total score. The Rasch analysis results suggests that all the 12 items contribute to capture the OSDI latent trait. Hence, they are all useful and should be kept in the questionnaire. However, the items characteristic curves for the 12 items/questions, in both web‐based and paper‐based questionnaires, showed that only three rating scales are adequate, instead five. Additionally, good face validity was demonstrated as 72% of the respondents’preferred using the web‐based version over the paper‐based version when assessing their DED. Surprisingly in this current study, only 28% of the patients indicated that they favored the paper‐based C‐OSDI questionnaire. This could possibility be a biased indicator as the primary objective was to validate the web‐based version of C‐OSDI or that the transition to internet‐based assessment is well accepted in China due to widescale mobile internet coverage in the last half decade. One of the important follow‐ups of this study will be to continue a longitudinal follow‐up of C‐OSDI scores to assess whether web‐based assessment can be of value in routine clinical DED care. The continuation of this study will allow longitudinal follow‐up of electronically administered self‐reported DED scores and determine its value for clinicians and researchers.

Screening test such as the OSDI questionnaire enables clinicians’ early discovery of ocular surface alterations in a population allowing for prompt treatment, care, and monitoring. A screening test should have the benefits of being quick, easy to use, inexpensive and the ability to be administered by nonspecialized personnel. The OSDI one such popular tool clinical practice for DED. The present findings indicate that assessing DED using the web‐based version was easy and reliable, and importantly, fulfils the criteria for migrating the need for a paper‐based C‐OSDI.However, migration to web‐based C‐OSDI could present some limitations since the variation of difficulty between some items were discovered. The mean of score of item and subscale 1 were found to be significantly different in the Wilcoxon rank test as the individual participants response were significantly different to the symptom severity. Therefore, this could further influence the final score[40]. Finally, it is possible that the randomized crossover test re‐test studies design can facilitate carryover effect. While interpreting the findings of our study,it must be taken into consideration that randomized crossover design test re‐test study design suffers from internal validity,but the within‐patient design offers better statistical power and reduces requirements for a large sample size. However,to compensate for carryover effect participants were given adequate time between test re‐test. McNeelyet al[38]suggest that Rasch analysis validated questionnaires such as the OSDI are centered on a single cohort and therefore at certain situations might not derive the most accurate assessment.However, further investigation is needed and will be carried out to validate these findings regarding the difficulty of items on Chinese version of OSDI. Although participants completed the web‐based questionnaire in a shorter time than the paper‐based questionnaire, it should be noted that patients that were administered the paper‐based or web‐based questionnaire first might have memorized their responses, however this could not be quantitatively assessed. The follow‐up data to analyze the responsiveness of the web‐based C‐OSDI will be assessed in a forthcoming study.

To summarize, the web‐based C‐OSDI shows good reliability and could possibly mitigate the use of paper‐based C‐OSDI in assessing and monitoring individuals with DED. Additionally,good test‐retest reliability suggests that web‐based C‐OSDI can be used for clinical studies that have a relatively moderate sample size.

ACKNOWLEDGEMENTS

Conflicts of Interest:Zhang XM,None;Yang LT,None;Zhang Q,None;Fan QX,None;Zhang C,None;You Y,None;Zhang CG,None;Lin TZ,None;Xu L,None;Moutari S,None;Moore JE,None;Pazo EE,None;He W,None.