Abbas Rezaianzadeh, Marjan Zare, Hamidreza Tabatabaee, Mohsen Ali-Akbarpour, Hossain Faramarzi,Mostafa Ebrahimi
1Colorectal Research Center, Shiraz University of Medical Science, Shiraz, Iran
2Department of Epidemiology, School of Health, Shiraz University of Medical Sciences, Shiraz, Iran
3Research Center for Health Sciences. Shiraz University of Medical Sciences, Shiraz, Iran
4Department of Community Medicine, Medical School, Shiraz University of Medical Sciences, Shiraz, Iran
5Department of Communicable Diseases, Shiraz university of Medical Science, Shiraz, Iran
Keywords:Cutaneous leishmaniais Malaria Prospective permutation scan statistics Fars province Iran
ABSTRACT Objective: To determine whether permutation scan statistics was more efficient in finding prospective spatial-temporal outbreaks for cutaneous leishmaniasis (CL) or for malaria in Fars province, Iran in 2016. Methods: Using time-series data including 29 177 CL cases recorded during 2010-2015 and 357 malaria cases recorded during 2010-2015, CL and malaria cases were predicted in 2016. Predicted cases were used to verify if they followed uniform distribution over time and space using space-time analysis. To testify the uniformity of distributions, permutation scan statistics was applied prospectively to detect statistically significant and non-significant outbreaks. Finally, the findings were compared to determine whether permutation scan statistics worked better for CL or for malaria in the area. Prospective permutation scan modeling was performed using SatScan software. Results: A total of 5 359 CL and 23 malaria cases were predicted in 2016 using time-series models. Applied timeseries models were well-fitted regarding auto correlation function, partial auto correlation function sample/model, and residual analysis criteria (Pv was set to 0.1). The results indicated two significant prospective spatial-temporal outbreaks for CL (P<0.5) including Most Likely Clusters, and one non-significant outbreak for malaria (P>0.5) in the area. Conclusions:Both CL and malaria follow a space-time trend in the area, but prospective permutation scan modeling works better for detecting CL spatial-temporal outbreaks. It is not far away from expectation since clusters are defined as accumulation of cases in specified times and places.Although this method seems to work better with finding the outbreaks of a high-frequency disease; i.e., CL, it is able to find non-significant outbreaks. This is clinically important for both high- and low-frequency infections; i.e., CL and malaria.
Malaria and leishmaniasis are among the most important infections transferred by protozoa parasites. These infections are both vectorborne and any effort to lower the rate of infections is usually focused on controlling the resources and reservoirs.Malaria is highly prevalent in equatorial sites of the world[1-3]. World Health Organization (WHO) reported 214 million malaria cases in 2015 among which, 434 000 deaths were observed universally.Malaria has got the first rank of mortality among infectious diseases,and almost 90% of deaths occur in Africa. The related symptoms are recurrent fever, chills, convulsion, nausea and dizziness, abdominal pain, and coma and death in severe cases. There are four types of malaria, including Plasmodium falciparum, Plasmodium ovale,Plasmodium malariae, and Plasmodium vivax. Malaria is transmitted by female Anopheles mosquito bites. The incubation period relies on the type of malaria and ranges from 12 to 28 days from getting bitten by the insect to the onset of fever as the most common sign of infection. The risk factors of getting infected are race, sex, age,job, temperature, humidity, and life style including utilization of insecticides and webs in open areas[4]. Located on the northern hemisphere of the earth, Iran is supposed to have malaria under control, but the incidence rate of the disease has been reported to be 100 to 1 000 cases per a million inhabitants. Besides, 90% of malaria cases occur in southern parts of Iran, including Fars province. Vivax is responsible for almost 80% of these cases[5].
The second most considerable vector-borne protozoa disease is leishmaniasis. There are two types of leishmaniasis, including cutatious leishmaniasis (CL) and visceral leishmaniasis. Humans and animals are reservoirs of leishmaniasis, but it has been reported that humans are the only source for CL. CL is transmitted by sandflies and usually causes ulcers on open parts of body mostly on face,hands, and feet. Sometimes, the ulcer is so massive that deforms the infected organ. It is not usually lethal but imposes large burdens on the society. CL is prevalent in tropical sites, such as Saudi Arabia,Iran, Pakistan, and Afghanistan[2,6-12]. Currently, 350 million people are at risk of getting infected by sandflies in 88 countries. In addition, 12 million new infections are reported annually from which 1 500 000 cases are CL[13-15]. In Iran, 19 000 new cases occurred yearly among which 3 000 cases belonged to Fars province. Onethird of the 6 000 reported cases in Fars province occurred in urban areas and two-thirds in rural parts. The incidence rate of CL was 1 070-1 440 cases per 1 000 000 people in Fars province[10].
Malaria and leishmaniasis are potential time- and space-related infections, and investigation of their temporal or spatial features could help disease prediction and prevention resulting in final improvement of people’s health status[16]. The classical way of considering a space- and time- related phenomenon used is to categorize in space dimension, do the investigation in time dimension, and interpret the results in space categories, or vice versa. However, the new method of investigating spatial-temporal outcomes; i.e., malaria and CL, could detect more reliable facts about diseases.
A key use of time-series methods is estimation and also prediction of data in future. Offering informative and analytical graphs derived from time-series analysis is another key stone in exploring time,seasonality, and residual trends of estimated and predicted data,and also an efficient tool to assess the goodness of fit criteria. This methodology works really well in evaluating time-dependent data.
The classic way of assessing time and space related data was to categorize on one dimension i.e. time, do the analysis on another dimension i.e. space and interpret the results on time categories. It could be a time taking strategy and also interpreting the results in categories could be sometimes cumbersome and puzzling.
Permutation scan statistics, as a new method in analyzing a phenomenon in both space and time dimensions simultaneously, is used to find both past and present outbreaks by using time-spacerelated data. The former is called retrospective and the latter is called prospective permutation scan modeling[16,17]. Considering two dimensions at the same time could result in offering more interesting and informative results. In addition to evaluating the time and space features of the outcome, it could involve variety of covariates in the model, and finally report the past and present clusters happened in exact time and place with associated Pvto distinguish between a true and accidental outbreak.
The current study aims to employ prospective permutation scan statistics in order to find future outbreaks of CL and malaria and to determine whether this model is more efficient in finding the future clusters of a low-prevalence (malaria) or a high-prevalence (CL)disease.
Time-series designs including 29 177 CL cases recorded during 2010-2015 and 357 malaria cases recorded during 2010-2015 were applied to predict CL and malaria cases in 25 different cities of Fars province in 2016. Since all eligible patients were entered into the study using census data, there was no need for sample size calculation.
In Iran, Fars, Esfahan, and Kerman are the three large CL and malaria endemic provinces. Fars is the southernmost province located almost near the Persian Gulf. The geographical coordinate system for its capital city is 27°3’ and 31°40’ northern latitude and 50°36’ and 55°35’ western longitude. Using Google-Earth online system (US Department of State Geographer 2016), all 25 cities of the province got their latitude/longitude coordinate systems. Since Fars province is located in a geographically convenient site of the northern hemisphere, the four seasons of the year are quite distinct in the province, causing a variety of geographical and metrological climates.
A total of 29 177 CL cases recorded during 2010-2015 and 357 malaria cases recorded during 2010-2015 were included in the study. All monthly recorded cases in every city were enrolled and maintained in Contagious Disease Control Center of Shiraz University of Medical Sciences, Shiraz, Iran. CL positive cases were diagnosed through polymerase chain reaction, culture, or smear.All subjects whose symptoms of CL began from January 1, 2010 to December 31, 2015 were entered into the study. Furthermore, any fever was diagnosed as malaria until the opposite was proved. To confirm positive malaria cases, acceptable laboratory tests, such as microscopic tests and rapid diagnostic test, were done. The subjects whose symptoms of malaria began from May 1, 2010 to December 31, 2015 were entered into the study.
All ethical steps, including data collection and analysis as well as reporting the results, were in accordance with the standards approved by the Ethics Committee of the Ministry of Health, Treatment, and Medical Education under ethics number: IR.SUMS.REC.1396.S755.Indeed, the process of work was completely anonymous and the results were reported to the study participants.
Statistical, time-series, map drawing, and space-time cluster analyses were done using SPSS version 22, ITSM 2002, Arc GIS version 10, and SatScan version 9.4.4.
Minimum, maximum, and relative frequency were used to describe the data. Kolmogorov-Smirnov and Kruskal-Wallis tests were applied for normality and mean rank tests. In addition, a variety of time-series models were recruited to find the best predictive trend of cases over time. Finally, prospective permutation scan statistics was used to detect prospective spatial- temporal outbreaks.
Time-series methods were fitted on time-dependent data to predict the cases in future. From a set of time-series models; i.e., auto regressive, moving average, and any combination of these two basic models like ARMA, SARMA (regarding classic seasonality decomposition), and SARIMA (regarding Integrated Seasonality decomposition), the best model fitting the data was chosen. Akaike Information Criterion (AIC) is a goodness of fit criterion for timeseries models, which is an estimator of the relative quality of statistical time-series models for a given set of data. For a collection of models derived from time-series analysis for a set of data,AIC estimates the quality of each model in comparison to one another. Thus, AIC provides a means for model selection. Bayesian Information Criterion (BIC) is another criterion for model selection among a finite set of models. The model with the lowest values of AIC/ BIC is preferred. It is based, in part, on the likelihood function and is closely related to AIC. Auto Correlation Function (ACF) and Partial Auto Correlation Function (PACF) model/sample evaluate conformance of observed and fitted patterns in data. Also, there are several tests in residual analysis each examining one presumption necessary to be met for time-series modeling. Generally, a variety of time-series models are applied and the best ones are selected based on goodness of fit criteria, such as residual analysis, lower AIC/BIC statistics, and ACF/PACF conformity criteria. A model with less AIC/BIC is preferable, showing less divergence of observed from fitted values in time-series models. Finally, the more ACF, PACF model/sample matchs, the better the model’s fitness would be[18].
Some health events alter in different time periods and places.To evaluate a space- and time-dependent outcome, there are two options. The easier and classical way involves grouping in time dimension, doing the analysis in space category, and interpreting the results in time categories, or vice versa. However, the new spatial-temporal method introduced by Kulldorff is a novel method with which, the space-time trait of a variable could be evaluated simultaneously[17]. Space-time permutation scan modeling tests if cases follow a constant risk over space and time. In space-time permutation modeling, permutation scan statistics is used and explained by cylindrical windows. The base of the window has a circular shape. The center of the circle is one of the cities centroid and its diameter varies from zero to 50% of the at-risk population.The height of the cylinder is prone to time and varies from zero to 50% of the study period. The window moves across the area and time. Whenever and wherever the observed number of cases exceeds the expected number of cases by the largest likelihood ratio (the likelihood ratio made from observed data in proportion to the one gained from Monte Carlo simulation), this window is reported as a potential outbreak. Scan statistics is able to detect retrospective as well as prospective clusters using past data. The statistical signification of each cluster is tested by Monte Carlo hypothesis testing process in which, the likelihood attained from the observed data is compared to that derived from Monte Carlo simulation. This methodology is well-grounded and is able to cope with potential confounders. Using hyper geometric distribution, spatiotemporal permutation scan statistics estimates the expected number of cases with the assumption of defined total population. This methodology has been specified deeply by Kulldorff[16,17].
In this work, space-time permutation scan modeling was done using SaTScan software, version 9.4.4. In order to scan the outbreaks with high rates of infection occurrence, time precision was set at month. Space-time prospective analysis was used and the circular spatial window shape was set to standard as default. The number of replications was set at 9 999 for both CL and malaria.It is noteworthy that in case of lower sample size, setting a larger number of replicates would result in a higher study power. The upper limit allowed for the base of the cylinder was set at 50%of at-risk population. Moreover, the lowest and highest temporal cluster lengths were set at 1 month and 50% of the study period,respectively. At last, only clusters with no geographical overlay were reported for both CL and malaria outbreaks.
Using time-series modeling on 29 177 CL cases recorded during 2010-2015 and 357 malaria cases recorded during 2010-2015 in 25 cities of Fars province, prospective permutation scan statistics analysis was conducted on 5 359 CL and 23 malaria predicted cases in 2016.
The maximum and minimum predicted cases were respectively 1 964 and 8 for CL and 12 and 0 cases for malaria. The frequency of CL and malaria cases in 25 cities of Fars province, Iran has been shown in Table 1.
The distribution of CL and malaria cases was right-skewed. The results of Kolmogorov-Smirnov normality test revealed non-normal distributions (P<0.05). The results of Kruskal-Wallis mean rank test also showed the equality of mean ranks of CL and malaria cases in different cities (P>0.05).
Table 1CL and malaria cases with percentage of total cases based on 5 different climates in Fars Province -Iran, 2016.
3.2.1. Time-series results predicting CL cases in 2016
As a rule of thumb, one-fifth of total cases could be predicted in a given time-series data set. In the present study, it was tried to predict the monthy recorded cases of CL and malaria in the 25 cities of Fars province for the 12 months of 2016 using the CL cases recorded within 72 months and malaria cases recorded during 68 months.
Using 29 177 CL cases recorded during 2010-2015, auto-regressive(1) time-series model with seasonality of 12 and quadratic trend of classics transformation was applied to predict CL cases in 2016.Number of cases during 2010-2015 in addition to the predicted cases in 2016 using the same model has been depicted in Figure 1.
ACF and PACF sample models have been shown in Figure 2.X-axis indicates lag at which the autocorrelation is computed, and Y-axis indicates the value of correlation (between -1 and 1). A positive correlation shows that large values correspond with large values at the specified lag; a negative correlation shows that large values correspond with small values at the specified lag.
Accordingly, the model was good because the sample and model correlations were out of the bands overlap.
Figure 1. Observed and predicted CL cases by month of onset in Fars province, Iran in 2016.
Figure 2. ACF and PACF sample/model auto-regressive (1) for CL in 2016.The green correlations are derived from the sample, the red correlations are taken from the fitted model, and the two dashed horizontal lines are confidence bands.
3.2.2. Time-series results predicting malaria cases in 2016
Using the 357 malaria cases recorded during 2010-2015, movingaverage (1) time-series model with seasonality of 12 and linear trend of classic transformation was applied to predict malaria cases in2016. Number of cases during 2010-2015 in addition to the predicted cases in 2016 using the same model has been depicted in Figure 3.ACF and PACF sample models have been shown in Figure 4.Accordingly, the model was almost good because the sample and model correlations were out of the bands overlap.
For better understanding of goodness of fit, residual analyses with AIC values for CL and malaria have been presented in Table 2.
Figure 3. Observed and predicted malaria cases by month of onset in Fars province, Iran in 2016.
Figure 4. ACF and PACF sample/model moving-average (1) for malaria in 2016.
Table 2Residual analysis results with AIC scores for CL and malaria.
Almost all test statistics are significant at 0.1 significance level for CL and malaria, showing that the models were fitted well. The order of Minimum AIC Yule-Walker model for residual test needs to be zero to be compatible with White-Noise (mean=0, SD=1). This test assessed the mean of white noise residual. The less AIC/ BIC scores are, the better the model’s goodness of fit would be. Among all tests, just McLeod-Li needs to be less than 0.1 and all other tests should be greater than or equal to 0.1, showing that the models are good based on residual analysis.
3.3.1. Most Likely Clusters (MLC)
A cluster is statistically significant when its test statistic is greater than the critical value for the significance level. The standard Monte Carlo critical value for 0.05 significance level was obtained as 3.69 for CL. Also, the sequential Monto Carlo procedure terminated the calculation after 67 replications resulted in no critical values for malaria.
The MLC of CL occurred in December and contained almost 18%(980/5 359) of the total cases in 2016. This cluster was composed of Zarindasht, Darab, Lar, Jahrom, Neireez, Estahban, Fasa, and Khonj. Additionally, the MLC of malaria occurred from August to December 2016. It included Farashband, Firoozabad, and Shiraz,accounting for 39% (9/23) of total cases in 2016. However, this was not statistically significant.
3.3.2. Secondary Clusters (SC)
The statistically significant SC for CL happened from July to December 2016 including 48% (2 565/5 359) of total cases in Bavanat, Pasargad, Arsenjan, Eghlid, Marvdasht, and Kharame.However, another SC that was non-significant occurred in Rostam and Sepidan during October and December including almost 2%(122/5 359) of the total cases in 2016. There was no SC for malaria in 2016 since the majority of observed cases were less than the expected cases regarding the time and place of occurrence. The results of prospective permutation scan statistics with which the MLCs and SCs of CL and malaria were derived have been presented in Table 3.
3.3.3. Sub-cluster analysis
Doing the same analysis as mentioned above within each cluster resulted in finding sub-clusters. In other words, to determine the city in a cluster that caused it to be significant, sub-cluster analysis was done.
Results of sub-cluster prospective permutation scan statistics showed one statistically significant cluster for CL occurred in Fasa during 1/10/2016 to 31/12/2016 (Pv<0.05), however there was no statistically significant cluster for malaria (Pv> 0.05).
From all detected clusters for CL, just Fasa located in MLC was statistically significant. It contained 3% (148/5 359) of the total cases in 2016. Considering malaria cases, since there was no statistically significant clusters, no sub-clusters were found.
The results of the current research revealed both significant and non-significant prospective clusters of CL in Fars province in 2016.The statistically significant MLC contained 18% of the total cases in December 2016 including Zarindasht, Darab, Lar, Jahrom, Neireez,Estahban, Fasa, and Khonj. Besides, the first SC of CL contained 48% of the total cases during July to December 2016, including Bavanat, Pasargad, Arsenjan, Eghlid, Marvdasht, and Kharame. The second SC of CL that was non-significant also contained 2% of thetotal cases during October to December 2016, including Rostam and Sepidan. In a similar study conducted in Fars province, the results of retrospective space-time cluster during 2010-2015 revealed six significant clusters and sub-clusters in the area. It is noteworthy that the non-significant cluster of the current study was MLC of the cited study, which contained 13% of the total cases from 2010 to 2015 and almost 97% of the total cases from 1/7/2010 to 30/11/2010. This was in agreement with the clinical significance of the detected cluster in 2016. Since no severe changes were observed in the associated environmental and epidemiological factors of CL occurrence in Fars province since 2010, the detected non-significant cluster carries clinical and epidemiological importance regarding dedicating funds,estimating medical facilities, and getting ready for future outbreaks in the area.
Table 3Results of prospective permutation scan statistics for CL and malaria in Fars province, Iran from 1 January to 31 December 2016.
The above-mentioned study also reported a quadratic trend in CL occurrence from 2010 to 2015, which increased sharply until 2014 and then decreased until 2015. This is in concordance with the CL trend in 2016. In addition, a detected retrospective SC in that study contained Zarindasht, Darab, and Lar, which were common cities in MLC. These results empowered the efficiency of permutation scan statistics in finding clusters for a highly prevalent endemic infection;i.e., CL, in the area. Moreover, sub-cluster analysis revealed common outbreaks for both studies. Accordingly, Fasa was a canonical past outbreak and also a canonical site for CL during 1/9/2011 to 28/2/2013. In the current study also, there was a prospective outbreak for CL in December, January, and February[16].
In another study, the results of Ordinary Least Square Regression proved the relationship between CL and rainy days. This is in partial agreement with the results of the present research indicating the occurrence of CL from July to December as rainy times of most tropical areas in the study[12,19]. Indeed, a previous study showed that the seasonal transmission of the disease inclined towards summer and spring. This is not completely consistent with the occurrence of outbreaks in the present study, which happened partially in summer and mostly in fall[9].
To validate the results of the current work, the same permutation scan statistics was done retrospectively on real CL and malaria data in 2016. According to the findings, almost the same MLCs and SCs were detected covering the same cities and time frames. Experts confirmed that non-significant outbreaks had clinical importance and the detected areas were canonical sites for CL and malaria in the region.
The present study findings revealed one non-significant prospective outbreak and no sub-clusters for malaria. The only non-significant prospective outbreak occurred during 1/8/2016 to 31/12/2016. It included Farashband, Firoozabad, and Shiraz and contained 36% of the total cases in 2016. Merely temporal cluster analysis showed an outbreak from June to September (during 2004 to 2006) covering the time span of the detected outbreak in the current study[20].
The transmission period of malaria in the area is from April to November, which is in agreement with the outbreak time frame found in the current study; i.e., from August to December. Moreover,some studies have shown a decreasing trend of malaria occurrence that is in contrast to the linear increasing trend detected in the present
study[21,22].
Similar to other cluster detection methods, space-time analysis can detect more than one cluster regarding time and space. Yet, giving priority between the detected clusters is debatable, especially when there is distinction between statistical and clinical significance.
In space-time cluster analysis, an Oliver-F measure derived from SaTScan ranging from 0 to 1 was measured for each cluster. The closer the Oliver-F measure is to 1, the more a cluster will be likely to be a true one. However, this applicable measure is not computable with prospective permutation statistics with which, hyper geometric distribution is applied to estimate the expected number of cases.In other words, Oliver-F measure is computable only with Poisson distribution applied to estimate the expected number of cases in purely spatial clustering. It should be noted that as a solution regarding this drawback in our study, sub-cluster analysis is a tool to recognize a true cluster. Another limitation of the current study was that the reporting system of infectious diseases is a passive one and,consequently, the predicted cases of CL and malaria might have been underestimated. This drawback did not affect detection of clusters,but had a severe impact on disease load, determining the necessary medical facilities, and estimating the disease prevalence.
In conclusion, prospective permutation scan statistics could detect both statistical and clinical outbreaks of CL and malaria, but it seemed to work more efficiently with CL as an endemic and highly prevalent disease in the area.
Conflict of interest statement
The authors declare that there is no conflict of interest.
Acknowledgements
This article was extracted from the PhD dissertation (proposal No.12439) written by Marjan Zare and approved by the Research Vicechancellor of Shiraz University of Medical Sciences. Hereby, the authors would like to thank Ms. A. Keivanshekouh at the Research Improvement Center of Shiraz University of Medical Sciences for improving the use of English in the manuscript.
Asian Pacific Journal of Tropical Biomedicine2018年10期