Prediction of flyrock induced by mine blasting using a novel kernel-based extreme learning machine

2021-12-24 02:49MhiJmiMhiHsnipnhMsouKrbsiImnAhminrSomyThrir

Mhi Jmi, Mhi Hsnipnh, Msou Krbsi, Imn Ahminr,Somy Thrir

a Faculty of Engineering, Shohadaye Hoveizeh Campus of Technology, Shahid Chamran University of Ahvaz, Dashte Azadegan, Iran

b Department of Mining Engineering, University of Kashan, Kashan, Iran

c Institute of Research and Development, Duy Tan University, Da Nang, 550000, Vietnam

d Department of Water Engineering, Faculty of Agriculture, University of Zanjan, Zanjan, Iran

e Department of Civil Engineering, Behbahan Khatam Alanbia University of Technology, Behbahan, Iran

f Department of Computer Sciences, Faculty of Mathematics and Computer Sciences, Shahid Chamran University of Ahvaz, Ahvaz, Iran

Keywords:Blasting Flyrock distance Kernel extreme learning machine (KELM)Local weighted linear regression (LWLR)Response surface methodology (RSM)

ABSTRACT Blasting is a common method of breaking rock in surface mines.Although the fragmentation with proper size is the main purpose,other undesirable effects such as flyrock are inevitable.This study is carried out to evaluate the capability of a novel kernel-based extreme learning machine algorithm, called kernel extreme learning machine (KELM), by which the flyrock distance (FRD) is predicted. Furthermore, the other three data-driven models including local weighted linear regression (LWLR), response surface methodology (RSM) and boosted regression tree(BRT)are also developed to validate the main model.A database gathered from three quarry sites in Malaysia is employed to construct the proposed models using 73 sets of spacing, burden, stemming length and powder factor data as inputs and FRD as target.Afterwards, the validity of the models is evaluated by comparing the corresponding values of some statistical metrics and validation tools. Finally, the results verify that the proposed KELM model on account of highest correlation coefficient (R) and lowest root mean square error (RMSE) is more computationally efficient,leading to better predictive capability compared to LWLR,RSM and BRT models for all data sets.

1. Introduction

Drilling and blasting is a common method of extracting minerals and ores at surface mining operations.Several desirable factors may be of interest in conducting blasting operations such as improving the drilling operations efficiency and reducing the total costs.About 85% of the energy in blasting operations is dispersed in the earth with adverse effects (Hajihassani et al., 2015; Hasanipanah et al.,2015). One of the adverse problems is flyrock which means movement or throwing of rock fragments under excessive pressure due to an unexpected blast of explosives. The previous studies showed that impermissible throwing of rocks has caused most of damages and hazards reported in the mines (Khandelwal and Singh, 2005; Khandelwal and Monjezi, 2013; Jahed Armaghani et al., 2016; Hasanipanah and Amnieh, 2020; Hasanipanah et al.,2020a). Rock mass resistance, powder factor (PF), and blast energy are key parameters on the flyrock distance (FRD). Based on aforementioned studies, any inconsistency between the above factors(rock mass resistance,PF and blast energy)may result in the occurrence of flyrock phenomenon.For example,a proper value of PF can produce proper blast energy and lead to a good fragmentation size,while any increase or decrease in PF value can lead to an improper blast energy and produce the undesirable effects such as flyrock. Note that, the PF is the amount of explosive (kg or g)needed to fragment 1 m3or cm3of rock (Hustrulid, 1999). According to the previous researches, such as Ghasemi et al. (2012),Hasanipanah and Amnieh (2020) and Jahed Armaghani et al.(2020), there are three types of mechanisms causing flyrock in surface mines.

Cratering occurs when the ratio of stemming height(St)to blasthole diameter is very small(Ghasemi et al.,2012;Hasanipanah and Amnieh, 2020). In the cratering occurrence, the flyrock can be projected in any direction from a crater at the hole collar(Ghasemi et al., 2012). Using inadequate stemming materials leads to the rifling. During rifling, blast gases may create a hole along the lowresistance path resulting in the ejection of the collar rock as harmful flyrock (Ghasemi et al., 2012). The convergence of explosive charges in major and huge geological structures or their proximity to brittle plates leads to face bursting. In this situation,the burden conditions usually control FRD in front of the face. In Fig.1,these three types of mechanisms are shown.We studied the previous works to categorize flyrock based on two controllable and non-controllable factors. Proper design can manage the controllable factors, while natural non-controllable factors cannot be modified (Bajpayee et al., 2004). The controllable factors include inaccurate drilling, improper burden (B) and spacing (S), and inadequate St.Note that Stis the blast-hole height minus the length of the explosive column, B is the distance between the individual rows of blast-holes,and S is the distance between blast-holes in any given row (Hustrulid,1999). Besides, uncontrollable factors in the flyrock phenomenon include issues such as rock mass properties and unspecified geological conditions (Ghasemi et al., 2012;Faradonbeh et al., 2016). One of the reasons for unspecified geological conditions issue is the uncertainty of geological testing techniques.

To predict FRD in surface mines, there are two categories of methods. The first category is the machine method with physical mechanisms being investigated(Roth,1975;Little and Blair,2010).On the other words,the machine methods,such as the mechanistic Monte Carlo models (Little and Blair, 2010), are based on mechanistic modeling in which the physical mechanisms are clearly identified (Ghasemi et al., 2012). The second one is the empirical equations not related to the machine methods. In the literature,some studies, such as McKenzie (2009) and Trivedi et al. (2014),have used the empirical equations to predict FRD.The following are the advantages and drawbacks of these methods (Lundborg et al.,1975; Monjezi et al., 2012).

The most important advantage of machine methods is their entire combination of comprehensive components such as rock fall,trajectory and air drag. Besides, the factors mentioned above are not specific to the site.On the other hand,an obvious drawback of this method is its need to several input factors such as partial velocity, mass and throwing angle for measurement of FRD. These factors are mostly specific to the site leading to some undesired problems (Ghasemi et al., 2012).

Fig.1. Mechanisms causing flyrock in surface mines (Han et al., 2020).

Empirical methods are favorably useful in predicting FRD dealing with simple methods or similar formulas. The sitespecific nature of empirical techniques is their main drawback due to the applicability of statistical data only for the areas of site which are measured.It is worth noting that other models usually combine the practical and mechanical components. Several researchers have recently employed methods on the basis of artificial intelligence (AI) to solve mining and civil engineering problems (e.g. Alimohammadlou et al., 2014; Sezer et al., 2014;Sevgen et al., 2019; Zhou et al., 2020a, b, c, d, 2021a, b; Huang et al., 2020, 2021; Bardhan et al., 2021; Can et al., 2021), and also flyrock which is caused by blasting. Flyrock has inevitable adverse effects. It is not possible to completely eliminate them but damage can be somehow prevented by minimizing them to a reasonable extent. Flyrock accidents can be controlled and prevented by estimating various occurrences of the flyrock phenomenon as one of the most effective ways. Designers and technicians intend to minimize flyrock to prepare a safe place for equipment and workers in the area surrounded by the mine. For this purpose,different empirical models have been developed for the prediction of flyrock considering the effects of certain burdens and the diameter of the hole (Roth,1975). Due to the effect of other factors on the flyrock, in general, it is not possible to strongly predict the performance of empirical models (Ghasemi et al., 2012). As mentioned earlier, the AI methods have been recently employed in the literature to predict FRD. Nguyen et al.(2021) offered a hybrid of support vector machine (SVM) and whale optimization algorithm to predict FRD. Then, the performance of hybrid model was compared with two other datadriven models including the artificial neural network (ANN)and gradient boosting machine. Their results confirmed the effectiveness of the proposed model to accurately predict FRD. In another study, Ye et al. (2021) investigated the application of random forest and genetic programming to predict FRD. Their results demonstrated that the performance of genetic programming is better than the random forest, and could be a good tool in this field.Recently,a combination of support vector regression and gray wolf optimization algorithm was carried out in predicting FRD by Jahed Armaghani et al. (2020). They indicated the integrity and reliability of the combined model to predict FRD.

The main motivation behind this research is to provide more accurate models than the previous models in estimating the FRD having appropriate performance. In our analysis, a novel kernelbased extreme learning machine (ELM) algorithm, called the kernel ELM (KELM), is proposed to predict FRD based on four features including S, B, Stand PF. Moreover, three other data-driven models including local weighted linear regression (LWLR),response surface methodology(RSM), and boosted regression tree(BRT)are also developed for the sake of comparison.The potential of four models are examined for the first time to predict the FRD estimation. Furthermore, some efficient validation tools, uncertainty,and sensitivity analysis are conducted to precisely justify the capability of data-driven models.The results demonstrate that the KELM, as the best predictive model, has considerable capability to accurately predict the FRD and regarding to the simplicity of setting the hyperparameters, it prevents over-fitting.

The rest of this paper is organized as follows:Section 2 devotes to statistically describing the collected data sets and pre-processing of the data sets used in simulation. Section 3 explains the methodology of the data-driven models and all validation devices. Section 4 includes the application results and discussion, data outlier detection, uncertainty, and sensitivity analysis. Finally, the conclusions are reflected in Section 5.

2. Case studies and statistical analysis

To accomplish the objectives of this study,three granite quarry sites, including the Ulu Tiram, Pengerang and Masai quarry sites,located in Malaysia, are investigated (Fig. 2). The rock quality designation (RQD) and uniaxial compressive strength (UCS) of the aforementioned sites are varied in range of 45-80 and 30-110 MPa,respectively.

To fragment the rock masses, the drilling and blasting method was used in the sites.The drilling process was performed by using the holes with diameters of 75 mm,115 mm and 150 mm for the Ulu Tiram, the Pengerang and the Masai sites, respectively.Furthermore, in the blasting process, the ANFO was used as the main explosive material. More details regarding the studied cases can be found in Hasanipanah et al. (2017).

The blasting operations produce some undesirable effects such as flyrock, ground vibration and air-overpressure. These unfavorable effects should be considered in each blasting site. In the present study, a comprehensive AI based investigation is performed to predict FRD in the aforementioned sites.Some blast design parameters including the S,B,Stand PF were measured in the sites,and also the values of FRD for each blasting round were carefully measured.After each blasting event,the relevant videos were observed to determine the locations of the maximum rock projections.Then,the maximum horizontal distances between blasting-point and landed fragments were considered as flyrock value. For this purpose, a hand-held global positioning system(GPS)was used.

Totally, 73 sets of data, collected from 73 blasting events,including four input parameters (S, B, Stand PF) and one output parameter (FRD) were prepared and used in the modeling processes.Note that many researchers have used this range of datasets to evaluate and predict FRD in their studies.For example,Tonnizam Mohamad et al.(2013),Trivedi et al.(2014),Armaghani et al.(2014),Faradonbeh et al. (2016), Zhou et al. (2020d), Fattahi and Hasanipanah (2020), and Hasanipanah and Amnieh (2020) have used 39,95,44,76,65,80 and 62 datasets in their studies to predict FRD.Therefore,the use of 73 datasets in the present study can be an acceptable number of dataset to evaluate FRD.

In training models,70%of all data(52 data)were considered and remaining datasets (21 data) were allocated for the testing stage.Data statistics used in the simulation of FRD are tabulated in Table 1.The descriptive statistics specially the skewness and kurtosis confirmed Gaussian behavior of the implemented variables except Stwhich had kurtosis factor equal to 3.254.

Fig. 3 displays the dispersion of data points and box plots of normalized value of datasets for visually assessment of the distribution function. According to this figure, the Ston account of maximum skewness (1.682) is far from the normal distribution

whereas rests of variables by less skewness are close to the normal distribution. In addition, the Anderson-Darling test (Scholz and Stephens, 1987) on all datasets ascertains that the FRD sample passes the normality test which can be a positive point in modeling.

The degree of linear correlation between the input and target parameters has a significant effect on determining the type of data mining method.The linear dependence of all data sets is illustrated in a correlation matrix in Fig. 4.

It is clear that spacing(rp= -0∙6415)on account of the highest absolute Pearson correlation coefficients is recognized as the most(inverse) effective feature on FRD estimation. Regarding the low values of linear correlation between the rest of input data and FRD,it is not possible to judge definitively on the importance of the other input variables and it seems reasonable to perform a sensitivity analysis based on the data mining models (Jamei and Ahmadianfar, 2020a, b). Eventually, the predictive function of FRD based on four mentioned inputs is defined as

It is noteworthy that all the variables are normalized to restrict the variables into the range[0,1] for facilitating the complexity of modeling and computational cost reduction using (Xnor= (X -Xmin)/(Xmax- Xmin)) formulation, where Xnordenotes the normalized value;X is the original value of variable;and Xmaxand Xminare the maximum and minimum values of the variable,respectively.

Fig. 2. The study area and granite quarry sites in this research.

Table 1 Statistical indicators associated with the variables used in FRD modeling.

Fig. 4. The correlation plot for the input and target variables.

3. Methodology

3.1. KELM

ELM was first brought up by Huang et al. (2006). The ELM calculates the weights of output in just one step without any iteration. The main advantages of this method are high learning speed and easy implementation.This method is a novel version of the single layer feed forward network (SLFN) which, unlike the ANN, is random in nature. The ELM randomly generates input weights and hidden layer biases, and then maintains them fixed throughout training process.Recently,a novel version of the ELM,i.e. KELM, has been introduced by Huang et al. (2011). The KELM uses the main advantages of the ELM and kernel functions (KFs)simultaneously. It was demonstrated that it provides a better prediction efficiency than the ELM with lower computational cost(Chen et al.,2020a,b).The ELM output with M hidden nodes can be defined as

where xk= [xk1,xk2,…,xkn]Tdenotes the input vector with n nodes, yk= [yk1,yk2,…,ykn]Trepresents the output vector with n nodes, aistands for the weight vector conecting jth hidden node and input nodes, αjis the weight vector linking the output nodes and jth hidden node,bjsymbolizes the jth hidden node bias,and h indicates the nonlinear function for hidden layer.

Eq. (1) can be expressed as

where Z is the expected output;and H denotes the matrix of hidden layer output, which is defined as

In this research,the radial basis function(RBF)is employed as a KF,which is described by

In order to find the optimal value for the weight vector α, the ELM needs an objective function (OF) that should be minimized.The OF can be expressed as

where H†denotes the Moore-Penrose inverse matrix(MPIM)of H.According to the orthogonal projection technique and theory of ridge regression (Hoerl and Kennard, 1970), the regularization factor (F) was considered in the optimization process, thus the solution α can be calculated as

The ELM achieves high performance,but its random nature is its major drawback.Hence,Huang et al.(2011)introduced the KELM to overcome this shortcoming. Fig. 5 illustrates the structure of the KELM.The KELM uses the kernel matrix(Kernel(x,xj))instead of the activation function h(x).The kernel matrix based on an orthogonal projection procedure can be expressed as

Therefore, instead of the activation function, KELM needs a KF.

Fig. 5. KELM structure utilized in modeling the FRD values.

where ρ represents the kernel width of RBF.

3.2. LWLR

In the conventional linear regression methods, all training samples are considered by the same weights. This will cause the problem of under-fitting the data by this type of regression. This under-fit negatively impacts the accuracy of the model(Wang et al.,2016).To resolve this problem,LWLR method,which is an advanced version of the linear regression, has been introduced (Cleveland and Devlin, 1988). The LWLR method is one of the lazy learning methods.Assume that the estimated yθ(x)output is determined by regression using the following equation:

where x is the training input data and θ stands for the coefficient factor for the training input. The LWLR method minimizes the following objective function to achieve the least error:

where yiis the observed data value,W is the diagonal matrix of the weights, X denotes the matrix of the input training data, and Y is the estimated output vector. In order to minimize the above objective function, we must differentiate the above equation with respect to θ and set the result to zero (Jamei and Ahmadianfar,2020b):

Therefore, θ in the LWLR model can be obtained as follows(Wang et al., 2016; Jamei and Ahmadianfar, 2020b):

The LWLR method uses KFs similar to the SVM model to increase the weight of close points more than the other points. In the present study, a polynomial kernel with the following equation is used:

where μ is a polynomial degree and C is a constant coefficient. In this study, LWLR model is developed using Weka 3.8.4 software.

3.3. RSM

In the current study, RSM is used to investigate the effect of independent variables (S, B, Stand PF) on the output (response)variable (FRD) and also to provide an optimal regression relationship for FRD prediction. The RSM is a statistical tool for modeling and analyzing the behavior of the process (input) variables on the response (output) variable (Bucher and Bourgund, 1990). Using RSM, most information can be obtained with a minimum of experimental data. The 2nd order RSM model includes linear,quadratic and the interaction of input variables terms. The RSM model for the mentioned case can be expressed as follows(Hamid et al., 2016):

Fig. 6. Architecture of BRT approach.

3.4. BRT

BRT model is a combination of regression trees and boosting technique (Friedman, 2001; Rätsch et al., 2001). This model is one of the several techniques that help to improve the performance of a single model by using a combination of multiple single models(Aertsen et al.,2010).BRT uses a combination of two algorithms:(i)the classification and regression tree (CART) model and (ii) the construction and combination of a set of models by boosting technique (Naghibi and Dashtpagerdi, 2017). Boosting is a way to increase the model’s accuracy,and it works on the basis of building,combining and averaging a large number of models(Aertsen et al.,2010). This process builds better and more accurate models compared to a single model.BRT overcomes the largest weakness of a single decision tree,which is a relatively poor fitting.In BRT,only the first tree of the entire training data is created,subsequent trees are grown on the remaining data from the previous tree(Elith et al.,2008). Trees are not built on all data and only use some data. At each stage, each data set is categorized, and this classification is used as a weight to fit the next tree. Boosting operations are performed to improve the predictive power of the regression tree.This operation is similar to the model averaging process, in which the common results of several models are used, except that the boosting operation is a step-by-step process,meaning that in each iteration step,the models are fitted to the part of a training dataset(Elith et al.,2008;Naghibi and Dashtpagerdi,2017).Therefore,two important parameters, i.e. the shrinkage parameter and the learning rate parameter,are proposed in the model.The shrinkage parameter specifies the percentage of training data in each iteration and is determined by the user. The learning rate parameter indicates the contribution of each tree in the modeling process(Elith et al.,2008).This method has several advantages,including the fact that it can analyze large volumes of data at high speeds, is less sensitive to over-fitting of models, does not require data distribution assumptions,and is also able to determine the most important factors in the modeling process (Westreich et al., 2010). The architecture of BRT approach is depicted in Fig. 6.

3.5. Theory of leverage approach

Checking the validity of the model and detecting outlier data are the most essential parts of the model developing process(Rousseeuw and Leroy, 2005). Among the plenty of numerical and graphical methods of outlier data detection,leverage approach is a well-known method in the field of model application range and outlier data detection (Shateri et al., 2015). In this method, the residual is defined as the difference between the observed and predicted data and the following equation is used to construct the H matrix (Rousseeuw and Leroy, 2005):

where the input matrix X has n rows (number of samples) and m columns (number of variables used in modeling). The diagonal elements of the H matrix are known as the hat index.In the process,the hat index is plotted against the standardized residual in a diagram called Williams. The warning hat index (H*) value is calculated from the relation H*= 3(m +1)/n.In the Williams diagram,the placement of the majority of data in the range of-3

3.6. Statistical criteria for evaluation of models

For the goodness of fit assessment of the provided AI models,five efficient performance metrics were taken into account, i.e.correlation coefficient (R), root mean square error (RMSE), mean absolute percentage error (MAPE), and Theil’s inequality coefficients comprised of the prediction accuracy or Theil 1(U1),and prediction quality or Theil 2(U2)(Zhang et al.,2019,2020a,b,c,d,2021; Wang et al., 2020; Hasanipanah et al., 2020b; Shahrour and Zhang, 2021; Zheng et al., 2021; Zhou et al., 2021c). According to Lewis categorization for MAPE indicator,the predictive models for MAPE less than 10% yields “excellent” results, those for 10% ≤MAPE ≤20%can be evaluated as the“good”performance,those for 20% ≤MAPE ≤50% are considered as “acceptable”, and those for MAPE>50% lead to“inaccurate” outcomes.

Correlation coefficient (R), RMSE, MAPE, prediction accuracy(U1) and prediction quality (U2) are respectively expressed as

Kling-Gupta (KGE) multi-objective index (Gupta et al., 2009)was examined for reliable selecting the optimal predictive model.This decision criterion depends on the correlation coefficient,variability error and bias of the models as follows:

where StDpand StDoare the standard deviations in predicted and observed values of FRD.It should be mentioned that the unit value of KGE indicates the perfect consistency between measured and predicted values.

3.7. Model development

In this research, a comprehensive intelligent data analysis is conducted for precise prediction of the FRD measurement using a novel KELM model. For this purpose, four input variable were adopted comprised of S, B, Stand PF. Three robust data-driven models including LWLR, RSM and BRT were employed to validate the predictive execution of KELM.The procedure of predicting FRD using the provided predictive models is presented in Fig. 7. The KELM model was provided in environment of Matlab Software based on a RBF KF,three layers,four input neurons,and two crucial settings, i.e. the regularization coefficient (F) and width of KF (ρ)were specified by a trial and error process.The best values of F and ρ were obtained equal to 5000 and 30,respectively.Besides,in order to develop the LWLR model,a polynomial kernel was employed to provided model.The optimum setting parameters include C=1.275 and exponent value of μ = 1.07 which were acquired in a trialand-error process and several model executions. The summary of setting parameters for all provided model is listed in Table 2.

Moreover, in this research, an ensemble of BRT based on the“fitrensemble”function(Matlab Software)was accomplished using a least-squares boosting(LSboost)aggregation algorithm and a 10-fold cross-validation method. The optimum regularization to gain the best results was reported in Table 2.Fig.8 shows the learn rate variation in BRT modeling versus R and RMSE in range of 0.01-1.The optimal learn rate equal to 0.66 resulted in the best performance in the BRT model.

4. Results and discussion

4.1. Application results and analysis

As mentioned earlier, four data-driven models (i.e. RSM, KELM,LWLR and BRT)are used to estimate the FRD.According to Section 2, a total number of 73 data points have been collected for modeling. Fig. 9 illustrates the decision trees obtained by the BRT model. Also, the optimal input variables achieved by the RSM model using a quadratic approximation are listed in Table 3.

Table 4 presents the values of six performance metrics obtained by four models for training,testing and all datasets.As reported in Table 4, the BRT model has the highest R (0.983) and lower MAPE(4.5102)when compared with the KELM,LWLR and RSM models in the training phase. In terms of RMSE (12.3984), KGE (0.9514), U1(0.0265) and U2(0.0565), the KELM models can outperform the other models. In the testing phase, according to Table 4, the LWLR had the maximum value of R(0.976)and minimum value of RMSE(13.325), U1(0.0282) and U2(0.0561), outperforming the KELM,RSM and BRT models, while the KELM model had the maximum value of KGE (0.9679) and minimum value of MAPE (5.4947)compared to other models. Tables 5 and 6 report the rank of all models in the training and testing phases, respectively. Based on these tables, the KELM can obtain the best rank (rank = 1.33), followed by the BRT (rank = 2), RSM (rank = 3) and LWLR(rank = 3.67), respectively, at the training stage. In addition, the best rank belongs to the LWLR model(rank=1.33),followed by the KELM (rank = 1.67), RSM (rank = 3) and BRT (rank = 4), respectively, in the testing phase. Overall, by averaging the mean ranks obtained for training and testing phases, it was specified that the best rank belongs to the KELM(rank= 1.5), followed by the LWLR(rank= 2.5), RSM and BRT (rank= 3), respectively.

Fig. 7. Flowchart associated with the process of FRD estimation using four data-driven approaches.

Table 2 The characteristics and setting parameter of proposed AI based approaches.

Fig. 8. The learn rate variation versus correlation coefficient in testing phase.

The measured versus predicted FRD values at the training and testing stages are displayed in Fig.10. According to this figure, the error of estimated values obtained by all models, when compared with the measured values,is within the range of±10%.This figure shows that the RSM, LWLR and KELM have a reasonable precision and the estimated FRD values were closer to the 45°line.

Fig. 11 demonstrates the physical behaviors of all models to assess their capability in tackling the nonlinear trend of the FRD values for all datasets, where picked intervals for accurate evaluation are zoomed beside each plot. Taking a deeper look into the calculated results indicates that the KELM, RSM and LWLR models could successfully capture the non-linearity of the FRD values.The distribution of FRD values achieved by the KELM, LWLR, RSM and BRT models for all datasets is displayed in the form of box plots in Fig.12.As shown in this figure,the KELM has the best distribution of FRD values compared with the LWLR,RSM and BRT models.This means that the proposed KELM can present the superior efficiency among all data-driven models.

Fig. 13 exhibits the relative deviation (RD) distribution in the scatter form for the KELM, RSM, LWLR and BRT models in the training and testing phases.From this figure,the KELM by RD range of -15% ≤RD ≤13% provides the least under-/over-estimation compared with LWLR, RSM and BRT models by the RD ranges of- 20∙5% ≤RD ≤17%, - 15∙7% ≤RD ≤13∙6%, and - 70% ≤RD ≤15∙3%, respectively. According to the obtained results, the KELM model yields more compression of error distribution as compared to the LWLR,RSM and BRT models.

Fig.14 illustrates the pie plots of the RD values for all provided models.The pie plots indicate that about 62%of the estimated FRD by the KELM have RD less than 5% while only 56.76%, 49.32% and 54.79% of the estimated FRD by LWLR, RSM and BRT have RD less than 5%,respectively.Also,5.54%of all datasets have RD more than 10% for the KELM model, while 14.2%, 16% and 17.5% of the predicted FRD by the LWLR,RSM and BRT models,respectively,provide the RD more than 10%. Consequently, the results reveal that the KELM model is recognized as a more reliable and robust predictive model in comparison with the LWLR, RSM and BRT models for estimating the FRD.

It is noteworthy that the extracted relationship obtained from the RSM model is written as

4.2. Sensitivity analysis

Fig. 9. Achieved decision trees from the BRT approach to estimate the FRD.

Table 3 The optimum outcome of RSM approach using a quadratic approximation.

Table 4 Quantitative evaluation of AI-based approaches for predicting FRD.

Sensitivity analysis is one of the most crucial parts of datadriven model which gives the reliable and useful information on feature influence on simulation of phenomena.There are different sensitivity analysis methods including cosine amplitude method(CAM) (Khandelwal et al., 2016), partial derivative sensitivity analysis (PDSA), relevancy factor (Chen et al., 2014), and consecutive excluding input variable from the predictive model Note: SE is the standard error, and tstatis the t-statistic.(Ahmadianfar et al., 2019). Sequentially excluding input variables(feature)is one of the most reliable sensitivity analysis approaches which employed in this study on account of more compatibility with AI models. Therefore, KELM as the main contribution of the current paper was employed to handling sensitivity analysis. For this purpose, four combinations were compared to benchmark KELM model(with all features)and given results were reported in Table 7. One can infer that the parameter S by the most impact on accuracy reduction of original model, in terms of R = 0.3859,RMSE=50.1327,U1=0.1079 and U2=0.2127,was recognized as the most influential feature. Moreover, the parameter B based onmetrics of R=0.8127,RMSE=31.8143,U1=0.0678 and U2=0.135 was identified as the second effective parameter followed by the PF(R=0.8505,RMSE=28.5504,U1=0.0609 and U2=0.1211).Fig.15 depicts the selective metrics (R, KGE, U1and U2) for sensitivity analysis execution for four combinations and benchmark(original)model. The parameters S and Stby lowest and highest conformity metrics (R and KGE) were diagnosed as the significant and insignificant features in modeling, respectively.

Table 5 Rank of all models for training stage.

Table 6 Rank of all models for testing stage.

4.3. Uncertainty analysis

Fig.10. Comparison between the measured and estimated FRD values for all predictive models in training and testing phases: (a) KELM, (b) LWLR, (c) RSM and (d) BRT.

Fig.11. Capability assessment of the provided data-driven models to capture the non-linearity relationship between the measured and predicted FRD values:(a)KELM,(b)LWLR,(c)RSM and (d) BRT.

Fig.12. The distribution of FRD values for all datasets in RSM, LWLR, KELM and BRT.

Fig.13. Variation of RD versus measured FRD for evaluating the accuracy of all predictive models in the training and testing phases: (a) KELM, (b) LWLR, (c) RSM and (d) BRT.

Fig.14. Cumulative absolute RD for (a) KELM, (b) LWLR, (c) RSM and (d) BRT models.

Table 7 The sensitivity analysis results using KELM model.

To quantitatively evaluate the uncertainty of the models in the FRD estimation, the uncertainty analysis was performed on the models. To calculate the uncertainty, first, the error value of each sample was calculated as ei= FRDo,i- FRDp,i, and then the mean values of the error e and the deviation of the estimation error Sewere computed using the following equations:

Fig.15. Qualitative conformity metrics for sensitivity analysis execution.

Table 8 Uncertainty estimates of FRD for different models.

4.4. Applicability domain assessment

In the present study, leverage approach was used to determine the outlier data and statistical validity of the model. Fig.16 shows the Williams diagram of the KELM, RSM, LWLR and BRT models.According to the shape, all data points are in the ranges of-3

Fig. 16. Williams diagram of applicability domain specifying for all data-driven paradigms.

5. Conclusions

Any blasting event in surface mines produces a sudden ejection of rock pieces, called flyrock, which may result in human injuries,fatalities and property damage. Therefore, a precise prediction of FRD is important especially for safety issues. To this view, the present research aimed at introducing a novel KELM model to predict FRD. To make a fair comparison, three other data-driven models, i.e. LWLR, RSM and BRT, were also employed. Then,several statistical and graphical tools were considered to evaluate the predictive performance of the provided models. Based on the obtained results, the following conclusions can be summarized:

(1) Among all considered models,the KELM in terms of R=0.9706 and RMSE = 13.0461 m and rank scores of 1.33 and 1.67 for training and testing, respectively exhibited a higher performance capacity to predict FRD in all datasets. However, the LWLR method on account of the best rank score(1.33)showed the best performance(R=0.976 and RMSE=13.325 m)in the testing phase and the KELM method was in the second place with a slight difference. Accordingly, the KELM and LWLR models could be appropriate tools for accurate prediction of FRDandhavethecapacitytogeneralizeinotherfields,andRSM and BRT stood in the next ranks,respectively.

(2) The error analysis demonstrated that the KELM model due to lowest RD band(-15%≤RD ≤13%)has the best performance in predicting FRD in comparison with the other approaches.

(3) Based on the sensitivity analysis, it was found that the parameters S and Stby lowest and highest conformity metrics(R and KGE) were diagnosed as the significant and insignificant parameters in modeling, respectively.

(4) Based on uncertainty analysis,LWLR and KELM models have the least uncertainty while the BRT model has the highest uncertainty in FRD prediction.

(5) Finally,to detect outlier data and applicability domain of the models, the leverage approach was used. The Williams plot showed no outlier data in the models, and all four models were statistically valid and correct.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.