Geomechanical characterization of volcanic rocks using empirical systems and data mining techniques

2018-03-01 03:16MirndSousGomesTinooFerreir

T.Mirnd,L.R.Sous,A.T.Gomes,J.Tinoo,C.Ferreir

aUniversity of Minho,Guimarães,Portugal

bState Key Laboratory of Geomechanics and Deep Underground Engineering,China University of Mining and Technology,Beijing,China

cUniversity of Porto,Porto,Portugal

1.Introduction

Preliminary calculation of the geomechanical parameters of rock masses can be carried out using the empirical classification systems.These systems consider,between others,the properties like the strength of the rock,density,condition and orientation of discontinuities,groundwater conditions and the stress state.To evaluate these properties,a numerical measure is given and,subsequently,a final geomechanical index is obtained by applying a numerical expression associated with the system.The result allows classifying the rock mass in a certain class associated with important information for the design like in some cases construction sequences,support needs and geomechanical parameters.

The most widely used systems are the rock mass rating(RMR),Q and geological strength index(GSI)(Bieniawski,1989;Barton,2000;Hoek et al.,2002).For the deformability evaluation,there are several analytical solutions relating the deform ability modulus with geomechanical coefficients.These expressions should always be used considering their application limits.New subsystems were also developed like the QTBMsystem (Barton,2000).This subsystem,starting from the Q system,allows the prediction of several parameters related to the excavation in TBM(tunnel boring machine)tunnels,and also constitutes an important development for the characterization of geomechanical parameters.Some countries developed their own empirical/classification systems like the Chinese BQ classification system(Feng and Hudson,2011),and also the MR system in Portugal and applied in Brazil(Rocha,1976;Miranda,2003).The Chinese BQ system was developed to aid in the stability evaluation of engineering structures in rock masses providing rock mass characterization in design and construction.

The diversity and variability of rock masses imply the adoption of distinct methodologies for their characterization.Characterization passes,most of the time,to the application of empirical systems without ever leaving part of the realization of in situ and laboratory characterization tests.These empirical systems have been experiencing constant modifications arising from the expansion of knowledge and experience that have been acquired over time.Innovative work was carried out by using data mining(DM)processes in geotechnical engineering to uncover new and useful predictive models in databases of geotechnical data through knowledge discovery in databases(KDD)processes(Miranda and Sousa,2012;Sousa et al.,2012;Miranda et al.,2013).These processes define the main procedures for transforming raw data into useful knowledge.Thus,refining existing classification systems and/or developing new empirical systems are the normal thing to follow with more experience and knowledge.

Several models were developed using different sets of input information,which allow their use in different conditions of knowledge about the rock mass and can be helpful in the decision making process.Some of the estimated models use less information than the original formulations while maintaining a high accuracy level.The relevance of the Q index for determining rock mass strength parameters was known since the relation tan(Jr/Ja)is used to approximate the inter-block shear strength,whereJrandJaare the parameters from Q system related to the discontinuity characteristics.This assumption was later confirmed by Barton(2013),which means that Q index can also be used to compute strength parameters of jointed rock masses assumed as a continuum medium,corroborating the idea that this index is a very complete and useful parameter.The results of some expressions concerning the calculation of the deformability modulus were compared.A methodology to define a single final value for this parameter was established and validated with the results of reliable in situ tests.It was verified that some expressions may not be adequate for their application in specific rock masses(Miranda and Sousa,2012).

For volcanic rocks, a new empiric system was developed from the adaptation of the RMR system and by using a classification developed at São Paulo,for the design of several tunnels in basaltic formations(Ojima,1981;Menezesetal.,2005;Moura and Sousa,2007).This followed the experience acquired in Brazil during the construction of a wide number of large dams in volcanic foundations,in particular the dam of Itaipú,at time the largest hydroelectric undertaking worldwide,the dam of Água Vermelha and the dams of Jupiá,Ilha Solteira,and Três Irmãos,amongothers(Pedroetal., 1975;Cabrera,1988;Herrera,2005;Silveira,2009;Sadowski,2012).

Later,another adaptation of the empirical system developed in Brazil was applied to volcanic road tunnels at Madeira Island(Menezes et al.,2005;Moura and Sousa,2007).New tools of computer sciences,namely those based on artificial intelligence(AI),can play an important role in the generation of calculation means that make possible the inclusion of that experience and knowledge(Russell and Norvig,2003).The application of DM techniques to well-organized data gathered from large geotechnical works can provide the basis for the development of models that can be very useful in future projects.Also,they permit to validate the new empirical system for geomechanical characterizations of volcanic rocks.In heterogeneous volcanic rock formations,the geomechanical characterization becomes more complex,due to their texture consistent with the manner of eruption(Stoffer,2002).The deterministic definition of the parameters and zoning is also difficult.In this context,empirical systems,like RMR,have been statistically applied using the Monte Carlo method which makes it possible to simulate various scenarios.A study has already been performed for Caniçal tunnel at Madeira Island in volcanic formations which permitted to obtain a probabilistic description of the strength and deformability parameters(Costa et al.,2003).

The purpose of this paper is to analyze the geomechanical behavior of volcanic rock formations,characterize them,and develop an empirical system to classify them,as well as to apply DM techniques in order to develop new models.The volcanic rock system(VRS)has been designated.Geotechnical information was collected from samples from several Atlantic Ocean islands that include Madeira,Azores and Canarias archipelagos,taking into consideration the data from different sources(Costa et al.,2003;Cafofo and Sousa,2007; Concha-Dimas and Vargas-Godinez,2007;González de Vallejo et al.,2007;Moura and Sousa,2007;Serrano and Olalla,2007;Simic,2007;Amaral et al.,2016).The various rock types are described with particular emphasis on the Madeira Island rock formations and their geomechanical properties.

In islands with volcanic rock masses,various critical infrastructure systems such as transportations networks,hydroelectric projects and building foundations,have been successfully constructed despite slope stability problems.The nature of the topography and land use strongly affect the construction activities.In Madeira Island,a new road transport system has been constructed that enables the easy access to many parts of the island.More than 40%of the length of this new transportation system is in tunnels(Moura and Sousa,2007).Also,mention to the innovative underground pumped hydroelectric system(UPHS)of Socorridos,part of a multi-purpose scheme at Madeira Island,is a critical undertaking involving underground upper and lower reservoirs built in very heterogeneous volcanic rocks which caused complex challenges(Cafofo and Sousa,2007).

In addition,DM techniques were applied to VRSin order to predict volcanic rock mass classes(Miranda,2007;Miranda et al.,2013;Tinoco et al.,2016).Different algorithms were developed for different approaches used with the VRS and RMR classification systems.The application of these AI techniques permitted also to validate the new empirical system(Miranda,2007;Miranda et al.,2013).

In this paper,aspects related to geomechanical characterization of volcanic rocks in Atlantic Ocean islands are discussed in next section with emphasis on Madeira,Canarias and Azores islands,from where the major information is obtained in the organized database.Section 3 analyzes the new empirical system VRS,while Section 4 presents the database,a summary of the geomechanical properties of the volcanic rock formations and obtained representative correlations.Sections 5-7 are related to the application of DM techniques to the database using different approaches,the evaluation of the models,analysis of results and their interpretation.More relevant conclusions of the study are illustrated in the last section.

2.Geomechanical characterization of volcanic rocks in islands of Atlantic Ocean

Volcanic rocks typically exhibit large natural variability of lithological formations and heterogeneities,with large variations in geometry.Standard empirical systems are frequently used for the geomechanical characterization of volcanic rocks.However,it is necessary to adapt and refine existing systems in order to take into consideration the special features of volcanic rocks.In addition,seismic activity and volcanism that are registered in the islands pose specific questions in the analysis of the characteristics of volcanic rocks(Stoffer,2002;Gaspar et al.,2007).Standard empirical systems are frequently used for the geomechanical characterization of volcanic rocks.However,it is necessary to promote some adaptations to the existing empirical systems in order to take into consideration the special features of volcanic rocks.

In this paper,major geomechanical information was obtained from Madeira Island.The geology of Madeira Island comprises five volcanic complexes,as indicated in Fig.1(Cafofo and Sousa,2007;Moura and Sousa,2007).The rock types encountered in Madeira are basalts,breccias and tuffs.Table 1 shows geomechanical characteristics of these formations applying RMR classification system for the Covão tunnel that forms part of the Socorridos hydroelectric scheme(Cafofo and Sousa,2007).It was possible to estimate the deformability modulus of the rock mass according to the Serafim and Pereira(1983)formula based on the RMR values and the strength of the rock mass was obtained using Hoek and Brown(1997)failure criterion.

In the Canarias Islands,geomechanical characteristics of volcanic rock formations were obtained from González de Vallejo et al.(2007)who performed an extensive study of Tenerife Island.This study included the description and characterization of the volcanic materials of this island,mainly from field data,bibliographical data collection,including the analysis of geotechnical and research projects,and expertise judgment.As a result of this study,a database has been prepared with more than 400 data mainly from bibliographical collection.Some of the results are illustrated in Table 2,which shows a summary of geotechnical properties.

Fig.1.Geological formations at Madeira Island.β1-Mio-Pliocenic volcanic;β2-Pos-Miocenic volcanic complex;β3-Pos-Miocenic volcanic complex;β4-superior basaltic complex;andβ5-modern basaltic lava.

Table 1Geomechanical properties of volcanic rocks at Madeira Island(Cafofo and Sousa,2007).

The Azores islands are located in the North Atlantic Ocean and are composed of volcanic formations.They occur at the intersection of three tectonic plates(Fig.2),which may explain the seismic and volcanic activities in the islands(Gaspar et al.,2007).Results of geotechnical tests were obtained by Regional Laboratory of Civil Engineering for the representative geological formations(Malheiro and Nunes,2007).The general stratigraphy of different geological formations was synthetized in five profiles including the majority of occurrences,as illustrated in Fig.3.

In volcanic regions like the Azores islands,volcanic cavities pose a serious geotechnical problem.It is usual to use deep boreholes or other non-destructive techniques for the characterization of these openings(Sousa and Oliveira,2004;Jover Carmona et al.,2007;Serrano et al.,2007;Signorelli et al.,2007;Malheiro et al.,2015).The dynamics of lava tubes or lava tunnels can exist,as well as shaft caves.In the case of the Azores,there are about 200 natural caves,the most significant in the Pico and São Miguel islands(Gaspar et al.,2007).The Carvão cave is located in the western part of Ponta Delgada City,in São Miguel Island,and it presents a very complex situation.It hasa totallength of about 2500 m and is divided into three main separate sections.The last section is located under the João do Rego Street,which gives rise to the name of the Cave of João do Rego Street.Another significant occurrence in the Azores archipelago isthe Towers Cave in Pico Island that requires a stabilization of a lava tube.This cave is the largest lava tube known in the islands,with 5150 m in total length and 15 m in height(Costa et al.,2008).

Table 2Geotechnical properties of volcanic rocks at Canarias Islands(Adapted from González de Vallejo et al.,2007).

Fig.2.Main tectonic structures in Azores area(Gaspar et al.,2007).MAR-Mid-Atlantic ridge;EAFZ-East Azores fracture zone;TR-Terceira rift;GF-Gloria fault.

3.Adaptation of an empirical system to volcanic rocks

An empirical system is developed for the characterization of volcanic rocks and is designated as VRS.The VRS is an adaptation of the RMR system and includes a classification developed at São Paulo,for tunnels in basaltic formations(Ojima,1981;Menezes et al.,2005;Moura and Sousa,2007).

The new empirical system is based,like RMR system,on the consideration of six geological-geotechnical parameters to which relative weights are attributed.The final VRS index value,which varies between 0 and 100,is obtained through the algebraic sum of these weights(Fig.4).With this index,it is possible to obtain strength properties,deformability moduli,and description of the rock mass quality,as well as recommendations for excavation and support needs and support loads,using correlations with other geomechanical indices.

The following geomechanical parameters were considered:P1-UCS;P2-rock weathering characteristics;P3-intensity of jointing;P4-discontinuity conditions;P5-presence of water;P6-disposition of blocks.Different weights are assigned to each parameter,as illustrated in Fig.5.

Fig.4.Calculation scheme for the VRS index.

In relation to RMR,the properties were identical forP1,P4andP5,but have different weights.The parameter due to discontinuities orientationP6,introduced by Bieniawski(1989)as an adjustment of the sum of the remaining five parameters,was difficult to assign a weight,because it depends on groundwater conditions.Instead,it was substituted by another parameter related to the disposition of blocks.This parameter is considered to evaluate block stability.Four situations were considered:blocks of very favorable,favorable,acceptable and not acceptable which refer to the stability of the geotechnical structure.The VRS system considers forP2the rock weathering effect which is not considered by the RMR system,whileP3is related to the joint intensity combining the effects of parametersP2(RQD)andP3(discontinuity spacing)considered by RMR system.

The meaning of different parameters is given in Fig.5.A rock mass is classified into six classes.A rock mass designated as class VI has a behavior conditioned by the rock characteristics of deformability and strength,while a formation designated as class I behaves in accordance with the characteristics of the discontinuities.For rock masses with other classes,behavior is determined by the combination of both types of characteristics.

4.Database with volcanic rocks and statistical results

The collected data were organized and structured in a database composed of 99 examples with 29 attributes which are described in Table 3(Cafofo and Sousa,2007;Concha-Dimas and Vargas-Godinez,2007;González de Vallejo et al.,2007;Ito et al.,2007;Moura and Sousa,2007;Simic,2007).

Fig.3.Stratigraphic profiles at Azores Island(Malheiro and Nunes,2007).

Fig.5.Volcanic rock mass classification and weights.

The data were mainly obtained from Madeira Island(76%),with the rest from Canarias Islands(18%)and Mexico(6%).The information covers different rock types,such as basalt(42),breccia(33),tuff(15)and pyroclasts(9).In the database,the deformability modulus of the rock mass(ERM)was derived from the Serafim and Pereira(1983)formula,assuming the restriction ofRMR<80(Miranda,2003).GSI was only calculated forRMR>23 according to the Hoek and Brown(1997)criterion.The values of cohesion and internal friction angle were obtained through the software RocData(Rocscience,2015),considering a depth for all the cases of 20 m.

In Table 4,the average values(avg.)and standard deviations(st.dev.)of the geomechanical properties of the volcanic rockformations are presented,taking into consideration the empirical coefficients VRS,RMR and GSI,and UCS,ERM,cand φ.Some of the relevant statistical results are also presented in this table for different rock types and parameters.

Table 3Data considered in the database.

Some representative correlations were obtained between VRS coefficients and RMR values.Fig.6 represents the relationships between VRS and RMR.Another representative correlation was obtained betweenERMand VRS(Fig.7),i.e.an exponential expression.

The correlations betweenERMand VRS for each rock type are illustrated in Fig.8.The correlation coefficient is very low for tuff,being very similar for the other rock types.

5.Application of data mining techniques to the database

In this paper,we make use of the flexible leaning capabilities of DM techniques to predict volcanic rock mass(VR)classes.In particular,four different approaches were followed:

(1)VR class prediction based on a decision tree algorithm using attributesP1,P2,P3,P4,P5andP6variables from the VRS classification system.

(2)VR class prediction based on a decision tree algorithm using attributesP1,P2,P3,P4,P5andP6variables from the RMR classification system.

(3)Prediction of VR classification system value based on three different DM algorithms,i.e.multiple regression(MR),artificial neural networks(ANNs)and support vector machines(SVMs),and considering four of the most relevant variables from the VRS classification system.Based on the VR value,the correspondent class can then be calculated.

Fig.6.Linear correlation between VRS and RMR coefficients.

Fig.7.Deformability modulus of the rock mass versus VRS coefficient.

(4)RMR value prediction based on three different DM algorithms(the same as in(3)above),and considering four of the most relevant variables from the RMR classification system(P1,P2,P3andP4).Based on the RMR value,the correspondent class is then calculated.

DM techniques are very powerful tools able to solve complex problems,and have been widely used in computer science and other fields over the last decade.Indeed,these tools have already been successful applied to different knowledge domains(Javadi et al.,2012;Liao et al.,2012;Garg et al.,2014)including civil engineering(Miranda et al.,2011;Miranda and Sousa,2012;Gomes Correia et al.,2013;Tinoco et al.,2014a,b,2016;He et al.,2015).DM techniques have also been used in the study of rock masses(Martins and Miranda,2012;Miranda et al.,2013,2014).Many successful applications of DM in solving real complex problems arethe evidence of its potential and,at the same time,the reason they have been chosen to be used in this study.Excellent applications of DM in rock mechanics are the cases of Venda Nova II and Bemposta II hydroelectric schemes built recently in the North of Portugal.New models were built for strength and deformability parameters of granite formations(Miranda and Sousa,2012).Emphasis was put on the evaluation of new geomechanical models at former Homestake gold mine at Lead,USA.DM techniques were applied to RMR,Q and GSI empirical systems and also Bayesian networks (BNs)were learned and tested for prediction of RMR values(Sousa et al.,2012).In addition,the generation of new rock burst in dices applying DM techniques to a database with rockburst laboratory tests is cited(He et al.,2015).

Table 4Summary of geomechanical properties of volcanic rock formations.

Fig.8.Deformability modulus of the rock mass versus VRS for each rock type.

A decision tree(DT)is a direct and acyclic flow chart that represents a set of rules distinguishing classes or values in a hierarchical form.These rules are extracted from the data,using rule induction techniques,and appearin an ‘‘if-then’’structure expressing a simple and conditional logic.Source data are split into subsets based on the attribute test values,and the process is repeated in a recursive manner.Graphically,they present a tree structure and are formed by three main components:

(1)The top node or root that represents all the data.

(2)Branches which connect nodes.Each internal node represents a test to an attribute while the branches denote the outcome of the test.

(3)Leaves which are the terminal nodes represent classes or values.

After a tree is learned,it can be used to classify or calculate the value of a new object.There are two types of decision trees,i.e.classification and regression trees(CART)(Berry and Linoff,2000).These two types of treesuse the same structure.The only difference is the type of the target variable.Classification trees are used to predict the class to which data belongs while regression trees are used to estimate the value of a continuous variable based on the induced mathematical expressions.

CART algorithm is one of the most popular algorithms used for inducing decision trees and was used in this work.It splits the data using a predictor that can be employed several times at different levels.At each stage,data are partitioned so that the cases of the two created subsets are more homogeneous than the previous one.It grows only binary trees(i.e.trees where only two branches can attach to a single root or node),thus,despite the high flexibility,it can sometimes be unreliable and computationally slow.

CART algorithm is capable of constructing trees which can be applied to analyzing regression or classification problems with good results.Nevertheless,the fully automated process may result in an over-structured,inefficient tree.Moreover,many of the branches may reject noise or outliers in the training data.Tree pruning attempts to identify and remove such branches and simplify the tree,with the goal of improving accuracy on new data.The greatest benefit of the decision trees approach is that they are easy to understand and interpret.They use a ‘‘white box’’model,i.e.the induced rules are clear and easy to explain as they use a simple conditional logic.The main drawback is that they get harder to manage as the complexity of data increases,leading to a larger number of branches in the tree.

ANN is a computational model based on the structure and functions of biological neural networks(Kenig et al.,2001).The information is processed using iteration among several neurons.ANNs are considered nonlinear statistical data modeling tools where the complex relationships between inputs and outputs are modeled or patterns are found.This technique is capable of modeling complex nonlinear mappings and is robust in exploration of data with noise.In this paper,the multilayer perceptron that contains only feed-forward connections,with one hidden layer containingHprocessing units,was adopted.Because the network’s performance is sensitive toH(a trade-off between fitting accuracy and generalization capability),we adopta grid search of{0,2,4,6,8}during the learning phase to find the bestHvalue.Such grid searches only considered training data,dividing it into fitting(70%)and validation data(30%),where the validation error was used to select the bestH.After selecting the bestHvalue,the ANN is retrained with the whole training data.The neural function of the hidden nodes was set to the popular logistic function 1/(1+e-x).

SVMs are based on the concept of decision planes that define decision boundaries.A decision plane separates a set of objects having different class memberships.They were initially proposed for classification tasks(Cortes and Vapnik,1995).Then it became possible to apply SVM to regression tasks after the introduction of the ε-insensitive loss function,where ε is the width of a ε-insensitive zone(Smola and Schölkopf,2004).The main purpose of the SVM is to transform input data into a high dimensional feature space using nonlinear mapping.The SVM then finds the best linear separating hyperplane,related to a set of support vector points,in the feature space.This transformation depends on a kernel function.In this paper,the popular Gaussian kernel was adopted.Its performance is affected by three parameters:γ,the parameter of the kernel;C,a penalty parameter;and ε(only for regression)(Safarzadegan Gilan et al.,2012).The heuristics proposed by Cherkassky and Ma(2004)were used to define the two parameter values,C=3(for a standardized output)andpredicted bya 3-nearest neighbor algorithm andNis the number of examples.A grid search(similar to the one used for ANN)of 2{-1,-3,-7,-9}was adopted to optimize the kernel parameter γ.

In MR,several independent variables are linearly combined to predict the dependent(output)variable(Hastie et al.,2009).Due to its additive nature,this model is easy to interpret and is widely used in regression tasks.However,one of its main limitations is its inefficiency at modeling problems of a nonlinear nature.MR was essentially used in this paper as a baseline comparison.

All experiments were conducted using the R statistical environment(Team,2009)and supported through the RMiner package(Cortez,2010),which facilitates the implementation of several DM algorithms,i.e.DTs,ANNs and SVMs algorithms,as well as different validation approaches such as cross-validation.

6.DM model evaluation

For models evaluation and comparison,three classification metrics were used based on the confusion matrix(Hastie et al.,2009):recall,precision andF1-score.

The recall measures the ratio of how many cases of a certain class were properly captured by the model.In other words,the recall of a certain class is given by True Positives/(True Positives+False Negatives).On the other hand,the precision measures the correctness of the model when it predicts a certain class.More specifically,the precision ofa certain class is given by True Positives/(True Positives+False Positives).TheF1-score was also calculated,which represents a trade-off between the recall and precision for a given class.TheF1-score corresponds to the harmonic mean of precision and recall,according to the following expression:2 precision recall/(precision+recall).For all three metrics,the higher the value is,the better the predictions are.In particular,for approaches(3)and(4)(regression models),and for an easy comparison of different models,we take advantage of the regression error characteristic(REC)curve proposed by Bi and Bennett(2003),which plots the error tolerance on thex-axis versus the percentage of points predicted within the tolerance on they-axis.

Fig.9.Decision trees for(a)VRS and(b)RMR systems.

Table 5Models comparison based on recall,precision and F1-score.

The models generalization performance was assessed by 10 runs under a cross-validation(k-fold=10)approach,where the data(P)are randomly sampled intokmutually exclusive subsets(P1,P2,…,Pk),with the same length(Hastie et al.,2009).Training and testing are performedktimes and the overall error of the model is taken as the average of the errors obtained in each iteration.In this scheme,all of the data are used for training and testing.This method requires approximatelyktimes more computation,becausekmodels must be fitted.

In addition to the model accuracy,its interpret ability is also of high importance,especially from an engineering viewpoint.However,SVM and ANN algorithms in particular,which rely on complex statistical analyses,are frequently referred to as “black boxes”due to their high complexity.To overcome this drawback of data-driven models,Cortez and Embrechts(2013)proposed a novel visualization approach based on sensitivity analysis(SA),which is used in this paper.SA is a simple method applied after the training phase and measures the model responses when a given input is changed,allowing the quantification of the relative importance of each attribute as well as its average effect on the target variable.

In particular,the global sensitivity analysis(GSA)method is applied,which is able to detect interactions among input variables.This is achieved by performing a simultaneous variation ofFinputs.Each input is varied through its range withLlevels and the remaining inputs fixed to a given baseline value.In this work,the average input variable value was adopted as a baseline and setL=12,which allows an interesting detail level under a reasonable amount of computational effort.

With the sensitivity response of the GSA,different visualization techniques can be computed.The input importance bar plot shows the relative influence(Ra)of each input variable in the model.To measure this effect,first the gradient metric(ga)for all inputs was calculated.After that,the relative influence was computed as

where^ya,jis the sensitivity response forxa,j.

7.DM results analysis and interpretation

7.1.A classification approach

A hierarchical volcanic rock mass rating was developed based on a DT algorithm,taking as model inputsP1,P2,P3,P4,P5andP6variables from the classification system of VRS(from here named HVR).A similar approach was followed but considering instead attributesP1,P2,P3,P4,P5andP6variables from the RMR system(from here named HRMR).Fig.9 depicts the decision trees according to these two approaches.Table 5 summarizes recall,precision andF1-score values for each class according to all models proposed in this work.

Fig.10.(a)HVR and(b)HRMR performance.

Fig.11.HVR and HRMR relative importance.

Fig.10 shows the observed versus predicted classes using the HVR and HRMR models.For each observed class(x-axis),the percentage of each predicted class(y-axis)is shown.Fig.10a shows that the HVR model is unable to correctly identify class 1 and its best performance is for class 4.Around 90%of VR identified as 1(true condition)was classified by the HRV model as 2 and 10%classified as 3.Also the HRMR model(Fig.10b)is unable to correctly identify VR class.The proposed DT is not able to identify classes 1 and 5.The best performance is observed for class 2,for which anF1-score around 73%was achieved(see Table 5).For classes 3 and 4,the proposed system seems to perform slightly well.

Fig.11 illustrates the relative importance of each input variable for both HVR and HRMR models.According to the proposed DT based on the HVR system,the three most relevant variables areP2,P4andP1with an influence close to 30%each.Also,according to the DT based on the RMR system,the three most relevant variables areP1,P4andP2,with a total influence around 94%.

7.2.A regression approach

Instead of predicting volcanic rock mass directly,a different approach was followed where VRS and RMR values were first predicted and then used to calculate VR class based on the VRS and RMR systems.For this,three different DM algorithms were applied,i.e.MR,ANN and SVM.As mentioned previously,only the four most relevant variables are used as input variables,and these are obtained from a SA performed over a model trained with all six variables of the VRS or RMR systems.Thus,attributes onlyP1,P2,P3andP4from VRS and RMR systems were considered.

Fig.12 depicts REC curves for both approaches considering variables from the VRS and RMR systems.Two main observations can be made.On one hand,a better performance is achieved using attributes from the VRS.On the other hand,ANN and MR present very similar performances that are superior to the SVM.

Taking ANN as a reference,Fig.13 compares observed versus predicted values according to ANN models.Fig.14 shows that considering variables from VRS or RMR systems,F1-score values close to 0.8 were achieved for all classes.The only exception is class 1 for which anF1-score of 0.75 was achieved,according to the approach where the variables from the VRS system are considered as model attributes.Although very similar,usingP1,P2,P3andP4as variables from the VRS system results in a more accurate VR classification.Based on these results,it is clear that a higher performance is achieved following a regression approach than by predicting VRS and RMR values.

Measuring the relative importance of each input variable,Fig.14 shows thatP4is the most influential variable,according to ANN models,in VR class identification,with more than 30%of relative importance.The second most influential variable isP2.

Fig.12.ANN,SVM and MR models comparison based on REC curves:(a)VRS and(b)RMR.

Fig.14.Relative importance of each variable according to ANN,SVM and MR models based on VRS and RMR systems.

8.Conclusions

An empirical geomechanical classification system,termed as VRS,was developed specifically for volcanic rocks by adapting the more traditional RMR system.A database of volcanic rocks was created using mainly geomechanical information from different archipelagos.The VRS was calibrated and correlated with RMR system.

DM techniques were applied to predicting VR classes.Different DM algorithms were used that include MR,ANN and SVM.All experiments were conducted using R environment and supported by the software RMiner.

ANN models were used to compare observed and predicted values.ParameterP4(discontinuity conditions)from the VRS is the most relevant variable andP2(rock weathering characteristics)is the second most influential parameter.Considering variables from the VRS and RMR systems,two main observations can be made:a better performance is achieved using attributes from the VRS;and ANN and MR algorithms present very similar performances that are superior to the SVM.

Considering hierarchical models,the HVR model was unable to identify correctly class 1 and its higher performance is observed for class 4.Also,the HRMR model was unable to correctly identify VR classes.Indeed,the proposed DT is not able to identify classes 1 and 5.The best performance is observed for class 2.For classes 3 and 4,the proposed system seems to perform slightly well.

Although the initial classification attempt requires further improvements,this first attempt shows that the use of DM tools in the study of volcanic rocks could be very useful,with important costs reduction.Moreover,the use of sensitivity analysis can help in the clarification(human understanding)of the high complexity of these models,in particular by measuring the relative importance of model attributes.

For future activities,it is proposed that the database should be enriched with information from other regions particularly from Brazil which consequently will permit to establish new correlations for different volcanic formations and to refine the proposed empirical system.Also,this extension will permit to define a probabilistic description for several volcanic formations for the strength parameters based on the generalized Hoek-Brown criterion,and for the deformability.

Conflict of interest

The authors wish to confirm that there are no known conflicts of interest associated with this publication and there has been no significant financial support for this work that could have influenced its outcome.

Acknowledgements

The authors are greatly indebted to Mr.J.Pires Barreto for the given information about volcanic rocks included in the database.The authors also want to express their acknowledgments to Dr.Karim Karam,which obtained PhD from Massachusetts Institute of Technology,for his valuable comments.

Amaral C,Malheiro A,Amaral P.Contribution for the characterization of chemical properties of Azorean rocks(Basalts,trachytes,welded ignimbries,surteyan tuffs and limestones).In:Natural hazards workshop;2016.

Barton N.TBM tunnelling in jointed and faulted rock.Rotterdam:A.A.Balkema;2000.

Barton N.Shear strength criteria for rock,rock joints,rockfill,interfaces and rock masses.In:Constitutive modeling of geomaterials-advances and new applications.Beijing:Springer;2013.p.1-12.

Berry M,Linoff G.Mastering data mining:the art and science of customer relationships management.New York:John Wiley and Sons;2000.

Bi J,Bennett K.Regression error characteristics curves.In:Proceedings of the 20th international conference on machine learning.Washington:AAAI Press;2003.p.43-50.

Bieniawski Z.Engineering rock mass classifications.John Wiley and Sons;1989.

Cabrera J.Foundation investigation and treatment for the main dam,Italpu Project.In:Proceedings of the 2nd international conference on case histories in geotechnical engineering;1988.p.185-94.

Cafofo P,Sousa LR.Innovative underground works at Socorridos,Madeira island,Portugal.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.73-80.

Cherkassky V,Ma Y.Practical selection of SVM parameters and noise estimation for SVM regression.Neural Networks 2004;17(1):113-26.

Concha-Dimas A,Vargas-Godinez J.Effects of flow structure in lavas from Sierra de Guadalupe,Northern Mexico City,on point load index and rock mass quality evaluation.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.67-72.

Cortes C,Vapnik V.Support vector networks.Machine Learning 1995;20(3):273-97.

Cortez P,Embrechts MJ.Using sensitivity analysis and visualization techniques to open black box data mining models.Information Sciences 2013;225:1-17.

Cortez P.Data mining with neural networks and support vector machines using the R/rminer tool.In:Proceedings of advances in data mining-applications and theoretical aspects;2010.p.572-83.

Costa M,Nunes JC,Constância JP,Borges P,Barcelos P,Pereira F,Farinha N,Góis J.Azores volcanic caves.Ponta Delgada.2008.

Costa P,Sousa LR,Baião C,Rosa S.Caniçal tunnel,Madeira island.Geotechnical analysis.In:Proceedings of the 4th workshop on applications of computational mechanics in geotechnical engineering;2003.p.63-73.

Feng XT,Hudson J.Rock engineering design.London:Taylor and Francis Group;2011.

Garg A,Garg A,Tai K,Sreedeep S.An integrated SRM-multi-gene genetic programming approach for prediction of factor of safety of 3-d soil nailed slopes.Engineering Applications of Artificial Intelligence 2014;30:30-40.

Gaspar JL,Queirós G,Ferreira T.Geological hazards at the Azores region.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.11-8.

Gomes Correia A,Cortez P,Tinoco J,Marques R.Artificial intelligence applications in transportation geotechnics. Geotechnical and Geological Engineering 2013;31(3):861-79.

González de Vallejo LI,Hijazo T,Ferrer M,Seisdedos J.Geomechanical characterization of volcanic materials in Tenerife.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.21-8.

Hastie T,Tibshirani R,Friedman J.The elements of statistical learning:data mining,inference,and prediction.New York:Springer-Verlag;2009.

He M,Sousa LR,Miranda T,Zhu G.Rockburst laboratory tests database-application of data mining techniques.Engineering Geology 2015;185:116-30.

Herrera RL.The application of rock mechanics to the analysis of rock foundations of concrete dams.MSc Thesis.São Paulo:University of São Paulo;2005(in Portuguese).

Hoek E,Brown E.Practical estimates of rock mass strength.International Journal of Rock Mechanics and Mining Sciences 1997;34(8):1165-86.

Hoek E,Carranza-Torres C,Corkum B.Hoek-Brown failure criterion-2002 edition.In:Proceedings of the 5th North american rock mechanics symposium;2002.p.267-73.

Ito Y,Agui K,Kusakabe Y,Sakamoto T.Rock failures in volcanic rock area in Hokkaido.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.155-9.

Javadi AA,Ahangar-Asr A,Johari A,Faramarzi A,Toll D.Modelling stress-strain and volume change behaviour of unsaturated soils using an evolutionary based data mining technique,an incremental approach.Engineering Applications of Artificial Intelligence 2012;25(5):926-33.

Jover Carmona F,Signorelli S,Pacheco Cabrera M,Zafrilla S.The stability state of the Jameos del Agua lava tube cave over the auditorium by accurate site investigations.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.81-9.

Kenig S,Ben-David A,Omer M,Sadeh A.Control of properties in injection molding by neural networks.Engineering Applications of Artificial Intelligence 2001;14(6):819-23.

Liao S,Chu P,Hsiao P.Data mining techniques and applications.A decade review from 2000 to 2011.Expert Systems with Applications 2012;39(12):11303-11.

Malheiro A,Nunes JC,Sousa LR,Marques F.Conservation of volcanic caves at Azores islands,Portugal.In:International symposium on scientific problems and long-term preservation of large-scale ancient underground engineering;2015.p.1-8.

Malheiro A,Nunes JC.Volcano stratigraphic profiles for the Azores region:a contribution for the EC8 regulations and the characterization of volcanic rocks geomechanical behavior.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.59-64.

Martins F,Miranda T.Estimation of the rock deformation modulus and RMR based on data mining techniques.Geotechnical and Geological Engineering 2012;30(4):787-801.

Menezes AT,Varela FM,Sousa LR.Road tunnel Caniço-Camacha,Madeira island.Geotechnical study.In:Proceedings of the 2nd Portuguese-Spanish geotechnical symposium;2005.p.179-89(in Portuguese).

Miranda T.Contribution to the calculation of geomechanical parameters for underground structures modelling in granite formations.MSc Thesis.Guimarães:University of Minho;2003(in Portuguese).

Miranda T,Gomes Correia A,Santos M,Sousa LR,Cortez P.New models for strength and deformability parameters calculation in rock masses using data mining techniques.International Journal of Geomechanics 2011;11(1):44-58.

Miranda T,Sousa LR,Roggenthen W,Sousa RL.Application of data mining techniques for the development of new rock mechanics constitutive models.In:Springer series in geomechanics and geoengineering.Springer;2013.p.735-40.

Miranda T,Sousa LR,Tinoco J.Updating of the hierarchical rock mass rating(HRMR)system and a new subsystem developed for weathered granite formations.International Journal of Mining Science and Technology 2014;24:769-75.

Miranda T,Sousa LR.Application of data mining techniques for the development of geomechanical characterization models for rock masses.In:Innovative numerical modeling in geomechanics.London:Taylor and Francis;2012.p.245-64.

Miranda T.Geomechanical parameters evaluation in underground structures.Artificial intelligence,Bayesian probabilities and inverse methods.PhD Thesis.Guimarães:University of Minho;2007.

Moura F,Sousa LR.Road tunnels at Madeira island,Portugal.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.201-6.

Ojima L.Methodology of rock mass classifications for tunneling.LNEC Thesis.Lisbon:Laboratório Nacional de Engenharia Civil(LNEC);1981(in Portuguese).

Pedro J,Sousa LR,Ramos J,Teles M.Study by the finite element method of the Água Vermelha dam foundation.Technical Report.Lisbon:LNEC;1975 (in Portuguese).

Rocha M.Underground structures.Technical Report.Lisbon:LNEC;1976(in Portuguese).

Rocscience.RocData,Version 5.003.2015.https://www.rocscience.com.

Russell S,Norvig P.Artificial intelligence:a modern approach.2nd ed.Prentice Hall;2003.

Sadowski G.A short review on the importance of colonnades,entablatures and“fault joints”for the excavation of basaltic rocks.Soils and Rocks 2012;35(3):257-302.

Safarzadegan Gilan S,Bahrami Jovein H,Ramezanianpour A.Hybrid support vector regression-particle swarm optimization for prediction of compressive strength and RCPT of concretes containing metakaolin.Construction and Building Materials 2012;34:321-9.

Serafim JL,Pereira P.Considerations on the geomechanical classification of Bieniawski.In:International symposium on engineering geology and underground construction;1983.p.33-42.

Serrano A,Olalla C.Strength and deformability of low density pyroclasts.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.35-43.

Serrano A,Perucho A,Estaire J.Foundations on grounds with caverns.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.123-8.

Signorelli S,Jover Carmona FL,Pacheco Cabrera ML,Zafrilla S.The Jameos del Agua cave(lanzarote,Canary Islands):some morphological and geological features of a spectacular lava tube adapted to auditorium.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.45-51.

Silveira J.Practical evaluation of Lugeon’s and Pautre’s criteria for leakages through concrete dams.São Paulo:International Commission on Large Dams(ICOLD);2009.

Simic D.Foundation of the “Los Tilos”arch bridge in La palma island.In:International workshop on volcanic rocks,proceedings of the 11th ISRM congress.ISRM;2007.p.113-21.

Smola A,Schölkopf B.A tutorial on support vector regression.Statistics and Computing 2004;14(3):199-222.

Sousa LR,Miranda T,Ruggenthen W,Sousa RL.Models for geomechanical characterization of the rock mass formations at DUSEL using data mining techniques.In:Proceedings of the 46th US rock mechanics/geomechanics symposium(ARMA 2012).American Rock Mechanics Association;2012.

Sousa LR,Oliveira M.Volcanic cave at João do Rego street.Technical Report 142/04.Lisbon:LNEC;2004(in Portuguese).

Stoffer P.Rocks and geology of the San Francisco bay region.Bulletin 2195.U.S:Department of Interior;2002.

Team R.R:a language and environment for statistical computing.Viena,Austria:R Foundation for Statistical Computing;2009.

Tinoco J,Gomes Correia A,Cortez P.A novel approach to predicting Young’s modulus of jet grouting laboratory formulations over time using data mining techniques.Engineering Geology 2014a;169:50-60.

Tinoco J,Gomes Correia A,Cortez P.Support vector machines applied to uniaxial compressive strength prediction of jet grouting columns.Computers and Geotechnics 2014b;55:132-40.

Tinoco J,Gomes Correia A,Cortez P.Jet grouting column diameter prediction based on a data-driven approach.European Journal of Environmental and Civil Engineering 2016.https://doi.org/10.1080/19648189.2016.1194329.