Real-time analysis and prediction of shield cutterhead torque using optimized gated recurrent unit neural network

2022-08-24 10:02SongShunLinShuiLongShenAnnanZhou

Song-Shun Lin, Shui-Long Shen, Annan Zhou

a Department of Civil Engineering, School of Naval Architecture, Ocean, and Civil Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China

b Key Laboratory of Intelligent Manufacturing Technology,Department of Civil and Environmental Engineering,College of Engineering,Shantou University,Shantou,515063, China

c Discipline of Civil and Infrastructure, School of Engineering, Royal Melbourne Institute of Technology (RMIT), Melbourne, Victoria 3001, Australia

d Department of Civil and Environmental Engineering, National University of Singapore,117576, Singapore

Keywords:Earth pressure balance (EPB) shield tunneling Cutterhead torque (CHT) prediction Particle swarm optimization (PSO)Gated recurrent unit (GRU) neural network

ABSTRACT An accurate prediction of earth pressure balance(EPB)shield moving performance is important to ensure the safety tunnel excavation. A hybrid model is developed based on the particle swarm optimization(PSO) and gated recurrent unit (GRU) neural network. PSO is utilized to assign the optimal hyperparameters of GRU neural network. There are mainly four steps: data collection and processing, hybrid model establishment, model performance evaluation and correlation analysis. The developed model provides an alternative to tackle with time-series data of tunnel project. Apart from that, a novel framework about model application is performed to provide guidelines in practice. A tunnel project is utilized to evaluate the performance of proposed hybrid model. Results indicate that geological and construction variables are significant to the model performance. Correlation analysis shows that construction variables (main thrust and foam liquid volume) display the highest correlation with the cutterhead torque (CHT). This work provides a feasible and applicable alternative way to estimate the performance of shield tunneling.

1. Introduction

In recent years, subway construction has achieved unprecedented development in China (Qiao et al., 2019; Peng et al., 2021).Earth pressure balance(EPB)construction method has been widely applied to the tunnel excavation in urban areas due to its fast construction speed,low cost and environmental friendliness(Qian and Lin,2016;Qiao et al.,2019).Tunnel excavation inevitably poses impacts to surrounding soil, which changes the soil stress status,resulting in deformation of soil (Sirivachiraporn and Phienwej,2012; TanLong, 2021). In order to mitigate the impacts from surrounding environments and ensure safety tunnel excavation,onsite workers have to regulate the construction variables manually according to geological conditions and status of tunnel construction.Besides, cutterhead torque (CHT) is one of the significant construction variables in the tunnel excavation. CHT is highly correlated with the safety tunnel excavation in engineering practice.Under this circumstance,real-time analysis and estimations of shield moving performance become quite significant. Tunnel excavation is a complex activity since there are various aspects involved, i.e. construction technology and management. Interactions between underground structures and soils are complicated. Moreover, uncertainty and fuzziness of geological environments also pose great challenges to data analysis (Zhang et al., 2021a, b). Analysis of construction variables mainly focuses on empirical and analytical analysis, numerical simulations and experimental methods. Empirical analysis for tunnel project emphasizes on the investigations of case histories and engineering practice(Esmailzadeh et al.,2018;Foderà et al.,2020).Zhang et al.(2005) investigated the impacts of cutting wheel rotation, thrust force and earth pressure on shield CHT via multivariate statistical analysis, and proposed a mathematical model of shield CHT estimation in the soft soil environment.Chen et al.(2012)developed a modified equation to calculate thrust, in which pressure distribution pattern and effects from soil compression were considered.Wang et al. (2017) investigated evolution pattern of disc cutter through the theoretical and experimental measures. Zhao et al.

(2019) proposed the theoretical TBM excavation toque model in mix strata and considered the interactions between rock-soil interface and cutterhead. Ren et al. (2018) proposed a prediction model on the basic of friction energy and stress analysis.Analytical approaches required a series of assumptions on physical issues related to tunnel construction (Cheng et al., 2020a, 2021). These issues are usually transformed into partial differential equations.As for numerical simulations, rigorous boundary conditions and physical assumptions are significant (Chen et al., 2020; Yin et al.,2020). Moreover, the accuracy of simulations is heavily relied on geological parameters obtained from construction sites. Agrawal et al. (2021) illustrated the interactions between the rock and cutter, in which the simulation was conducted to compute wear,normal and tangential forces in different conditions. She et al.

(2022) developed disc cutter prediction model considering effects of dense core,whose length was derived from theoretical analysis.As for experimental methods, the design of procedures is timeconsuming (Chen et al., 2018; Song et al., 2019). Moreover, analysis results of above-mentioned methods can only be applied to specific geological conditions. Besides, analyses of tunnel projects via above methods mainly focus on the investigations of environmental impacts, shield cutterhead wear, and grouting effects during tunnel construction. Reasonable analysis of construction variables can provide references for the regulations of parameters,and thus reduce the risk in tunnel construction.

Datasets of tunnel construction are in a huge volume.There is a nonlinear relationship among variables. Dataset of construction variables is a time-series. Comprehensive investigation of construction variables is a significant step for tunnel construction. In recent years, machine learning techniques provide an alternative way to deal with the issues discussed above(Jin and Yin,2020;Jin et al.,2020;Liu et al.,2021;Shahrour and Zhang,2021;Zhang et al.,2022). Machine learning techniques are powerful tools to analyze the nonlinear relationships among variables (Ghimire et al., 2019;Jin et al., 2019a, b; Bardhan et al., 2021a, 2021b; Busari and Lim,2021; Mahmoodzadeh et al., 2021). There are numerous machine learning methods(Ray et al.,2021;Yin et al.,2016;Tao et al.,2020a,b;Zhang et al.,2020a,b;Tang and Na,2021),e.g.back-propagation neural network (Chen et al., 2019; Ranasinghe et al., 2017) and support vector machine (Cheng et al., 2020b). There are many methods to deal with the time-series datasets in nonlinear relationships,such as artificial neural network(ANN),particle swarm optimization-support vector machine (PSO-SVM), and convolutional neural networks-long short-term memory(CNN-LSTM).ANN and PSO-SVM are mainly utilized in dealing with the data in nonlinear relationships, while CNN-LSTM is widely adopted to tackle time-series data in nonlinear relationships. Gated recurrent unit (GRU) is developed on the basis of LSTM, which consumes numerous computing resources (Chung et al., 2014). GRU neural network can not only address nonlinear and time-series issues in data but also reduce computing resources. Moreover, values of hyperparameters (such as learning rates) greatly impact the model’s performance (Gao et al., 2020; Zhang et al., 2021c). PSO provides an alternative way to obtain optimal hyperparameters for the GRU neural network.

In order to solving the aforementioned problems, machine learning techniques and optimized algorithms are integrated to estimate the EPB shield moving performance. A hybrid CHT prediction model integrating PSO with GRU neural network is proposed. Moreover, a framework about model application is illustrated, which aims to provide references for engineering practice.

Fig.1. Steps of PSO algorithm.

2. Methodology

2.1. PSO

PSO was proposed to simulate the foraging of birds. PSO contains a swarm of particles. Each particle is denoted by its velocity vector Vi(t) and location vector xi(t). PSO is adopted to determine the optimal location,which is achieved by the particle movement.Particles move to best location and modify their own velocities at each iteration. The best location and global fitness of particles are achieved through n iterations or meeting the requirement of convergence. Velocity and location of particles are updated(Kennedy and Eberhart,1995) by

where ω is the inertial weight, c1and c2are the acceleration coefficients, γ1and γ2are the random numbers distributing in the interval of[0,1],Pi(t)is the personal best location of the ith particle,and Pg(t) is the global best location among all particles.

In the standard PSO, the value of inertial weight is set to be 1.Inertial weight is a parameter affecting the velocity of the current particle.Larger value of inertial weight is conducive to the particle global search, while the smaller value is benefit to local search.Improvement is made on the standard PSO (termed IPSO) to increase the PSO global optimization ability and convergence speed.The modified inertial weight is performed by

where ωmaxand ωminare the maximum and minimum values of inertial weight, respectively; k is the current iteration; and kmaxis the maximum number of iteration.

Fig.1 summarized the basic steps of PSO algorithm according to the concept of Kennedy and Eberhart (1995), which are described below.

(1) Step 1: Define the target function and initialize the parameters of PSO algorithm. The initialized parameters include inertia weight, sizes of particles, acceleration coefficients,velocities, and locations of particles.

(2) Step 2:Calculate the fitness values of particles.Fitness values and locations of particles relative to those of the current personal best should be determined.If fitness value is better than that of the current personal best, the personal best is updated;otherwise,the personal best is kept.Global best of particles is identified according to the best location and corresponding fitness value.

(3) Step 3:Location and velocity of particles are updated by Eqs.(1) and (2).

(4) Step 4: Judge whether current state meets the terminal criteria or not. If so, PSO algorithm is terminated, and the optimal solution of research object is output. Otherwise,return to Step 2.

2.2. GRU neural network

A difference between the recurrent neural network (RNN) and general neural network is the connection mode of neurons. In addition,there is a directional loop in the information flow in RNN,while the information is transferred directly from input layers to output layers in general neural networks.Furthermore,as a variant of long-short term memory (LSTM) neural network (Hochreiter and Schmidhuber, 1997), GRU neural network has a great potential to deal with time-series data.

As for GRU neural network,it is mainly composed of four layers:input layer, GRU hidden layer, dense layer and output layer.Compared with LSTM unit,there are two modifications in GRU.The input,output and forget gates in LSTM unit are changed to reset and update gates in GRU (Chung et al., 2014; Abdelgwad et al., 2021).Cell state and output in LSTM unit are integrated into one state(ht)in GRU. The structure of GRU is presented in Fig. 2. Reset gate controls the integration of new input and previous memory,while update gate presents the significance of information in current status.

Parameters of related gates are presented(Chung et al.,2014)as

where xtand htare the input and output in GRU, respectively; αrand βrare the weight vectors for reset gate,while αzand βzare the weight vectors for update gate; brand bzare the offset vectors for reset and update gates, respectively; μtis the output of hidden state; rtand ztare the output of reset gate and update gate,respectively;αhand βhare the weight vectors for hidden state;bhis the offset vector for hidden state; ℝ (x) and ℚ (x) are the sigmoid and tanh activated functions, respectively; and ⊙ is the vector inner product.

Fig. 2. Structure of GRU.

Fig. 3. Flowchart of the developed model.

2.3. Establishment of hybrid prediction model

In the developed hybrid model, PSO and IPSO are utilized to obtain optimal hyperparameters for GRU neural network. Flowchart of model establishment is presented in Fig. 3, and the steps are illustrated below.

(1) Step 1: Collect the datasets from engineering project. In tunnel engineering,data of geological variables are obtained and processed from geological investigation report,while the datasets of construction variables are collected automatically from the EPB shield machine. Construction variable dataset in tunneling project is a time-series.

(2) Step 2:Divide the datasets into training and testing datasets.Generally, training datasets account for 80% while testing datasets for 20%.

(3) Step 3: Initialize the parameters of PSO (IPSO) and GRU neural network. The initialization of PSO parameters are illustrated in Section 2.1. As for initialization of GRU neural network, it mainly includes the structure of GRU neural network (input layer, GRU hidden layers, dense layer and output layer), number of neurons in different layers, time steps,optimization method and activation functions.

(4) Step 4: Determine the target function and compute fitness values of particles.Fitness values of particles is calculated as

Best fitness value is obtained by comparing location of each particle and local optimal location.When fitness value is smaller,it becomes a new personal best location of particles. Global optimal location is obtained by comparing the current personal best and global best locations of all particles.

(5) Step 5: Judge whether the current situation meets the terminal criteria. Optimal hyperparameters are obtained when current situation meets the requirements of terminal criteria.Otherwise, Step 4 is repeated.

(6) Step 6: Optimal hyperparameters are utilized in GRU neural network training to obtain the optimal GRU estimation model.Established optimal prediction model is then utilized to predict target variables.

2.4. Performance indicators

Performance indicators are adopted to evaluate prediction accuracy of the proposed model. In this article, root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are adopted and equations of three indicators are presented by

3. Case study

3.1. Project descriptions

A tunnel engineering project in Shenzhen (Shenzhen International Airport)is adopted to evaluate performance of the proposed hybrid model. The total length of tunnel is 3.295 km. The cover depth of tunnel(from ground surface to tunnel crown)ranges from 8 m to 15 m. Part of the geological profile of construction site is displayed in Fig.4b.Composite strata are the distinct characteristic of geological condition. The strata consist of soft soil in the upper section and hard rock(slightly weathered rock)in the lower section of tunnel. Fig. 4 presents the framework of model application.Datasets are significant to establish hybrid model. Descriptions of datasets are presented in Section 3.2. In this study, GRU neural network consists of input layers, GRU hidden layers, dense layers and output layers, which are presented in Fig. 4a. Steps for model establishment are illustrated in Section 2.3.

3.2. Data descriptions

Safety tunnel construction is heavily relied on geological conditions. Hence, geological conditions within tunnel line are considered. Geological data are required to be processed prior to the model development. Due to the spatial variability of lithology,the strata within tunnel line are divided into different layers based on the geological investigation report. Division of strata within tunnel line is presented in Fig. 4c. There are n strata within the tunnel line. Thicknesses of stratum are denoted as h1, h2, …, hm.Average value of the jth geological variable is introduced to consider the effects from different strata within tunnel line comprehensively by

where γjis the average value of the jth geological variable,hiis the thickness of the ith stratum, ηijis the measured value of jth geological variable in the ith stratum,and m is the number of strata within tunnel line.

Geological variables are determined and processed based on the geological investigation report. Data of geological variables are processed prior to the tunnel excavation,while data of construction variables are obtained from automation data collection system of EPB shield machine. The stratigraphic distribution is plotted(Fig. 4b) based on the geological investigation report. In engineering practice, tunnel segments are also displayed in the map of stratigraphic distribution(Fig.4e).Fig.4d shows the segment-time relations.Thus,each dataset is defined within one tunnel segment.Under this condition, the datasets of geological variables within tunnel line are processed and obtained. During the tunnel construction, tunnel segments are installed one by one along the tunnel line. The number of installed tunnel segments increases with the construction time(Fig.4d).Data of each segment of tunnel consist of the values of geological variables and construction variables. The processed geological data combining with the construction data obtained from EPB in time can be viewed as timeseries in a broad sense. According to the engineering construction situations,there are five geological variables and nine construction variables. There are total of 1000 datasets collected and processed from the tunnel engineering project,of which 80%are determined as training datasets, while 20% are the testing datasets. Table 1 presented the statistical characteristic of considered variables for 1000 datasets. Fig. 5 displays the distribution of construction variables.

Before the model establishment, data (training and testing datasets) have to be transformed into interval [-1,1] through Eq.(11). Data in the range of [-1, 1] can reduce the influence of difference scales of data on the model performance.

Fig. 4. Framework of the mode application: (a) GRU neural network, (b) Stratigraphic distribution, (c) Divisions of strata within tunnel line, (d) Segment-time relations, and (e)Tunnel.

3.3. Results

As for the hybrid model establishment,PSO and IPSO algorithms are utilized to optimize hyperparameters of GRU neural network.Values of hyperparameters of GRU neural network are denoted as location vectors in the PSO and IPSO algorithms. In this study, the hyperparameter of GRU neural network is the learning rate.As for the parameters of the PSO and IPSO algorithms, the number of particles is set to 20,which is enough to search the optimal location vectors(Trelea,2003;Zhang et al.,2020b).c1=c2=2,kmax=150,ωmax= 0.9,ωmin= 0.4 and[Vmin,Vmax]=[-1,1].Iterations are setto 150, which is a terminal criterion. Convergence fitness value at constant value is another terminal criterion. PSO and IPSO algorithms are applied to searching the reasonable location vector(hyperparameters) to lower the fitness values. As for the initialization of GRU neural network,there are four layers(i.e.input layer,GRU hidden layer, dense layer, and output layer). In general, the neural networks with more hidden layers have better performance in the prediction of target variable, but require higher computational cost.Moreover,the neural networks with 1-2 hidden layers meet the requirement of estimation errors in geotechnical engineering. According to the previous simulated results (Gao et al.,2020; Zhang et al., 2021b), GRU neural network with one hidden layer demonstrates better performance than that with two GRU hidden layers. Thus, the number of hidden layers is set to 1. After trials and simulations based on the developed model and collected datasets,time step is assigned as 5.The numbers of neurons in the GUR hidden layer and dense layer are set to 20 and 15,respectively.Adaptive moment estimation optimizer was adopted(Kingma and Ba,2015).Thirteen variables(Table 1)are input in the input layer of GRU neural network, while CHT is the output variable.

Table 1 Description of variables.

Different datasets (geological and construction) may have various impacts on the prediction of CHT.Under this circumstance,different kinds of datasets are combined for investigation. In this study, the geological and construction variables are classified into three categories, which subsequently are utilized as model input:geological variables, construction variables and all variables. The description is presented in Table 2.

Fig. 5. Distribution of the construction variables.

To analyze robustness and feasibility of the PSO and IPSO applications in GRU neural network, the fitness values in different iterations are calculated. In this study, the ten-fold CV is adopted(Stone, 1974). Fitness values within 150 rounds of iterations are displayed in Fig.6.As shown in Fig.6,there is no further reduction of the fitness values in all scenarios when the number of iterations exceeds 90. There are different evolution patterns towards the maximum iterations and fitness values for these scenarios. Fitness values of the developed IPSO-GRU model for all variables are shown in a steady trend except the obvious variations in initial iterations.Besides,a fast convergent rate with nearly 40 rounds of iterations is shown in the scenario S5compared with other scenarios. It indicates that the hyperparameters of the model S5have lower impacts on the prediction of CHT. Minimum fitness values of considered scenarios(i.e. S1,S2, S3, S4, S5and S6)are 0.0253,0.022,0.0187, 0.0163, 0.0116 and 0.0087, respectively. Minimum fitness value exists in S6while the maximum fitness value is yielded by S1.According to above analysis,developed model(IPSO-GRU)with all variables outperforms other models.

Predicted results of CHT for training and testing datasets in all scenarios (Table 2) are displayed in Fig. 7. RMSE, MAE and MAPE values of training and testing datasets under considered scenarios are calculated. As seen in Fig. 7, predicted results of CHT by the developed models (PSO-GRU, IPSO-GRU) with construction variables and all variables show the lowest estimation errors for the training dataset, which are reflected by the three performance indicators. Predicted results of the developed models with only geological variables for training dataset show higher estimation errors. As for testing datasets, points of CHT are mainly concentrated around the line (measured = predicted) in scenarios S3-S6which indicate that the errors of predicted results can be accepted in geotechnical engineering.The points of CHT in S1and S2are more scattered. Values of RMSE, MAE and MAPE of the training andtesting datasets under considered scenarios are displayed in Table 3. It can be observed that the developed hybrid model for training datasets outperforms that for testing datasets. Besides,RMSE, MAE and MAPE values of S3, S5, S4and S6are lower than those of S1and S2, which indicates that the developed models are sensitive to datasets to some extent. Geological conditions are significant to tunnel construction. It is difficult to estimate the construction parameters with only geological variables by the developed models. From the values of three indicators under all scenarios, the geological and construction variables have great impact on the model performance.Finally,the improvement of PSO algorithm is beneficial for the hyperparameters optimization and model performance.

Table 2 Scenarios setting description.

3.4. Correlation analysis and discussions

Fig. 6. The fitness values for optimization process.

Estimation of CHT for EPB shield machine is a key issue for tunnel project.Accurate prediction of EPB performance reduces the risk of tunnel excavation. Influential factors of CHT should be evaluated comprehensively. Relationship between input variables and the estimated variable is unclear. Thus, correlation analysis is conducted. Data of input variables and target variable (CHT) are time-series. Pearson’s correlation coefficient (PCC) has a wide application in the correlation analysis between input variables and the target variable. Value of PCC is in the range of [-1,1]. Higher absolute value of PCC indicates that the higher correlation exists between two variables. Moreover, value of PPC can reflect the negative or positive correlation between two variables.Thus,PCC is utilized to analyze the influence of input variables on the target variables. Equation for PCC is given (Chen et al., 2019) by

where ρPi,Tis the PCC value of input variable and target variable;and Piand Tiare the ith value of input variable and predicted variable, respectively.

Fig. 7. Comparison of predicted and measured CHT for training and testing datasets.

Table 3 Values of three performance indicators under different scenarios.

Fig. 8. Results of correlation analysis.

PCC values are displayed in Fig.8.As seen in Fig.8,MT,FLV,VRa,CMa, GV, NWCa and EP are in the positive correlation with CHT,while SSC, IFAa, GP, UWa, P and DS display negative correlations with CHT. Among them, construction variables (MT and FLV)display higher correlation with CHT. PCC values of MT and FLV to CHT is larger than 0.35. Absolute PCC values of other input parameters are smaller than 0.3.In other words,correlation between input variables and the predicted variable were weak which implied that there did not exist a simple linear correlation. The developed model provides an alternative means to tackle with nonlinear correlations between input and target variables.

This study is novel for adopting the PSO and IPSO to optimize the hyperparameters of GRU neural network.Moreover,developed model shows the potential in capturing the characteristic (timeseries and nonlinear) of data from tunneling project. A hybrid model is developed to tackle the data from tunnel project. Moreover,the proposed framework about model application can provide guidelines in engineering practice. Nevertheless, there are some limitations in the developed model: (1) there are various factors(including geological and construction variables)that can influence the target variable. In this study, only limited variables are determined according to the construction situations of engineering project; (2) only 1000 datasets are utilized to evaluate the performance of the developed hybrid model. More data (e.g. data from different tunnel projects) should be adopted to evaluate the performance of the model;and(3)it should be noted that performance of the developed model relies on the specific values of parameters in this study.

4. Conclusions

This study proposed a hybrid model based on PSO and GRU neural network and illustrated the application of the developed model in practice.Procedures can be summarized as data collection and processing, hybrid model establishment, model performance verification and correlation analysis.The following conclusions can be drawn:

(1) The developed model integrates PSO and IPSO algorithms with GRU neural network.PSO and IPSO are utilized to assign optimal hyperparameters of GRU neural network. The developed model, IPSO-GRU, has a better performance for the estimation of target parameter under all variables. Besides,geological and construction variables are significant for the model establishment and performance.

(2) The flowchart of hybrid model for CHT estimation is proposed which aims to provide an alternative way to deal with time-series data in the tunnel engineering. Moreover,framework of model application is illustrated to give guidelines for engineering practice.Case study of tunnel project is adopted to check the performance of the proposed model.

(3) Correlation analysis is conducted to evaluate the correlation between input variables and target parameter(CHT).Results show that construction variables (MT and FLV) display the highest correlation with CHT. Other variables show a weak correlation with CHT, since Pearson’s correlation coefficient values are smaller than 0.3.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The research work was funded by “The Pearl River Talent Recruitment Program” of Guangdong Province in 2019 (Grant No.2019CX01G338), and the Research Funding of Shantou University for New Faculty Member (Grant No.NTF19024-2019).