Application in soft sensing modeling of chemical process based on K-OPLS method

2020-04-21 01:21LIJunLIKai

LI Jun, LI Kai

(School of Automation and Electrical Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China)

Abstract: Aiming at the problem of soft sensing modeling for chemical process with strong nonlinearity and complexity, a soft sensing modeling method based on kernel-based orthogonal projections to latent structures (K-OPLS) is proposed. Orthogonal projections to latent structures (O-PLS) is a general linear multi-variable data modeling method. It can eliminate systematic variations from descriptive variables (input) that are orthogonal to response variables (output). In the framework of O-PLS model, K-OPLS method maps descriptive variables to high-dimensional feature space by using “kernel technique” to calculate predictive components and response-orthogonal components in the model. Therefore, the K-OPLS method gives the non-linear relationship between the descriptor and the response variables, which improves the performance of the model and enhances the interpretability of the model to a certain extent. To verify the validity of K-OPLS method, it was applied to soft sensing modeling of component content of debutane tower base butane (C4), the quality index of the key product output for industrial fluidized catalytic cracking unit (FCCU) and H2S and SO2 concentration in sulfur recovery unit (SRU). Compared with support vector machines (SVM), least-squares support-vector machine (LS-SVM), support vector machine with principal component analysis (PCA-SVM), extreme learning machine (ELM), kernel based extreme learning machine (KELM) and kernel based extreme learning machine with principal component analysis(PCA-KELM) methods under the same conditions, the experimental results show that the K-OPLS method has superior modeling accuracy and good model generalization ability.

Key words: kernel method; orthogonal projection to latent structures (K-OPLS); soft sensing; chemical process

0 Introduction

In the actual chemical production process, the monitoring of product quality standards and environmental pollution has attracted increasing attention in recent years. However, effective control strategies are needed to achieve product quality control and implement environmental monitoring[1]. Therefore, it is particularly important to monitor the vital variables in the chemical process. However, under the existing technical conditions, they are difficult to measure directly or are not suitable for fast on-line measurement. They can only indirectly guarantee the quality requirements by controlling other measurable variables. Moreover, due to high cost and the need for downtime maintenance, the current online measuring instruments sometimes have the shortcomings of lagging time response, which is often not conducive to the realization of optimal operation in process automation, and also affects the profits of enterprises. Soft-sensing modeling method[2-3]describes the functional relationship between measurable key operating variables, controlled variables and disturbance variables and product quality by constructing a mathematical model. It is an effective means to solve the above problems by obtaining product quality estimation based on process operation data.

Data-driven soft sensor modeling method has been successfully applied[4-7], but now it faces some challenges, such as over-fitting, stability, robustness and complexity of learning process. In order to overcome the above difficulties, neural network, fuzzy logic, support vector machine, wavelet analysis and non-linear filter, etc. have been used in soft sensor modeling research[8-16]. Neural network[11,14], support vector machines(SVM)[9,12]and other computational intelligence models are the main modeling tools.

Based on projections to latent structures (PLS), Trygg et al.[17]proposed a method called orthogonal projections to latent structures (O-PLS). This method integrates the data filtering technology of orthogonal signal correction and separates the intrinsic change of orthogonal output from the structure of the predictive model, thus enhancing the interpretability of the model. A method of kernel-based orthogonal projections to latent structures (K-OPLS) was firstly proposed in Ref.[18]. This method introduced the technique of “kernel transformation” and maps the data into the nonlinear high-dimensional feature space to transform the descriptor variable matrix into the kernel matrix. On the basis of preserving the O-PLS model framework, it makes the descriptor variable and the response variable have powerful nonlinear mapping relationship, which further improves the predictive performance of the model. And the K-OPLS method of has been successfully applied in modelling for different chemical processes.

Considering the strong non-linear modeling ability of K-OPLS method, here we carry out the research on soft sensing of chemical process based on K-OPLS. In order to accurately predict the qualitative change of soft sensing of chemical process, K-OPLS method is applied to soft sensing modeling for component content estimation of debutane tower base butane (C4), and the prediction of quality index of industrial fluidized catalytic cracking unit (FCCU) to realize quality monitoring of related products. The method is also applied to predict the concentration of hydrogen sulfide (H2S) and sulfur dioxide (SO2) in sulfur recovery unit (SRU) to monitor the composition of tail gas and realize the detection of environmental pollution. Under the same conditions, K-OPLS method is compared with extreme learning machine (ELM), kernel based extreme learning machine (KELM), kernel based extreme learning machine with principal component analysis (PCA-KELM), support vector machine with principal component analysis (PCA-SVM), least-square support-vector machine (LS-SVM) and SVM and the results of related literature to verify the effectiveness of this method.

1 K-OPLS method

1.1 Model training with K-OPLS method

If theNpairs of data (xl,yl) in the data set are given, andl=1,...,N, the descriptor variable matrix and the response variable matrix are defined asX=[x1x2…xN]T∈RN×NaandY=[y1y2…yN]T∈RN×D, respectively. Mapping the input data to the high-dimensional feature space, namelyxl→φ(xl)∈RM*, then the input data matrix isΦ(X)=[f(x1)f(x2) …f(xN)]T, and the kernel function and the corresponding kernel matrix expressions arek(xi,xj)=f(xi)Tf(xj), andK=Φ(X)Φ(X)T.

According to the implementation of the O-PLS method, the K-OPLS method replaces the input data matrixXwith the kernel matrixKin a dual form. It should be noted thatKcompletes affine deflation in each iteration step of the algorithm by orthogonal components withY. During the training of the model,Khas two transformation forms. One form is similar to the predictive weight matrixWpin the O-PLS, with the dimension ofN×D, whereDis the number of predictive components, andKshould be retained throughout the implementation process of the algorithm. Another form is the deflation corresponding to theY-orthogonal variation. LetKj,ito represent the different deflation matrix forms ofK, whereK1,1represents the original kernel matrixK, the first kind transformation form ofKisK1,i, implying the predictive weight matrix expressed in calculating the predictive score, and the second transformation formKi,iis used in calculating the components orthogonal toY. If the training data kernel matrix is represented byKtr,tr, centralization process is needed as

(1)

whereItris an identity matrix, and1trare column vectors with elements all ones, and its length isN×1.

The specific steps of the K-OPLS algorithm are as follows:

Step 1: The load vector of the output matrix is obtained by eigenvalue decomposition, which can be expressed as

(2)

whereY∈RN×Dis the output matrix;Cp∈RD×Dis the output load matrix;Σp∈RD×Dis the diagonal matrix composed of eigenvalues; and eigs(·) represents the eigenvalue decomposition function in Matlab. Its output is the eigenvectors corresponding to theDlargest eigenvalues.

This step corresponds to singular value decomposition ofYTXin O-PLS method.

Step 2: The score matrixUpofYis calculated according to the known output load matrixCpas

Up=YCp,

(3)

whereUp∈RN×Dis the score matrix ofY.

This step is consistent with the O-PLS method.

Step 3: IfAois the number ofY-orthogonal components, setting the initiali=1 and computing the predictive score matrix after the deflation, it can be expressed as

(4)

This step corresponds toTp=XWpin the O-PLS method, whereTpis the predictive score matrix after theith affine deflation.

Step 4: Theith load vector orthogonal toYis calculated by

(5)

This step corresponds to the singular value decomposition ofETTpin O-PLS method, whereEis the residual matrix ofX.

Step 5: Theith score vector orthogonal toYis calculated by

(6)

This step corresponds to calculating the orthogonal score vectortoin O-PLS method, withto=Xwo, wherewois a weight vector orthogonal to the output vector.

(7)

(8)

Step 7: Theith deflatedWp-related kernel matrix in one direction for theY-orthogonal variation is calculated by

(9)

Step 8: Theith deflated kernel matrix in both directions for orthogonal toYvariation is calculated by

(10)

In the O-PLS method, this step corresponds to the deflatedXafter sequential removing theithY-orthogonal variation.

Step 9: Leti=i+1, ifi≤Ao, return to Step 4, otherwise the deflation process is stopped. And the updated predictive score matrix using the deflated kernel matrix is calculated by

(11)

This step corresponds toTp=XWpin O-PLS, whereXis an affine deflated matrix.

Step 10: Calculating the regression coefficient matrix after all orthogonal vectors are eliminated, that is

(12)

whereBt∈RD×Dis the regression coefficient matrix ofUp-Tp.

1.2 K-OPLS method model prediction

Given test data set (xj,yj),j=1,2,…,Ny, the test input and output matrices are expressed as

Xtc=[x1…xNy]T∈RNy×Na,

Yte=[y1…yNy]T∈RNy×D,

respectively. When the kernel matrix of testing data and training data is defined asKte,tr, the predictive output calculation steps of the model test process are as follows:

Step 1: The kernel matrix formed by testing and training data is calculated by

Kte,tr=〈Φ(Xte,Φ(X)〉,

(13)

whereΦ(Xte)=[f(x1)f(x2) …f(xNy)]T.

Step 2: The kernel matrix of testing and training data is centralized by

(14)

where1teis a column vector with all elements ones, and its length isN×1.

(15)

This step corresponds to calculating the test predictive score matrixTpafter thedeflation in the O-PLS method.

Step 4: Calculating the score vector of the orthogonal toYin the testing and training data kernel matrix, it can be expressed as

(16)

Step 5: The standardization of the score vector orthogonal toYin test data is given by

(17)

(18)

Step 6: For the change of which orthogonal to Y, the kernel matrix of testing and training data is computed after theith affine deflation along one direction as

(19)

Step 7: For the change of which orthogonal toY, the kernel matrix of testing and training data is computed after theith affine deflation along every direction as

(20)

This step corresponds to the matrixXteobtained by affine deflation in the direction of orthogonal withYtein O-PLS.

Step 8: Seti=i+1, then determine whetheri≤Ao. If true, return back to Step 3, otherwise the deflation is stopped. Based on the testing and training data kernel matrix after affine deflation, the updated predictive score matrix is calculated by

(21)

This step corresponds to calculating the predictive score matrixTptein the O-PLS method, in whichTpte=XteWp.

(22)

This step is consistent with the predicted output of the O-PLS method.

2 Chemical process application examples

In examples of this section, the K-OPLS method is applied to the modeling for soft sensor in component content of debutane tower base butane (C4), the quality index of the key product output of industrial fluidized catalytic cracking unit (FCCU) and the H2S and SO2concentration for SRU.

The modeling result of the K-OPLS method is compared with those of SVM[19], LS-SVM[20], PCA-SVM[21], ELM[22], KELM[23]and PCA-KELM[24]under the same conditions. Meanwhile, Gauss kernel function is selected for K-OPLS, KELM and SVM as

k(xi,xj)=exp(-‖xi-xj‖2/(2σ2)),

(23)

whereσis the kernel parameter.

Furthermore, the algorithm implementing for SVM uses LIBSVM software in the experiment. In order to evaluate indicators of models, except for MSE, the correlation coefficient is also used as

(24)

wherecov(·) is the covariance function. The higher the correlation coefficient is, the correlation degree is correspondingly higher.

2.1 Soft sensing modeling of component content of debutane tower base butane (C4)

Debutane column is an important component of desulfurization and naphtha fractionation unit in refinery. Propane (C3) and butane (C4) are removed from the top of naphtha fraction. The de-butane column needs to meet the quality control requirements of minimizing the content of C4 component in the injection part of the naphtha fractionation unit at the bottom of the column. Due to the performance differences of different gas chromatographs, it takes about 45 min to monitor the dominant variable C4 based on the on-line measuring instrument. Therefore, the establishment of soft sensor model and the estimation of C4 content are vital to ensure the quality control requirements of the debutanizer.

In order to realize the soft-sensing dynamic modeling, a large number of sensors are installed in the fractionation unit. As shown in Fig.1, seven grey rings represent the seven auxiliary variablesu1-u7:u1represents top temperature,u2is top pressure,u3is reflux flow,u4is flow to the next process,u5is the sixth tray temperature,u6andu7 are the temperatures of different regions at the bottom. And the sampling period of auxiliary variables is about 12 min. The dominant variable is the component content of C4, which is the output of the model. For a more detailed description of the process, see Refs.[1] and [14]. Considering the delay of the output time of the dominant variable, the NARX model shown in Eq.(25) is adopted in Ref.[1], namely

y(k)=f(u1(k),u2(k),u3(k),u4(k),u5(k),

u5(k-1),u5(k-2),u5(k-3),(u6(k)+u7(k))/2,

y(k-1),y(k-2),y(k-3),y(k-4)),

(25)

where the model outputy(k) represents the content of C4 component,f(·) is the model of K-OPLS. In the experimental comparison,f(·) can also be used as other soft sensing methods such as ELM, SVM, etc. In this process, 2 394 sets of data were collected and normalized. The first 1/2 of the data set was selected for training and the rest for testing.

1—Debutanizer column; 2—Top temperature sensor of T102; 3—Top pressure sensor of T102; 4—Reflux flow sensor of T102; 5—Flow sensor; 6—Tray temperature sensor of T102; 7—Bottom temperature sensor of E108A; 8— Bottom temperature sensor of E108BFig.1 Flow chart of debutanizer column

The experiment chooses the kernel parameter of Gauss kernel functionσ=12 by five fold cross-validation. In the K-OPLS method, the number ofY-orthogonal componentsAois 12. In order to verify the model estimation accuracy of the proposed method, the soft sensor models constructed by other methods are compared under the same conditions. When using ELM algorithm, sigmoid function is selected as the node activation function and the number of hidden layer neurons is 20, and the principal component number is set to be 10 in PCA-SVM and PCA-KELM.

Table 1 gives the comparison results of correlation coefficients and mean square error (MSE) of different methods on test data sets. From the results in Table 1, the predictive accuracy of K-OPLS method is obviously better than those of other methods.

Table 1 Performance comparison of K-OPLS and other methods

Different modelCorrelation coefficientMSESVM0.997 11.75×10-4LS-SVM0.940 13.82×10-3PCA-SVM0.997 11.75×10-4ELM0.995 22.76×10-4KELM0.999 24.55×10-5PCA-KELM0.999 34.25×10-5K-OPLS0.999 52.96×10-5

Furthermore, the method is compared with the results of the relevant paper on the test data set. Under the same NARX model, the MLP method with 16-12-1 structure is used in Ref.[1]. The correlation coefficient in it is 0.985. The training data and test data account for 1/2 of the total data set, and are selected in a way of arrangement every other one in it. Ref.[25] is based on affine clustering, Gauss process and Bayesian decision-making combined static multi-model soft sensing method, and its training data account for about 1/4 of the total data set, which is obtained by taking one out of four. The root mean square error (RMSE) is 0.093 5, while the MSE is 8.74×10-4. Ref. [26] is based on JT-LSSVM static soft-sensing method of real-time online learning. The training data and test data account for 1/2 of the total data set, and the correlation coefficient is 0.913 2. Latent factor analysis also used in the same experiment in Ref.[13]. The methods in this paper are superior to those in the paper above-mentioned.

Fig.2 shows the comparison between the model estimation and the actual value of the method on the test data set.

Fig.2 Comparison between K-OPLS estimation of C4 in bottom flow and corresponding actual data on test set

Fig.3 is the corresponding estimation error in test data set. It can be seen that the K-OPLS method has good estimation ability.

Fig.3 Error curve of the test data in K-OPLS method

Fig.4 Estimation graphs of probability density functions of predicted score matrix

From the abscissa range of Fig.4, it can be seen that the range of probability density estimation of predictive component is located in the interval [-0.05, 0.15], and that of orthogonal component are mainly located in the interval [-0.029,-0.028], [-0.1, 0.1]. This shows that the contribution of orthogonal component to the predictive ability of the model is basically zero, which can be considered as structural noise to be eliminated.

2.2 Modeling for soft sensor in industrial fluidized catalytic cracking unit (FCCU)

Fluidized catalytic cracking unit is the core unit of secondary petroleum processing. Fig.5 shows a simplified flow chart of a typical FCCU[12,28]. FCCU consists of reactor-regenerator subsystem, fractionator subsystem, absorber-stabilizer subsystem and gas desulfurization subsystem. Its function is to convert high boiling point and high molecular weight crude oil into light hydrocarbon products, such as gasoline. The main objective of the fractionator subsystem is to crack crude oil according to the fractionation process. The quality index of the key product output of the FCCU unit is composed of gasoline, light diesel oil (LDO), liquefied petroleum gas (LPG), etc. These quality indicators of key product output, which are usually calculated by off-line analysis once a day depending on traditional detection methods, cannot meet the needs of real-time monitoring of industrial product quality. Therefore, it is necessary to design soft sensor and soft measurement modeling with three quality indicators of key product output as the leading variables.

Fig.5 Simplified flow chart of a typical FCCU refinery

Six auxiliary variables, which are regarded as input variables of soft sensor model, and three dominant variables used in soft sensor modeling are shown in Table 2. The obtained sample data need to be normalized.

Table 2 Primary products and secondary variables

The experiment chooses the kernel parameter of Gauss kernel function. In the K-OPLS method, set the number of Y-orthogonal components to be 13. In order to verify the model estimation accuracy of the proposed method, the soft sensor models constructed by other methods are compared under the same conditions. When using ELM algorithm, select sigmoid function as the node function and the number of hidden layer neurons is set to be 20. And the principal component number is set to be 4 in PCA-SVM and PCA-KELM.

Fig.6(a)-(c) gives the comparison results between the predicted and actual output values of three key products on the test set based on K-OPLS method, in which the dotted lines are the actual values of the product output and the solid lines are the predicted values. From the results of Fig.6(a)-(c), we can see that the K-OPLS method has achieved good predictive results.

Fig.6 Predictive results of three product yields

Table 3 gives the numerical comparison of six different methods on the test set. It can be seen from Table 3 that after adjusting the parameters, the predictive accuracy of K-OPLS method is obviously higher than those of other methods on the performance indicators of the three products’ output, and the best predictive accuracy is obtained in the soft-sensing modeling of all the three key products’ output.

Table 3 Performance comparison of K-OPLS methods and other models in FCCU

MethodRMSEGasoline yieldLDO yieldLPG yieldSVM0.856 80.861 30.748 5LS-SVM0.994 10.946 00.743 3PCA-SVM0.611 60.613 70.586 5ELM1.193 10.640 30.554 2KELM0.793 70.780 20.671 1PCA-KELM0.540 70.540 80.541 8K-OPLS0.538 80.538 70.539 0

This method is also compared with the results of Ref.[12]. Liu et al. adopted an adaptive least squares support vector machine (ALSSVR) modeling method. This method uses sliding window and two-stage recursive learning framework to track the time-varying dynamics of the process. The RMSE values of gasoline, LDO and LPG output are 1.19, 1.38 and 1.00 respectively. The results of the K-OPLS method in this paper are obviously superior. The experimental results show the potential of K-OPLS method in the field of soft sensor modeling.

2.3 Soft sensing modeling of H2S and SO2 concentration in SRU

The role of SRU is to remove environmental pollutants from acid gas before it is discharged into the air. Acidic gas is the most dangerous air pollution factor and the main cause of acid rain formation. Finally, Sulphur element is recovered by SRU as a by-product. SRU consists of four identical sub-units, also known as sulphur production line.

Four production lines work in parallel. Each production line can convert acidic gas into sulphur. It mainly deals with two kinds of acidic gas, one is rich in H2S, also known as MEA gas, which comes from the gas washing device. Another rich in H2S and NH3, also known as SWS gas, comes from the acid water stripping unit. Considering one of the production lines, the simplified SRU process is shown in Fig.7.

1,2—Heater; 3—Combustion chamber; 4—Water condenser; 5,7—Catalytic reactor; 6,8—CondenserFig.7 Simplified scheme of SRU

The acid gas combusts with air in the reactor F101. The reactor F101 is composed of two separate combustion chambers. MEA gas enters the main combustion chamber, and it needs to be adjusted by supplementing appropriate air flow (AIR_MEA) since it lacks air when burning. It needs to be adjusted by supplementing appropriate air flow (AIR_MEA). The SWS gas mainly enters the second combustion chamber and has an appropriate air flow (AIR_SWS) to adjust. SWS gas burns in a separate combustor with excess air to produce nitrogen and nitrogen oxides to prevent the formation of ammonium salt in the equipment. The gas flow into the second combustion chamber is kept constant by adding some MEA gases (MEA_SPILLING_AIR). Air flow is controlled by the equipment operator to ensure an appropriate stoichiometric ratio in the exhaust gas.

Based on the analysis of exhaust gas composition, the control effect can be improved by further adjusting the closed-loop algorithm of air flow rate (AIR_MEA_2). Combustion products enter through water condensation equipment E101, catalytic reactor R101 and condenser E102, and then, into catalytic reactor R102 and condenser E103. In turn, about 90% sulphur element will be collected eventually. Among them, air provides oxygen for chemical reaction. It is an important parameter in the conversion process of H2S, the key component of exhaust gas composition. Too much or too little air flow will change the stoichiometric ratio of H2S to SO2in the exhaust gas composition. In order to improve the sulfur extraction process, on-line analyzer using the residual H2S and SO2concentration in the tail gas to monitor the performance of the conversion process and control the air supply ratio into SRU. The frequent damage of sensors caused by H2S and SO2makes the on-line analyzer have to be maintained frequently. Therefore, it is necessary to design a “soft sensor” to predict the concentration of H2S and SO2, that is to say, the dominant variable output of soft sensor is the concentration of H2S and SO2. According to expert experience and control requirements, five auxiliary variables are used as input for modeling, whose unit is m3·h-1, and the sampling period is 1 min. Specifically,U1is MEA gas flow (MEA_GAS);U2is air flow (AIR_MEA);U3is the second air flow (AIR_MEA_2);U4is the gas flow in SWS region (SWS_GAS+MEA_SPILLING);U5is the air flow in SWS region (AIR_SWS + MEA_SPILLING_AIR). The concentration output of H2S and SO2can be obtained by measuring auxiliary variables at standard temperature and pressure. Detailed description of SRU process and data acquisition can be found in Ref.[1].

The NMA time series model[1]shown in Eqs.(26) and (27) is adopted as

y1(k)=f1(x1(k),x1(k-5),x1(k-7),x1(k-9),

…,x5(k),x5(k-5),x5(k-7),x5(k-9)) ,

(26)

y2(k)=f2(x1(k),x1(k-5),x1(k-7),x1(k-9),

…,x5(k),x5(k-5),x5(k-7),x5(k-9)),

(27)

wherey1(k) represents the concentration output of H2S, andy2(k) represents the concentration output of SO2. Meanwhile,f1(·) andf2(·) are the methods of K-OPLS. Therefore, two K-OPLS soft sensor models for estimating the concentration of H2S and SO2are constructed, which are both 20-dimensional inputs. A total of 10 081 sets of data were collected during the process. After normalization, the first 80% of the data were used as training and the rest as testing.

The experiment chooses the kernel parameter of Gauss kernel functionσ=4. In the K-OPLS method, set the number ofY-orthogonal componentsAoto be 16. In order to verify the model estimation accuracy of the proposed method, the soft sensor models constructed by other methods are compared under the same conditions. When using ELM algorithm, select sigmoid function as the node function and the number of hidden layer neurons is set to be 40, and the principal component number is set to be 15 both in PCA-SVM and PCA-KELM. In addition, in order to avoid memory overflow, the experimental data can be selected as 40% of the total data set.

Tables 4 and 5 show the comparison of correlation coefficients and mean square errors on the test set when predicting the concentration of H2S and SO2based on different methods. The results in Tables 4 and 5 demonstrate that K-OPLS method achieves the best predictive accuracy for the sulfur recovery process with strong nonlinearity.

Table 4 Performance index for estimation results of H2S based on different model in test set

Different modelCorrelation coefficientMSESVM0.745 71.02×10-3LS-SVM0.803 05.12×10-4PCA-SVM0.865 92.70×10-4ELM 0.819 05.45×10-4KELM0.871 53.11×10-4PCA-KELM0.871 53.11×10-4K-OPLS0.882 32.92×10-4

Table 5 Performance index for estimation results of SO2based on different model in test set

Different modelCorrelation coefficientMSESVM0.818 98.42×10-4LS-SVM0.811 88.47×10-4PCA-SVM0.842 07.18×10-4ELM 0.831 16.63×10-4KELM0.836 87.12×10-4PCA-KELM0.836 87.12×10-4K-OPLS0.891 44.72×10-4

The results of the method in this paper is also compared with the results of the related literature on the test set. Fortuna et al.[1]applied the NMA model of Eqs.(26) and (27) to predict the concentration of H2S and SO2based on MLP, RBF neural network, adaptive neuro-fuzzy system and NLSQ, respectively. The training data and test data are small subsets of 1 000 samples randomly selected from all data sets, among which the accuracy of NLSQ method was the best. The predictive index MSE for H2S is 0.000 8, the correlation coefficient is 0.848, and the predictive performance index MSE for SO2is 0.000 4, the correlation coefficient is 0.905. The method presented in this paper is relatively superior to the results in Ref.[1].

Fig.8 shows the comparison between the estimated and actual values of the model on the test set when estimating the concentration of H2S and SO2, respectively. As shown in Fig.8, K-OPLS has good estimation accuracy. The analysis results further confirm that K-OPLS is an effective soft sensor modeling method.

Fig.8 Comparison between K-OPLS estimations and corresponding measured output of H2S and SO2 on test set

3 Conclusion

Soft instrumentation based on soft sensor modeling has the advantages of low cost and self-adaptation. It provides a cheap and fast real-time estimation method for process dominant variables that need to be monitored. It makes it possible to design effective and fault-tolerant control strategies. Based on the historical data of auxiliary variables and dominant variables collected, a soft sensing modeling based on K-OPLS method is proposed for a chemical process with strong non-linear characteristics. It is applied to the component content of debutane tower base butane (C4) and the modeling of industrial fluidized catalytic cracking unit (FCCU) for the quality control of related products. And it is also used in the prediction of H2S and SO2concentration in sulfur recovery unit (SRU) to calculate the composition of tail gas. By comparing with the results of existing methods and related literatures, and measuring the performance of the model, the following conclusions are drawn.

The K-OPLS method maps data to high-dimensional feature space through “kernel method”, which preserves the O-PLS algorithm framework. Therefore, between the descriptor variable matrix and the response variable matrix, K-OPLS can separate the modelling of predictive andY-orthogonal variation of theXandYmatrices. Compared with the conventional kernel learning method, the intrinsic structural noise contained in the data with the orthogonal variation information withYcan be effectively identified, which enhances the model’s interpretability.

K-OPLS method has high approximation accuracy, which shows the effectiveness and application potential. It can adapt to the training of relatively large data sets, has high convergence accuracy with good robustness, and has the advantages of strong non-linear modeling performance. It is an effective soft sensor modeling method, and can also be transplanted to other chemical process environment.