Fuzzy identification of nonlinear dynamic system based on selection of important input variables

2022-06-27 00:28LYUJinfengLIUFucaiandRENYaxue

LYU Jinfeng ,LIU Fucai ,and REN Yaxue

1.Engineering Research Center of the Ministry of Education for Intelligent Control System and Intelligent Equipment, Yanshan University, Qinhuangdao 066004, China; 2.School of Mathematics and Information Science and Technology,Hebei Normal University of Science and Technology, Qinhuangdao 066004, China

Abstract: Input variables selection (IVS) is proved to be pivotal in nonlinear dynamic system modeling.In order to optimize the model of the nonlinear dynamic system,a fuzzy modeling method for determining the premise structure by selecting important inputs of the system is studied.Firstly,a simplified two stage fuzzy curves method is proposed,which is employed to sort all possible inputs by their relevance with outputs,select the important input variables of the system and identify the structure.Secondly,in order to reduce the complexity of the model,the standard fuzzy c-means clustering algorithm and the recursive least squares algorithm are used to identify the premise parameters and conclusion parameters,respectively.Then,the effectiveness of IVS is verified by two well-known issues.Finally,the proposed identification method is applied to a realistic variable load pneumatic system.The simulation experiments indi cate that the IVS method in this paper has a positive influence on the approximation performance of the Takagi-Sugeno (T-S) fuzzy modeling.

Keywords: Takagi-Sugeno (T-S) fuzzy modeling,input variable selection (IVS),fuzzy identification,fuzzy c-means clustering algorithm.

1.Introduction

There are many forms of motion in nonlinear systems,most of which are complex and uncertain.In general, a large number of data points are used to build a model to describe the nonlinear system and solve the practical problems.Among the nonlinear models, the fuzzy model can deal with the relation of input and output, especially for a complex and imprecise system.Although any element that may affect the output can be considered as a possible input, some inputs may be noise.Therefore, in the fuzzy modeling of the nonlinear time-varying system, determining optimal input variables is a problem that must be solved.Input variable selection (IVS) is to use certain methods and criteria to find out significant variables from numerous candidate input variables [1].Determining important input variables, eliminating redundant input variables and lessening the total number of input variables are of great significance for decreasing calculation consumption, reducing model dimensions and improving prediction accuracy.At present, IVS is mainly used in neural network prediction and feature extraction [2−5], which can effectively reduce the dimension of input variables and the complexity of the fuzzy model, and improve the generalization ability of the model.However, the selection of the input variables of the fuzzy model has been paid little attention.

The Takagi-Sugeno (T-S) fuzzy model by using the experimental data is regarded as an effective means for modeling the actual system [6−17].The work of T-S fuzzy system identification consists of two parts: structure identification and parameter estimation.The structure identification consists of structure identification I (selecting and deciding system inputs) and structure identification II (setting the number of fuzzy rules and dividing the fuzzy space).Parameter identification contains antecedent and consequent parameter identification of fuzzy rules.According to Sugeno [7], each work of fuzzy identification has different influence on the identification result.Sugeno believed that structure recognition I is the most important, with a proportion of 100, followed by structure recognition II with a proportion of 10, and finally parameter identification with a proportion of only 1.Therefore, IVS plays a great role in improving the identification performance of the model.For a real complex system, there are many factors that affect the output of the system.If all the factors are considered as the input of the model, the fuzzy rules will grow exponentially with the increase of the system dimension, which is unrealistic in the practical system application.For the rule-based fuzzy identification problem, selecting important input variables is an important means to solve the “dimension disaster” problem.In the existing literature, the input variables of fuzzy models are mostly determined by experience.Therefore, developing a more robust IVS approach is an important means to solve the problem of system identification with the fuzzy system, which can accurately describe the interdependence between the candidate input variables and the output variables, and can quickly sort the input variables according to the correlation.

The commonly used IVS methods contain two categories, one is model-based, and the other is model-free [1].Model-free IVS methods include correlation coefficient(CC), partial correlation coefficient (PCC), mutual information (MI), partial mutual information (PMI), covariance matrix (CM), random forest (RF), etc.; and model-based IVS methods include input omission (IO), combined neural path strength (CNPS), etc.In [1], four input variable selection methods (PC, PMI, IO and CNPS) were comprehensively compared.Moreover, the IVS based artificial neural network flow prediction model is verified on two water basins datasets.In [18], the method MI was utilized to select input variables, and the dynamic neural network model was established to obtain the feeding conditions of broilers.The input variables were selected by forward selection (FS), Gamma test (GT) and principal component analysis (PCA), and the artificial neural network model was established to predict the human response to odor perception in [19].The prediction model of carbon content in fly ash and the classification model of crude oil samples were established by RF, respectively in [20] and [21].The model has a higher accuracy and faster calculation speed.For the selection of input variables of the fuzzy model, there are few literature.In order to determine the input variables of the Takagi-Sugeno and Kang (TSK)fuzzy model, Banakar et al.adopted the loop algorithm and the genetic algorithm, and established the dynamic system model by combining the modified mountain clustering (MMC) algorithm and the structure tree (ST) algorithm [22].Lin et al.used two-stage fuzzy curve and surface(TSFCS) to select important input variables for nonlinear system structure identification [23].With these fuzzy curves and surfaces, the correlation between the candidate input variables and the output variables was determined.The correlation was used to sort candidate input variables,select important input variables with strong correlation,and eliminate the variables with poor correlation.

In existing researches on fuzzy modeling for complex nonlinear dynamic systems, the inputs of the model are mostly selected or determined by experience.Though the determination of input variables can make a great contribution to improving the performance of fuzzy identification, the research results on this part are not perfect yet.In order to make up for this shortcoming, this paper will carefully study what kind of performance the fuzzy model will show under the effect of IVS.In order to verify the accuracy and effectiveness of the fuzzy identification method considering the IVS, this paper compares with previous research results in two aspects, one is the number of input variables and the other is the number of fuzzy rules.The research results of this paper have important theoretical and practical significance for the practical application of fuzzy systems in nonlinear dynamic systems.In order to improve fuzzy modeling accuracy of nonlinear time-varying system, a method of fuzzy identification based on IVS is raised.The TSFCS method in[23] is simplified to identify the important input variables of the T-S fuzzy model.On this basis, the traditional fuzzy c-means (FCM) clustering algorithm and the recursive least square (RLS) method are utilized to identify the premise parameters and conclusion parameters respectively.The algorithm preprocesses the possible input variables offline, which reduces complexity of structure identification and can obtain a high identification accuracy without complicated parameter optimization.

This article is structured as below.Section 2 gives an outline of the T-S fuzzy model.In Section 3, a fuzzy identification method is proposed, which uses the FCM algorithm to identify antecedent parameters on the basis of determining significant input variables.In Section 4,two classical examples verify the performance of the established model.Furthermore, the discussed identification method considering IVS is applied to the identification of a real pneumatic loading system in Section 5.In the last section, the work of this paper is summarized.

2.Preliminaries

2.1 T-S fuzzy model

The T-S fuzzy model is a fuzzy rule based model, which approximates the nonlinear system by the local linear subsystem and realizes global nonlinearity by fuzzy reasoning.The premise of the rule is a fuzzy variable, and the conclusion is the linear function of input and output.The T-S fuzzy model withNgroups of sample data(xk1,xk2,···,xkr,yk) can be described as

The global output of the fuzzy model is gained by weighted processing of local output and given by

where

The matrix equation (5) is obtained by substitutingNpairs of sample data into (4).

is thekth row ofX.P∗=(XTX)−1XTYis the required parameter vector.

The purpose of consequent parameters identification is to determine the coefficient of linear regression model (1).In order to avoid inverse operation of the matrix, the RLS method is used to obtain the parameter estimation vectorPby iterative optimization, and the recursive algorithm is

wherel=0,1,2,···,N−1,Slis a matrix for auxiliary calculation.

The initial condition isP0=0,S0=αI.α is a constant greater than 10 000.Iis a unit matrix, which isL×Ldimensional.The optimal conclusion parameter matrix in the sense of mean square error (MSE) is obtained by using (6) and (7), and the conclusion parameter and MSE are output after recursion.

2.2 Fuzzy c-means clustering algorithm

There are many methods to divide fuzzy space and estimate premise parameters, such as the Gustafson-Kessel clustering algorithm [24], the Gath-Geva clustering algorithm [25], FCM and the fuzzy C-regression model(FCRM) [26−31].Among them, the FCM algorithm is an effective method in practical application, which has the characteristics of simple implementation and a high identification accuracy.

The purpose of FCM is to minimize the objective function

In the above equation,Nis the sample quantity,cis the number of fuzzy rules,mrepresents the weight of the membership function, usually an integer greater than 1.The value ofmwill affect the accuracy of recognition,andm=2 is taken usually.uikcharacterizes the membership of thekth sample pointxk=(xk1,xk2,···,xkr) in theith clustering anduik≥0;viis theith cluster center.∀i=1,···,c,uikandviare calculated by (9)−(11):

U=(uik) is a fuzzy partition matrix andV=(v1,v2,···,vc)Tis the matrix composed of cluster centers.

According to the given data point (xk,yk), the following steps give a detailed offline calculation method:

Step 1Initialize the number of iterationsl=0, set the value ofc, weight indexm, stop criterion ε, and initialize the matrixUrandomly.

Step 2Update the matrixVby (9).

Step 3Use (11) to update the distancedikof each cluster.

Step 4Update the matrixUby (10).

Step 5If<ε, stop, otherwise skip back to Step 2.

3.Fuzzy identification method based on IVS

This part shows in detail the innovations of this article.Firstly, the algorithm of IVS is discussed to determine the important input variables for structure identification.Secondly, the FCM algorithm is used for parameter identification.Then, the fuzzy model is constructed by combining IVS with FCM.Compared with the previous fuzzy models, the advantage of this model can avoid the complex iterative optimization process of fuzzy model parameters and reduce the calculation amount on the premise of high fuzzy modeling accuracy.It is suitable for online identification of practical system and has important theoretical and practical guiding significance for the practical application of fuzzy system in nonlinear dynamic system.

3.1 Improved IVS

The IVS method based on TSFCS is more suitable for the nonlinear system with strong interdependence between input variables.The two-stage fuzzy curve gives the correlation between a single input variable and the output,and the two-stage fuzzy surface describes the interdependence between two different input variables relative to the output.In order to simplify the model, the correlation between input variables is not considered temporarily,and the difference calculation in the original algorithm is abandoned.The simplified TSFCS method is called the two stage fuzzy curve (TSFC) method, through which all possible input and output correlation indexes are obtained and sorted.The detailed process is discussed as below.See Fig.1 for the specific algorithm flow chart.

Fig.1 Schematic diagram of TSFC

3.1.1 The first stage fuzzy curves

Based on the design concept that the more important the inputs are, the more relevant the outputs are, the first stage fuzzy curves are designed.Supposex1,x2,···,xMare the possible input variables of the fuzzy system model,yis the output, andNis the sample quantity.

First, in eachxi−yspace, a Gaussian membership function µki(xi) is constructed:

3.1.2 The second stage fuzzy curves

Ifi≠j,Py˜iis equal toPy˜j, which makes it impossible to identify which input variable is more important.The fuzzy curvein the second stage aims to solve the problem.The second stage fuzzy curve is given through fuzzy curves in (13) as follows:

The variable list sorted by importance from large to small can be obtained by calculating the performance index functionPiand arranging it in the descending order.IfPi=0, it means a good performance; ifPi=1, it means a poor performance.

3.2 The proposed T-S fuzzy identification approach

The T-S fuzzy identification approach discussed in this article is illustrated in Fig.2, which combines the TSFC method and the FCM algorithm.

Fig.2 T-S fuzzy system identification approach based on TSFC

The detailed steps of the fuzzy modeling are described as below:

Step 1Use the TSFC method to sort input variables;

Step 2Set termination threshold ε and the number of input variabler, conduct fuzzy subsets partition (determinec);

Step 3Calculate the antecedent parametersaccording to (10);

Step 4GetXfrom (5);

Step 5SolvePthrough (6) and (7);

Step 6Calculate the model performance evaluation index MSE.If MSE meets the identification accuracy,then the identification algorithm will terminate; otherwise,addcand go to Step 3.

4.Simulation research

The main goal of this section is to verify the effectiveness of IVS based on the TSFC method through several typical nonlinear system models.To determine the reliable dynamic characteristics, two methods, FCM and TSFC+FCM algorithms, are used to carry out a comparative study.

4.1 Example 1: Mackey-Glass chaotic system

4.1.1 IVS based on TSFC

The Mackey-Glass chaotic system [32] is considered as a standard case, which is widely applied to research of fuzzy model performance.It is obtained by

These 1000 datasets are obtained by (18).Applying the TSFC method mentioned in Section 3, lettingsi=x(t−i)(i=1,2,···,18),y=x(t+1), calculating the index functionPiof each variablesiand sorting all variables according to the value ofPi, the first six important input variabless1,s2,s3,s4,s5ands18are screened out, and the corresponding valuesPiare 0.061 1, 0.116 4, 0.196 7,0.300 3, 0.420 3 and 0.496 0 respectively.The first six input variables represented by them arex(t−1),x(t−2),x(t−3),x(t−4),x(t−5) andx(t−18).

4.1.2 Experimental results

One thousand sets of data obtained by (18) are equally divided into two groups to verify the prediction performance of the model.The first group of data is used for model training, and the latter group of data is used for model testing.

In general, the six variablesx(t−1),x(t−2),x(t−3),x(t−4),x(t−5) andx(t−6) are selected as model inputs.In this article, the above selected variablesx(t−1),x(t−2),x(t−3),x(t−4),x(t−5) andx(t−18) are used for fuzzy modeling.Set the initial values ofx(0)tox(17)all as 1.2 and the rule numberc=2.The performance of our fuzzy model is shown in Fig.3, where Fig.3 (a)exhibits the comparison of the original output and the predicted output of the model, and Fig.3 (b) depicts respective errors.

Fig.3 Contrast of established model with actual model for Example 1

The traditional FCM algorithm and the proposed algorithm TSFC+FCM algorithm are respectively used for modeling comparison of Example 1.Under the condition that the fuzzy rules are both 2, the modeling accuracy(MSE1) of the FCM algorithm and the TSFC algorithm are 4.565 5×10−5and 4.162 7×10−5, and the prediction accuracy (MSE2) are 4.605 0×10−5and 4.295 7×10−5,respectively.From the experimental results, it can be concluded that the prediction performance of fuzzy identification based on the TSFC method is better than that without IVS.

In addition, we compare our model performance with other results in literature in Table 1, and our model has a higher identification accuracy.The training MSE is 4.162 7×10−5and the obtained prediction MSE is 4.295 7×10−5for the test data.

Table 1 Performance contrast of different fuzzy models for Example 1

4.2 Example 2: Box-Jenkins system

4.2.1 IVS based on TSFC

The Box-Jenkins data set [35] includes 296 sets of input and output measured values of the gas furnace process.At timek, the gas flow of the system input is expressed withu(k) and CO2concentration of the system output isy(k).The purpose of this experiment is to select the optimal inputs and predicty(k+1).For this purpose, given

The importance indexPiof each input to be selectedsiis calculated according to (17) based on the TSFC method and arranged in the descending order.Here, the first six input variables:s6,s5,s7,s4,s8ands3are screened out, and the valuesPiof them are 0.070 5, 0.149 4, 0.160 5, 0.260 4, 0.306 1 and 0.429 5 respectively, and the corresponding actual inputs arey(k−1),u(k−4),y(k−2),u(k−3),y(k−3),u(k−2).

4.2.2 Experimental results

In this section, the model generalization is tested through two situations.In the first case, a full set of data is devoted to establish the model to verify its superiority; in the second case, the data set is equally divided into two parts, the former is used for model training and the latter is for examining the prediction ability of the model.

In most available literature, the input variables of the Box-Jenkins system areu(k),u(k−1),u(k−2),y(k−1),y(k−2) andy(k−3).The variables selected above are used as the input of the model.Fig.4 and Fig.5 show the model approximation performance in two cases respectively.

Fig.4 Contrast of established model with actual model for Example 2 (Case 1)

Fig.5 Contrast of established model with actual model for Example 2 (Case 2)

Aiming at the first case, it is available from Fig.4 (b)that the modeling error between 200−290 data groups is relatively large compared with that between 1−200 data groups.In [29], a classic fuzzy identification literature,and [8], the error of Box-Jenkins gas furnace modeling data also had an increasing trend in the same range.For the second case, in the test model stage of [36] , the test error was also increasing in the 160−290 range.It can be concluded that since the data of gas furnace is the measurement data obtained through experiments, there will inevitably be noise interference, which will affect the modeling results.

In both Case 1 and Case 2, the simulation results obtained by the TSFC + FCM algorithm and the FCM algorithm are displayed in Table 2 and Table 3.From Table 2 and Table 3, the identification performance of the model based on TSFC is obviously better than those without IVS under the condition of the same number of input variables.In both cases, detailed comparisons of our model performance with others are given in Table 4 and Table 5, which show that our model has a higher accuracy than other models in literature.It should be noted that those results in Table 4 and Table 5 are mostly obtained by using the complex parameter optimization process.In[10], the T-S fuzzy system with the least fuzzy rules was established by using the iterative vector quantization clustering method, and in [37], based on the establishment of fuzzy structure by FCRM, the number of fuzzy rules was optimized.

Table 2 Performance comparison of IVS based on TSFC for Example 2 (Case 1)

Table 3 Performance contrast of different fuzzy models for Example 2 (Case 1)

Table 4 Performance comparison of IVS based on TSFC for Example 2 (Case 2)

Table 5 Performance comparison of different fuzzy models for Example 2 (Case 2)

The gravitational search algorithm (GSA) was used to optimize the antecedent parameters of the fuzzy model in [14,36].Yan et al.presented an improved hybrid backtracking search algorithm (IHBSA) to search the optimal number of fuzzy clusters and the corresponding cluster centers at the same time in [15].However, the algorithm proposed in this paper avoids the complex iterative optimization process, which is the advantage of this model.

5.Application to the variable load pneumatic loading system

Due to the characteristic of simple structure, small volume, and without pollution and wide applicability, the pneumatic loading system has attracted much attention in the latest research.It has been widely applied to industrial automation, aerospace, health care and other fields [38].However, the complexity of the gas flow, the charge and exhaust characteristics of the two chambers of the cylinder, the nonlinear flow of the electric proportional valve and other factors, lead to the pneumatic loading system having strong nonlinearity and strong coupling, which makes modeling and controling the variable load pneumatic loading system become a difficult problem in industrial application.There are two approaches to build the system model in practical application.One is to establish the mechanism model according to the dynamic law of system operation; the other is to establish the data model by collecting experimental data of system operation.In our research, the fuzzy model of the pneumatic loading system is built based on data drive.

The pneumatic variable load test machine is suitable for the precise loading of small load and the continuous variable loading of the numerical value.It can simulate the movement behavior of the moving interface of the ground and space mechanism, and realize the friction and wear simulation test between various moving interfaces under the gravity and microgravity environment.The structure block diagram of pneumatic loading system controlled by electric proportional valve is shown in Fig.6.The hardware system is mainly composed of air source,pneumatic couplet, electric proportional valve, industrial control computer, pressure sensor, data acquisition card,cylinder, etc.Among them, the pilot electric proportional valve of SMC ITV2050 is used to convert the electric signal into the pressure signal; the MCL-L pull and press type sensor is used as the detection element of the system; the D/A and A/D are composed of Advantech PCI1710 and Advantech PCI1710.The system controller is on IPC-610h industrial computer, which is used to collect, process, calculate and output the gas pressure signal.

Fig.6 Structural diagram of the pneumatic system

Within the allowable scope of the dynamic system, this system uses square wave load as the excitation signal, as shown in (19):

It is continuously applied to the system in the opening loop state, and the input and output data of the system are collected for offline modeling.The sampling cycle and sampling duration are 0.1s and 100s respectively, and 1 000 sample data [u(t),y(t)] are acquired, whereu(t) is the current input of the pneumatic proportional valve andy(t) is the actual pressure output of the pneumatic loading system.

First, the appropriate input variables are selected by using 1000 sets of training data denoted as

According to the method of TSFC, the first six important input variables are screened out in sequence:s12,s9,s8,s10,s13ands7, the valuesPiof which are 0.021 2,0.026 2, 0.037 5, 0.057 7, 0.059 8 and 0.070 2 respectively.The corresponding actual variables arey(t−1),u(t−4),y(t−2),u(t−3),y(t−3),u(t−2).The first four and six variables are separately chosen to establish the T-S fuzzy model, and the rule number is taken as 3.Then, FCM and RLS are used for identification of the premise parameters and conclusion parameters.

The approximation of the established model output to the actual system output is displayed in Fig.7 (a), and the error between the two is shown in Fig.7 (b).The performance comparison of the input variable selection model of the pneumatic loading system is shown in Table 6.Obviously, compared with the model with traditional input variables, this model has a higher accuracy.When the first six or the first four variables are adopted as model inputs, the MSE of the fuzzy model is 2.0168 or 3.0505,respectively.However, when six or four conventional variables are taken as model inputs, the MSE is 14.8809 or 24.462 6.The approach error evaluation index of the proposed fuzzy modeling method is obviously reduced.The model based on IVS can effectively overcome the impact on system modeling created by time delay,which has great importance to the actual dynamic system modeling.

Fig.7 Comparison of established model and actual system for pneumatic system

Table 6 Performance comparison of IVS based on TSFC for pneumatic system

6.Conclusions

In this study, a fuzzy model with important input variables is established to improve the identification accuracy of fuzzy system and reduce the complexity of the model.First of all, the improved IVS algorithm is adopted, that is, TSFC is used to screen the necessary input variables for model identification.Then, FCM is used for identification of premise parameters and the RLS method is used for identification of conclusion parameters.Finally, we apply the algorithm to the fuzzy identification benchmark problems and an actual system.The research results reveal that the identification performance of the T-S fuzzy model based on TSFC can be effectively improved.Compared with the previous fuzzy model, the combination of TSFC and FCM algorithms preprocess the important input variables off-line, which can simplify the complexity of fuzzy recognition and achieve a higher recognition accuracy without optimizing parameters.