RBF neural network regression modelbased on fuzzy observations

2013-01-02 01:24ZhuHongxiaShenJiongSuZhigang

Zhu Hongxia Shen Jiong Su Zhigang

(1School of Energy and Environment, Southeast University, Nanjing 210096, China)(2School of Energy and Power Engineering, Nanjing Institute of Technology, Nanjing 211167, China)

Radial basis function (RBF) neural networks have been applied and evaluated in a wide variety of fields[1-7]. Most of the recent RBF neural networks assume a perfect knowledge of the values of the response for learning samples. That is to say, the observations are supposed to be precise (i.e., point-valued). However, in many real-life situations, such standard observations cannot be obtained. Information about the response is usually obtained through measuring devices or sensors with limited precision. Therefore, it is necessary to extend the RBF neural networks to deal with imprecise data and propose a new methodology in the imprecise setting. Up to date, there is little literature on extending the RBF neural networks to deal with imprecise data. Cheng and Lee[1]proposed a fuzzy version of the RBF neural network, in which the weight coefficients are assumed to be fuzzy. In this regard, the output of such a fuzzy RBF neural network is fuzzy, and the application of fuzzy weight coefficients usually leads to learning complexity. In practice, it is more appropriate to obtain precise prediction in some sense, although the training samples can only be imprecise. In this paper, we suppose that the imprecise data are represented by fuzzy membership functions and investigate the RBF network regression with crisp inputs and fuzzy output. Unlike the existing family fuzzy RBF neural networks, our proposed method does not require the weight coefficients to be fuzzy, which reduces the learning complexity, and the prediction output is precise point value.

There exist two obstacles preventing the classical RBF neural networks to deal with imprecise data. The first one is how to determine the radial basis functions (i.e., the centers and widths of nodes in the hidden layer) when the response is a fuzzy membership function. The second one is how to identify the linear functions (i.e., the weight coefficients of nodes in the hidden layer) when observations (of responses) are fuzzy membership functions. To solve the first problem, we propose a data-driven automatic method. This method treats the input data and output data separately, but it considers both the structure of input data and the performance of the RBF neural networks so as to find the optimal number of nodes in the hidden layer with an acceptable accuracy. To identify final linear behaviors, a novel algorithm for estimating parameters in a fuzzy setting is needed. Recently, a significant contribution is the extension of the expectation-maximization (EM) algorithm[8]to fuzzy data, i.e., the so-called fuzzy EM algorithm[9]. Using the fuzzy EM algorithm, the weight coefficients in RBF neural networks can be identified when observations are fuzzy membership functions. Therefore, we propose a fuzzy observations-based RBF neural network (FORBFNN) regression model and it can be automatically data-driven.

1 Fuzzy EM Algorithm

LetX, referred to as the complete-data vector, be a random vector, taking value in sample spaceχand describing the result of a random experiment. The probability density function (pdf) ofXis denoted byg(x,ψ), whereψ={ψ1,ψ2,…,ψd}Tis a column vector of unknown parameters with parameter spaceΩ.

Ifx, a realization ofX, is known exactly, we can compute the maximum likelihood estimate (MLE) ofψas any value maximizing the complete-data likelihood function:

L(ψ;x)=g(x;ψ)

(1)

(2)

(3)

the likelihood function (2) can be written as a product ofnitems,

(4)

and the observed-data log likelihood is

(5)

The E-step consists in the calculation of

(6)

The M-step requires the maximization ofQ(ψ,ψ(q)) with respect toψover the parameter spaceΩ, i.e., findingψ(q+1)such that

Q(ψ(q+1),ψ(q))≥Q(ψ,ψ(q))ψ∈Ω

The fuzzy EM algorithm alternately repeats the E- and M-steps until the increment of observed-data likelihood becomes smaller than some threshold.

2 Proposed RBF Neural Network Based on Fuzzy Observations

2.1 Identification of radial basis functions

The basic topology of the RBF neural network comprises in sequence a hidden layer and a linear processing unit forming the output layer. Fig.1 depicts this topology of a multi-input single-output network, wherecrepresents the number of nodes in the hidden layer. The set of input-output data pairs can be symbolized asT={(ui,xi)∈Rp×R|xi=f(ui),i=1,2,…,n}, wherenis the number of training samples,ui={ui1,ui2,…,uip}Tis thei-thp-dimensional input vector andxiis thei-th output variable. The Gaussian type RBF functions with the following form are selected:

(7)

(8)

Fig.1 Basic topology of an RBF neural network

This section presents the strategy to identify radial basis functions when the observations are in the following form:

(9)

To address the proposed strategy, a new performance measure is first needed, which is called mean square fuzzy expectation error and defined as

(10)

The proposed strategy is an iterative procedure with two termination conditions: 1) The approximate accuracy is not higher than a given acceptable performanceε, i.e., MSEfuzzy≤ε; 2) The number of nodes in the hidden layer (i.e., node base) is bigger than a given maximum numberRmax, i.e.,c>Rmax. Either condition 1) or condition 2) is satisfied, the iteration is then terminated. These two conditions can ensure a desirable tradeoff between the accuracy and the size of the node base according to the designer’s intuition or expertise.

First, an initial node in the hidden layer is generated. The initial node is extracted by a simple method. The center and width of such node are determined by

(11)

(12)

In this original initial node base, the weight coefficients are identified using the fuzzy EM algorithm, which will be detailed in the following section.

Secondly, a new node in the hidden layer is constructed. The vector that has the worst MSEfuzzy,i, denoted byui′, is considered as the candidate center of this new node:

(13)

Because the candidate center is only based on performance error, it is possible for an outlier to be considered as a new center. Although the preprocessing of data maybe detects and eliminates the outliers, it is still needed to reduce the effects of the noisy data and exclude the chance of an outlier to become a center. In addition, we do not want the new candidate center to be too close to the existing centers. Therefore, the following conditions should be satisfied:

(14a)

(14b)

whereΔ1andΔ2are constants;μi,i′is the membership degree of thei-th data belonging to thei′-th cluster, determined in the following way[11]:

(15)

The role of condition (14a) is to prevent an outlier to be a new center, and the condition (14b) ensures that the new center is not located very close to the other existing centers. Hence, the constantsΔ1andΔ2can be defined as

(16)

(17)

If the selected vectorui′satisfies (14), then it is declared as the center of a new node. Otherwise, it is marked as an outlier and the process of selecting the vector that has the worse performance is repeated without considering the outliers. When none of the existing vectors satisfies (14), the procedure is terminated to avoid over-fitting. The center and width of the new node are defined as

vnew,j=ui′,jj=1,2,…,porvnew=ui′

(18)

(19)

Finally, once the new node is added to the node base, the node number increases one, i.e.,c=c+1, and we havevc=vnew,sc=snew. Due to the added node, the node base should be updated. The centersvkof the previous (c-1) nodes existing in the node base can be maintained whereas their widthssk(k=1,2,…,c-1) can be updated according to Eq.(19) only by replacing indexi′ with the indexk(k=1,2,…,c-1).

From the above interpretations, it is evident that the computational complexity of the proposed strategy isnO(Rmax).

2.2 Identification of weight coefficients of RBF by fuzzy EM algorithm

For the following discussion, we first transform the estimated output in Eq.(8) to the following vector or matrix form:

(20)

where

According to the above interpretations, the complete-data pdf can be defined as

(21)

By using the complete-data pdf, the complete-data log likelihood is computed as

(22)

(23)

(24)

(25)

whereΦ(·) denotes the cumulative distribution function (cdf) of the standard normal distribution, andx*denotes (x-m)/σfor allx. It is easy to obtain that

m(Φ(b*)-Φ(a*))

(26)

(27)

where the denominator is given by Eq.(25). The numerator is

(28)

which can be computed using Eq.(26) and

(m2+σ2)(Φ(b*)-Φ(a*))

(29)

We finally compute

(30)

The numerator is

(31)

which can be computed using Eq.(29) and

m3(Φ(b*)-Φ(a*))

(32)

The M-step requires maximizingQ(ψ,ψ(q)) with respect toψ. This can be achieved by differentiatingQ(ψ,ψ(q)) with respect towandσ, which results in

Equating these derivatives to zero and solving forwandσ, we obtain the following unique solution:

w(q+1)=(HHT)-1Hβ(q)

(33)

σ(q+1)=

(34)

When the iteration terminates, we can obtain the regression weight coefficientswand thus obtain the final RBF neural network regression model with crisp inputs and fuzzy membership output.

3 Simulations

In this section, we validate the performance of FORBFNN by using a numerical simulation, in which the

behavior of a nonlinear system is defined as

x=usinuu∈[0,10]

(35)

To model the situation where responsexcan only be imprecisely observed, triangular fuzzy membership function (see Eq.(24)) is adopted. The core and support of such kind of fuzzy membership functions are generated according to the following two-step strategy:

Step1Generate the coresxiof fuzzy observations,xi=f(ui)+εi, whereεi~N(0,δmax).

In the simulation, four different study cases for deviationδi, i.e.,δi∈{[0,0.01],[0,1],[0,2],[0,3]} are considered. Note that too wide range of imprecision is not considered, because too wide range of imprecision leads to useless training samples about the given system. In each study case, the size of training samplesn=21, accuracy thresholdε=10-5and maximum node numberRmax=4,5,6,7. In addition, we consider that there are not outliers existing in the data sets. Therefore, the parameterηin Eq.(16) can be set to be zero.

To validate the performance, there are 101 testing samples produced according to

(37)

The numerical results are shown in Tab.1, and four graphical results randomly selected from the 100 trials in each study case are shown in Fig.2.

Tab.1 Approximation and prediction errors (mean plus or minus one standard deviation) in different ranges of imprecision

Fig.2 Four data sets and prediction results randomly selected from 100 trials for four study cases. (a) δi∈[0,3]; (b) δi∈[0,2]; (c) δi∈[0,1]; (d) δi∈[0,0.01]

Fig.2 illustrates the prediction results of the FORBFNN model in different ranges of imprecision. It can be seen that the predicted curves can approach the true behavior. The difference between the predicted curves and true behavior becomes smaller with the decrease in imprecision. Especially, such difference approaches zero in the precise and certain case.

Tab.1 presents the approximate and prediction accuracies when the maximum number of nodes in the hidden layerRmaxtakes different values in different ranges of imprecision. They numerically show the performance of the FORBFNN model. For a given range of imprecision, theRmaxcorresponding to the highest approximate accuracy is determined as the node number without considering over-fitting. For instance in the first caseδi∈[0,3] in Tab.1, the highest approximate accuracy appears whenRmax=5; therefore, the number of nodes in node base is 5, i.e.,c=5. We call a model over-fitting if its approximate accuracy becomes small whereas its associated prediction accuracy is high, see the case whenδi∈[0,3] in Tab.1. The over-fitting always occurs in the cases when high imprecision exists. In this regard, it suggests constructing the FORBFNN with small size of node base in the high imprecision cases. In addition, we can see that the performance of the FORBFNN can be improved when the number of nodes in the hidden layer increases to a limit.

In a word, the FORBFNN can deal with imprecise data, and its performance is determined by the ranges of imprecision. The lower the imprecision, the higher the approximate and predicated accuracies.

4 Conclusion

This paper proposes a fuzzy observations-based RBF neural network used to deal with problems when the response of a system can be represented by fuzzy membership functions. In this approach, the weight coefficients used to combine the outputs of the nodes in the hidden layer are identified by the fuzzy EM algorithm, and both the performance accuracy and the size of node number in node base (i.e., the complexity of the produced model) are considered simultaneously. The performance of the FORBFNN is illustrated by using some simulations.

There are still some further works that need to be studied for the extensive applications of our proposed method, such as how to establish fuzzy data from running data. If such an issue is solved, it can be widely used in engineering practice.

[1]Cheng C B, Lee E S. Fuzzy regression with radial basis function network[J].FuzzySetsandSystems, 2001,119(2): 291-301.

[2]Lu S W, Basar T. Robust nonlinear system identification using neural-network models[J].IEEETransactionsonNeuralNetworks, 1998,9(3): 407-429.

[3]Li Y, Qiang S, Zhuang X, et al. Robust and adaptive backstepping control for nonlinear systems using RBF neural networks[J].IEEETransactionsonNeuralNetworks, 2004,15(3): 693-701.

[4]Panda S S, Chakraborty D, Pal S K. Flank wear prediction in drilling using back propagation neural network and radial basis function network[J].AppliedSoftComputing, 2008,8(2): 858-871.

[5]Rivas V M, Merelo J J, Castillo P A, et al. Evolving RBF neural networks for time-series forecasting with EvRBF[J].InformationSciences, 2004,165(3/4): 207-220.

[6]Wei H K, Song W Z, Li Q. A RBF network based online modeling method for real-time cost model in power plant[J].ProceedingsoftheCSEE, 2004,24(7): 246-252. (in Chinese)

[7]Kumar R, Ganguli R, Omkar S N. Rotorcraft parameter estimation using radial basis function neural network[J].AppliedMathematicsandComputation, 2010,216(2): 584-597.

[8]Dempster A P, Laird N M, Rubin D B. Maximum likelihood from incomplete data via EM algorithm[J].JournaloftheRoyalStatisticalSocietyB, 1977,39(1): 1-38.

[9]Denoeux T. Maximum likelihood estimation from fuzzy data using the EM algorithm[J].FuzzySetsandSystems, 2011,183(1): 72-91.

[10]Zadeh L A. Probability measures of fuzzy events[J].JournalofMathematicalAnalysisandApplications, 1968,23(2): 421-427.

[11]Hoppner F, Klawonn F. Improved fuzzy partitions for fuzzy regression model[J].InternationalJournalofApproximateReasoning, 2003,32(2/3): 85-102.