Hong-bo PENG,Xiong-wei JIANG
(College of Aeronautical Engineering,Civil Aviation University of China,Tianjin 300300,China)
Abstract:Aiming at the problem of fuzzy boundaries between aeroenginemaintenance levels and low accuracy of decision-making,a decision-makingmethod of enginemaintenance levels based on largemargin nearest neighbor algorithm and k-nearest neighbor algorithm is proposed.Firstly,the largemargin nearest neighbor(LMNN)algorithm is adopted to obtain the transformation matrix based on the historicalmaintenance data of the engine.Then,the enginemonitoring data ismapped to the optimal feature space by the transformation matrix.Finally,the K-nearest neighbor algorithm is utilized to establish the decision-makingmodelwith the optimized data as the training samples,which determines themaintenance level by the evaluation of the state of the engine before it is removed from the aircraft.Themethod is verified using the performance parameters andmaintenance level data of an aeroengine,and its decision accuracy is higher than the support vectormachinemodel and neural network model which are commonly used.
Key words:Largemargin nearest neighbor,Transformationmatrix,K nearest neighbor,Maintenance level decision-making
Aeroengine is complex equipment integratingmechanism,electricity and fluid.Its maintenance and guarantee are related to the flight safety of aircraft.Maintenance level of engine could reflect the depth of maintenance in the actual work.Each maintenance level corresponds to certain workscope of enginemaintenance[1].The maintenance level of the engine is not only directly related to themaintenance cost,but also affects the performance of the engine aftermaintenance and the time on-wing.Therefore,the scientific decision-making of maintenance level is of great significance.
In the early stage,the determination of enginemaintenance level of domestic and foreign airlinesmainly depended on the experience of engine engineers and the technical support provided by engine manufactures.Thismethod can only be applied to the case of small fleet.With the expansion of the scale of engine fleet and the increase number of engine types,engine engineers need enough basis tomake decision in fleet management.Therefore,scholars at home and abroad carry out a lot of researches in this field.The main method they used is to learn from the theory of engine fault diagnosis[2],which adopting some machine learning algorithm to determine the maintenance level through the engine monitoring data.Xie et al.used the theory of variable precision rough set to mine the internal relationship between enginemonitoring param-eters and maintenance level in order to extract decision rules[3].Wang explored the decision-making of engine maintenance level as the problem of pattern classification,and proposed the decision-makingmodel ofmaintenance level based on least square support vectormachine[4].According to the support vector machinemodel,Zheng utilized particle swarm optimization algorithm to optimize the adjustable parameters of themodel,which improved the accuracy ofmaintenance level decision-making[5].From the perspective of engine performance degradation,Jia used fuzzy comprehensive evaluation to determine the maintenance level of enginemodule through the combination of performance parameters[6].Liu applied extreme learningmachine to decision-making of enginemaintenance level,and proposed an improved algorithm of singular value decomposition to solve the problem of matrix singularity,which enhanced the stability of the established decision-making model[7].Che used deep learningmethod based on deep belief network to mine hidden features in monitoring parameters,and established deep network model which is used to determine enginemaintenance level[8].
Themachine learning algorithms used in the above research usually need to assume that the enginemonitoring data obey the Gaussian distribution,but the distribution type of the engine monitoring parameters is usually unknown.In addition,the training process of thesemodels is complex and there aremany model parameters to be optimized,which affects the application and promotion of the research results.In the whole life of aeroengine,engine overhaul is mostly caused by performance degradation or major failure.In view of the characteristics of the same degree of engine performance degradation or the occurrence of the same type of failure,the trend of the performance parameters is similar.Besides there is unknown correlation between some performance parameters.For these seasons,this paper proposes a decision-making method based on LMNN(Large Margin Nearest Neighbor)algorithm which can effectively process themonitoring data and optimize the samples feature space.
Large margin nearest neighbor classification algorithm can accurately establish the relationship between attributes and categorieswithout assuming that the data obey a certain distribution.It iswidely used in text recognition[9],action recognition[10]and other fields.This paper uses the engine historical maintenance data as training samples directly,and adopts the large margin nearest neighbor algorithm to obtain the transformation matrix from the sample data.Then this paper employs engine monitoring data optimized by the transformation matrix as the input,and utilizes k-nearest neighbor algorithm to establish a decisionmakingmodel.The proposed decision-makingmethod is verified by the actualmaintenance data of CF6 engine of an airline.
In the process of aeroengine operation,the state monitoring is used to ensure the reliability and performance of the engine.Usually,themonitoring data returned from the sensors are collected by themonitoring software and stored in the database in the format of deviation value after conversion,so as to eliminate the interference of flight state and external environment.The main monitoring parameters of engine include exhaust gas temperature,low-pressure rotor speed,high-pressure rotor speed,fuel flow,low-pressure rotor vibration,high-pressure rotor vibration,oil temperature and oil pressure,etc.
The abnormal trend of the monitoring parameters corresponds to certain failure or performance degradation degree of engine.The performance degradation degree and failure type of engine are themost important factors of engine maintenance level decision.Therefore,the core problem of engine maintenance level decision-making guided by condition monitoring information is the effective processing of monitoring data and the establishment of themapping relationship betweenmonitoring parameters andmaintenance level.
The aeroengine is generally designed as a module structure.The maintenance level of the whole engine and eachmodule should be determined duringmaintenance.With the development of civil aviationmaintenance industry,the practical engineering significance of the whole engine maintenance level is not great,only as a management decision-making.The current enginemaintenance work ismainly around each module of engine.Themaintenance engineer needs to determine the workscope according to the maintenance level ofmodule.Therefore,in the actual enginemaintenance process,all airlines take the determination of modulemaintenance level as one of the key contents.
Themaintenance level is usually determined by the engine manufacturers.The methods of dividing the maintenance level by the major engine manufacturers are different,but the basic idea is the same.Themaintenance level is divided into several different levels according to the sequence ofmaintenance degree from shallow to deep.For example,GE’s CF6 engine modulemaintenance level is divided into three levels,as shown in Table 1.
Tab Ie 1 Aeroengine m odu Ie maintenance Ieve I c Iassification
VC is suitable for the engine without obvious hardware failure or performance problems.POH is used to restore the engine exhaust gas temperaturemargin and reduce the fuel consumption rate.OH is a thorough inspection and repair of engine parts.
The idea of enginemaintenance level decision-makingmethod based on LMNN is using metric learning algorithm tofind an appropriate transformationmetric,which can shorten the distance of samples belong to the same class and widen the distance between heterogeneous samples.Due to that the boundaries of engine monitoring data between different maintenance levels ismore obvious.When makingmaintenance level decision,a simple classification algorithm can bring about higher accuracy.
The process ismainly divided into two stages:Firstly,collect the historical enginemaintenance data,obtain the feature transformation matrix by large margin nearest neighbor algorithm,then optimize raw data with the feature transformation matrix;secondly,take the transformed historical engine maintenance data as training samples,take the transformed current engine monitoring data as testing samples,adopt k-nearest neighbor algorithm to classifymaintenance levels,update K value and evaluate classification accuracy,when classification accuracy is the highest take the K value as the ultimate parameter of enginemaintenance level decision model.The framework of the decision method is shown in Fig.1.
Fig.1 Decision-m aking m ethod fram e diagram
Largemargin nearest neighbor algorithm is a supervised metric learning method based on Mahalanobis distance[11].The main idea of this algorithm is to construct an objective function by using the feature attribute and category information of samples,and then solve the optimization problem of objective function by semi positive definite programming so as to obtain the measurement matrix.The measurement matrix can map training samples to a new domain in which the similar samples get closer and the heterogeneous samples get further.
The expression of Mahalanobis distance is:
In Formula 1,M is a Mahalanobismatrix and L is a feature transformationmatrix.The key to learning Mahalanobis distance is to obtain the transformation matrix L.According to the idea of LMNN,the optimiza-tion model is designed as follows:
In Formula 2,xiis the input vector.xjis the nearest neighbor of the same class asxi.xlis the nearest neighbor of the different class asxi.μis the balance coefficient.yil∈{0,1}indicates whether samplesxiandxlhave the same category label,which taking 1 for the same and 0 for the different.εijl=[1+dM(xi,xj)-dM(xi,xl)]+is the slack variable.
Firstly,the monitoring data and maintenance level information of aeroengine before shop visit are collected.The initial training sample set Strain=[X1,X2,…,Xn]and corresponding category label set Ctrain=[c1,c2,…,cm]are constructed as the input of the optimization model.Herenis the number of samples,and each sample is a vector composed of p-dimension enginemonitoring parameter values Xi=(xi1,xi2,…,xip)T.ciis the corresponding category label of each sample,indicating the maintenance level of engine.In this paper,1,2 and 3 are set to represent VC,POH and OH respectively.
The transformation matrix L is initialized randomly.According to the optimization problem of Formula 2,the sub gradient descentmethod is used to solve the optimizationmodel.Under the constraints,the optimal transformation matrix L is obtained by iterative updating.Use the transformation matrix L to transform the feature of the initial training sample set,as shown in Formula 3:
Let T={S′train}be the training sample set of engine maintenance level decision model.
KNN(K-Nearest Neighbor)algorithm is a classification algorithm based on the principle of similarity.The idea of thismethod is to searchKnearestneighbor points of the testing sample in the training data set,and make a simple majority voting strategy in the set ofKnearest neighbor points to determine the category of the testing sample[12].For the determination of engine maintenance level,k-nearest neighbor algorithm is used to identify the testing samples to determine the finalmaintenance level.The algorithm steps are as follows:
(1)Collect the monitoring parameters of engine and establish the testing sample Xtest=[x1,x2,…,xp]T,and use the transformation matrix L to transform the feature of the testing sample:X′test=L×Xtest.
(2)Calculate the Euclidean distance from X′testto each sample in training sample set T respectively:
(3)Ksampleswhich are closest to the testing samples are selected from the training sample set.For the determination ofKvalue,this paper uses themethod of 5 fold cross validation and takes the average classification accuracy as the evaluation standard to determine the bestKvalue.Find the category with the largest number of samples from theKnearest neighbors,then assign the category label to the testing sample.The corresponding label(1,2,3)represent the enginemaintenance level(VC,POH,OH).
In this paper,we use the CF6 engine’s maintenance data of a domestic airline fleet in recent years as the simulation object,and take the module high pressure turbine(HPT)of engine as the experimental object.According to themaintenance plan guide document developed by GE(General Electric Company),themaintenance level of CF6 engine’s high pressure turbine includes three levels:VC(general inspection),POH(gas path performance recovery),OH(overhaul).The monitoring parameters reflecting HPT performance include:exhaust temperature deviation(DEGT),high pressure rotor speed deviation(DN2),fuel flow deviation(DFF),low pressure rotor vibration deviation(ZVB1F)and high pressure rotor vibration deviation(ZVB2F).The data consist of HPT actual maintenance level and monitoring parameter values is shown in Table 2.
In this paper,we select80 of94 data sets asmodeling samples to obtain the transformation matrix L and the bestKvalue;use the remaining 14 data sets as testing samples to verify the effectiveness of themethod.The whole process is realized by MATLAB programming.
Tab Ie 2 HPTmonitoring param eters and m aintenance IeveI
Firstly,we use 80 data sets as input samples of LMNN optimization model,set the maximum number of iterations to1 000,set the learning rate to0.1,and set the number of targetneighbors to 3.After67 iterations,the transformation matrix L is obtained.
The transformationmatrix L is used to transform the feature of initial training samples and testing samples.The data visualization results of t-SNE(t-distrib-uted stochastic neighbor embedding)[13]before and after the feature transformation are shown in the Fig.2 and Fig.3 respectively.
Fig.2 Raw data sam p Ie distribution
Fig.2 indicates the visualization result of raw data.It can be seen that the boundaries between threemaintenance levels are not clear and the data points belong to each part are staggered.Fig.3 indicates the visualization result of the optimized data by the transformation matrix.It is obvious from the comparison of twofigures above that after the feature transformation the boundaries between different maintenance levels are clearer and themaintenance levels are easier to distinguish.
Fig.3 Sam p Ie distribution after feature transformation
We input the modeling samples after feature transformation into k-nearest neighbor classifier,then use the 5 fold cross validation method to randomly divide themodeling samples intofive groups to determine the bestKvalue.Due to the limit of the number of VC level samples in the modeling samples,set the value range ofKto 1-7.When theKvalue is 3 the average classification accuracy is the highest.Therefore ultimateKvalue of themodel is selected as 3.The broken line diagram of classification result is shown in Fig.4.
Fig.4 Cross va Iidation c Iassification resu Its
Finally,we input the testing samples after feature transformation into the decision-making model to get the maintenance level prediction results.Then compare the predicted maintenance level of the testing samples with the actualmaintenance level.The predicted results are shown in the Fig.5.
Fig.5 Prediction resu Its ofm aintenance Ieve I
We set up a comparative experiment that is selecting the same data and using KNN,LS-SVM(Least Square Support Vector Machine)[14]and CNN(Convolutional Neural Network)[15]to predict the maintenance level of testing samples.For the first of the threemethods,KNN’sKvalue is selected as 3.For the method using LS-SVM,select Gauss kernel function and optimizemodel parameters by genetic algorithm,besides set penalty factorCas0.64 and kernel function parameterγas 0.002 respectively.For the finalmethod,set CNN’s layer number as 3,select radial basis function as neuron’s excitation function,and set the number of neurons in each layer to{5,9,1}.Accuracy is one of themost common evaluation indexes.In addition the number of excessive maintenance,shortage of maintenance and accurate prediction of allmethods are counted as the other evaluation indexes.The comparison results of four methods are shown in the Table 3.
Tab Ie 3 Statistica I tab Ie of test resu Its
Compared with other methods,the comprehensive performance of themethod this paper proposed is better.Themethod using convolutional neural network is not suitable for the small number of samples and has the lowest accuracy rate.While engine maintenance such as gas path performance recovery or overhaul is a small-sample event.LS-SVM can classify the smallsample data better,but there aremore shortage ofmaintenance in the classification results.For the actual enginemaintenance,shortage ofmaintenance ismore likely to cause engine operation accidents than excessivemaintenance.Because some data points of training samples in the edge of eachmaintenance level are interlaced with each other,accuracy of the classification results using KNN is far lower than the method this paper proposed.After optimizing the raw data by LMNN the classification accuracy increases greatly,which meets the requirements of engineering application.
In this paper,a decision-making method of engine maintenance levels has been proposed.Based on the condition monitoring data and maintenance records of aeroengine,the proposed method uses large margin nearest neighbor algorithm to optimize the raw data and utilizes the k-nearest neighbor algorithm to classify the enginemaintenance level.In the case of overlapping and unclear boundaries of engine’s performance parameters between differentmaintenance levels,largemargin nearest neighbor algorithm optimizes the distribution of data points which represent parameter values and makes maintenance levels easy to determine.The method proposed in this paper makes full use of the existingmonitoring data of aeroengine.It is also easier to realize and has better generalization performance compared with othermethods.