Mahmoud Ragab
1Information Technology Department,Faculty of Computing and Information Technology,King Abdulaziz University,Jeddah 21589,Saudi Arabia
2Centre of Artificial Intelligence for Precision Medicines,King Abdulaziz University,Jeddah 21589,Saudi Arabia
3Mathematics Department,Faculty of Science,Al-Azhar University,Naser City 11884,Cairo,Egypt
Abstract: Rainfall prediction becomes popular in real time environment due to the developments of recent technologies.Accurate and fast rainfall predictive models can be designed by the use of machine learning (ML),statistical models,etc.Besides,feature selection approaches can be derived for eliminating the curse of dimensionality problems.In this aspect,this paper presents a novel chaotic spider money optimization with optimal kernel ridge regression(CSMO-OKRR)model for accurate rainfall prediction.The goal of the CSMO-OKRR technique is to properly predict the rainfall using the weather data.The proposed CSMO-OKRR technique encompasses three major processes namely feature selection,prediction,and parameter tuning.Initially,the CSMO algorithm is employed to derive a useful subset of features and reduce the computational complexity.In addition,the KRR model is used for the prediction of rainfall based on weather data.Lastly,the symbiotic organism search(SOS)algorithm is employed to properly tune the parameters involved in it.A series of simulations are performed to demonstrate the better performance of the CSMO-OKRR technique with respect to different measures.The simulation results reported the enhanced outcomes of the CSMO-OKRR technique with existing techniques.
Keywords: Rainfall prediction;statistical techniques;machine learning;kernel ridge regression;symbiotic organism search;parameter tuning
Rainfall remains a leading meteorological parameter in various factors of day-to-day life.With effects ranging in damage to framework in case of floods to disruption in the transportation networks,the socio-economic impact of rainfall is remarkable [1].Floods and other events are consequences of climate changes i.e.,predicted to arise most often and have catastrophic effects in following years.Current researches have emphasized that weather condition could possibly increase air pollution(other main topics of climate change and discourse recently)in summer and winter seasons[2].It is relevant to repeat that increasing air pollution leads to serious health issues like asthma and lung problems.Thus,as a mitigation method,several researchers have proposed and investigated rainfall forecasting methods in preparing for another possibility amongst others[3].But the statistical and mathematical algorithms utilized complex computing power and are time consuming with minimum effect.
Rainfall forecasting uses conventional models which utilized statistical models for assessing the correlations among rainfall,geographic coordinates (namely latitude and longitude),and also any atmospheric aspects(such as pressure,wind speed,humidity,and temperature)[4].But the difficulty of rainfall i.e.,its non-linearity making it hard to forecast.Accordingly,efforts have been undertaken to minimize the nonlinearity through Wavelet analysis,Spectrum Analysis,and Empirical Mode Decomposition(EMD).
In recent times,due to the several progresses within the field of pattern recognition method,there are several methods for forecasting rainfall simply instead of formerly utilized traditional methods of linear mathematical curves and guidelines and mathematical relationship supports operator experience[5].Machine Learning(ML)method is extensively employed to unravel hydrological problem includes rainfall forecasting ML-based models use their self-learning capability to attain hidden feature of echo variations and displays better association and memory capability[6].It is employed as numerical prediction and classification model in weather prediction shows the broad and potential predictions of employing neural network system to radar echo extrapolation.
Especially,it has lately employed deep learning (DL) method for processing meteorological big data,which shows stronger technological performance and advantages,has gained considerable interest [7].Researchers are employed artificial neural network (ANN) for forecasting rainfall with all distinct models in which built rainfall simulation models and offer precise rainfall prediction data,temporal and spatial distribution,employed short-term rainfall for urban catchment also the results illustrate that ANN method using low lag outflanked interms of forecast precise indices[8],forecasting everyday rainfall using resilient propagation learning method also the arithmetical result shows that our method is paramount to a several regression models interms of forecast accuracy index.
This paper presents a novel chaotic spider money optimization with optimal kernel ridge regression(CSMO-OKRR)model for accurate rainfall prediction.The proposed CSMO-OKRR technique encompasses three major processes namely feature selection,prediction,and parameter tuning.Initially,the CSMO algorithm is employed to derive a useful subset of features and reduce the computational complexity.In addition,the KRR model is used for the prediction of rainfall based on the weather data.Lastly,the symbiotic organism search(SOS)algorithm is employed for properly tune the parameters involved in it.A series of simulations are performed to demonstrate the better performance of the CSMO-OKRR technique in terms of different measures.
Hu et al.[9]deployed long short term memory(LSTM)and ANN method to simulate the rainfallrunoff procedure-based flood event from 1971 to 2013 in Fen River basin observed by one hydrologic station in the catchment and fourteen rainfall stations.The experiment analysis has been taken from ninety-eight rainfall-runoff events.Amidst eight six rainfall-runoff events have been utilized as trained sets,and the remaining are utilized as testing sets.Xiang et al.[10],explored data about the shortto-long-time variations within novel rainfall time series with Ensemble EMD based investigation on 3 rainfall data sets gathered by meteorological station situated in Kunming,Lincang,and Mengzi,Yunnan Province,China.Considering the time efficiency and prediction accuracy,a new integrated method based on data extracted by using EMD is presented.
Poornima et al.[11]presents Intensified LSTM based recurrent neural network(RNN)to rainfall forecasting.The neural network(NN)system is tested and trained by benchmark data sets of rainfall.The training network produces forecasted attributes of rainfall.Pham et al.[12]compared and developed AI techniques such as Adoptive Networks for predicting day-to-day rainfall in Hoa Binh province,Vietnam.Xiang et al.[13]developed an application of a predictive method based seq2seq structure and LSTM to evaluate hourly rainfall-runoff.Two Midwestern watershed focuses on Upper Wapsipinicon and Clear Creek River in Iowa,such methods have been employed for predicting hourly runoff for a twenty-four hours period using rainfall forecast,rainfall observation,empirical monthly evapotranspiration,and runoff data observation from every station in these 2 watersheds.Venkatesh et al.[14]present a rainfall forecasting method with generative adversarial network to examine rainfall statistics of India and forecast the upcoming rainfall.The presented method employed a generative adversarial network(GAN)system where convolutional neural network(CNN)is utilized as a discriminator and LSTM is utilized as a generator.LSTM is better suited for predicting time series data like rainfall information.
This paper has presented an effective CSMO-OKRR technique that has been developed to predict rainfall.The proposed CSMO-OKRR technique encompasses involves pre-processing,feature selection,prediction,and SOS based parameter tuning process.The proposed SOS algorithm helps to considerably enhance the predictive outcomes of the KRR model.
Perturbation rate is the most significant parameter of SMO that impacts the convergence behavior of SMO.In general,perturbation rate is a linear increased function.But because of the obtainability of nonlinearity in distinct applications,a nonlinear function might impact the SMO efficiency.Thus,to enhance the SMO accuracy,new variants of SMO are introduced[15],in which perturbation rate was adapted that illustrates a better global search efficiency and preferred rate of convergence.For a metaheuristic approach,exploration and exploitation are two significant stages to escape from the local optima and obtain exact solution.In SMO,perturbation rate is the key factor that affect convergence behaviour of SMO.Generally,it can be linear increased function with iteration.But it was detected that at last iteration SMO was trapped to local optimal because of poor divergence.Consequently,under the presented optimization method,the values of perturbation rate are altered by chaotic increased function rather than linear.As a chaotic model was divergent and nonlinear naturally,it displays good outcomes for global optimized.It creates oscillating trajectory and forms a fractal design.
Whereas,ztsignifies the chaotic value attthiteration and lie in ∈[0,1].The chaotic performance forμ=4.During the presented model,the rate of perturbation,parameter was altered based on chaotic performance.
In the equation,the maximal amount of iterations represented by max-it,tdenotes the existing iteration,and the value ofprhas arbitrarily initializing within 0 and 1.CSMO is utilized for selecting an optimum feature,called a codeword,from the feature extracted by SURF.Similarly,it executes clustering of the feature and the centroid of produced optimum cluster is called a codeword.CSMO employs intraclass variance as a fitness function to generate an optimum codeword and the amount of clusters is called codebook size.AssumeSrepresent the amount of features(x1,x2,...,xs)that is grouped tonclusters(C1,C2,...,Cn).Thus,the individual under the population of CSMO might havendecision parameters that denote centroid ofnclusters.In order to estimate the fitness values of all the individuals,the feature extraction is categorized into classCjwhere they have minimal Euclidean distance.
The presented CSMO algorithm repeats until the ending condition to minimize the intra-class variance.Afterward,the completion of CSMO,individuals with optimal fitness values are regarded as output and the returned cluster is called a codeword[16].Afterward creating the codebooks,all the images are mapped to this codeword and denoted as the histogram of the codeword.Fig.1 depicts the flowchart of SMO technique.
Figure 1:Flowchart of SMO
Next to the feature selection process,the next stage is to carry out the classification process by the use of KRR model.In benchmark ridge regression(RR)[17],the part of hidden neuron is to map the input layers to the hidden layers viz.,hidden layers of RR are mapping data in the data space to high dimension space,in which all the dimensions correspond to hidden neurons.Therefore,the efficiency of RR is mainly based on the hidden layers.To prevent the abovementioned hidden layer selection issues,a KRR was utilized for classifying each microarray medical data set.Fig.2 showcases the structure of KRR.In KRR,aCpositive regularization coefficient was presented for making it more stable and generalized[18].Here,theβresultant weight can be formulated by
NowCindicates the regulation coefficient,Hrepresent the hidden neuron output matrix,andTshows the output matrix.
Now,rather than significant the hidden neuron feature mapping,h(x),its respective |〈(u,v)is calculated.The absence ofLi.e.,hidden layers inKRRadditionally simplifies KRR computational procedure.The kernel matrix-based Mercer condition is described by the following equation
Figure 2:RR structure
Therefore,the output of kernel ridge regression is expressed by
LetθRR=HHTandk(xi,xj)be the kernel function of hidden neurons of Single hidden layer feedforward neural network(SLFN).Among the distinct kernel functions fulfilling Mercer condition presented,in the study radial basis function kernel(RBFK)and wavelet kernel are taken into account.RKRR is a local kernel function whereλandYare utilized as the variables.At the same time,the complex wavelet kernel function employs vector that is[d,e,f]as parameter.Based on the data sets,proper tuning of and parameter best decision of kernel function is much needed for getting optimal outcomes.
Radial basis kernel
Wavelet kernel
Kernel RR is beneficial when compared with RR since there is no need to know the hidden neuron feature mapping and set the amount of hidden layersL.It attains good generalization,which is faster than support vector machine(SVM)and more stable than RR.
To properly modify the parameters involved in the KRR model,the SOS is applied to it.A new population-based metaheuristic model stimulated from the natural ecosystem called SOS was introduced by Cheng and Prayogo [19].SOS employs the symbiotic relationships among the two dissimilar species.The symbiotic relationships that are common in the realtime,are parasitism,mutualism,and commensalism.Mutualism is described as inter-dependable relationship among two organisms in which the two organisms benefit from the communication.The relationships among the flowers and bees are examples of mutualism relations.Bee’s moves amongst the flower and collects nectar and it became honey [20-23].These activities profit the flower as it assists them in the pollination procedure as follows:
WhereasPjindicates an organism i.e.,randomly chosen for interacting withPiandPirepresent theithmember of population.
The two organisms are processing on a mutual basis for survival in the ecosystem,MV denotes the mutual vector,rnddenotes an arbitrary value with a uniform distribution within[0,1],kshows the generation,Pbestrepresent the optimal individual organism attained in thekthgeneration andBFindicate the benefit vector[24].MV andBFare evaluated by the following equations[24]:
The round operator is utilized for setting the values ofBFas 1 or 2.BFis utilized for identifying either an organism fully or partially benefitted from the interactions amongst individuals from the population.
This section investigates the performance analysis of the CSMO-OKRR technique under distinct batch sizes(BS)and samples.The results show that the CSMO-OKRR technique has offered effective predictive outcomes as provided in Tab.1 and Fig.3.
Table 1:Result analysis of CSMO-OKRR technique with different batch sizes
For instance,under 100 samples with BS-8,the CSMO-OKRR technique has predicted the rainfall of 0.04743 for the actual rainfall of 0.04922.Along with that,under 500 samples with BS-8,the CSMOOKRR approach has predicted the rainfall of 0.06359 for the actual rainfall of 0.06460.Moreover,under 1000 samples with BS-8,the CSMO-OKRR method has predicted the rainfall of 0.05060 for the actual rainfall of 0.04932.Concurrently,under 1500 samples with BS-8,the CSMO-OKRR system has predicted the rainfall of 0.05824 for the actual rainfall of 0.05671.Simultaneously,under 2000 samples with BS-8,the CSMO-OKRR methodology has predicted the rainfall of 0.06079 for the actual rainfall of 0.06125.
Fig.4 showcases the predicted rainfall values under varying sample count and BSs.Tab.2 provides a comparative study of the CSMO-OKRR technique interms of MSE under distinct BS.The experimental values demonstrated that the CSMO-OKRR technique has gained reduced values of MSE under all samples.For instance,in BS-8 and 100 samples,the CSMO-OKRR technique has offered reduced MSE of 0.00179,0.00106,0.00046,0.00060,0.00018,and 0.00053 respectively.Besides,with 200 samples,the CSMO-OKRR technique has offered reduced MSE of 0.00012,0.00073,0.00000,0.00026,0.00017,and 0.00075 respectively.In addition,with 300 samples,the CSMO-OKRR approach has obtainable lower MSE of 0.00146,0.00071,0.00011,0.00058,0.00013,and 0.00032 respectively.Along with that,with 400 samples,the CSMO-OKRR approach has offered minimum MSE of 0.00179,0.00184,0.00070,0.00002,0.00017,and 0.00018 correspondingly.
Figure 3:Result analysis of CSMO-OKRR technique
Figure 4:Prediction analysis of CSMO-OKRR technique
Table 2:MSE analysis of CSMO-OKRR technique with varying BS
Besides,with 500 samples,the CSMO-OKRR technique has offered reduced MSE of 0.00101,0.00229,0.00106,0.00060,0.00056,and 0.00064 respectively.In addition,with 600 samples,the CSMO-OKRR approach has obtainable lower MSE of 0.00047,0.00198,0.00030,0.00017,0.00045,and 0.00019 respectively.Along with that,with 700 samples,the CSMO-OKRR approach has offered minimum MSE of 0.00085,0.00194,0.00090,0.00085,0.00065,and 0.00043 correspondingly.Besides,with 800 samples,the CSMO-OKRR technique has offered reduced MSE of 0.00053,0.00211,0.00005,0.00067,0.00055,and 0.00000 respectively.In addition,with 900 samples,the CSMO-OKRR approach has obtainable lower MSE of 0.00142,0.00002,0.00124,0.00056,0.00002,and 0.00044 respectively.Along with that,with 1000 samples,the CSMO-OKRR approach has offered minimum MSE of 0.00128,0.00002,0.00124,0.00056,0.00002,and 0.00044 correspondingly.
Finally,a comparative MSE and RMSE analysis of the CSMO-OKRR technique with recent methods is offered in Tab.3 [25].Fig.5 illustrates the comparative MSE analysis of the CSMOOKRR technique with existing techniques.The figure reported that the convolutional neural network(CNN)model has resulted in ineffective outcomes with the maximum MSE of 0.000437.In line with,the modified synchronous reference frame (MSRF) with DL named MSRF-DL (600) and MSRFDL (500) techniques have resulted in slightly reduced MSE of 0.000350 and 0.000353 respectively.Followed by,the MSRF-DL(300)and SRF-DL(800)techniques have accomplished competitive MSE of 0.000331 and 0.000339 respectively.However,the CSMO-OKRR technique has resulted in minimal MSE of 0.000300.
Table 3:Comparative analysis of CSMO-OKRR technique with existing approaches interms of MSE and RMSE
Fig.6 showcases the comparative RMSE analysis of the CSMO-OKRR system with existing algorithms.The figure reported that the CNN approach has resulted in ineffective outcomes with a higher RMSE of 0.0209.Besides,the MSRF-DL (600) and MSRF-DL (500) methodologies have resulted in somewhat decreased RMSE of 0.0187 and 0.0188 correspondingly.Likewise,the MSRFDL(300)and SRF-DL(800)techniques have accomplished competitive RMSE of 0.0182 and 0.0184 correspondingly.At last,the CSMO-OKRR technique has resulted to lower RMSE of 0.0173.
After examining the above-mentioned tables and figures,it is evident that the CSMO-OKRR technique has the ability of accomplishing effective rainfall prediction performance over the other techniques.An elaborate result analysis of the CSMO-OKRR technique is also take place with the recently presented models interms of different evaluation measures.The obtained results demonstrated the improved outcomes of the CSMO-OKRR technique over the other techniques with the effective outcomes.The proposed CSMO-OKRR technique has resulted to the lower MSE and RMSE values of 0.000300 and 0.0173 respectively.Therefore,the CSMO-OKRR technique can be utilized as an effective tool to predict rainfall.
Figure 5:MSE analysis of CSMO-OKRR technique with existing methods
Figure 6:RMSE analysis of CSMO-OKRR technique with existing methods
This paper has presented an effective CSMO-OKRR technique that has been developed to predict rainfall.The proposed CSMO-OKRR technique encompasses involves pre-processing,feature selection,prediction,and SOS based parameter tuning process.The CSMO-OKRR technique intended to properly predict the rainfall using the weather data.The proposed CSMO-OKRR technique encompasses three major processes namely feature selection,prediction,and parameter tuning.Initially,the CSMO algorithm is employed to derive a useful subset of features and reduce the computational complexity.In addition,the KRR model is used for the prediction of rainfall based on weather data.Lastly,the SOS algorithm is employed to properly tune the parameters involved in it.The proposed SOS algorithm helps to considerably enhance the predictive outcomes of the KRR model.The KRR model is used for the prediction of rainfall based on the weather data and the SOS technique was employed to properly tune the parameters involved in it.A series of simulations are performed to demonstrate the optimum performance of the CSMO-OKRR approach with respect to different measures.The simulation results reported the enhanced outcomes of the CSMO-OKRR technique with existing techniques.Therefore,the CSMO-OKRR technique can be utilized as an effective tool to predict rainfall.In future,hybrid DL models can be used to improve outcomes.
Acknowledgement:This work was funded by the Deanship of Scientific Research (DSR),King Abdulaziz University,Jeddah,under Grant No.(D-356-611-1443).The authors,therefore,gratefully acknowledge DSR technical and financial support.
Funding Statement:This work was funded by the Deanship of Scientific Research (DSR),King Abdulaziz University,Jeddah,under Grant No.(D-356-611-1443).
Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.
Computers Materials&Continua2022年8期