Yunan Qiu, Zhenyu Lu, and Shanpu Fang
Dear editor,
Regional precipitation, as a very important component system of hydrology, plays a key role in the whole water cycle [1]. The dramatic changes of regional precipitation in a short period can easily have a serious impact on the local ecological environment and daily life. Short-term heavy precipitation refers to precipitation events with rainfall of more than 20 mm in one hour or 50 mm in three hours [2].
Therefore, how to accurately and timely predict future precipitation is an important research field in meteorology. However, precipitation prediction has always been a challenging task because of its complex spatial and temporal dependence.
Over time, cloud water content changes gradually, which in turn affects precipitation at the next moment [3].
Also, due to the influence of wind, the current precipitation is also related to the past precipitation of surrounding areas.
There are many methods for precipitation prediction in the computer field, some of which consider time correlation, including[4], [5]. The above methods only consider the dynamic changes of the data and ignore the spatial dependence. There are also some methods to describe spatial features by introducing convolutional neural networks (CNN) into spatial modeling network in [6], [7].However, CNN is often used in Euclidean data [8], such as image and regular grid, which is not consistent with the distribution of automatic weather stations in urban and rural areas, so it does not work well on this problem.
In order to solve this problem, we try to use a spatiotemporal convolution network (STCN) method to predict precipitation based on the data of automatic stations. Our contribution is divided into the following three points: 1) Data processing and establishment of precipitation data sets of 70 automatic stations in Jiangsu Province.2) EEMD-STCN model integrates graph convolution network(GCN), gated recurrent unit (GRU) and ensemble empirical mode decomposition (EEMD). The model can be used not only for singlestep prediction, but also for multi-step prediction. 3) The proposed method is applied to the established dataset and the results show that it can more effectively express the time-series relationship of shortterm precipitation and has higher prediction accuracy.
Methodology:
Constructing adjacency matrix: In order to construct the relationship matrix between automatic stations, we calculate the paired linear distance between automatic station sensors based on longitude and latitude and use Gaussian kernel with a threshold to establish the adjacency matrix [9].
According to the actual problem, we select the positive part of the distance matrix and correlation matrix, and set the rest to 0.
Improved STCN model with EEMD:
• Temporal graph convolutional network: Temporal graph convolutional network (TGCN) is a spatiotemporal forecasting algorithm for traffic flow forecasting proposed by Zhaoet al. in 2019 [11].Considering that our prediction task is similar to traffic flow prediction in the following two aspects: 1) Similar problem objectives: predict the values of next few moments by analyzing the temporal and spatial relationship between the current station data and the surrounding station data; 2) Similar data composition: equal interval numerical data of multiple stations within a certain range of time and spatial relationship between stations. Therefore, we improve the model with EEMD to make it more suitable for precipitation prediction.
• Ensemble empirical mode decomposition: In fact, we can find that the predicted value at timetis often similar to the real value at timet−1. That is, the model tends to take the real value of the previous time as the predicted value of the next time, which leads to the hysteresis of the two curves. In this case, the model is equivalent to using only precipitation one hour before the time to be predicted,rather than using a non-linear mapping through the analysis of input data rules to achieve prediction. Ensemble empirical mode decomposition (EEMD) can decompose a complex signal into a finite number of intrinsic mode functions (IMF). Each IMF component decomposed contains local characteristic signals of the original signal at different time scales. Considering the EEMD method shows its impressive superiority of automatic adjusting to any nonstationary time-series by introducing the IMF [12], it can more effectively express the time-series relationship of short-term precipitation and has higher prediction accuracy.
• Spatiotemporal convolution network with EEMD: In order to capture the spatiotemporal characteristics of data and reduce the impact of autocorrelation on the results, we propose a spatiotemporal convolution model combined with EEMD, as shown in the Fig. 1.
The specific calculation process of each IMF is as follows:
In summary, the proposed model can deal with complex spatial dependence and temporal dynamic changes, and reduce the impact of autocorrelation on the results.
Fig. 1. The overall framework of the proposed model.
Experiments:Setups: We selected the hourly precipitation of 67 automatic stations(excluding three remote stations) in Jiangsu Province between June and September from 2016 to 2019 as the original data set. For data loss and exceptions caused by automatic station failures, we do linear interpolation based on nearby moments.
In the experiment, We use 80% of the data as the training set and the remaining 20% as the test set. We forecast the precipitation in Jiangsu Province in the next 1 hour, 2 hours, and 3 hours. We set the learning rate to 0.001, the batch size to 64, the training round to 100,the number of hidden units to 64, and the number of IMF to 13.
We use the following indicators to evaluate the prediction performance of the proposed model: Root mean squared error(RMSE), mean absolute error (MAE), coefficient of determination(R2) and threat score (TS) [13].
Experimental results: Firstly, we compare the prediction results of the model with the related deep learning models: graph convolutional network (GCN) [14], gated recurrent unit (GRU) [15], and TGCN.
Table 1 shows the prediction of the next 1 hour, 2 hours, and 3 hours of precipitation by the proposed model and related models. It can be seen from the table that the proposed model has a good performance in most evaluation indexes, especially in the indexes of long-term high precipitation.
Fig. 2 shows the predicted values and the real values of a singlesite. From the comparison, EEMD-STCN is the closest to the real value, and there is almost no decline in the prediction of next two and three hours. The trend of the predicted value of TGCN in the next hour is close to the curve of the actual value, but it can be seen that the prediction in the next two and three hours become significantly worse. Compared with the actual values, the predicted values of GCN and GRU have obvious delay, which can not reflect the real law of precipitation data.
Table 1.Next 3 Hours of Precipitation by the Proposed Model and Related Models
Fig. 2. Prediction results of different models on precipitation in next 1 hour, 2 hours,and 3 hours.
Then, we compare the prediction results of the proposed model with two numerical prediction models: T639_L60 [16] and GRAPES_MESO [17].
The Table 2 shows the prediction results of the proposed model and the two numerical models on the accumulated precipitation in the next three hours at a given time. * means that the values are small enough to be negligible. It can be seen from the table that the proposed model has a good performance in most evaluation indexes,especially in the middle and high-level precipitation indexes.
Table 2.Prediction 3 Hours of Precipitation by the Proposed Model and Numerical Models
Figs. 3 and 4 show the numerical models and the prediction results of the proposed model at different time of large-scale moderate precipitation in Jiangsu Province. It can be seen from the figure that the precipitation zone predicted by the proposed model is basically consistent with the reality, and it has certain prediction ability for medium and high intensity precipitation.
Fig. 3. Comparison of forecast results with GRAPES at 2019/09/29 12:00–15:00.
Fig. 4. Comparison of forecast results with T639 at 2019/09/16 2:00–5:00.
Conclusions: This paper presents a short-term precipitation prediction model based on spatiotemporal convolution network and ensemble empirical mode decomposition. Through a series of comparative experiments, the results show that the proposed model can handle complex spatial dependence and time dynamic changes and obtain better prediction results. In the future, we will consider introducing attention mechanism that allows models to better capture important temporal and spatial characteristics improve prediction accuracy for extreme values.
Acknowledgments: This work was granted by the National Natural Science Foundation of China (61773220) and Key Program of the National Natural Science Foundation of China (U20B2061).
IEEE/CAA Journal of Automatica Sinica2022年4期