Chao Ren , Chunran Zou , Zehui Xiong , Han Yu ,Zhao-Yang Dong , and Niyato Dusit
Dear Editor,
This letter presents a novel and efficient adversarial robustness verification method for tree-based smart grid dynamic security assessment (DSA).Based on tree algorithms technique, the data-driven smart grid DSA has received significant research interests in recent years.However, the well-trained tree-based DSA models with high accuracy are always vulnerable caused by some physical noises or attacks, which can misclassify the DSA results.Only with the accuracy index is not enough to represent the performance of the treebased DSA models.To provide formal robustness guarantee and select the trusted tree-based DSA models, this letter proposes an efficient adversarial robustness verification strategy with a sound robust index to quantify the ability of tree-based DSA models against any adversarial attack.Analysis results verifies the proposed strategy can achieve up to ~564X speedup.
Tree-based data-driven models have been identified as a promising approach to achieve real-time stability assessment of power grids[1].With the real-time measurements, the well-trained tree-based model in the offline process can directly deliver the highly accurate stability assessment result [2].However, due to some practical issues,such as false data injection, noise manipulation, communication error,or even with cyber-attack by adversarial attack algorithms [3]-[6],the tree-based models designed for power systems may be vulnerable to these scenarios and make the misclassification, which can verify that high accuracy is not equal to good robustness and tree-based models only with the high accuracy performances cannot guarantee to be valid all the time.Thus, the robustness and security of the treebased models have become a severe concern [7].
This letter proposes an efficient adversarial robustness verification strategy to evaluate the ability of tree-based models to resist adversarial attack algorithms, and systematically analyzes the acceleration for ensemble trees.Besides, an attack-independent sound robust index is designed, which can provide formal robustness verification for safety-critical applications, such as DSA.
Related work: For DSA problem, given the fault database, the data-driven DSA models can be trained by various tree algorithms.The feature inputs to the DSA models are P/Q power generation, load demand, and bus voltage magnitudes; the output is the corresponding stability status [6].Based on tree structures, tree algorithms can be divided into single decision tree and ensemble tree.For tree with ensemble learning, it consists of bagging and boosting methods.The bagging aims to train weak learners based on bootstrap sampling set in parallel, and such method includes random forest (RF) and extra tree (ET).The core idea of boosting is to promote weak learners to strong learners, and it includes adaptive boosting (AdaBoost), gradient boosting (GBDT) and extreme gradient boosting (XGBoost).
Notations and problem description: In order to provide formal robustness guarantee for the different tree-based DSA models (single or ensemble), an efficient adversarial robustness verification strategy with a precise robust index is proposed to quantify the ability of treebased DSA models against any adversarial attack.For ease of notation, the proposed adversarial robustness verification tree-based(ARVT) strategy for single tree-based and ensemble tree-based DSA model are referred as ARVT-S and ARVT-E, respectively.The purpose of ARVT is to measure the distance from the original input to the closest box decision boundary with the high computational efficiency, especially for the ensemble tree-based DSA models.We firstly introduce ARVT-S how to exactly calculate for verifying the single tree-based model, then we convert robustness verification for ensemble tree-based model into the max-clique problem on a multipartite graph with bounded boxicity.
Note that directly solving (1) cannot ensure to achieve the minimal adversarial perturbation due to the non-convexity.Therefore, adversarial attack algorithms can only obtain the upper bound ofz, which can not provide a sound safety guarantee, even if the attack fails to obtain the adversarial examples, it does not mean no adversarial example exists.
Adversarial robustness verification: It aims to determine whether exists the adversarial examples within a radiuszfixed ball region around x,Ball∞(x,z)={xˆ ∈Rm|‖xˆ-x‖∞≤z}.It can be seen to determine whether (2) is true.
Note that (2) are designed to calculate the global optimal exact value,which can be regarded as the lower bound ofz, implying that no adversarial example exists withinBall(x,z).Through giving the exact“Yes/No” answer, a binary search can obtain the value ofz.Hence, it can provide a sound safety guarantee solution against any adversarial attack.
Proposed ARVT-S method: Distinguish from neural network-based models, tree-based models are non-continuous step functions, so existing robustness verification for neural networks [4] are not suitable for tree-based models.Assuming that the single decision tree hasnleaf nodes, for each given instance x withmfeatures, starting from the root node, x will traverse several internal nodes until it reaches a leaf node.Each internal nodeidecides x to pass to the left or right child node via comparing the feature value and its threshold.Each leaf node has a valuevi, which indicates the predictedclass label for aclassification tree.The purpose of ARVT-S is tocalculate amdimensional box for each leaf nodes such that any instance in this box will definitely exist in this leaf node.Based on the tree structure,the box of leaf nodeiis defined by the Cartesian product [8], formalized in (3).
EachBidenotes the decision boundary of a leaf node as Fig.1.
Theorem 1: Given an instance x ∈Rmand a boxB=(l1,r1]×···×(lm,rm].The smallest ∞ - norm distance from x toBcan be calculated as
Proof for Theorem 1: Given the constraint onbj, the minimal dis-
Fig.1.Illustration of Bi for the tree-based models (Left: exist adversarial examples; Right; no adversarial example).
A higher RIT typically indicates the better robustness of tree-based model, since it is equal to have a larger decision boundary.
Simulation results: The proposed method was implemented using the MATLAB programming language, and the experimental evaluations were conducted on a computer with the following specifications: An Intel(R) Xeon(R) W-2133 CPU with a clock speed of 3.6-GHz, 16-GB RAM, and a GPU with NVIDIA GeForce RTX 3070.The analysis process is on the New England 10-machine 39-bus system to validate the performance.In the database generation process,massive operating points are generated by randomly sampling generation and load within a certain range based on Monte-Carlo method.The detailed database description can refer to [6].Eight different faults are studied which are the three-phase faults with inter-area corridor trip.Transient stability criterion is utilized to label the instances, where 60% of samples are randomly sampled for training and the remaining 40% serve as testing data for each fault.
In this case study, the maximum number of nodes in a clique is set as 2 that can ensure allGtrees are enumerated; the maximum number of binary searches for finding the largestzis set as 10 that the proposed ARVT can be verified.Verification error is the upper bound of errors under any attack, indicating that no attack can achieve more than a certain percentage error on the testing set within‖εx‖∞=0.5.If only want to get verification errors at a certain ‖ εx‖∞,just need to disable binary search via setting the maximum number of binary searches as 1 to be computed.To demonstrate the validity of the proposed ARVT with RIT for tree-based DSA models, we applied six state-of-the-art tree algorithms (DT, RF, ET, GBDT,XGBoost and AdaBoost) to compare the DSA accuracy, RIT, computational efficiency and verification error as Table 1.Among them,single DT is considered as the baseline, the number of ensemble tree is set as 200 and 1000.In order to guarantee the best DSA performance and fair comparison, all the tree-based DSA models have been well-trained and the hyper-parameters have been fine-tuned to avoid underfitting and overfitting.
It can be seen that ensemble tree-based models always have the better DSA accuracy than single DT.Although bagging ensemble methods have the better DSA accuracy than boosting ensemble methods, the RIT and verification errors of bagging ensemble methods are worse than boosting ensemble methods.It can proof that the accuracy index is not enough to represent the performance of the treebased DSA models.Besides, it is clear that the computational efficiency of ARVT strategy can achieve up to ~564X speedup compared with linear sum of several baseline.With the increasing of the number of ensemble trees, Table 1 shows the different degree of increase in RIT and speedup and decline in verification errors.
Fig.2.Illustration of ARVT-S.Combine the different boxes of different single trees, and then convert them into the graph layer form.
Conclusions: In this letter, an ARVT strategy is proposed to characterize the robustness of both single and ensemble tree-based DSA models against any adversarial attack, hence, to select the more trusted tree-based DSA models.Analysis results have demonstrated ARVT strategy can achieve up to ~500X speed up.To the best of our knowledge, similar works have not been reported in the literature,and the proposed ARVT can also provide the formal robustness verification for other safety-critical data-driven problems in power engineering.
Acknowledgments: This work was supported in part by the Internal Talent Award with Wallenberg-NTU Presidential Postdoctoral Fellowship 2022, the National Research Foundation, Singapore and DSO National Laboratories under the AI Singapore Program(AISG2-RP-2020-019), the Joint SDU-NTU Centre for AI Research(C-FAIR), the RIE 2020 Advanced Manufacturing and Engineering(AME) Programmatic Fund, Singapore (A20G8b0102), and NOE Tier 1 Projects (RG59/22 & RT9/22).
IEEE/CAA Journal of Automatica Sinica2024年3期