Dan Zhang, IEEE Senior Member, Gang Feng, IEEE Fellow, Yang Shi, IEEE Fellow, and Dipti Srinivasan, IEEE Fellow
Abstract—Multi-agent systems (MASs) are typically composed of multiple smart entities with independent sensing,communication, computing, and decision-making capabilities.Nowadays, MASs have a wide range of applications in smart grids, smart manufacturing, sensor networks, and intelligent transportation systems. Control of the MASs are often coordinated through information interaction among agents,which is one of the most important factors affecting coordination and cooperation performance. However, unexpected physical faults and cyber attacks on a single agent may spread to other agents via information interaction very quickly, and thus could lead to severe degradation of the whole system performance and even destruction of MASs. This paper is concerned with the safety/security analysis and synthesis of MASs arising from physical faults and cyber attacks, and our goal is to present a comprehensive survey on recent results on fault estimation,detection, diagnosis and fault-tolerant control of MASs, and cyber attack detection and secure control of MASs subject to two typical cyber attacks. Finally, the paper concludes with some potential future research topics on the security issues of MASs.
WITH rapid development of perception, communication, and computation technologies, distributed cooperative control of multi-agent systems (MASs) has received great attention from scholars in different disciplines due to their wide applications in large-scale process industries,multi-robot systems, intelligent transportation systems, sensor networks, smart grids, and internet systems [1]–[5]. In the field of intelligent transportation systems, as shown in Fig.1,the distributed control framework of networked autonomous vehicles can provide new solutions for the safety and efficiency of transportation systems, and help solve practical problems such as traffic accidents, road congestion, energy conservation, and environmental protection [6]. Compared with traditional single-agent systems, multi-agent systems are more scalable and upgradeable while improving task execution efficiency and robustness due to its inherent ability to learn and make autonomous decisions cooperatively.
Fig.1. The coordinated vehicles in transportation systems.
The analysis and synthesis of MASs have been extensively studied in a variety of disciplines including computer science,control engineering, electrical engineering, and civil engineering. Existing survey articles present discussions on communication mechanism of MASs [7], [8], consensus protocol of MASs [9], and communication constraints of MASs [10]. Survey works on MASs in the context of computer sciences are also found in [11]–[14]. However, none of these articles concentrate on the issue of securities of MASs though such a kind of networked systems are fragile to physical faults and cyber attacks [15]. It has been recently revealed in [16] that a small fault or cyber attack on an agent can degrade the performance and even paralyze the whole system. What shall we do when some agents are misbehaving,wait, abandon, or adjust [17]? In [18], the issues of access control and trust/reputation of MASs were addressed for the security of MASs. In [19], fault-tolerant control methods of MASs were surveyed, and the attention was focused on topology reconfiguration methods. Nevertheless, there have been a large volume of works on physical safety and cyber security analysis and synthesis in literature recently, where the issues of fault estimation, detection and diagnosis, faulttolerant control, attack detection, and secure consensus were investigated. These studies provide some systematic perspectives and methodologies for improving security of MASs effectively.
In this paper, we provide a comprehensive overview of recent advances in the physical safety and cyber security issues of MASs, where the actuator and sensor faults will be discussed in the physical safety analysis part, and the deny-ofservice (DoS) attack and Deception attack will be addressed in the cyber security section. The paper is organized as follows.Section II introduces salient results on the fault estimation of MASs, along with some key analysis methods. Section III addresses fault detection and diagnosis of MASs. Section IV discusses fault-tolerant control of MASs. Section V is concerned with representative cyber attacks in MASs and corresponding attack detection schemes. Section VI presents some recent results on secure control of MASs. Finally,Section VII concludes the article with some potential future research topics. The structure of this paper is depicted in Fig.2.The major differences between the relevant survey papers[18], [19] and our paper are summarized as follows:
1) Reference [18] is concerned with the access control and reputation of MASs. Instead we focus on the cyber security and physical safety of MASs and deal with the safety and security control problem.
2) In [19], the fault-tolerant control methods of MASs are surveyed with its attention focused on the topology reconfiguration methods. In contrary, we start with the physical threat and conduct a review on recent results on fault estimation,fault detection, and fault-tolerant control of MASs; Then, we pay attention to the cyber threat issue, and analyze recent advances on two typical attacks, DoS attack and Deception attack.
3) A more systematic and broader overview on the safety and security of MASs is given, aiming to provide a comprehensive survey on this emerging and challenging research direction.
In the past decades, there has been an increasing demand on safety and reliability of MASs, as a single fault on the sensor or actuator could lead to a significant performance degradation on the whole system and even the failure of the whole system. In [20], the performance of a group of unmmaned autonomous vehicles (UAVs) that are subject to different types of actuator faults was investigated, and it was shown that a consensus can still be achieved if a fault of the partial loss of effectiveness occurs in one actuator of an agent,but the transient performance could be degraded dramatically.The consensus would fail to be achieved when actuators are in complete loss of effectiveness. A good fault estimation scheme is capable of providing the timely and precise information of any faults within the system being monitored.Then the effective defense mechanism can be triggered to eliminate the effect of the fault to the system. In this section,we address the fault estimation problem, but only focus on the model-based fault estimation of MASs. The readers are referred to [21]–[23] for model-free ones. A detailed categorization of physical threats in the study of fault estimation of MASs is given in Table I.
TABLE I CATEGORIZATION OF PHYSICAL THREATS IN THE STUDY OF FAULT ESTIMATION OF MASS
Online fault estimation of MASs is challenging due to the complex interactions among of MASs. Consider a homogeneous MAS with N agents, and each agent is modeled by the following linear system:matrices. In [24], the following system was introduced:
It is worth pointing out that the local fault in the i -th agent cannot be estimated by its neighbors in the aforementioned results as only the local state and fault were augmented for estimation. Recently, the distributed fault estimator design for a class of Lipschitz nonlinear MASs was investigated in [32],where a new augmented state vector including the local state fault and the neighboring state fault was constructed. The designed observer therein is capable of providing a good estimation of faults both in local agent and its neighbors. The common limitation is still the computational burden when the augmentation technique is used.
It is worth pointing out that all the above studies [24]–[32],[34]–[36] focused on homogeneous MASs. In reality, most of multi-agent systems have different agent dynamics, see,trucks, buses and cars in transportation systems. Therefore,distributed fault estimation of heterogeneous MASs has received increasing attention. For a class of linear discretetime heterogeneous linear MASs, a distributed l1-norm-based optimization method was introduced to estimate the state and fault simultaneously in [33]. It was shown that if and only if the number of faulty agents is smaller than the half of number of agents and the following optimization problem:
Compared with fault estimation, traditional fault detection and diagnosis is less demanding as it only seeks to trigger an alarm signal when a fault is detected in the system (and then isolate the fault). It has been extensively adopted in many real systems such as power systems [38], mechatronic systems[39], chemical systems [40], etc. We now address the fault detection and diagnosis issue of MASs, and present some recent results in this area. A detailed categorization of physical threats in the study of fault detection and diagnosis of MASs is given in Table II.
TABLE II CATEGORIZATION OF PHYSICAL THREATS IN THE STUDY OF FAULT DETECTION AND DIAGNOSIS OF MASS
Distributed fault detection for a network of second-order linear MASs was studied in [41], where a bank of unknown input observers (UIOs) were designed to detect the fault by regarding the fault as an unknown input. The faulty agent was removed from the network when it was detected by comparing the residual evaluation function with a threshold. The approach presented in [41] was feasible only if a single additive fault was present. Based on the analysis results in[41], distributed fault detection of a networked dynamical system with multiple faults was investigated in [46], where the minimum amount of information required by an agent to detect the faults was revealed. The distributed unknown input observer was also designed in [42] for a class of discrete-time high-order systems. The reduced-order unknown input observer was also applied to the high-order MASs as in [43].It must be pointed out that the matching condition is a direct restriction of those observers, e.g., rank(CE)=rank(E),where C is the output matrix, and E is the weighting matrix of the unknown disturbance.
In [47], distributed fault detection for general high-order linear MASs was discussed, where the relative output information was used to construct the observer. In [44], the interval observer was proposed for a class of discrete-time MASs such that the lower and upper bounds of state observation were obtained, see the following interval observer designed for the i-th agent:
Note that most of the above works only address the fault diagnosis including fault estimation, detection, and isolation of physical faults in MASs, the design of control protocols is yet to be further investigated. In the following section, the issue of fault-tolerant control is discussed.
The main role of fault-tolerant control is to trigger an adjustment control mechanism to deal with the faults when they are detected. Sometimes, the fault-tolerant controller is designed on the basis of the precise fault estimation information. Recent studies on this topic are summarized in this section. A detailed categorization of physical threats in the study of fault-tolerant control of MASs is given in Table III.
TABLE III CATEGORIZATION OF PHYSICAL THREATS IN THE STUDY OF FAULTTOLERANT CONTROL OF MASS
We have made a comprehensive survey on physical fault estimation, detection, isolation, and fault-tolerant control of MASs in the previous sections. All those results deal with the physical threats in MASs. With recent development of information technology, the communication networks of MASs are exposed to the general public, with a great risk of being attacked by adversaries. In this article, we only focus on two typical attacks: the deny-of-service (DoS) attack and the Deception attack. A DoS attacker can exhaust the network or system resources of the target agent, causing the service of MASs to be temporarily interrupted, stopped, or crashed. On the other hand, wrong decision may be made when some false data are injected into sensors or actuators. In this case, the man-made fault could occur. The attack detection for networked systems has received increasing attention in past few years as in [71]–[73]. In this section, we focus our attention on the attack detection problem of MASs. A detailed categorization of cyber threats in the study of cyber attack detection of MASs is given in Table IV.
TABLE IV CATEGORIZATION OF CYBER THREATS IN THE STUDY OF CYBER ATTACK DETECTION OF MASS
In [78], the DoS attack detection problem was considered for a network of vehicle systems, and an augmented system including the vehicle state (position, velocity) and controller state was proposed as follows:
where di(t) is the relative distance between vehicles, vi(t) is the velocity of the i-th vehicle, and ai(t) is the acceleration signal of the i-th vehicle. τ is the unknown delay introduced to describe the duration time that the communication network was occupied by an illegal user. In order to estimate τ, the following sliding mode observer was designed:
The secure consensus control problem of MASs has been investigated in parallel with the attack detection problem in previous years. The salient results are collected and presented in this section. A detailed categorization of cyber threats in the study of secure consensus of MASs is given in Table V.
TABLE V A CONCRETE CATEGORIZATION OF CYBER THREATS IN THE STUDY OF SECURE CONSENSUS OF MASS
Some other researchers studied the DoS effect on MASs from the Markovian jumping system point of view. For example, the secure consensus of linear MASs with random attack was studied in [90], where the attack was driven by a Markov process. A state feedback secure consensus protocol consisting of two controllers was proposed for different time intervals, where one controller was designed for the MAS subject to attacks and the other one was designed for the normal MAS without attacks, i.e.,
where λ(t) is a switching signal obeying the Markov transition process. α and β are the coupling strengths for the MASs without attack and with attack, respectively. K and H are the two controller gains that were determined based on the solution of an algebraic Riccati equation and an algebraic Riccati inequality, respectively, i.e., K=R−1BTP and H=T−1BTS with
where δ>0 is any given scalar, and P,R,S, and T are positive definite matrices.
The similar analysis method was later extended to the observer-based control protocol in [91]. However, the attack distribution information must be known a prior for consensus protocol design, which may not be an easy task due to the fact that the adversaries would try to hide their attack strategies.On the other hand, the adversaries may inject some false data into the packet to mislead the agent as the price of launching the DoS attack may be very high. Now, we pay our attention to the case of Deception attack. The recent advances are summarized in the following.
The secure output consensus of heterogeneous MASs with aperiodic sampling and DoS attack was studied in our earlier work [92], where the input-hold mechanism was adopted when DoS attack occurs. In our work [102], both of the communication channels of agent-satellite and agent-agent were broken when attack occurs. By introducing a piece-wise signal to describe the nonuniform sampling phenomenon, a stochastic variable to characterize the occurrence of the attack and the time-delay term to model the attack duration, a stochastic switched-time delay system was derived that is capable of modeling the MASs subject to nonuniform sampling and random DoS attack. With the help of Lyapunov stability theory, some sufficient conditions were proposed such that the output consensus error system was guaranteed to be exponentially stable in the mean-square sense and achieved a prescribed H∞performance level. Some matrix manipulation techniques were also introduced to derive the controller gain matrices. Note that the secure control protocol design was based on the precise attack probability, which is a limitation in reality. Recently, a new switched system approach was proposed for the secure consensus of heterogeneous MASs with DoS attack in [93], where a piecewise switching signal indicating different attack duration was proposed to characterize the attack strategy variation, i.e., the local position signal can be modeled as
where d(k)∈{0,1,2,...,N} is a switching signal indicating the different attack durations. The similar modeling method was also adopted for the communication interaction of agent i and its neighbors. In this case, the attack probability is not involved in the system modeling and analysis. A major limitation is that the adversary could jam all the communication channels, while in reality the adversary may only have the ability to jam a few channels as the system size is usually large.
Recently, the Markovian jumping system approach was proposed to model the partially unknown and uncertain attack in secure consensus of heterogenous MASs in [103], where the agents are communicating with each other periodically when there is no attack. A nonuniform sampled-data system was introduced when the attack occurs, the attack duration and probability are transformed into the number of sampling periods and transition probability of Markovian jumping system, respectively by assuming that the attack strategy follows a Markov chain as in [90].
As for the Deception attacks, the research on secure consensus of heterogeneous MASs is yet to be reported.Compared with the results on secure consensus of homogeneous systems, the regulator equation [104] is usually necessary to derive the consensus protocols, see, e.g., [92],[93], [103].
We have presented an overview of recent advances on the physical safety and cyber security issues of MASs in this paper. In particular, we have presented the results on physical fault estimation, detection and diagnosis, fault-tolerant control, cyber attack detection, and secure control under two typical kinds of attacks: the DoS attack and the Deception attack. Although many significant results have been reported on the security issue of MASs from various perspectives, the increasing complex security situation and higher security demand have brought many new challenges to the protection of MASs. Some potential research directions are recommended as follows.
In most of existing works, usually only one attack phenomenon is studied, i.e., the DoS attack and the Deception attack were usually investigated separately. A sole attack behavior may be easier to be detected and eliminated by a well designed detection system. In order to avoid being detected, the adversary would try to launch more sophisticated attacks such as a combination of the DoS attacks, Deception attacks, Replay attacks [105], Stealth attack [106] etc. In this scenario, how can we design the secure consensus protocol?The main challenge is that it is usually hard to precisely capture the dynamic behavior of adversaries precisely.Furthermore, with the rapid development of artificial intelligence, some well designed attacks may have the learning ability to evolve and mutate to avoid being detected[107]. Modeling of such a sophisticated attack is a difficult task [108]. This may be one of the potential reasons that very few works on secure consensus of heterogeneous MASs with Deception attack have been reported, not to mention a more sophisticated attack. A possible method to deal with those sophisticated attacks is from the game-theory perspective,which has been shown to be effective in dealing with the attacks in smart grids [109], [110]. Note that it is possible to design a perfect defense system only when the attacker’s dynamic behavior can be predicted and modeled accurately.Thus, the security analysis of MASs subject to more sophisticated attacks deserves further research attention as it is the first stage to deal with attacks.
A successful working MAS relies on healthy components and a secure communication environment. However, most of current attention has only focused on the physical safety or cyber security individually, instead of both simultaneously.With the development of network and communication technologies, the boundaries between the physical world and the cyber world have been blurred. Furthermore, the adversaries may exploit security vulnerabilities to gain control over some of the sensors and actuators, MASs may crash quickly when they are subject to physical and cyber threats at the same time [111]. Therefore, a collaborative defense strategy must be established from both the physical and cyber levels to better protect MASs, which could be a second potential research direction. However, it is a challenging task,because we have not fully understood the physical world, let alone the cyber world. A possible method to deal with unknown physical and cyber securities is to use the current advanced artificial intelligence technology by collecting the large amount of running data; see the recent survey papers[112], [113].
The MASs are safety-critical systems that should be designed with time-efficient attack detection and defense. In other words, malicious attack signals should be detected in a timely manner. However, most of current researchers only focused on whether malicious attack signals can be detected or not, with very little attention being paid on how fast malicious signals can be detected. The survey paper [9] has presented a few interesting results on finite-time consensus of MASs, which may be helpful to provide a new insight for development of time-efficient defense approaches for MASs with desirable detection speed and performance. To the best of our knowledge, such a kind of finite-time malicious signal detection and secure consensus is still not solved yet, which could be a good research direction in this area.
In most existing works, only theoretical results are reported with some simulations. For example, a simulation study was performed for the protection of power system in [114]. These results are still far away from real applications. Designing a real testbed to verify the effectiveness of existing theoretical results is an urgent task as this is the first step in developing practical techniques to protect actual systems. Note that some testbeds have been designed for industrial cyber-physical systems, see the water distribution testbed [115], power generation station testbed [116], the teleoperation system [96],etc. But, the development of the system testbeds is still very limited and the focus is only paid on detecting the attack without designing any defense mechanism. It should also be a challenge when the physical threats and cyber threats are simultaneously considered in the design of testbeds and defense strategies, which could be a fourth potential research direction.
IEEE/CAA Journal of Automatica Sinica2021年2期