Institutionalized and systematized gaming for multi-agent systems

2022-11-17 17:13:31JunLUFeiYueWANGQiDONGQinglaiWEI

Frontiers of Information Technology & Electronic Engineering 2022年7期

Jun LU，Fei-Yue WANG，Qi DONG†‡，Qinglai WEI

1China Academy of Electronics and Information Technology，Beijing 100049，China

2School of Artificial Intelligence，University of Chinese Academy of Sciences，Beijing 100049，China

Multi-agent system gaming (MASG) is widely applied in military intelligence，information networks，unmanned systems，intelligent transportation，and smart grids，exhibiting systematic and organizational characteristics.It requires the multi-agent system per‐ceive and act in a complex dynamic environment and at the same time achieve a balance between individual interests and the maximization of group interests within the system.Some problems include complex system structure，uncertain game environment，incomplete decision information，and unexplainable results.As a result，the study of multi-agent game has transformed from a traditional simple game to a game facing a high-dimensional，continuous，and complex environ‐ment，which prompts an urgent need for institutional‐ized and systematized gaming (InSys gaming).With this background，several important tendencies have emerged in the development of InSys gaming for multiagent systems:

1.Analyzing the evolution law of MASG and establishing the InSys gaming theory model for multiagent systems

The organized and systematic MASG has orderly and structured characteristics，so it is necessary to es‐tablish a system game model.To study political，mili‐tary，economic，and other systemic confrontation gam‐ing problems，the first step is to analyze the system’s internal evolution characteristics and external interac‐tion information.In addition，establishing the evolu‐tion model of InSys gaming and studying the ele‐ments，relationships，and criteria of the game evolution help provide theoretical support for the system design，decision-making planning，and other research in this field.

2.Combining several artificial intelligence learn‐ing algorithms to achieve collaborative decision-making of multi-agent systems

The current mainstream artificial intelligence learning methods all have application advantages in specific scenarios.In solving InSys gaming pro‑blems，we can combine the environmental representa‐tion ability of deep learning and the decision genera‐tion ability of reinforcement learning (RL).For ex‐ample，by building a digital simulation training envi‐ronment，intelligent decision algorithms and unsuper‐vised training methods can be designed to generate a multi-agent system’s collaborative decision in a com‐plex and unknown environment.

3.Adopting a hierarchical task planning and decision-making action architecture to reduce the com‐plexity of collaborative decision-making algorithms

With the increase of the scale of multi-agent sys‐tems，the problems of node coupling，observation un‐certainty，and interaction disorder faced by collabora‐tive decision-making have become increasingly promi‐nent.The complexity of solving its systematic and organized game problems has increased significantly.A multi-agent hierarchical algorithm architecture is constructed through game task decomposition，longterm planning，and real-time action decision-making.It can effectively reduce the complexity of the search process of a collaborative decision-making algorithm.In addition，it is a feasible idea for solving an orga‐nized and systematic game.

4.Establishing the robustness analysis frame‐work of the algorithm model to solve the model devi‐ation between data-driven methods and the actual scene

When the training data deviates from the actual scene for data-driven methods，the algorithm’s per‐formance will be degraded.Thus，it is necessary to study the robustness analysis framework of data-driven methods.For example，a robust algorithm model and an actual data fine-tuning method are designed to re‐duce the performance loss of the trained algorithm.This strategy helps support the actual deployment of data-driven methods.

Game theory has become a basic analytical frame‐work for solving problems in strategic politics，mili‐tary confrontation，market economy，and so on.The object of analysis is characterized by complex sys‐tematization and organization and has been highly concerned with and valued by academic and industrial circles alike.A multi-agent system is used to model the organized and systematic game，combined with an artificial intelligence method to solve the game decision-making problem，providing a new idea for developing theories，methods，and technologies in this field.

In this context，the journalFrontiers of Informa‐tion Technology &Electronic Engineeringhas orga‐nized a special feature on institutionalized and sys‐tematized gaming (InSys gaming) for multi-agent systems.This special feature covers multi-agent evo‐lutionary game，unmanned aerial vehicle (UAV) for‐mation control，autonomous multi-agent planning，collaborative control，swarm intelligence，and multiagent RL framework design.After a rigorous review process，eight papers have been selected，including two perspective articles and six research articles.

Jun LU and his collaborators explored the exis‐tence and practice of games from the understanding process of gaming，elaborating the difficult problems of a complex and changing environment，dynamic heterogeneity of systems，and limited computational and perceptual capabilities of a single individual in multi-intelligence games，and proposed a theore‑tical framework of multi-intelligence，multi-agent evolutionary games.Multi-agent evolutionary game application practice was introduced with the nextgeneration early warning and detection system as an example，which is essential for studying organized and systematic game behavior in a high-dimensional complex environment.

To provide a quick overview of multi-agent study with a particular focus on agent collaboration and gaming，You HE and his collaborators reviewed multiintelligence collaboration and gaming technologies from three perspectives:task challenges，technical directions，and application areas.Therein，typical re‐search problems and challenges in recent work on multi-agent systems were analyzed，some promising research directions on multi-agent collaboration and gaming tasks were discussed，and the outlooks on the application directions in this field were given.

Multi-agent cooperative games provide an effec‐tive tool for studying multi-agent optimal control problems，relying on solving the coupled Hamilton–Jacobi (HJ) equations.Hongyang LI and Qinglai WEI proposed a new optimal synchronization con‐trol method with input saturation to address the cou‐pled HJ equations that limit the applications of coop‐erative game theory in synchronization control pro‑blems.They transformed the optimal synchronization control problem into a multi-agent nonzero-sum game problem by introducing the multi-agent theory and then solved the Hamilton–Jacobi–Bellman(HJB)equa‐tion with a non-quadratic input energy term to achieve the Nash equilibrium.Meanwhile，they proposed a new model-free off-policy RL method that allows the iterative control rate to converge to the Nash equi‐librium without considering the system model infor‐mation，and provided methodological support for simultaneous control of the multi-agent system with saturated inputs.

Haibin DUAN and his collaborators investigated a distributed game strategy for UAV formation with external disturbances and obstacles.Their strategy in‐volves a distributed model predictive control (MPC)framework and Levy flight based pigeon inspired opti‐mization (LFPIO).First，they proposed a non-singular fast terminal sliding mode observer(NFTSMO)to esti‐mate the influence of disturbances.Second，a distri‑buted MPC framework was established，where each UAV exchanges messages only with its neighbors.Moreover，the cost function of each UAV was de‐signed，by which the UAV formation problem was transformed into a game problem.Finally，LFPIO was developed to solve the Nash equilibrium.Numerical simulation was conducted，and the efficiency of LFPIObased distributed MPC was verified through compar‐ative experiments.

Multi-agent RL is challenging in practice，par‐tially because of the gap between simulated and realworld scenarios.Jian ZHAO and his collaborators de‐rived a formal concept of a cooperative multi-agent RL system with unexpected crashes to address this problem.They designed a virtual coach-assisted multiagent RL framework，which can further stimulate the phenomenon that agents in actual operation may“crash”unexpectedly during coordination，and pro‐vided a research framework for solving the discrepancies between multi-agent system simulation and the reality.

Xiwang DONG and his collaborators investigated the multi-agent differential game problem and its ap‐plication in cooperative synchronization control.A systematic design and analysis method of multi-agent differential game was presented，and a data-driven approach based on RL technology was given.First，it was established that distributed controllers have diffi‐culty reaching the global Nash equilibrium for differ‐ential games due to the coupling of networked inter‐actions.Second，alternative local Nash solutions were derived by decomposing the game problem and de‐fining the concept of best response.An off-policy RL algorithm using adjacent interaction data was then constructed to update the controller without a system model，demonstrating stability and robustness.A global Nash equilibrium was achieved by modifying the differential game configuration of the coupled ex‐ponential functions.At the same time，distributedcooperative control ensured stability，and an equiva‐lent parallel RL method was developed.Simulation re‐sults illustrated the effectiveness of the learning pro‐cess and the strength of synchronous control.

Sliding mode control (SMC) has significant ad‐vantages against system uncertainty and the influence of external disturbances，especially for the closed-loop system.Ruizhuo SONG and his collaborators devel‐oped a new consistency control scheme for the finitetime leader-follower consensus of discrete secondorder multi-agent systems under external disturbance constraints.The traditional sliding-mode convergence law was used to create the adaptive controller，which can effectively reduce the jitter and invariance phe‐nomena of perturbations.At the same time，finite-time stability was proved using discrete Lyapunov func‐tions，which provides theoretical support for the solu‐tion of the consistency problem of infinite time for multiple intelligence.

The cooperative multi-agent planning problem is one of the representative tasks that can reflect the coordination and cooperation ability of multi-agent systems.Weining LU and his collaborators combined the graph neural network (GNN) with a task-oriented knowledge fusion sampling method to address this problem.They proposed a new collaborative plan‐ning architecture which can build a general model for the collective planning of any number of agents，and designed a task-oriented sampling method aggre‐gating available knowledge from specific directions.This strategy provides a research framework for multiagent game collaborative planning problems in un‐known complex environments.

Overall，a broad spectrum of current research topics relevant to the theory and techniques of InSys gaming are covered in this special feature，from multiagent system gaming theory and application to InSys gaming methods and others.We hope that this collec‐tion of diverse but interconnected topics will benefit those interested in InSys gaming or related areas.

Finally，we would like to express our special gratitude to the authors and reviewers for their sup‐port and valuable contributions to this special feature，the Editors-in-Chief Profs.Yunhe PAN and Xicheng LU，and the editorial staff.

Frontiers of Information Technology & Electronic Engineering2022年7期

Frontiers of Information Technology & Electronic Engineering的其它文章: Perspective:Prospects for multi-agent collaboration and gaming:challenge，technology，and application*; Efficient decoding self-attention for end-to-end speech synthesis*; Cellular automata based multi-bit stuck-at fault diagnosis for resistive memory; Enhanced solution to the surface–volume–surface EFIE for arbitrary metal–dielectric composite objects*; Review:Light field imaging for computer vision:a survey＊#; Cooperative planning of multi-agent systems based on task-oriented knowledge fusion with graph neural networks*