Orientation and Decision-Making for Soccer Based on Sports Analytics and AI: A Systematic Review

2024-01-27 06:49ZhiqiangPuYiPanShijieWangBoyinLiuMinChenHaoMaandYixiongCui

IEEE/CAA Journal of Automatica Sinica 2024年1期

Zhiqiang Pu , Yi Pan , Shijie Wang , Boyin Liu , Min Chen , Hao Ma , and Yixiong Cui

Abstract—Due to ever-growing soccer data collection approaches and progressing artificial intelligence (AI) methods, soccer analysis, evaluation, and decision-making have received increasing interest from not only the professional sports analytics realm but also the academic AI research community.AI brings gamechanging approaches for soccer analytics where soccer has been a typical benchmark for AI research.The combination has been an emerging topic.In this paper, soccer match analytics are taken as a complete observation-orientation-decision-action (OODA) loop.In addition, as in AI frameworks such as that for reinforcement learning, interacting with a virtual environment enables an evolving model.Therefore, both soccer analytics in the real world and virtual domains are discussed.With the intersection of the OODA loop and the real-virtual domains, available soccer data, including event and tracking data, and diverse orientation and decisionmaking models for both real-world and virtual soccer matches are comprehensively reviewed.Finally, some promising directions in this interdisciplinary area are pointed out.It is claimed that paradigms for both professional sports analytics and AI research could be combined.Moreover, it is quite promising to bridge the gap between the real and virtual domains for soccer match analysis and decision-making.

I.INTRODUCTION

SOCCER is arguably the most popular but challenging sport in the world.In recent years, automated soccer match analysis, evaluation, and decision-making have received increasing interest from not only the realm of professional sports analytics [1], [2], but also the academic artificial intelligence (AI) research community [3], [4].More and more data collection devices have been applied for soccer match training and broadcasting, such as video capturing systems,optical tracking systems, and wearable devices.In addition,vital progress has been constantly made in the AI research community.Successful applications of computer vision (CV)and natural language processing (NLP) technologies represent the maturity of perception intelligence, especially with recent large-scale models [5]–[8].Concurrently, the development of deep reinforcement learning (DRL), such as Alpha-xof DeepMind [9]–[11] and the series OpenAI algorithms[12]–[14], have contributed a large amount to decision-making intelligence.These theoretical developments have naturally spread to soccer analytics as one of their downstream domains.

The reason that AI for soccer can result in great achievements may be threefold.First, AI can indeed improve soccer analysis and evaluation performance in practice.In the last few years, new evaluation models such as expected goal (xG)[15]–[17], expected procession value (EPV) [18], and pitch control (PC) models [19] have been proposed, which can directly help soccer coaches obtain more critical latent information and improve player skills and team tactics.Secondly,abundant soccer data has been prepared, which is the foundation of AI applicability.More importantly, soccer, like other sports items, offers a highly standardized scenario; thus the collected data could also be highly standardized, which is beneficial for AI application.At last, sports analytics have a long history where previous AI technologies such as CV and statistical learning methods have been popular in both commercial and academic realms.A reason for the popularity in use of AI technologies for soccer professionals is that AI models only provide different (although usually more insightful) pieces of advice for the experts, which makes potential model inaccuracy more tolerable.

Meanwhile, for the AI research community, soccer also provides a representative benchmark.From a multi-agent decision-making perspective, soccer poses the following huge challenges.i)Modelling challenge: Soccer features more onpitch players (22), a longer match period (90 min for a regular match), and a larger pitch size compared with other sports.In addition, the match process is highly dynamic with diverse uncertainties.ii)Cooperation challenge: Soccer matches rarely involve goals scored, which is a typical sparse reward problem.In addition, a goal scored or conceded can result after a long match period and the participation of many teammates and opponent players, which poses a complicated credit assignment issue for effective cooperation.iii)Interpretability challenge: Unlike other video game problems such as Starcraft [10] or Dota [12], the analysis and decision-making models for soccer should be more interpretable.The results should coincide with inherent patterns of soccer sports and be easily understood by relevant practitioners, coaches, players, scouts,and even audiences.Due to these reasons, soccer has been taken as a typical challenging scenario that is of milestone significance to the AI community as that of Go in 2016 [9].Therefore, it is not surprising that there are increasing research publications in not only professional sports conferences (e.g.,MIT Sloan Sports Analytics Conference) but also representative AI academic conferences (e.g.,ACM SIGKDD Conference on Knowledge Discovery and Data Mining).In 2019,Google released a RL training environment, i.e., Google Research Football (GRF) [4].Until now, GRF has been a fundamental benchmark for multi-agent RL (MARL) research.

A soccer match is essentially a game about command and control (C2), i.e., controlling the right player so that they come up in the right place at the right time with the right action.A general C2 process can be modeled by a typical observation-orientation-decision-action (OODA) loop model[20] which was originally created from observing and examining fighter pilots in aerial combat.In this paper, we creatively adopt the OODA loop to describe the soccer match analysis,evaluation, and decision support process.Hereobservationmeans the collection of soccer data.Since there have been abundant readily available technologies for data collection such as positional sensing and motion capturing and it is often conducted by commercial companies, the observation stage is not extensively discussed in this paper.Theorientationvaries from simple data statistical analysis and big data mining to deep prediction, evaluation, and reasoning based on current match status and the upcoming match tendency.Most current soccer analytics and AI models fall into this stage which offers critical inputs for subsequent decision-making.Thedecisionis the ultimate stage that soccer analytics achieves.In some cases, especially in a virtual match environment such as GRF, an algorithm can be designed to directly command and control a given agent (player); but in most cases, in practice,algorithm outputs are adopted as decision support for soccer analysts and coaches, or to conduct a step-by-step inference or counterfactual reasoning.Lastly,actionis determining how to execute the decision commands.In practice, it is implemented through daily training and may not be the primary concern of a soccer analyst.In a virtual soccer environment,the action is often merged into the decision stage.Therefore,in this paper, we conduct a comprehensive literature review mostly considering theorientationanddecisionstages in the perspective of the OODA loop.

Other than the different problem stages in the OODA loop,another issue that has received far less discussion is the domain that a soccer match is in.Here the domain indicates whether the match discussed happens in the real world or in a virtual environment.This taxonomy may bring confusion because without a doubt most studies investigate real-world matches, that is, the data are collected from real matches and then orientation and decision support models are proposed based on real-world data to characterize real-world players and teams.This paradigm dominates almost all sports analytics realms and most of the data science research for professional soccer analysis.However, as emerging AI technologies such as DRL have flourished in recent years, the evolving mechanism of an intelligent agent by interacting with an elaborately designed virtual environment has been noticed.In this new paradigm, an active agent collects data from virtual matches and learns different decision strategies autonomously.It opens a vast opportunity since costly data collection becomes quite convenient and the matches, players, and teams to be examined can be easily assigned.This paradigm is especially popular in the emerging MARL research community,with the help of virtual environments such as GRF [4], [21],[22].In 2020, Google was working with the world famous Manchester City F.C.and presented a Kaggle soccer AI competition using GRF [23].As the officer from Manchester City F.C.pointed out, “GRF provides us with a new place to learn through simulation and offers us the capabilities to test tactical concepts and refine principles so that they are strong enough for a coach to stake their career on”.One could argue that virtual matches may be different from real-world matches and the learned virtual strategies cannot be applied to realworld games.However, three facts help us understand that the gap may not be as big as we assume.For one thing, virtual engines are developed quickly and virtual environment modeling is becoming more and more accurate.A piece of evidence is that many professional soccer scouts start to evaluate players with the aid of soccer game evaluation systems such as FIFA Online.In addition, from a macroscope perspective, the gap between real and virtual matches may be acceptable.Here the macroscope perspective means the overall team formation,players’ trajectories, off-ball running rules, and cooperation towards a given tactic.These macroscope elements are more robust and suffer only small impacts from each player’s personal skills and uncertain temporary action decisions.At last,Sim2Real has been an emerging topic and there are many methods to deal with the transfer of a virtual strategy to a realworld application [14], [24], [25].For these considerations, it is valuable to discuss both real-world and virtual soccer matches.

Based on the OODA loop and the classification of realworld and virtual domains, in this paper, we conduct a systematic review of the intersection of these two taxonomies (see Fig.1).For both real-world and virtual matches, data collection is the foundation.For real-world applications, a large volume of data has been collected by professional companies,although they may not be easily obtained for the public, especially fine-grained optical tracking and physical data.For virtual games, it is convenient to obtain the data by interacting with the virtual environment.The orientation issue has been excessively discussed for real-world games.Many physical rule-based models and statistical models based on big data mining have been proposed.However, algorithms may not be directly used for decision-making in the real world.Some rare attempts have been made for penalty kick analysis [26], for example, but most models are used for decision support.On the contrary, for virtual soccer, orientation receives less attention and many algorithms are in an end-to-end mode, i.e.,directly generating diverse decision strategies using original state inputs with abundant rule-based and machine-learning methodologies.

Fig.1.Overall skeleton with the intersection of OODA loop and real-virtual domains.

To sum up, the objective of this paper is to conduct a comprehensive literature review of soccer analysis, evaluation,and decision-making.It is an emerging topic to combine conventional sports analytics and AI research.We have some (but rare) earlier reviews for reference.For example, in [1], a systematic review is made for professional soccer tactical performance analysis from domains of both sports science and computer science.However, it is from a big data mining perspective.In our paper, this kind of research falls into the orientation stage.Reference [2] comprehensively discusses how to evaluate the sports analytics models and indicators generated from machine learning methods.In addition, [3] discusses what AI can do for soccer and what soccer can do for AI.Approaches are categorized into three types: statistical learning, computer vision, and game theory.It is claimed that pertinent research problems lie in the intersection of the three areas.However, all these early discussions are only concerned with real-world matches.The virtual cases and the OODA loop framework are not involved.Therefore, the contributions of this paper are threefold.

1) Both professional sports analytics and AI research realms are covered.While these two separate areas have independently gained lots of success, their inherent paradigms are different.We attempt to point out some promising directions to combine these paradigms.

2) An OODA loop concept is adopted to characterize the match analysis, evaluation, and decision support process.This taxonomy helps better understand the inherent connections of different research and build up a roadmap for the overall research framework.

3) Both real-world and virtual soccer matches in data supports, models, and applications are discussed.We claim that it is quite promising to bridge the gap between these two domains.

In the following, we first present the available data for realworld soccer and typical simulation environments for virtual usage.Orientation models are followed, including both knowledge-based and data-driven models.Then decisionmaking strategies are reviewed.For the real world, some work evaluating specific state and action values for decision support are introduced.For virtual matches, abundant works of literature based on diverse virtual environments are included.At last, some discussions are made about further challenges and promising directions.

II.AVAILABLE DATA

A. Real-World Data

The following contents describe the data collected from the real world that is available for soccer research.It includesevent data,tracking data, andother data(see Table I).

1)Event Data: Event data is a chronological record of match events.Each event comprises structured information about the event type (pass, shot, foul, tackle, etc.), timestamp,involved player, the position of the player, and other information (e.g., passing angel).The frequency of the event data record is not constant.On average, the intervals are a few seconds between two records.Furthermore, there is no standard terminology or definition to describe events.It varies by data vendor.Some researchers, therefore, propose methods to describe event data by valuing the actions contained in events,regardless of their different formats [55], [56].Mainstream event data vendors include Opta, WyScout, and StatsBomb.Some open-source datasets can be found at [27], [28].Abundant research is based on these datasets as they provide a large volume of event data for almost 2000 matches.

Event data is mainly obtained by manual annotation.There are also automatic annotation methods based on machine learning.Some of them use broadcast video for event detection.A classic architecture of the approach using broadcast video follows the procedure of segmenting the video into fragments (usually based on shots) and then classifying the fragments into events.In [29], the authors split a broadcast video into many shots based on its camera motion features (e.g.,when a goal happens, there is always a replay), then extract features from the shots, and finally input the features into a classifier and obtain event types.A similar architecture is also used in [30].In addition to visual features, textual and audio features of the commentary in a broadcast video can also be useful [3].Some other automatic annotation methods use multi-camera data.For example, [31] proposes a system for automatic event data detection.The system detects players and referees first and then identifies their actions.Finally, a rule-based event recognition is constructed to map actions into event types.It is worth noting that automatic annotation is convenient but usually limited in the types of events that can be detected.

2)Tracking Data: While event data only provide the records of key events during a match, tracking data densely records the position of players and ball on the pitch with a fixed frame rate (e.g., 10 frames per second) at each timestamp.However, fine-grained tracking data is not easily obtained due to its increased requirement for the tracking equipment and algorithm.Especially for public research, the available open-source datasets are very limited.Main tracking data vendors include SkillCorner [32], Second Spectrum[33], Stats Perform [34], Metrica [35], and Signality [36].Tracking data can be costly, hence open-source data is often limited.SkillCorner has provided broadcast tracking data for nine matches [37], while Metrica has provided tracking data for three matches [38].

Methods for collecting tracking data fall into two categories:optical-based and wearable device-based.The optical-based methods usually need to set up multiple cameras on the pitch.Each camera tracks players and the ball.Then, the spatial positions of targets are recovered according to the geometric constraint of multiple cameras [39], [40], [57].Broadcast video is another source for data collection.As a broadcast video usually captures part of the pitch, only the objects in the scope could be precisely tracked [41].For those off-screen objects, usually, a well-trained model is used to predict their trajectories [42].In contrast to optical-based methods, wearable device-based methods collect data in a more straightforward way.To track players’ trajectories, players generally need to wear locating devices, so that the global positioning system [43], [44] or ultra-wideband-based tracking systems[45] can locate the players to obtain highly precise tracking data.

Tracking data include an enormous amount of information.Besides positional information, one can obtain speed, acceleration, distance, and other sports performance information after simple processing.Moreover, tracking data inherently contains players’ strategies and team tactics, presenting a challenge for researchers to mine this high-level knowledge.

3)Other Data: In addition to event and tracking data, there are numerous other data that facilitate soccer analysis, such as physical performance data (e.g., biometric data recorded by wearable device [58] for evaluation of player’s overall ability[59]) and social media data (e.g., commentary, news, and tweets).We categorize these data as other data.In [30], audio commentaries are utilized to assist in event detection.In [60],Twitter data duringWorld Cup Soccer 2014are used to analyze the sentiment of people throughout the world.Reference[61] uses both audio and video to improve the accuracy of action spotting and classification.In the future, with the rise of multi-modal large models (e.g., OPT [5], OFA [6], DALL-E[7], PaLI [8]), available data that cover text, audio, and image may unleash even more potential.

B. Virtual Data

The acquisition of soccer datasets in the real world may be hampered by various factors, such as the heterogeneity of data collection protocols, the cost and effort required for obtaining high-quality data, and the limitations in terms of the coverage and granularity of existing datasets.Developing a soccer simulation platform to promote the research of soccer analysis is a good way to remedy the above problems.Some famous soccer video games also help generate useful data.

Morraet al.[48] experiment with a much larger soccer event recognition (SoccER) dataset comprising 500 min of gameplay.This dataset consists of artificial (simulated) soccer matches obtained with an open-source Gameplay Football engine.This engine is further developed into an environment that supports RL training, i.e., the GRF environment [4],where agents are trained to play football in an advanced,physics-based 3D simulator.The players are controlled by AI algorithms and can perform 19 actions such as running, kicking and tackling.This platform attracts the attention of Manchester City F.C.and holds a Kaggle AI soccer competition [23].The competition aims to promote machine learning and other advanced technologies to analyze and understand the complex dynamics of soccer, with the goal of helping players and coaches optimize their training and performance.After the competition, they collected abundant strategies from more than 1000 teams and provided a rich simulation soccer dataset [49].

The RoboCup is a historic robotic soccer competition.Both competitions that use physical robots (humanoid and wheeled)and virtual simulation environments are involved.The RoboCup Simulation League (RSL) focuses on promoting team strategy with diverse AI technologies.In the absence of actual data, numerous academics have produced many useful soccer analytic conclusions using the data generated by this simulation platform [50], [62]–[64].Michaelet al.[50]describe a large dataset from games of some of the top teams(from 2016 and 2017) in RSL (2D), where teams of 11 robots(agents) compete against each other.Abreuet al.[64] use RoboCup data to assess team performance.In [65], the authors present a benchmark dataset for evaluating ball detection algorithms derived from RSL.Besides RSL, OpenAI provides a physics-based soccer simulation environment called MuJoCo Soccer [51], which is implemented using the MuJoCo (Multi-Joint dynamics with Contact) physics engine,allowing for the simulation of complex and realistic physics interactions.

Additionally, there are many famous soccer video games available, both commercially and as open-source projects,such as Pro Evolution Soccer [52] and FIFA [53], [54].In[52], player attributes in Pro Evolution Soccer 2018 are used to solve team composition problems.In [53], human experts’behaviors are adopted to train the agents in a dynamic learning scenario of FIFA based on imitation learning.In [54], the authors adopt FIFA game data to help quantify soccer players’marketing values for transfer negotiation support.

However, the value of soccer data gathered in a virtual environment is primarily dependent on the degree to which the strategies in virtual games might resemble actual strategies.This is an unexplored area regarding Sim2Real issues, which will be excessively discussed later.Scottet al.[66] examine the play style characteristics of soccer RL agents and they find RL agents’ play styles become similar to real-world players as the agent becomes more competitive.Therefore, it may be possible that an agent trained in a virtual environment can learn to play soccer with the aid of progressing AI techniques(such as RL), elaborately designed virtual environments, and well-defined application scenarios.

III.ORIENTATION MODELS

We define orientation models as those that aim to analyze and evaluate a match, the performance of a single player or the whole team, and to predict the upcoming tendency of a match.A larger pitch size, a longer time period, complex uncertainties of the match progress, and more players make modeling soccer matches much more difficult than any other kind of sport.For match evaluation, in addition to winning percentages and scores, various evaluation indicators are created to evaluate soccer matches from various aspects, such as expected goals, expected threats, and expected assists.Determining how to design more indicators from novel aspects to evaluate players, teams, and tactics based on various types of soccer data is still a great challenge.

With the development of AI technology, the solutions to the above challenges are constantly iterating.On the one hand,more research on sports analysis helps directly utilize expert knowledge for soccer match modeling.On the other hand, the growth in soccer match data and the development of machine learning algorithms have led to improved accuracy in using a large volume of soccer data for match modeling.In addition,the combination of expert knowledge and machine learning methods increases the credibility of the model for soccer coaches and players.

According to the different ways of solving the above challenges, orientation models can be divided into three categories: knowledge-based models, data-based models, and integrated knowledge-data-driven models.The methods based on different mechanisms discussed in this section, as well as their main areas of application, are listed in Table II.

A. Knowledge-Based Models

Due to the complexity of soccer matches, it is difficult to conduct an analysis of every detail.Most researchers focus on evaluating certain scenarios, for example, the spatial advantage of a given match status and key events during matches.Specifically, soccer is essentially a game of creating and leveraging proper pitch space.In addition, since passes are the most frequent events in a match, pass analysis is another hot topic for research.Here we give some representative models on spatial advantage evaluation and pass analysis.

1)Spatial Advantage Evaluation: Spatial advantage models aim to evaluate the value of on-ball and off-ball running actions and the athletic ability of players in soccer matches.A classic spatial dominance model divides the pitch into dominant regions based on the Voronoi diagram [91].Specifically,the area where a specific player can reach before other players is defined as the dominant region of this player.This model can be used to analyze the dynamic characteristics of offensive and defensive formations [92], to measure spatial interaction behavior [93], and as a feature of other data-based models [84].However, there are still several defects in this dominant region model.On the one hand, the computational complexity will become unbearable when a kinematic model of the player as complicated as that of a real player.Hence some researchers propose approximation algorithms to reduce computational complexity [94].On the other hand, it does not take into account the different importance of different areas on the pitch.As an example of improvement, [95] gives higher weight to areas close to the goal.

Pitch control [19] is another classic spacial advantage model.Firstly, it utilizes a multidimensional normal distribution to model the position of a player and then to characterize its individual influencefi(p,t) as

whereirepresents the player index,pa given position on the pitch,COVis the covariance matrix,sithe speed of playeri,and µi(si) is the position mean value that can be calculated by assuming the player runs at the current speedsifor a given time period.Using multidimensional normal distribution to model individual influence is consistent with intuitions: i)Spreading out from the starting position along any direction,the spatial advantage gradually decreases; ii) The difference in the decay speed of individual influence in different directions is related to the player speed.Then the individual influence at the given locationpis normalized by the value of current locationpi(t)

Finally, the advantage of any point on the pitch of the home teamPC(p,t) can be expressed as

whereiandjrepresent the players of the home team and away team.σis the logistic function for normalization.Compared with the spacial advantage model based on the Voronoi diagram, pitch control can give the advantage value with continuous value.In other words, the former model only considers the player with the greatest influence on any point, while pitch control takes into account the individual influence of all players when calculating the spatial advantage value.

2)Pass Analysis: Similar to the idea of the spatial advantage model based on the Voronoi diagram, [96] proposes a model to predict whether a pass will be completed or not.For a specific direction of a pass, it models the motion of the ball and players respectively.For a given timetafter the pass, the position of the ball and the reachable area of each player can be calculated, where the reachable area is the collection of the reachable points of each player within timet.Then the first player whose reachable area intersects the trajectory of the ball will be the predicted ball receiver.If the player belongs to the same team as the ball sender, the pass can be completed,otherwise, it fails.Fig.2 shows the prediction results of passing the ball in different directions.For the given state, the home team holds the ball.The red fan-shaped area in this figure indicates the direction of successful passes, while the blue area indicates that the pass will be intercepted.

In order to make the predicted result of the pass more realistic, [97] predicts the result of the final pass by introducing two concepts: the time to intercept and the time to control.Specifically, the time to intercept is used to calculate whether the player can intercept the ball on the trajectory of the ball.Different from the model proposed in [96], it does not simply compare whether the shortest time for the player to reach the point of ball trajectory is less than the time required for the ball, but introduces additional uncertainty with the Logistic distribution to model the probability of interceptPint(T)

wheretintrepresents the minimum time required of the player to reach the point where the ball is at timeT, whereσis the temporal uncertainty.The time to control is introduced to characterize the idea that the player intercepting the ball needsadditional time to completely control the ball.During a time intervalt, the probability that a player is able to control the ballP(t) is characterized as

TABLE II CLASSIFICATION AND DESCRIPTION OF ORIENTATION MODELS

Fig.2.The results of passing the ball in different directions in the given state [96].

whereλis the control rate parameter.Finally, the probability that a playerjis able to obtain the ball possession in a complete pass will be calculated by taking into account the time to intercept and the time to control according to

Besides spatial advantage evaluation and pass analysis,knowledge-based models are widely applied to soccer analysis.Table II gives a comprehensive survey.For example, in[77] a player vector model is proposed to characterize the playing style of a player.Event stream data of all Premier League matches that Liverpool Football Club participated in from 2017 to 2019 are adopted.At last, a vector of 18 dimensions is generated to evaluate the player performance.

With the representative models mentioned above, it can be seen that the knowledge-based models are often heuristic with good interpretability.However, they also show weaknesses in their lack of full consideration of related influencing factors,which limits their accuracy and scope of application.Databased models have the potential to remedy the above problems.

B. Data-Based Models

With the help of machine learning methods, soccer analysis has been further developed.Big data resource provides more novel application scenarios and smart research methods in the field of soccer.As mentioned in Section II, event and tracking data are the two types most commonly used in soccer analysis.Using different types of data, abundant models and analyses conclusions have been obtained.Therefore, this section categorizes models according to the type of adopted data.Although using different types of data and methods, databased research can generally be divided into the following application scenarios: i) Specific indicator evaluation, such as expected goals, and expected threats (xT); ii) Player performance evaluation, including evaluating individual passes,shots, and other behaviors of interest to researchers; iii) Team performance evaluation, such as pass networks, and tactical patterns.Among them, methods based on event data focus more on the evaluation of player and team behavior.The methods based on tracking data focus on the analysis of the tactical style and the overall situation of a match.By combining events and tracking data, richer information can be obtained and more analysis conclusions can be drawn.Moreover, for a given scenario, research with different types of data may draw conclusions emphasizing different aspects.For example, for calculating xG, the event data-based models often try to explain which key event affects the final goal.However, the models based on tracking data usually adopt the running and occupying positions of all players to obtain xG.Therefore, this subsection presents different research aspects of soccer analysis categorized by the type of data.

1)Models Based on Event Data: This part provides a general review of the models based on event data according to different application scenarios.The xG quantifies the value of the scoring opportunities that a team or player created.Early research divides the pitch into grids and makes use of statistical methods to calculate xG for each grid, i.e., by counting shooting and scoring events from event data [100].Then the machine learning methods replace the traditional statistical methods, predicting scoring opportunities of shot events [69],[76], [78], [101].Whether it is a decision tree model or a classifier model, these methods utilize features constructed from a large amount of event data such as the type of events, the distance and angle to the goal, and the positions of the events as inputs.

The event of scoring a goal is quite sparse.A complementary indicator that offers denser guidelines is xT which evaluates the current pitch status or tendency with the value quantifying to what extent the team can create a threat to the opponent in the possession sequence.Reference [102] divides a match into segments of possession, in each of which one team has control of the ball until the possession is finished due to a shot or the ball that is intercepted by the opponent.The essence of xT is that players perform actions with the intention to move the match into a state in which they are more likely to score, thus creating a threat to the opponents [103].Based on the xT model, most research focuses on defensive action analysis during a match.Reference [79] utilizes a deep learning model to study the threat of passes and predicts the impact of specific defensive actions.

Although scoring goals is a direct way of evaluating a match, it is equally significant to create scoring chances for teammates.Therefore, many studies focus on evaluating assists or key passes before a shot, in order to determine whether a player’s pass helps score.The model of expected assist (xA) measures the likelihood that a given pass will become a goal assist [104], [105].An expected assist value is assigned for every completed pass.Evaluating the assist ability of different players helps credit creative players that make key passes.Further, xGChain and xGBuildup evaluate the preassist, pre-pre-assist, or pre-pre-pre-assist to highlight attacking contributions in the long pass chains [106].

Based on the rich pass, dribbling, and other information in the event data, a large number of studies have been devoted to the evaluation and analysis of individual players’ performance.For instance, [71] divides a soccer match into a series of possession sequences, then computes the difference between the values of each possession sequence before and after a pass to measure soccer players’ on-ball contributions using play-by-play event data.Many studies use the same model to evaluate different types of action performance, such as giving different evaluation values to interception, shooting,and other actions, so as to comprehensively evaluate players or a match from multiple aspects [55], [56], [75].Some studies combine the evaluation of individual actions to form an overall evaluation of the players’ performance to rank players in the match [73], [107].In addition to the evaluation of an individual player, the analysis of strategy patterns lying in a team also has significance in evaluating the performance of a team [67], [68], [74].Identifying players’ specific behavior patterns during a match offers important insight for player scouting, player development monitoring, and match preparation [108].Taking scoring as an evaluation index, some studies more often take offensive actions from event data streams to find spatio-temporal patterns that characterize attacking tactics [72].

Although the event datasets provide rich information on key events for soccer analysis, due to some inherent defects of the event data, the developments of research methods and application scenarios are limited.For instance, the lack of information on off-ball players is one of the main flaws of the event data.In most cases, the running and cooperation of players without the ball play an important role in the tactics of the entire match.Also, although a game can record more than a thousand key events, for a 90-minute match, the time interval is long and it is still unable to completely cover all the required information during the match.

2)Models Based on Tracking Data:The studies based on tracking data mainly focus on the analysis of the overall tactics of a match with fine-grained information.Reference [80]analyzes the spatio-temporal patterns of a short window before a shot and accurately estimates the likelihood of chances to extract strategic features.By constructing finegrained features from tracking data, off-ball players’ contributions can be fully considered.Based on this kind of idea, some studies extend application scenarios, such as quantifying increasing opportunities resulting from off-ball actions, and automating talent identification by finding the players across an entire league [82].Although essential events such as passes are not explicitly recorded in the tracking data, there are still some studies that try to start with the tracking data to identify features, evaluate pass patterns, and classify pass qualities[18], [83].In addition to fine-grained evaluation, some studies try to obtain more macroscopic features from tracking data.For instance, [109] utilizes a formation descriptor to determine the identities of different teams from spatio-temporal player tracking data.

In recent years, more research has been performed to analyze teams’ overall tactics.However, it is not intuitive to accurately grasp complete match information only by using finegrained tracking data, while event data can help better characterize high-level tactical information.Therefore, models combining both event and tracking data are discussed next.

3)Models Combining Event and Tracking Data:Combining tracking data and event data can bring a promising development to soccer analysis.Making full use of key information in event data, supplemented by tracking data, enlarges the scope of match analysis.For example, taking full advantage of the event data and tracking data can lead to a more accurate evaluation of the quality of a goal-scoring opportunity, and it can also be applied to season analysis, match analysis, and player analysis [18].For the analysis of chances to score goals, [88] characterizes the passing tactics of players by predicting whether players can intercept the ball in a given time to compute the probability of a successful pass along a ball trajectory.Some research focuses on teams’ different styles of tactics, such as evaluating team defense from a comprehensive perspective related to team performance based on the prediction of ball recovery and situations being attacked [86].A few studies start with specific tactics to analyze defensive strategies, such as pressing.Similarly, [110] proposes a method to identify all pressing situations in a soccer match based on positional tracking and event data.Some studies use a novel approach to measure styles of play effectiveness and the effectiveness of team possessions [85], in order to make the analysis more intuitive and provide coaches with an enriched interpretation of specific match situations.Meanwhile, producing visually-interpretable probability surfaces for potential passes is an effective way to represent the likelihood of a team scoring or receiving the next goal at any time instance [87].

4)Models Based on Other Data:As mentioned in Section II, apart from tracking and event data, other data such as heart rate, and high-intensity running are more about players’ personal physical fitness and ability analysis on sports performance evaluation.Because physical data records the information of players at a microscopic level, related research focuses more on individual ability analysis, behavioral performance evaluation, and player value ranking.Many studies analyze the impact of technical and physical parameters, such as sprint and high-speed running distance and on-ball possession, in order to help soccer analysts and coaches to develop suitable strategies in a match [89], [90].Starting from different aspects of sports performance analysis, some works analyze the different needs of different players in training and customize special training plans for them [111], [112].In addition to players, much research also analyzes the personal abilities of referees to help guide their ability development and training [113],[114].

C. Integrated Knowledge-Data-Driven Model

Although data-based models have achieved more detailed and fine-grained analysis results than knowledge-based models, machine learning methods often weaken the interpretability of models and thus hinder the application of data-based models.In order to solve the above problem, integrated knowledge-data-driven models learn knowledge from data while maintaining the interpretability of the models by introducing knowledge.

EPV is often used as the expected outcome to represent the likelihood of a soccer possession ending in a goal scored or conceded.In [18], an EPV model proposed by analysts from Barcelona Football Club is constructed by prior knowledge which decouples the advantage of soccer matches through

whereρ, ς, andδrepresent a pass, shot, and ball drive respectively, whereEis the mathematical expectation.Withi=ρ,ς,δ,P(A=i) is the probability of the on-ball player performing actioniandE[X|A=i] is the advantage generated by the actioni.By decomposing the overall EPV into sub-modules regarding individual actions, one can then use different machine learning algorithms to learn each decoupled submodule.It allows EPV to evaluate the whole situation of a match, and explain whether a valuable situation is due to a good angle for a shot, a creative pass opportunity, or an unguarded dribbling opportunity.

Similar to EPV, the model proposed in [98] trains three models in a data-based way:?

1)xReceiver: Predicts the probability of that a player becoming the ball receiver of a pass.

2)xPass:Predicts the probability of a successful pass.

3)xThreat: Predicts the probability of performing a shot after a player receives the ball.

Fig.3.The values of xReceiver , xPass and xThreat in the given state [98].Red and blue dots represent home and away players respectively.The direction and length of the arrow represent the movement direction and speed of the player respectively.

As shown in Fig.3, these models can be used to explain the specific tactics of soccer matches by examining the changes in these three indicators, such as a man-to-man defensive strategy, ball-to-man defensive strategy, and off-ball running.Similar to EPV, specific knowledge is introduced to guide the task definition of data-based models for better model interpretability and practicability.

Reference [99] adopts a different idea, that is, introducing knowledge after training a data-based model.In this paper, a VAEP model is learned in a data-driven way for valuing individual actions.Then a concept of chemistry between players is constructed based on the VAEP model to represent the tacit understanding between players.The chemistry of two players in a season is calculated as

whereJOI90(p,q) represents the chemistry of playerpand playerqin a season,MINSm(p,q) the length of time that both players are on the pitch.This model can also be used to predict the chemistry between two players who have not yet played together, and further search for the best team composition with a set of players.

D. Opportunities and Limitations of Orientation Models

Based on the above discussion, orientation models have gone through several stages of development, that is, from simple statistical analysis to theoretical models based on expertknowledge, and then to data mining and machine learning models based on data-driven methods.Knowledge-based models are well interpretable but lack objective indicators to measure the credibility and accuracy of the models, and unable to perform flexible analysis according to different kinds of types and styles of matches as well.Taking advantage of various kinds of soccer data, data-based models adopt data mining and machine learning technologies enriching the methods and applications of soccer analysis, such as evaluating player performance and team tactics.However, the mainstream of data-based methods weakens the interpretability of models to varying degrees.From the perspective of soccer analysis experts, coaches, and players, advanced models together with explainable and evidence-based theories are more valuable for applications.Thus integrated knowledgedata-driven models combine the advantages of the aforementioned two types of models and have become a research hotspot recently.

TABLE III FEATURES OF THE DRL BENCHMARKS TACKLED IN RECENT MILESTONES

IV.DECISION-MAKING STUDIES

A. Challenges for Soccer Decision-Making

Decision-making has been an emerging topic in AI research,especially in the RL community.Studies have achieved superhuman performance in many existing virtual environments,such as Go and StarCraft II.Compared with traditional realtime strategy games such as StarCraft II and the poker games such as DouDizhu, decision-maing for soccer matches poses huge challenges, as shown in Table III.Specifically, the key challenges of soccer can be described in the following three aspects, which makes soccer a representative benchmark for decision-making research.

1)Multi-Agent Cooperation

Multi-agent learning poses significant challenges in the domain of reinforcement learning (RL), particularly for games involving a large number of agents that need to collaborate simultaneously.As shown in Table III, GRF stands out as a particularly complex domain, featuring 22 agents (players)with the need for coordination amongst 11 agents to achieve a common goal.Conversely, games such as the Go, StarCraft,and DoudiZhu, entail fewer autonomous agents requiring a degree of team cooperation.These divergent levels of collaboration create unique challenges for multi-agent RL algorithms and necessitate innovative methods to train and control collaborative agents effectively.

2)Task Complexity

The soccer match poses an immense degree of complexity of substantial challenges for MARL.Importantly, GRF boasts a continuous state space, with an abundance of potential player locations distributed throughout the field, resulting in a state space size that surpasses that of the other three games under consideration, demonstrated in Table III.Admirably,GRF features 22 players on the pitch, each with 19 legal action choices per frame, and there are 3000 frames per game.Then following the calculating method in [115] and [118], the action space size is 1 966000, i.e., 1 084398, which surpasses that of any other game.Moreover, the GRF environment’s stochastic nature implies that performing the same action by agents under the same initial conditions can advance to different states, consequently elevating the environment’s complexity.Moreover, GRF is a virtual simplification of real soccer matches to some degree.Therefore, the complexity of the real soccer match is far greater than that of GRF.

3)Sim&Real Differences

As evidenced by Table III, GRF stands out from the other three games due to the presence of Sim&real differences.Soccer is a game intrinsically rooted in the real world, where players employ a varied repertoire of strategies and tactics.Nonetheless, inherent dissimilarities between the simulation platform and the real world lead to incongruences between strategies adopted in either domain, precluding facile interdomain transfers.At the same time, the strategies trained in GRF should not only aim to win the game, but also control agents to behave more like real players.For example, a typical circumstance observed in GRF is that one player may dribble directly to the opponent’s goalkeeper and score a goal without any pass or cooperation.This may lead to high performance, but is not acceptable for a real match.

Based on the above typical challenges for decision-making in soccer, we introduce relevant research from the perspectives of both real world and virtual environments.

B. Real World Decision-Making Models

Following the orientation models, decision-making studies for both real-world and virtual matches are discussed in this section.The literature is summarized in Table IV.In realworld scenarios, decision-making models are often developed for decision support, rather than directly commanding the players on the pitch in a real-time mode.In this sense, the line between orientation and decision-making is quite vague.In this subsection, we confine the decision-making models as those: i) Relating more to a prescriptive analysis (i.e., action suggestion for a given situation) rather than a descriptive one(e.g., xG to describe the potential score chances); ii) Relating more to the dynamical process of a match rather than a static or post-match analysis.Since the Markov decision process(MDP) is a well-defined framework to describe dynamic processes, studies using MDP for characterizing decision-makingmodels will be highlighted.Models are categorized by those for the individual decision-making of players and the collective decision-making of a team.

TABLE IV CLASSIFICATION AND DESCRIPTION OF DECISION-MAKING MODELS

1)Individual Decision-Making:Evaluation of individual decision-making performance is crucial for coaching teams.The most prevalent method for evaluating players has been to quantify the value of their actions, such as shots and passes[75], [138], to provide decision support.Some player-evaluation metrics (e.g., predicted goals) consider actions that have an immediate effect on scoring [71], [119], [139].

Modeling a soccer match as a MDP provides a simple and intuitive way to consider the progression of the match [75],allowing analysts to take into account the current state of the match, as well as the actions that have been taken and the outcomes that have occurred, and use this information to make predictions about future events.MDP can be used to track the performance of individual players or teams over time, and identify trends or patterns in their play that could be useful for making strategic decisions.A MDP can be represented as a tuple (S,A,P,R,γ), whereSis a set of states,Ais a set of actions,Pis a transition function that defines the probability of transitioning from one state to another based on the current state and the chosen action,Ris a reward function that defines the reward or penalty associated with each transition, andγis a discount factor that determines the relative importance of immediate versus future rewards.

Fernándezet al.[18] define an EPV model as the anticipated result of a soccer possession based on full-resolution spatiotemporal data, which expresses the likelihood of a possession ending in a goal for the attacking team (1) or a goal by the defending team (-1) after an immediate possession regain.They then built a deep neural network model [120] from the full-resolution spatiotemporal data to compute the EPV for all actions during a game.Their method involves tracking data,which assumes that all players are completely observable.However, tracking data recorded in a real match may be incomplete or noisy, which makes it difficult to accurately evaluate the value of different actions taken by players on the field.Many other approaches are developed based on event data and evaluate all on-ball actions of soccer players.Decrooset al.[55] introduce a VAEP model for which they develop an entire language around event data called soccer player action description language (SPADL).Based on supervised learning, their model was trained to evaluate the influence of an event on the scoring or conceding probability.Liuet al.[75] apply the MDP framework to model the play dynamics for soccer and utilize a deep reinforcement learning model to learn an action-value Q-function and propose two new soccer performance metrics based on the Q-function: the goal impact metric and the Q-value-above-average-replacement (QAAR).However, event data only provides partial observability of match context, as it only captures the actions of players who are in possession of the ball at a given time.This can make it challenging for a MDP to capture the influence of off-ball actions on the outcome of a match.

With the aforementioned evaluation methods, it is possible to optimize player decisions by providing counterfactual analysis [2], [140].Some specific areas where counterfactual analysis might be used in soccer include examining how a different player substitution or tactical change might have impacted the outcome of a match [140], or how a different on-field decision by a player or coach might have altered the course of the match [122], [123].This analysis can also be used to evaluate the effectiveness of specific strategies or formations, and to identify areas for improvement in a team’s tactics or player performance.Royet al.[140] propose a framework to explore the effects of shooting from long distances, and reason about what would happen if a team increased or decreased their frequency of shooting from distance over the course of a season.The application of an evaluation model for counterfactual analysis based on real data requires high-quality and precise data in order to avoid obtaining incorrect results.In the future,the use of virtual platforms to explore counterfactual scenarios and apply the resulting insights to real-world soccer may bring significant changes to the field of soccer analysis.

2)Collective Decision-Making:In soccer analysis, formation is an important factor in determining a team’s tactics[141].The formation refers to the arrangement and positioning of the players on the pitch, and it can have a significant impact on a team’s ability to attack, defend, and control possession of the ball.Different formations can be used to conduct different tactical styles and to exploit the strengths of different players.

In recent years, formation analysis in soccer has received considerable attention.Bialkowskiet al.[109], [125], [126]propose a minimum entropy data partitioning method for detecting a team’s formation by modeling the spatial positions of players on the field as Gaussian distributions.This approach updates the parameters of each player’s position distribution function iteratively until convergence is achieved.Using this detection method, they found that teams exhibit different behavioral patterns at home and away matches.

Weiet al.[142] utilize team formation detection to identify the most commonly used offensive and defensive patterns in matches.However, these methods do not take into account the temporal and spatial characteristics of formations, making them unsuited for analyzing temporal and spatial formations.Perlet al.[124], [143]–[145] develop a series of formation analysis tools that use tables to show the distribution of different formations during matches, and line graphs to illustrate the evolution of formations over time.Voronoi diagrams are used to show the control area of each team to evaluate the quality of the formation.These tools partially address the problem of analyzing temporal and spatial changes in formations.However, in line graphs, a team’s formation is treated as a simple categorical variable, discarding its inherent spatial information.Furthermore, these tools do not support the study of the association between formation changes and multidimensional soccer data.Therefore, Wuet al.[127] propose a temporal and spatial visualization scheme that shows the evolution of formations in time and space, and develop a visual analytical system to support analysis.The above approaches either assume that team formation is consistent throughout a match or assign formations frame-by-frame.Kimet al.[128] propose a change-point detection framework that distinguishes tactically intended formation and role changes from temporary changes in soccer matches.

C. Decision-Making in Virtual Environment

An agent-based virtual environment to train and test artificial soccer policies is quite significant in solving the availability issue of real matches.In addition, virtual environments provide more flexible mechanisms to assign and examine diverse factors that may affect a match.Therefore, utilizing virtual environments not only complements real games but also opens a vast opportunity for the exploration of new analysis metrics, underlying patterns, team tactics, etc.As a result,many researchers study soccer decision-making problems in virtual environments such as RoboCup [146], [147] and GRF[66], [129], [136], [137].Different from real-world models which are mostly for decision support, decision policies in a virtual environment can directly command the virtual players(agents) in real time.RL and DRL in a multi-agent setting are widely used to learn the policy in this paradigm.In view of specific problems and methods to tackle the problems, the models reviewed in this subsection are categorized as communication-based, sparse reward-based, and role-based.

1)Communication-Based Learning:Since soccer is a typical complex multiplayer sports item, research on cooperation has always been a hot topic.The emergence of communication allows agents to interact with each other for cooperation.Outstanding teams learn effective communication strategies to share information wisely and reduce the cost of communication.This setting usually considers a set of cooperative agents in a partially observable environment where agents need to maximize their shared utility by means of communicating information.

In MARL, efficient policies require rational inferences about when to communicate, whom to communicate with, and how to communicate efficiently.In [21], a novel multi-agent graph-attention communication (MAGIC) algorithm is proposed, which uses a graph-attention communication protocol for sharing information.The method makes use of a Scheduler to help solve the problems of when to communicate and whom to communicate with, and a Message Processor using graph attention networks (GATs) with dynamic graphs to characterize the communication information.Dynamic and differentiable graphs are generated by the Scheduler which consists of a graph attention encoder and a differentiable attention mechanism.The graphs are sent to the Message Processor, which enables the Scheduler and Message Processor to be trained end-to-end.The test results in the GRF environment show that compared with normal benchmarks, enhancing the communication mechanism contributes to high winning rates.

When communicating with their neighbors, the agents may send and receive redundant information which dramatically degrades cooperation performance.To address these limitations, in our previous work [22], a cognition-driven multiagent policy (CDMAP) learning framework is proposed to prune redundant communications among agents.It includes a cognition difference network (CDN) based on a variational auto-encoder, a coupling cognition network (CCN), and a policy optimization network (PON).The concept of cognition difference is designed to prune redundant communications among agents.Based on the pruned topology, CCN captures the high-level hidden representations of the surroundings.The combination of several coupling graph attention layers lets the agents get a comprehensive state understanding from multiple representation spaces.Based on the captured hidden states,PON generates the final policies.

Based on social psychology, Maoet al.[129] leverage cognitive consistency which keeps human society in order to promote players’ policy learning.The method introduces neighborhood cognitive consistency (NCC) into multi-agent reinforcement learning.The NCC allows the player and its neighbors to have a consistent perception, which leads to better cooperation.Reference [130] proposes factorizing the joint team policy into a graph generator and graph-based coordinated policy to enable coordinated behaviors among agents.

Communication-based learning is still an active area in MARL with many open questions, especially for the application in soccer.Loweet al.[148] discuss some common pitfalls while agents learn to communicate with others in multiagent environments.

2)Sparse Reward-Based Learning:The success of reinforcement learning depends heavily on how well the reward signal frames the problem and guides the problem solving process.However, in soccer scenarios, the agent is supplied with extremely sparse or even no rewards, resulting in learning failure and ineffective exploration of the environment.Therefore, determining how to solve the sparse reward problem in soccer is a crucial and tricky problem.

At present, the common way for solving the problem of sparse rewards is to design some dense individual rewards for agents to guide the cooperation.However, most existing approaches utilize individual rewards in ways without promoting teamwork and sometimes cause the opposite effects.

In psychology, reward shaping is a method to train animals by continuously reinforcing the approximations of rewards to eventually make the animals achieve the desired complex behavior.Inspired by the phenomenon of guiding animal learning through reward shaping in neuroscience, Yang and Tang [131] solve the sparse reward problem by constructing a reward generator to generate inner rewards and guide the agents to learn control policies with deep neural networks.The proposed reward-shaping approach does not require specific domain knowledge, but rather enables the agents to instruct themselves by learning to generate inner rewards.At thesparseandvery sparsereward settings in the GRF environment, agents are able to learn to play soccer effectively.

Inspired by the self-organization principle in Zoology, Maet al.[132] propose a self-supervised intrinsic reward mechanism, named expectation alignment (ELIGN).Similar to how animals collaborate in a decentralized manner with those in their vicinity, agents trained with expectation alignment learn behaviors that match their neighbors’ expectations.This allows the agents to learn collaborative behaviors without any external reward or centralized training.Experimental results show that ELIGN works well in the Academy3vs1with Keepercompetitive task of the GRF environment, while other more complex tasks remain to be solved.

In addition, Liet al.[133] propose a model-based algorithm to create extra rewards to increase reward density with model error as an extra reward.Wanget al.[134] propose a method of individual reward-assisted team policy learning(IRAT), which learns two policies for each agent from the dense individual reward and the sparse team reward with discrepancy constraints to update the two policies mutually.To better utilize the soccer knowledge for dense reward design,[149] transfers knowledge from the soccer evaluation task to the MARL task to solve the challenges of sparse reward and the contradiction between consistent cognition and policy diversity.The model with transfer mechanism achieves perfect performance in the11vs11complex task from the GRF environment.

3)Role-Based Learning:To address the high demand for the scalability of multiagent reinforcement learning algorithms, policy decentralization with shared parameters (PDSP)is widely used, where agents share their neural network weights.Parameter sharing significantly improves learning efficiency and accelerates training.However, such a parameter-sharing method has apparent drawbacks in complex tasks.In tasks such as soccer games, different players have different roles and need substantial exploration and different strategies.When parameters are shared, agents tend to acquire similar behaviors because they typically adopt similar actions under similar observations, preventing efficient exploration and the emergence of sophisticated cooperative policies.Therefore,many researchers focus on role-based learning.

Fig.4.The structure of the knowledge-embedded policy framework.

In order to adaptively balance diversity promoting and parameter sharing, Liet al.[136] aim to introduce diversity in both optimization and representation of the MARL framework.Specifically, an information-theoretical regularization is proposed to maximize the mutual information between agents’identities and their trajectories, encouraging extensive exploration and diverse individualized behaviors.In representation,the agent-specific modules are incorporated in the shared neural network architecture, which are regularized by the L1-norm to promote parameter sharing among agents while keeping necessary diversity.

Furthermore, some studies utilize attention mechanisms to facilitate the role emergence of agents.Yanget al.[150]present a role-based attention model for reinforcement learning.The proposed model uses convolutional neural networks to generate soft attention maps, adding crucial role information in the task, forcing the agent to focus on important features and distinguish task-related information.Experimental results in the GRF environment show that role distinctions can help agents perform better in soccer games.

Liet al.[136] propose an information-theoretical regularization to maximize the mutual information between agents’identities and their trajectories to encourage diverse individualized behavior.Xuet al.[137] propose consensus learning for cooperative multi-agent reinforcement learning inspired by viewpoint invariance and contrastive learning.Following the principle of centralized training with decentralized execution,different agents can infer the same consensus in discrete spaces without communication.The one-hot consensus is inferred by all agents, which fosters their cooperative spirit.

D. An Example Combining Orientation and Decision-Making

Orientation models provide a deep assessment of the situation on the field.However, existing models are often used for post-match analysis and are not integrated with real-time decision-making processes.The emergence of reinforcement learning provides the possibility for orientation models to be applied in real-time decision-making.Existing reinforcement learning algorithms for soccer typically consider the problem from a data-driven perspective, ignoring the domain characteristics of the soccer scenario.Here, we provide an example of combining orientation models with decision-making models using GRF in a reinforcement learning framework.

Existing RL methods typically convert the pitch and player positions into two-dimensional matrix inputs, and then use convolutional networks to extract features.The advantage of this approach is that it can better extract spatial features of the situation on the pitch, but it also often leads to low learning efficiency since only the positions of the players are considered.Additionally, traditional convolutional networks are not sufficient to effectively capture complex spatial features in soccer scenarios, such as the deformation of formations.To address these issues, as shown in Fig.4, we propose a knowledge-embedded learning framework in our previous work[151].

In this framework, we design a novel pitch control model which quantitatively provides space influence values of a single player, the whole team, and the ball.Different from existing PC models, this approach considers each player’s various capabilities additionally, including flexibility, explosive force,and stamina.Based on this model, we construct a knowledgeembedded state representation, which compared to traditional two-dimensional inputs that only contain positions, embeds the spatial advantage of teams, players, and the ball on the soccer field.Furthermore, a deformable convolution network is constructed for state representation extracting, which is used to process the geometric transformation of the players’positions and spatial influence values generated by the pitch control model.Based on this comprehensive state representation, a PPO-based reinforcement learning scheme is adopted to generate the final policy.The proposed framework is evaluated at the11vs11scenario with the self-play training method in the GRF environment.The experimental results show that the framework successfully promotes learning performance.

V.MORE DISCUSSIONS ON SOME PROMISING DIRECTIONS

Here we enumerate some important issues that may be promising research directions for both professional soccer analytics and AI communities.

1)Soccer as a Benchmark:Soccer has been an ideal benchmark for AI research.It covers the complete OODA loop and involves representative multi-agent decision-making issues such as large state and action space, complicated team cooperation, dynamic tactics evolvement, etc.For example, for large state and action space issues, hierarchical learning may be a choice instead of end-to-end learning.It may also be helpful to learn initially from an easy task and then learn from harder tasks in a curriculum learning mode [13].For efficient team cooperation, multi-agent learning techniques such as credit assignment [22], communication learning [21], sparse reward[134], and role emergence [136] are available tools.Dynamic tactics evolvement involves considerations of uncertain opponent modeling [152], varying task learning [153], and a mechanism for strategies evolving such as self-play and populationbased learning [10].

2)Sim2Real Transfer:Learning in a virtual environment opens a vast opportunity for real-world soccer analysis.The practicability of the Sim2Real transfer is thus a promising research direction.As discussed in the Introduction, the gap between real-world and virtual matches may not be as huge as we assume.One of the reasons is that many methods to deal with the Sim2Real issue have been constantly proposed.These methods stem from different primitive studies.For example, from the transfer learning perspective, the model learned in a virtual environment can be adopted in a realworld match through feature transfer or instance transfer.From the perspective of improving model generalization and robustness, numerous domain randomization, domain adaption, and domain generalization methods can be applied where the virtual and real matches are taken as different domains[154].Counterfactual reasoning also uncovers vast possibilities based on real-world match data and extensive reasoning in a virtual environment [122].

3)Human-Oriented Factors:The human-oriented nature of soccer poses several extra challenges for sports analytics and AI research.The first is the human interpretation which means the model should be understood by a human and act like a human.For the latter, an opposite example is that an agent may learn excellent dribbling skills and directly dribble the ball to the goal and score without any team cooperation.The second is human uncertainty.There are diverse uncertainties in soccer matches mostly because the participants are human.Models should account for these subjective and uncertain factors.An alternative is to only consider macroscope scenarios where the impacts of personal uncertainties are not that obvious.Another human-oriented factor is that the entity making the final decision and executing actions are human.Therefore,many AI models are for decision support, e.g., analyzing the value of each state and action and suggesting some actions,but not directly generating continuous decision results.

4)Unlocking More Scenarios:Decision-supporting AI techniques, e.g., MARL, unlock numerous game-changing research and application scenarios.For example, in the case where xG augments actual goal scores, more indicators, and AI models can be developed for a better understanding of soccer matches.Furthermore, with detailed tracking data, descriptive models can be developed for scenarios that are difficult to quantify such as key passes and through balls.The number of key passes and through balls is often intuitively annotated by professional soccer analysts because their definitions are instinctive and often involve many players and large pitch sizes.However, an AI model with a powerful ability of feature representation may characterize these relatively highlevel and abstract concepts.Then a prescriptive model can be developed to suggest specific players for making a key pass and through ball.Likewise, counterfactual models are useful for learning the defensive or offensive style of a given team.Then a champion model (to learn from a champion team) or a model to characterize the opponent of the next match can be developed.

VI.CONCLUSION

In this paper, we conduct a systematic review of soccer analysis, evaluation, and decision-making from both the sports analytics and AI research communities.Two intersecting taxonomies are highlighted, i.e., the OODA loop to characterize the complete match process and the virtual or real domain where the match happens.We first present the available data for real-world soccer and typical simulation environments for virtual usage.Typical orientation and decision-making models are subsequently reviewed.Finally, some discussions are made about further challenges and promising directions.We claim that it is important to bridge the gap between real-world and virtual soccer matches.Thus available data and orientation and decision-making models for both real-world and virtual soccer matches are summarized.It becomes clear that AI can bring game-changing technologies for soccer analytics and, meanwhile, soccer has been a representative benchmark for AI research where milestone issues such as Sim2Real transfer and human interpretation are promising research and application directions.This is an emerging topic and our discussion in this paper is a start in the interdisciplinary direction.

IEEE/CAA Journal of Automatica Sinica2024年1期

IEEE/CAA Journal of Automatica Sinica的其它文章: A Tutorial on Quantized Feedback Control; Distributed Nash Equilibrium Seeking Strategies Under Quantized Communication; Feature Matching via Topology-Aware Graph Interaction Model; Adaptive Optimal Discrete-Time Output-Feedback Using an Internal Model Principle and Adaptive Dynamic Programming; Autonomous Vehicle Platoons In Urban Road Networks: A Joint Distributed Reinforcement Learning and Model Predictive Control Approach; Learning to Branch in Combinatorial Optimization With Graph Pointer Networks