Peng Wei ,Wei Feng,* ,Yunfei Chen ,Ning Ge ,Wei Xiang
1 Department of Electronic Engineering,Tsinghua University,Beijing 100084,China
2 Department of Engineering,University of Durham,DH1 3LE,Durham,UK
3 School of Computing,Engineering and Mathematical Sciences,La Trobe University,Melbourne,VIC 3086,Australia
Abstract: Networked robots can perceive their surroundings,interact with each other or humans,and make decisions to accomplish specified tasks in remote/hazardous/complex environments.Satelliteunmanned aerial vehicle(UAV)networks can support such robots by providing on-demand communication services.However,under traditional open-loop communication paradigm,the network resources are usually divided into user-wise mostly-independent links,via ignoring the task-level dependency of robot collaboration.Thus,it is imperative to develop a new communication paradigm,taking into account the highlevel content and values behind,to facilitate multirobot operation.Inspired by Wiener’s Cybernetics theory,this article explores a closed-loop communication paradigm for the robot-oriented satellite-UAV network.This paradigm turns to handle group-wise structured links,so as to allocate resources in a taskoriented manner.It could also exploit the mobility of robots to liberate the network from full coverage,enabling new orchestration between network serving and positive mobility control of robots.Moreover,the integration of sensing,communications,computing and control would enlarge the benefit of this new paradigm.We present a case study for joint mobile edge computing (MEC) offloading and mobility control of robots,and finally outline potential challenges and open issues.
Keywords: closed-loop communication;mobility control;satellite-UAV network;structured resource allocation
Robots are promising to accomplish specified tasks in remote,hazardous,or depopulated areas,as shown in Figure 1.In these areas,a single robot may not be able to accomplish complex tasks,such as emergency of nuclear leakage [1],unknown environment exploration [2],and offshore maintenance/inspection[3].Multiple robots are usually required,which need a communication network to coordinate with each other[1,4,5].However,these remote/hazardous/depopulated areas usually lack of terrestrial communication infrastructures.Simply extending the current fifth generation (5G) network to these remote areas,such as oceans and mountains,is not cost-effective.For example,the energy consumption of 5G cellular networks with full coverage would be quite high[6].
Figure 1.Application scenarios of networked robots in remote/hazardous/depopulated areas.
Satellite networks can provide wide-area coverage.The 3rd Generation Partnership Project (3GPP)standard [7] states that,under the assumption of 5 ms network latency,the end-to-end (E2E) latency between a user equipment (UE) and a ground station via a satellite link is up to 35 ms for the low Earth orbit (LEO),95 ms for the medium Earth orbit(MEO),and 285 ms for the geostationary Earth orbit(GEO).Furthermore,according to[8],the Inmarsat-5 GEO satellites can offer an upload speed of 5 Mbps.According to [9],the Starlink satellites deployed in the LEO offer an average upload speed of 12.04 Mbps in the United States with an E2E latency of 40 ms.However,to support video applications for monitoring and controlling mobile robots,the work in [10] specifies a maximum allowed E2E latency of 10 ms with an uplink data rate greater than 10 Mbps.When a substantial number of mobile robots are activated,the uplink data rate of 10 Mbps may even not be fully supported by LEO satellites.For example,for vehicular connectivity [7],satellite networks should support at least 25 Mbps data rates in the uplink.In a nutshell,the high propagation latency and relatively low data rate of satellites challenge the stringent requirements of robotic applications.
Unmanned aerial vehicle (UAV) networks present a promising solution for delivering on-demand communications in remote or hazardous areas,such as for timely disaster warnings and monitoring [11].However,establishing and maintaining efficient communications among UAVs poses significant challenges [11].Intermittent links and dynamic topology render it difficult to guarantee seamless handover and coverage.Additionally,the limited on-board energy also results in a limited lifespan of UAV networks.
Attributed to the wide-area coverage of satellites and the flexibility and strong line-of-sight links of UAVs [12,13],hybrid satellite-UAV network is a promising solution to providing on-demand services in remote or depopulated areas.In addition to this ground-to-space/air change,robots working in remote/hazardous/complex environments also require paradigm shifts of wireless communications.For example,most of current multi-robot control algorithms require precise synchronization [4],thus,a hard timing constraint will be imposed on the maximum communication latency of robots.This requirement may not be met by current communication protocols,which are typically designed under the “best-effort” regime.Under traditional open-loop communication paradigm,the network resources are usually divided into user-wise mostly-independent links,via ignoring the task-level dependency of robot collaboration.Thus,it is imperative to develop a new communication paradigm,taking into account the high-level content and values behind,to facilitate multi-robot operation.Towards this end,high-level feedback should be carefully integrated,which is vital to enhance the stability,accuracy,and precision of controls.Furthermore,the convergence of sensing,communications,computing,and control should be considered.
Inspired by Wiener’s Cybernetics theory,this article explores a closed-loop communication paradigm for the robot-oriented satellite-UAV network.This paradigm turns to handle group-wise structured links instead of traditional user-wise independent links,so as to allocate resources in a task-oriented manner.It could also exploit the mobility of robots to liberate the network from full coverage,enabling new orchestration between network serving and positive mobility control of robots.Moreover,the integration of sensing,communications,computing and control would enlarge the benefit of this new paradigm.We present a case study for joint mobile edge computing(MEC)offloading and mobility control of robots,and finally outline potential challenges and open issues.
The remainder of this paper is organized as follows.In Section II,three paradigm shifts from conventional networks to robot-oriented networks are presented.In Section III,a case study on multi-robot collaboration in the hybrid satellite-UAV network with MEC is given.In Section IV,open issues are provided.Finally,in Section V,conclusions are drawn.
Based on the Shannon’s theory [14],conventional communications focus on reproducing messages from the transmitter at the receiver,often disregarding“high-level information feedback”from the receiver to the transmitter.As shown in Figure 2,This open-loop communication paradigm could provide pipeline-like transmission services.However,in robotic applications,received messages would be used for robot controls after being reproduced.Consequently,feeding back the outcomes of robot control to the transmitter becomes pivotal,which is the inherent request of controls.In essence,successful robot control involves a seamless interplay of action and feedback,constituting a closed-loop process.This entails not only pipeline-like information transmission but also highlevel information feedback.Inspired by Wiener’s Cybernetics theory [15],there is a need for a paradigm shift from open-loop to closed-loop communications.Figure 3 illustrates the Wiener’s Cybernetics-inspired paradigm.
Figure 2.Simple diagram of the conventional communications based on Shannon’s theory.
Figure 3.Simple diagram of the Wiener’s Cyberneticsinspired communications.
In traditional communication networks,according to the 3GPP standard [16],independent links are allocated to mobile users based on their requests.As shown in Figure 4,communication resources are usually divided into mostly-independent resource blocks and are scheduled to different mobile users through multiple access technologies,including orthogonal frequency division multiple access (OFDMA),time division multiple access(TDMA),and space division multiple access (SDMA).The message is exchanged between a base station and a mobile user in the frequency division duplex(FDD)or time division duplex(TDD) mode.After communication resources are allocated to mobile users,each user is assigned a userwise link.These links are task-level independent of each other,because the scheduling and utilization of user-wise resources have nothing to do with the final goal of information transformation of different mobile users.
Figure 4.User-wise independent links in conventional communications.
In robotic applications,a single robot may not be sufficient to accomplish complex tasks in remote or hazardous areas,such as remote healthcare [1,17]and unknown environment exploration [18].It needs to be controlled in connection with humans/neighboring robots through satellite-UAV networks and be assisted by a large number of connected sensors.As a result,there is a significant difference between conventional and robot-oriented networks in terms of space-time-frequency communication resource allocation.To show the difference,an example of collaborative robots is illustrated in Figure 5.As can be seen from the figure,a task is accomplished through the closed-loop control of three robots with different functionalities.Firstly,at timet1,Robot 1 observes the state of the task and sends the sensory data to a base station.Then,the base station forwards the data to a computing unit in Robot 2 for information processing and situation analysis at timet2,and then receives the corresponding computation results from Robot 2 at timet3.Thirdly,the base station transmits the control command to Robot 3 at timet4.After receiving and translating the control command,Robot 3 rolls its motion control system to move its arm,wheels,or other components to execute task-specific actions.This procedure will be repeated until the task is completed.
Figure 5.An example of collaborative robots.
To perform the collaboration of robots,Figure 6 shows the process of task-oriented structured resource allocation.As can be seen from the figure,there are different but dependent ways of allocating communication resources to each robot.The amount of sensory data is often much larger than that of computation results and control demands.Thus,a large bandwidth is allocated to the uplink from Robot 1 to the base station and the downlink from the base station to Robot 2.Meanwhile,a small bandwidth is granted to the uplink from Robot 2 to the base station and the downlink from the base station to Robot 3.Moreover,as robots in different positions execute their respective actions at different times,they are allocated with different space-time resources.More importantly,as shown in Figure 6,the resource configuration and consumption in robot-oriented networks are dependent on the order in which collaborative robots execute a task.For instance,as the amount of sensory data increases,the uplink bandwidth allocated to Robot 1 is increased,leading to an increase in the downlink bandwidth allocated to Robot 2.When Robot 1 changes its position to observe the task state from different orientations,Robot 3 also changes its position to execute the corresponding movements.At this moment,the allocation of space-time communication resources for Robot 3 varies with the that for Robot 1.Therefore,as shown in Figure 6,during robot collaboration,the communication resource allocation and scheduling are coupled,achieving a final-synergy in accomplishing the task.
Figure 6.Task-oriented structured resource allocation for collaborative robots.
Resource allocation and scheduling for a group of collaborative robots exhibit a strong “high-level task correlation” in terms of spatial,temporal,and frequency domains.Different from user-wise links in conventional networks for individual users,groupwise links with “high-level task correlation” are essential for enabling a team of collaborative robots to accomplish a specific task.These group-wise links become the minimum unit of resource allocation,as there contain task-determined structure.We call it task-oriented structured resource allocation,which changes the basic object of resource allocation,and inevitably brings new complexity.One has to keep a systematic mindset to tackle this complexity.
In telephone networks,as shown in Figure 7,the human user first moves to a location equipped with a telephone and then picks up the handset to access the network.In this stage,users move for services.When it comes to the cellular network,which could fully cover an area,in which,users could move freely,and could seamlessly access the network.This architecture has liberated human users,which however puts pressure on the network.Tremendous communication resources should be occupied to ensure full coverage[6].
Figure 7.Network coverage patterns and user mobility behaviors in telephone,cellular,and robot-oriented satellite-UAV networks.
Different from both the telephone and cellular networks,satellite-UAV integrated networks could exploit the mobility of robots,leading to a cost-effective coverage regime.In general,due to the high mobility of satellites and UAVs,as well as the complicated geographical environments,full coverage cannot be always guaranteed.Instead of pursuing full coverage,we leverage the controllable mobility of robots,navigating them to the areas with better network coverage,under their task constraints,just as guiding a human being to a location equipped with a telephone.As shown in Figure 7,before a robot enters an area without network coverage,its moving path can be preprogrammed by the network so that it can move from the area without network coverage to an area with network coverage.According to the concept of radio map in[19],Figure 8 illustrates an example of radio mapguided robot movement through a satellite-UAV network in remote areas.In this case,the trajectory of a robot is planned in advance to avoid blind spots,and simultaneously accomplish the assigned task.
Figure 8.An example of radio map-guided robot movement in remote areas.
In practice,robot control algorithms are mainly intended for robotic applications [4].Robot control algorithms and network optimization algorithms come from different manufacturers and operators.As a result,it is challenging for networks to actively navigate robots.In the new network paradigm as shown in Figures 7 and 8,incorporating intelligent technologies such as artificial intelligence(AI)and machine learning(ML)will be advantageous in integrating network optimization algorithms and robot control algorithms.Consequently,networks can interact with robots to offer high-quality services and control their mobility,no long simply pursuing full coverage with high cost.
Human-centric communications have been fundamental from first generation (1G) to 5G networks.In 5G,there has been a rise in machine-centric communications,including ultra-reliable and low latency communications(URLLC)and massive machine type communications(mMTC)[20].However,closed-loop control of robots consisting of sensing,communications,computing and control remains open.For in-stance,in [21],to manipulate a robotic arm to execute a given action,translating the commands to allow the movement of the robot introduces inherent latency.This control latency results in a reduced data rate and an underutilization of bandwidth in the closedloop control,thereby indicating a reduction in network throughput.Furthermore,when the control latency is high,optimizing only the communication latency does not effectively reduce the end-to-end latency in robot control.
Recently,two new machine-centric use cases involving integrated artificial intelligence and communication(IAIC)and integrated sensing and communication (ISAC) are proposed for 6G robotic applications [5].Multi-robot collaboration entails efficient sensory data acquisition and sharing,logical reasoning,and deterministic execution of control instructions so as to enable safe and reliable closed-loop control[22].Thus,closed-loop control requires convergence of sensing,communications,computing,and control.Actually,the relationship between sensing,communications,computing,and control within a closed loop has not been fully uncovered.There is even no unified metric to characterize different stages.Entropy might be a tool to fix this problem,as we have Shannon entropy to characterize communications[14],and the intrinsic entropy rate to characterize control systems[23].The concept of unified entropy could be envisioned,which matches sensing,communication,computation,and control stages by addressing the elimination of uncertainty.
In summary,robotics is an interdisciplinary field of research,involving electrical engineering,computer science,mechanical engineering,and biology/cognitive science.A large amount of heterogeneous information for networked robots will be perceived,transferred,and processed.To measure the heterogeneous information in a unified way,multidisciplinary integration is required.
We present a case study to illustrate the benefit of closed-loop paradigm,as shown in Figure 9.Multiple collaborative robots are supported by a hybrid satellite-UAV network with MEC.To complete a specific mission within a given time,efficient control of robot mobility is needed,such as velocity control.During movement,mobile robots also need to periodically offload sensor data to and receive computation results from MEC servers[24].To minimize the average task completion time throughout the whole journey,a joint optimization problem of data offloading and velocity control is formulated as
Figure 9.System model for collaborative robots in the hybrid satellite-UAV network with MEC.
Tn(t) represents the task completion time in thetth time slot.Lnis the number of time slots in thenth AP coverage region,which is dependent on the target velocityof the robot.In constraint (1b),the task completion time cannot exceed the maximum completion timeTn,max(t).Constraint (1c) ensures that the total moving timedoes not surpass the tolerable total moving timeTmove.In constraint(1d),only one MEC server is selected based on the binary offloading decisionα(t)as specified in constraint(1e).Constraint(1f)defines the permissible range of,which is bounded by the minimumvminand maximumvmaxvelocities.
To solve the joint optimization problem,a general framework of dual-agentQ-learning is devised,as shown in Figure 10.In thenth AP coverage region,Agent1makes the offloading decisionα(t).Through accumulating the offloading reward and observing the channel state(extrinsic information),the target velocityis determined by Agent2.The channel state is about whether wireless transmission is available in thenth AP.Channel unavailability could be attributed to over-the-limit access to communication resources,severe channel fading,or damage to the AP.
Figure 10.Framework of dual-agent Q-learning.
In our simulation,according to[12,24],20 APs and 20 MEC servers are deployed,where the coverage regions with just satellite communications are numbered by{8,9,...,13}.The computational frequencies of cellular and satellite-based MEC servers are from finite sets{10,11,...,19}(GHz)and{50,51,...,59}(GHz).The moving distancecnis randomly chosen from set{100,200,300}(m) for the cellular AP and from set{1000,2000,3000}(m)for the satellite.In the cellular network,the bandwidth is 10 MHz,the transmit power of the mobile robot is 0.2 W,the channel noise power is 2 × 10-12W,and the channel gain is 10-6.We assume an LEO satellite with a satellite-ground distance of 1000 km.The transmission rates for the robot-satellite and satellite-gateway are 10 Mbps and 100 Mbps,respectively.The extra average migration delay between the cellular and satellite networks is 500 ms.In each offloading interval of 1 s,the generated data size is randomly selected from the set{100,250,400,550,700}(KB)with 800 CPU cycles/bit,and the computational capacity of the mobile robot is randomly chosen from a finite set{0.5,0.6,...,1}(GHz).For the mobile robot with acceleration of 2 m/s2,its velocity is from a discrete set{5,6,...,20}(m/s).InQ-learning,the hyperparameters are set asλ=0.1,γ=0.9,andϵ=0.05 with a discount interval of 4×10-6.
For a fair performance comparison,we include conventional offloading,local execution,and greedy schemes within each AP coverage region[24].In both conventional offloading and local execution schemes,the mobile robot maintains a constant velocity.Conversely,the greedy scheme employs local searching for velocity decision-making.This involves calculating rewards in each AP coverage region and subsequently searching for the maximal average reward across all AP coverage regions.
Figure 11a compares the proposed scheme with conventional schemes for different values ofNCH.As can be seen from the figure,the proposed scheme has a shorter completion time than conventional schemes.Figure 11b plots the task completion time against the size of the data generated by the robot.The data size per offloading interval is randomly selected from the set{100,250,400,550,700}(KB)+ΔD,where the incremental parameter ΔDis chosen from{1,3,5,7,9}(MB).It is shown that for different data sizes,the proposed scheme always has the shortest task completion time.
Figure 11.Comparison of the task completion time among conventional offloading,local execution,greedy,and the proposed schemes.
Figure 11c compares the task completion time versus the moving time for different values ofNCH.Since a constant velocity is assumed in conventional offloading and local execution,the effect of velocity control on the completion time cannot be clearly observed.Although the greedy scheme has a shorter moving time than the proposed scheme,it has a higher offloading time than the proposed scheme.Benefiting from velocity control,the proposed scheme attains the shortest task completion time at a moderate moving time.Figure 11c also implies that the proposed scheme adapts well to the harsh network environment due to the communication state-based velocity control.
The deployment of robots in dynamically changing environments is becoming increasingly prevalent.This trend necessitates the development of advanced network functions and protocols,leading to more intricate network designs.Addressing this complexity requires the augmentation of the network’s self-learning capabilities.State-of-the-art AI/ML techniques[25],such as meta-learning,might play a crucial role in achieving this enhancement.
Robots inherently possess low security,often demanding control within private networks [26].However,the proposed satellite-UAV network architecture operates in an open environment.To address security and privacy concerns in inter-robot information sharing,against various cyberattacks[27],one has to leverage the decentralized security methodologies which might be rooted in blockchain.Novel machine-oriented authentication mechanisms also should be developed.
Based on Wiener’s Cybernetics,a closed-loop communication paradigm was developed that can enable robot-oriented satellite-UAV networks.We have particularly considered the “high-level task correlation”,to facilitate multi-robot operation.The new paradigm could handle group-wise structured links instead of traditional user-wise independent links,so as to allocate resources in a task-oriented manner.It could also exploit the mobility of robots to liberate the network from full coverage with high cost.Moreover,the integration of sensing,communications,computing and control would enlarge the benefit of this new paradigm.We also presented a case study for joint MEC offloading and mobility control of robots,to demonstrate the benefits of the new paradigm.
ACKNOWLEDGEMENT
This work was supported in part by the National Key Research and Development Program of China (Grant No.2020YFA0711301);in part by the National Natural Science Foundation of China(Grant No.62341110 and U22A2002);and in part by the Suzhou Science and Technology Project.