Intelligent Task Offloading and Collaborative Computation in Multi-UAV-Enabled Mobile Edge Computing

2022-04-20 05:57JingmingXiaPengWangBinLiZesongFei
China Communications 2022年4期

Jingming Xia,Peng Wang,Bin Li,Zesong Fei

1 Nanjing University of Information Science and Technology,Nanjing 210044,China

2 Beijing Institute of Technology,Beijing 100081,China

Abstract:This article establishes a three-tier mobile edge computing(MEC)network,which takes into account the cooperation between unmanned aerial vehicles(UAVs).In this MEC network,we aim to minimize the processing delay of tasks by jointly optimizing the deployment of UAVs and offoading decisions,while meeting the computing capacity constraint of UAVs.However,the resulting optimization problem is nonconvex,which cannot be solved by general optimization tools in an effective and effcient way.To this end,we propose a two-layer optimization algorithm to tackle the non-convexity of the problem by capitalizing on alternating optimization.In the upper level algorithm,we rely on differential evolution(DE)learning algorithm to solve the deployment of the UAVs.In the lower level algorithm,we exploit distributed deep neural network(DDNN)to generate offoading decisions.Numerical results demonstrate that the two-layer optimization algorithm can effectively obtain the near-optimal deployment of UAVs and offoading strategy with low complexity.

Keywords:mobile edge computing;multi-UAV;collaborative cloud and edge computing;deep neural network;differential evolution

I.INTRODUCTION

1.1 Backgrounds and Motivations

The popularity of intelligent devices and the development of specifc technologies,such as augmented reality and virtual reality,has brought about the massive growth of data and posed new challenges to mobile communication technology[1–3].In addition,the booming development of the Internet of Things(IoT)also makes this challenge more serious[4–7].How to deal with so much data quickly and effectively has become a signifcant problem,which is worthy of consideration.

Mobile edge computing(MEC)is a promising solution.In recent years,MEC has been receiving a lot of attention due to its various performance in the feld of communication.With the advantages of MEC,such as short communication delay,good confdentiality and good scalability,the problem of large amount of data computing can be effectively alleviated[8,9].Nonetheless,due to fxed base stations,MEC inevitably suffers from poor disaster response and infexible deployment.To tackle this issue,fexible location deployment of edge servers is essential[10,11].

With this regard,MEC assited by unmanned aerial vehicle(UAV)opens up new possibilities for solving the dilemma.UAV has recently attracted much attention in wireless communication due to the characteristics of high mobility,fexible deployment,and low cost[12–16].By taking advantage of these characteristics of UAV,edge servers built on UAV can well solve the above problems.Therefore,UAV-assisted MEC network is becoming an evident trend.In the early years,there has been a lot of research on single drone-assisted MEC networks.For instance,the authors of[17]proposed a two-tier structure,where a single UAV and ground terminal tackle computing tasks cooperatively.In[18],the authors investigated a new MEC network,which takes full advantage of the nature of the UAV.In particular,the single UAV played a role of edge computing node or relay.

However,due to the limited computing power and service scope of single UAV,the shortcomings of single-UAV-assisted MEC system gradually come to light.Based on this reality,more and more studies focus on the MEC network assisted by multiple UAVs.For example,in[19],the authors proposed an energy-effcient UAV-supported MEC architecture where multiple UAVs equipped with MEC computing platform are deployed as edge servers to provide computational assistance to users on the ground,and the weighted energy cost of the system was reduced by jointly optimizing the trajectory of UAVs and computing resources allocation.The authors in[20]presented a MEC-enabled UAV communication system,where a number of UAVs are served by terrestrial base stations equipped with computation resource in the nonorthogonal multiple access manner.Their main goal was to minimize the sum of transmission energy and computing energy of the system.

Although the cooperation of cloud and edge in UAVassisted MEC model has been studied in the literature,the cooperation between UAV edge nodes is still missing,to the best of our knowledge.Due to the limited computing power of a single UAV,it is not enough to provide services for a large volume of users by relying on a single UAV.Naturally,this inspired us to explore the cooperation between UAVs.To this end,this paper mainly studies the full cooperation problem in a multi-UAV-assisted MEC network.

1.2 Our Contributions

This paper presents a three-layer MEC network and studies how to minimize the total processing delay of hot spots(HSs)while satisfying the computing capacity constraint of UAVs by jointly optimizing UAVs deployment and offoading decisions.We stipulate that the UAV can act either as edge computing server or as a relay to offoad computing tasks to other UAVs or remote clouds.Although the existence of UAV-UAV direct communication may worsen the uplink performance of UAV[21],this infuence can be neglected.In this MEC model,we introduce a high-altitude server(HAS)as cloud server due to the fact that compared with traditional fxed base stations,HAS is fexible and cheaper to deploy.In addition,HAS is also able to cope with natural disasters such as earthquakes and tsunamis.Notably,we formalize the problem and fnd that is a nonlinear integer programming problem,which cannot be solved effectively by general optimization tools.If the traditional relaxation method and convex optimization technique are used,the complexity of the problem will be quite high.Similarly,if the enumeration method is used,the computational complexity is too high to be realistic.To solve this problem with low complexity,we propose a two-layer optimization algorithm consists of distributed deep neural network(DDNN)and differential evolution(DE).Compared with a single DNN,multiple DNNs use different training parameters,which results in huge differences in the output layer and accelerate the convergence speed of the algorithm.To sum up,the main contributions of this paper are summarized as follows:

·We present a novel three-layer MEC model,which taking account of the horizontal cooperation between UAVs.With the goal of minimizing the total processing delay of HSs,we investigate the deployment of UAVs and offoading decisions while satisfying the computing capacity constraint of UAVs.

·The formulated optimization problem is mathematically expressed as a nonlinear integer programming problem,whose complexity increases dramatically with the increase of the number of HSs and UAVs.In order to reduce computational complexity,we propose a two-layer optimization algorithm.In the upper layer algorithm,we rely on the DE to seek the optimal deployment of UAVs.In the lower layer algorithm,we exploit the DDNN to solve the problem of offoading decisions.

·We evaluate the performance of proposed algorithm in comparison with some baseline algorithms.The simulation results show that the proposed algorithm can effectively fnd the nearoptimal offoading decisions of HSs and appropriate locations for UAVs.Also,we investigate the effects of different parameters on the system performance.

The remainder of this article is organized as follows.Section II describes the related work.System model and the problem formulation are illustrated in Section III.In Section IV,the solution to this problem is demonstrated.The simulation results and analysis are shown in Section V.Finally,we draw the conclusion in Section VI.

II.RELATED WORK

In recent years,UAV-assisted MEC in the feld of wireless communication has received extensive attention and research.Thereinto,some studies use UAVs as edge computing nodes to provide computing services to users on the ground.For instance,the authors of[22]proposed a trajectory control algorithm based on multi-agent deep reinforcement learning,which is used to independently manage the trajectory of each UAV.Then,a low-complexity method was introduced to optimize the offoading decision of users after obtaining the fight trajectory of UAV.[23]focused on the application of UAV in the scenario where users’ mobility was considered and used the DDQN algorithm to plan the trajectory of UAV.In order to ensure the QoS of each mobile device,the UAV with limited energy dynamically planned its trajectory according to the position of the users.

UAVs can not only serve as edge processors for users,but also play a role of relay.More explicitly,the authors in[24]investigated a newfangled UAVsupported MEC system,where the UAV serves as a relay.Because of the limited computing power of IoT devices,the tasks can only be offoaded to the UAV or ground stations to calculate.The authors jointly optimized the location of UAVs,resource allocation and offoading decision to minimize the energy consumption.Similarly,the authors of[25]considered that tasks can be calculated locally,by UAVs or by the base station.The authors proposed a new optimization equation to minimize the total energy consumption by optimizing bit allocation,time slot scheduling,power allocation,and trajectory of UAV.In[26],considering the mobility of the client and the edge server,the authors proposed a scheduling scheme of the online edge server taking advantage of the high maneuverability of UAVs.

In addition,the role of UAV in global scheduling in MEC network can not be ignored.The UAV equipped with edge server is generally deployed between the client and the cloud,which enables the UAV to sense the state of the client and the cloud well,so as to realize the role of global scheduling.In[27],the authors used UAVs to establish the dynamic digital twin of vehicles and road side units to capture the time-varying resource supply and demand,so as to realize the unifed resource scheduling and allocation.Similarly to[28,29],the authors proposed two incentive mechanisms to optimize the global resource allocation and offoading strategy.

III.SYSTEM MODEL AND PROBLEM FORMULATION

3.1 Network Model

As shown in fgure 1,we consider a three-tier MEC network,which includesMHSs,NUAVs equipped with lightweight MEC platforms and a HAS,and the HSs and UAVs are indexed byM={1,...,M}andN={1,...,N},respectively.There are many IoT devices in each HS and the tasks of IoT devices converge to the center of the HS at the beginning.Note that each UAV performs two roles,one to handle tasks offoaded to the UAV and the other to offoad tasks to HAS.Therefore,each UAV can be considered as an edge node or a relay and the HAS has enough computing resources and power to provide computing services.

In the considered network,we import separately(xi,yi,0)and(0,0,H)to represent the central threedimensional coordinate of theith(∀i ∈M)HS and coordinate of the HAS,whereHis the altitude of the HAS.V=(v1,v2,...,vj,...,vn)is applied to represent the positions ofNUAVs,wherevjis the coordinate of thejth(∀j ∈N)UAV.vjis denoted as(Xj,Yj,h),wherehis the altitude of UAV.It is assumed that theith HS has a total computation taskwhereDirepresents the data size of taskUiandCidenotes the number of CPU cycles required to process taskUi.In addition,we introduceTQoSto represent the maximum latency of each computation task.

In consideration of the limited computing power of IoT devices,we do not consider the local computing of HS in this article,which implies that all tasks should be offoaded to the UAV or HAS.Therefore,we introduceA=(a1,...,ai,...,am)to represent the offoading decisions of HSs,whereaidenotes the offoading decision of computation taskUi.Whenai=0,the taskUiwill be offoaded to the UAV layer;otherwise,it will be offoaded to the HAS.We stipulate that each HS can only be associated with one UAV or the HAS at the same time.In addition,we consider that the computing capacity of the UAV is limited and represented bynmax,wherenmaxmeans the number of HSs that one UAV can serve at the same time.Therefore,these yield the following constraints

The notations used in this paper are summarized in Table 1.

Table 1.Notation setting.

Table 2.Simulation parameters.

3.2 Computation Model

In this article,the computing tasks can only be handled by UAVs or the HAS.By fully taking account of the cooperation between UAVs,for taskUi,there are three execution possibilities.According to different offoading schemes,the computation cost in terms of delay for HSs are different.

3.2.1 Closest UAV Execution

3.2.2 Adjacent UAV Execution

3.2.3 Cloud Execution

Whenai= 1,taskUiwill be frst offoaded to the closestjth UAV and then be offoaded to the HAS via this UAV.Since the computing power of the HAS is very strong,the computing delay of the remote cloud can be ignored.The distance between thejth UAV and HAS is

and the uplink transmission rate ofjth UAV to the HAS and corresponding transmission delay can be expressed as

whereB3is the bandwidth between the UAV and HAS.The latency for this execution mode is given by

3.3 Problem Formulation

In this article,the deployment of UAVs and offoading decisions are jointly optimized to minimize the total execution delay of HSs while taking account of the computational capability constraint of UAVs.Through the above analysis,the optimization problem can be formulated as follows:

where constraints(16e)and(16f)control the range of the deployment position of the UAV.(16g)makes sure that each task satisfed the constraints of QoS.

IV.PROPOSED METHOD

It can be easily observed that problem P1 is a mixed integer nonlinear programming problem due to the fact that the objective function and constraints contain binary variables and discrete random variables,which is NP-hard.It is diffcult to solve problem P1 by the traditional convex optimization method.However,we found that the constraints can be decomposed into two parts,one related to the offoading decisions and the other related to the deployment of UAVs.To make this problem tractable,we divide the problem into two subproblems.Specifcally,one subproblem optimizes the offoading strategy,and the other optimizes the deployment of UAVs.

Given the deployment of UAVsV,there are only the offoading constraints.Thus,problem P1 can be simplifed to

To fnd a satisfactory solution to the problem P2,we exploit the DDNN algorithm,which has been shown to be effective in solving this dichotomy problem.

“If you can’t allow your partner to have latitude26 in what he or she eats, then maybe your problem isn’t about food,” said Susan Jaffe, a psychiatrist in Manhattan

Similarly,given the offoading strategyA,there remain only the deployment constraints of the UAVs.Thus,problem P1 can be reformulated as

In order to solve problem P3,we resort to the DE algorithm.In the following,we will introduce the details of the proposed method.

4.1 DDNN Algorithm

4.1.1 DDNN Algorithm

The structure of the DDNN algorithm is shown in fgure 2,which includes offoading decision generation and offoading strategy update.

In the stage of offoading decision generation,we frst input the dataDinto the DDNN and we can getKgroups of loose offoading decisions throughKparallel DNNS.In our MEC network,because offoading decisions are binary variables,we need to map all the loose decisions obtained before to binary variables.Then we calculate the delay of each decision group according to the formula.Due to usingKparallel DNNs and random initialization,the generated offoading decisions must be different.Finally,we store the dataDand best decisionsy*determined by the delay result into a memory bank.The storage here is the labeling process of supervised learning.In the stage of offoading strategy update,a batch of samples are taken from the memory bank to train theKparallel DNNs.According to[32][33],using memory bank to train DNN can break the correlation between data and experience replay technology can better accelerate the convergence speed of DNN.The details of two stages will be described in the next two sections.

4.1.2 Offoading Decision Generation

For each input dataDk,we can obtain differentKgroups of outputs withKactors with one group per actor.The offoading actor is shown in fgure 3.Other actors generate actions as well.Assume that one set of outputs isxk,we have

whereyk ∈(0,1)is a loose set of offoading decisions.Then,we need to map the loose decisionsykto binary variables.Here,we partition decisionsykby using the threshold of 0.5 and the corresponding offoading decisions are obtained by

After obtaining offoading decision,we need to determine which group is the best.The offoading decision which is best can be obtained by following

The best offoading decisions obtained above are stored as labels in a limited memory.When the memory is full,the new data will replace the oldest data.Kparallel DNNs were trained through using the sample data in the memory bank.For iterationt,we select randomly several groups of data and decisions from the memory bank to update the weight and deviation of DDNN with the aim of minimizing the cross-entropy loss.The cross-entropy loss is calculated by the following equation

In this paper,we use the cross-entropy function as the loss function because it can effectively speed up the convergence of the algorithm compared to other loss functions such as mean square error in classifcation problem.

4.2 DE Algorithm

As shown in fgure 4,the position of each UAV is coded as an individual,and the whole population represents the deployment of UAVs.This encoding mechanism has the following advantages: 1)the length of each individual in the process of evolution is fxed;2)The length of each individual is equal to 2,which means that the deployment of the UAV is optimized in a very low-dimensional search space[34].The detailed steps are as follows:

4.2.1 Initialize the Population

In the initialization step,our aim is to get the initial deployment of the UAVs.We frst initialize the horizontal and vertical coordinates for the frst UAV.Then,we generate the second UAV while constraint(16d)is satisfed and the rest and so on.For thejth UAV,we initialize it using the following formula

whererandis a random number in(0,1),which helps generate multiple different coordinates and ensures that the coordinates remain within bounds.

4.2.2 Variation

In this step,our goal is to generate a temporary new individual of a random individual in the population.Two individuals of the parent generation are randomly selected to generate difference vector,and then another individual is selected to sum with the difference vector to generate a experimental individual.The mutation of DE is introduced as follows:

wherevtindicates the position of UAVs in thetth iteration.is the temporary position of thepith UAV with a serial number in the(t+1)th iteration,and its function is to prepare for the later cross.Fis the scaling factor,which mainly affects the global optimization ability of the algorithm.The smallerFis,the better the local searching ability of the algorithm is.The largerFis,the more possibility the algorithm can jump out of the local minimum point,but the convergence speed will be slow.In general,we set the value ofFto 0.7.

4.2.3 Cross

In this step,we aim to re-assign a random individual extracted from the previous step to determine a new population.The crossover operators of DE is introduced as follows:

wherejrandis an integer randomly selected between 1 and 2 to ensure that the positions of the same UAV with two different iterations differ in at least one latitude[34];andCRis the crossover control parameter.The larger the value ofCRis,the faster the convergence rate will be.However,this can easily lead to precocious phenomenon which will lead to poor learning results.Generally,we set the value ofCRto 0.4 by taking into account learning speed and learning results.

4.2.4 Selection

Algorithm 1.Main algorithm.Input: D: Data size Output: A: Offoading decisions;V: Deployment of UAVs 1: Initialize the location of HS and UAVs,the K DNNs and memory;Set iterations T and I;2: for t=1,2,3,...T do 3: Train DDNN via algorithm 2;4: for i=1,2,3,...I do 5: Randomly select one set of data;6: Adjust deployment of UAVs via algorithm 3;7: end for 8: end for

Algorit Input:hm 2.DDNN algorithm.D: Data size;V: Deployment of UAVs Output: A: Offoading decisions 1: Set iterations T and training interval σ;2: for t=1,2,3,...T do 3: Generate relaxed offoading action xt =ft(Dt),yt =sigmoid(xt);4: Map yt into binary action ˆy*t;5: Computer the best Q*(Dt,ˆy*t)according to ˆy*t;6: Choose best y*t according to y*t=argminQ*(Dt,ˆy*t);7: Update the memory by adding(Dt,yt);8: if t/σ =0 9:Randomly sample K batches of memory;10:Train the DNNs and update using the DAM algorithm;11: end if 12: end for

The goal of this step is to compare the populations from the two previous iterations and retain the best one.The DE algorithm adopts a greedy strategy at the stage of selection.If constraint(16d)is satisfed and the total delay calculated according to the new position is better,the location of the UAVs will be updated;otherwise,the location of the UAVs will not change.After a certain number of selections,better and better deployment of UAVs will be retained.The formula of selection is as follows:

Algorithm 3.DE algorithm.Input: Vt: Deployment of UAVs Output: Vt+1: Deployment of UAVs 1: Set iterations G;2: for g =1,2,3,...G do 3: Select a set of data Ut;4: Calculate Q(Vt);5: Randomly select three UAVs as vp1,vp2,vp3;6: G*pi,t+1 =vp1,t+F*(vp2,t-vp3,t)(p1/=p2/=p3/=pi);7: if constraints(8)is satisfed 8:if rand <CR or j =jrand 9:v*pi,t+1,i,j =G*pi,t+1,i,j;10:else 11:v*pi,t+1,i,j =vpi,t,i,j;12:end if 13: end if 14: Calculate Q(V*t+1);15: if Q(V*t+1)<Q(Vt)16: Vt+1 =V*t+1;17: else 18: Vt+1 =Vt;19: end if 20: end for

4.3 Algorithmic Complexity

V.SIMULATION RESULTS

In this section,we evaluate the performance of the proposed algorithm through simulations.For comparison,we consider the following four baseline schemes.

1.Full offloading:All computing tasks are offoaded to the cloud for processing.

2.Greed:Because edge computing can generally provide low computation delay,each HS will offoad its computation tasks to the associated UAV edge node.

3.Enumeration:This algorithm searches all possible decisions to fnd the optimal solution.Due to the high computational complexity of the algorithm,we evaluated its performance in only a small amount of data.

4.Deep Q-network:This algorithm uses Deep Qnetwork to learn the offoading actions of HSs.

In the simulation,the software environment is Python 3.7 with TensorFlow and MATLAB 2019a,and the hardware environment is a GPU-based server.The system has a HAS,fve UAVs equipped with edge computing platform and fve HSs.Besides,it is assumed that HAS has enough power to train proposed algorithm and the HSs are randomly deployed in a fxed area.Other corresponding parameters are shown in the table 2.

Figure 5 shows the ratio of the optimal solution obtained by the enumeration method to the suboptimal solution obtained by the proposed DDNN algorithm versus different training step.What we frst fnd is that the ratio goes up with the number of training step until it converges at a certain point.It’s worth noting that the higher the value of the ratio,the closer the solution of the proposed DDNN algorithm is to the optimal solution.Besides,we can also fnd that the convergence rate of 5 DNNs is signifcantly faster than that of 3 DNNs and 4 DNNs and the ratio of 5 DNNs is greater than 3 DNNs and 4 DNNs.The reason for this is that the more DNNs there are,the better the chance of generating a better set of offoading decisions.As a result,this paper sets the number of DNNs toK=5 because it has the best convergence performance and can obtain a better near optimal solution.

Figure 1.System model of three-tier mobile edge computing with multi-UAV and a single HAS.

Figure 2.The structure of DDNN algorithm.

Figure 3.An offloading actor.

Figure 4.Deployment coding mechanism of UAV.

Figure 5.The ratio versus different training steps.

Figure 6.The total delay versus different UAV’s transmission power.

Figure 7.The total delay versus different UAV’s calculation power.

Figure 8.The total delay for the five algorithms versus different sampling data.

Figure 9.The total delay versus different iterations.

Figure 6 illustrates how the total delay changes with the growth of transmission power of UAVs.It is obvious that the delay decreases as the transmission power of UAVs increases.However,it is interesting to note that the decline is obvious at the beginning and gradually slows down due to the fact that the ratio of transmission delay to total delay gets smaller and smaller with the increasing of transmission power of UAVs.

Figure 7 plots the relationship between the total delay and the calculation power of UAV.It can be clearly observed that the delay does not change at the beginning because the offoading decision obtained by the DDNN algorithm is that all tasks are offoaded to the cloud when the computing power of UAVs is small.However,when the computing power reaches a certain level,this has changed because the offoading decisions begin to include not only cloud computing,but also collaborative computing of UAVs.In addition,when the computation frequency is very large,the total delay almost shows a linear downward trend.The main reason for this is that the offoading decision obtained by the DDNN algorithm is that all tasks are offoaded to UAVs.Obviously,this is not the result we want.Thus,we set the UAV’s calculation frequency to 30*100 cycle/s.

Figure 10.The deployment of UAVs.

Figure 8 compares the total delay of HS for the fve algorithms versus different sampling data.We can fnd that the total delay of the proposed DDNN algorithm is always lower than that of all-cloud algorithm and greedy algorithm.The main reason for this is that the proposed DDNN algorithm provides a variety of offoading schemes for HS,which takes into account the collaboration between the edge nodes of UAVs and the cloud server.Also,it can be seen that the proposed DDNN algorithm has better simulation results than that of deep Q-network algorithm.Besides,compared with the enumeration method,the proposed algorithm still has some disadvantages in computing time delay,because the solution obtained by the enumeration method is the global optimal solution,while the solution obtained by the DDNN algorithm is the global near-optimal solution.However,the gap between proposed algorithm and the enumeration method is not very big,which indicates that proposed algorithm can effectively reduce the total delay of HSs.

Figure 9 shows the variation of the total delay with the number of main iterations.We can see that the total delay decreases with the number of iterations,but the decreasing speed gradually slows down.Especially,after the ffth iteration,the total delay even begins to fuctuate.The main reason for this is that the deployment of UAVs obtained by DE learning is close to the optimal solution to some extent,and it is diffcult to learn the optimal solution if we continue to learn.It can be observed that the delay result after optimization is nearly 5%less than that without optimization,which proves that the proposed algorithm can effectively obtain a reasonable offoading strategy and deployment of UAVs to some extent.

Figure 10 plots the deployment of UAVs in a randomized experiment.It can be seen that the fnal position is always aligned because it is benefcial to reduce transmission delays between the UAVs.In addition,we can also fnd that the UAVs will eventually be deployed between the HSs to ensure that each HS can be served by UAVs with a low delay.Combined with the conclusion in fgure 9,we can deduce that the fnal deployment is a near optimal solution.

VI.CONCLUSION

In this paper,we investigated a multi-UAV enabled three-tier MEC network,in which the deployment scheme of UAVs and offoading decisions of HSs are jointly designed to minimize the calculated delay of HSs’tasks under the condition of satisfying the computing power constraint of each UAV.To address this issue,we proposed a two-layer algorithm which consists of DE to solve the deployment scheme of UAVs and DDNN to effectively generate offoading decisions.A large number of numerical results were conducted to show that the proposed algorithm can accelerate the convergence speed and effectively reduce the total delay of HSs.

ACKNOWLEDGEMENT

This work was supported in part by National Natural Science Foundation of China(Grant No.62101277),in part by the Natural Science Foundation of Jiangsu Province(Grant No.BK20200822),in part by the Natural Science Foundation of Jiangsu Higher Education Institutions of China(Grant No.20KJB510036),and in part by the Guangxi Key Laboratory of Multimedia Communications and Network Technology(Grant No.KLF-2020-03).