曹嘉馨,杨 博,朱善迎
(上海交通大学自动化系,上海 200240;系统控制与信息处理教育部重点实验室,上海 200240;上海工业智能管控工程技术研究中心,上海 200240)
Nomenclature
Abbreviations
DGs Distributed generations
DR Demand response
EMS Energy management system
HVAC Heating,ventilation and air-conditioning
PME Public monitoring entity
SE Stackelberg equilibrium
TATD Total average temperature deviation
Parameters and Constants
ηiEnergy conversion coefficient of HVAC unit in nanogridi(°F/kWh)
ΓiQueue shift parameter related to indoor temperature in nanogridi(°F)
γiDiscomfort cost weighting coefficient for users in nanogridi(¢/(°F)2)
θQueue shift parameter related to battery energy(kWh)
εiHVAC inertial coefficient in nanogridi
CbBattery using cost coefficient(¢/(kWh)2)
Emin/EmaxMinimum/maximum allowable energy state of battery unit(kWh)
Rated power of HVAC unit in nanogridi(kWh)
Maximum power injection into/exported from nanogridi(kWh)
nTotal amount of nanogrids
ViWeighting parameter for nanogridiunder the Lyapunov optimization framework
VPWeighting parameter for PME under the Lyapunov optimization framework Sets and Indices
Ωng,i/ΩPMEFeasible strategy set for nanogridi/PME
kIndex of the time slot(hour)
χkSubstitute representation of decision set for PME
Variables
BkState of virtual battery energy queue (k-Wh)
Basic load of nanogridi(kWh)
EkEnergy state of battery unit(kWh)
Energy consumption of HVAC in nanogridi(kWh)
Net energy generation in PME(kWh)
State of virtual temperature queue in nanogridi(°F)
Buying price of the main grid(¢/kWh)
Selling price of the main grid(¢/kWh)Buying price of the PME(¢/kWh)Selling price of the PME(¢/kWh)
Power generation of small scale uncontrollable DGs in nanogridi(kWh)
Outdoor temperature in nanogridi(°F)
Indoor temperature in nanogridi(°F)
Optimum comfort temperature for users in nanogridi(°F)
Power injected into/exported from nanogridi(kWh)
ykCharging or discharging amount of battery unit in PME(kWh)
Recently,more and more distributed generations(DGs) are integrated into power systems for reducing carbon emissions and long-distance transmission loss[1-2].Microgrid/nanogrid has emerged as an effective energy unit with the transformation from a traditional centralized mode into a distributed one making the system more reliable,more economic and more efficient[3].A nanogrid represents a small version of a microgrid,which is a power distribution system for a single house/small building[4].With intelligent communication and power electronics technologies,nanogrid can realize two-way communications and energy flow satisfying users’ needs in a more flexible way.Unfortunately,the intermittent renewable energy and dynamic energy requirements can lead to the mismatch between power supply and demand,which is detrimental to the efficiency of the connected nanogrids[5].
The existing approaches in maintaining the supplydemand balance are categorized into supply-side management(e.g.,scheduling dispatchable generators’output to optimize total generation costs and satisfy users’demand[6]or determining dynamic electricity transaction pricing[7])and demand-side management/demand response(DR)[8].With the emergence of energy management system (EMS) and advanced metering infrastructure,smart appliances have been developed at the consumer side,such as the heating,ventilation and airconditioning (HVAC) unit[9-11],battery storage system of electric vehicle[12],etc.Their energy consumption can be optimized and adjusted to benefit from dynamic prices set by the external utility.That is so-called the price-based DR,which has been used in diverse to help maintain the supply-demand balance[13],lower carbon emissions[14]and reduce users’energy bills by shifting/shaving the energy demand from high-peak to off-peak periods[15].
In these household appliances,HVAC units account for up to 60% of total energy consumption,and the elastic nature and the thermal capacity of dwellings signify certain kinds of power storage characteristics of HVAC units.Such features will bring challenges to the implementation of an effective DR.The reason is that the power demand of HVAC unit is unknown and it introduces the correlation of indoor temperature over time (i.e.,the time coupling property).It has become a meaningful research subject.Some studies focus on solving such device energy scheduling problems by employing dynamic programming,Monte Carlo[16]and model predictive control method[17].For example,[18]provides a stochastic model predictive HVAC control scheme cooperating chance constraints to jointly optimize not only the energy use but also thermal comfort with effective utilization of renewables.These works can minimize the expected energy cost under the assumption that the future parameters can be predicted exactly or the underlying stochastic process is known.However,these works become difficult to adapt to the scenarios that exist un-modeled uncertainties or changing probabilities.Some other works have taken into account the long-term optimal problem for HVAC devices to reduce the variation of energy consumption[9],to minimize the aggregate deviation between zone temperatures and their set points and the total energy cost[19-20] without the system parameter prediction.It is noted that these related works usually focus on the cost optimization of one side (e.g.,the customer side),while any information error of the other side will disturb the predetermined energy strategies and even lead to a new unbalance of power supply and demand.
Alternatively,the existing DR models for both the supply side and demand side are attractive in using market bidding/auction[21],game theory[22-24]to investigate the electricity trading behaviors of multi-players.Recently,Stackelberg game has become a popular approach to handle the sequential decision-making in twostage problems for independent participants with different objectives by using the leader-follower structure[25].Such an approach has been widely used for modeling the energy trading process between an end-user and external utility to solve the problem of pricing and energy management in microgrid or similar systems[26].For example,Maharjan et al.[27]have studied the complicated interactions between multiple utility companies and multiple users and aim to maximize the payoffs for both sides in one slot.Likewise,a real-time price-based energy scheduling problem is formulated as a Stackelberg game model with the objective of balancing supply and demand as well as flattening the aggregated load;the pricing model is given directly with a function of marginal cost[28].As an extension,[29] provides a hierarchical structure for a grid operator,multiple service providers,and corresponding customers and proposes a two-loop Stackelberg game to help the operator obtain the required energy from the supply and demand sides with the lowest cost.These works focus on short-term objectives and may not guarantee the long-term interests of overall systems owing to the uncertainties related to random power generation,demand,etc.Consequently,several recent works have investigated stochastic dynamic decision processes with game-theoretic framework to tackle these uncertainties in time-coupling problems[30].In[31],the effects of storage units such as batteries on energy management are studied by the corresponding game models.The electricity cost minimization problem is proposed based on Markov decision process and then solved by the stochastic dynamic programming approach.But the solution may suffer from the curse of dimensionality when it is implemented in the large-scale user community.Besides these applications,authors in [32] have studied a stochastic formulation of game model with a one-leader andN-follower under a real-time pricing demand response scheme where a certain probability function of energy load is adopted.A scenario-based stochastic energy management with bonus pricing optimization problem has also been proposed in [33] to maximize the matching level of users’ load and forecasted power generation.Differently,authors in[34]have designed a special Stackelberg game model with the receding horizon control strategy to optimize the social benefit and minimize the devices’operation cost concurrently for networked distributed energy resources and customers during each sample time.
Note that the above energy management problems with game model in a long-term optimization period explicitly/implicitly require the statistics information of future parameters or need parameter forecasting and usually ignore a two-way trade pattern.The energy entities in these works are supposed to play a single kind of predefined role possessing abundant energy or lacking energy all the time.In fact,the renewable generation is stochastic and the users’demands are dynamic,such that entities may switch back and forth between energy consumers and suppliers across time.It is indeed a two-way trade pattern.However,how to model and solve the corresponding bidirectional pricing problem between players with unfixed roles across time taking account of the residents’different comfort requirements is difficult.The challenges are mainly twofold.On one hand,the decision-making is coupled among different players across time intervals.Specifically,as mentioned before,the power demand of HVAC units in nanogrid is unknown.On the other hand,there are time coupling constraints and the future status of system is usually unknown or is difficult to get the accurate value.
In this work,to cope with the above issues,we investigate the bilevel energy management problem about two-way real-time pricing and DR in a long period for a public monitoring entity(PME)and nanogrids that can be both a consumer and a supplier during different time slots.Different from nanogrids,in terms of minimizing the total cost,the PME who has the ability to coordinate the energy demand of nanogrids,aims to set electricity prices and optimize the trading profit.The main contributions of this paper are summarized as follows.
1) In the setting of a two-way trade pattern,we propose a new three-layer framework where PME can trade energy with nanogrids and the main grid bidirectionally.We develop novel individual energy cost and trading profit functions for nanogrids and PME taking into account the bidirectional real-time pricing,random two-way power injection and the thermal discomfort cost of residents in nanogrids.
2) With the consideration of uncertainties in system status,the optimization problem is formulated in a long-term horizon where the time-coupling constraints and inter-constraint decision-making1It indicates the coupling interaction relationship in decision-making between the PME and n nanogrids in the energy management problem,which is specified in(14)and Section 3.between nanogrids and PME make the time-average expected model complicated.To make such model tractable,we introduce virtual queues and utilize the Lyapunov optimization approach to obtain a relaxed form.Rigorous analysis is provided to show that the solutions to the relaxed one are still feasible to the original one.We point out that the proposed approach does not need the knowledge of the prior system statistics.
3) The transaction interaction between PME and nanogrids that can make decisions independently is captured by a one-leader and multi-follower Stackelberg game framework.The existence and uniqueness of the Stackelberg equilibrium (SE) are proved theoretically.Moreover,we develop an energy management algorithm with only a little of information exchanged between nanogrids and PME,to find the equilibrium iteratively.
The rest of this paper is organized as follows.In Section 2,we present the system architecture and then formulate the optimization problem.Solution process for the bilevel energy management problem is developed in Section 3,where its performance is also analyzed.The devised optimization algorithm is shown in Section 4.The simulation results with practical data are provided in Section 5.Finally,conclusions are given in Section 6.
In this paper,we consider a residential power system consisting of nanogrids,PME and main grid shown in Fig.1.In the context,each nanogrid corresponds to one smart house which is equipped with small-scale uncontrollable DGs(e.g.,roof-top photovoltaic systems or small wind turbines),electricity load and house EMS.Each nanogrid consumer,in this work,is supposed to have two kinds of electricity load.They are the critical basic electricity demand2In this paper,we focus on HVAC-like thermal elastic demand appliances which need to meet users’satisfaction,and model other appliances simply as a certain inelastic basic load.which should be maintained under any circumstances and is deemed as a random parameter,and the flexible electricity demand that could be adjusted for the purpose of demand response.Specifically,note that the thermostatically controlled devices acknowledged as fast response and universal thermal inertia such as HVAC units occupy a larger fraction of demand response program.This kind of load would have been able to maintain users’comfort level in an acceptable range even with a curtailed consumption.Under the circumstances,in this work,HVAC units are considered as adjustable loads owing to their higher power consumption and elastic nature.For PME,it has its own generation units,local load and a storage device.As a regulator,equipped with an EMS,PME can gather and receive data from nanogrids and main grid.Besides,PME is responsible to purchase energy from nanogrids with renewable power surplus and sell energy to nanogrids short of power.The residual unbalanced energy of PME,if any,can be offset by trading with the main grid in the spot balancing market.
Fig.1 Schematic of a residential power system
For convenience,we introduce the net generation conceptfor PME.It is equal to the difference between the power output of generation units and the local load in the PME during slotk3This paper considers a long-term horizon with a time-slotted model indexed by k={0,1,...}.In addition,all power quantities(?,yk,?,etc.) are in the unit of energy per slot..As for the storage battery in PME,the stored energy state is denoted byEk.Assume that the storage battery unit is ideal with unit efficiency.Then we have the following battery dynamics:
whereEmaxis the maximum battery capacity,Eminis the minimum residual capacity to preserve battery life,andykis the charged amount(ifyk >0)or discharged amount (ifyk <0) during slotk.Considering the finite maximum charge rate (ucmax) and discharge rate(udmax),ykshould satisfy
Besides,in pratice,the using cost of battery should be considered in view of the limited charging/discharging service life.Over the course of charging/discharging,conversion loss and energy leakage may occur which are usually affected by the factors,such as the speed/amount/frequency of charging/discharging.Instead of accurately modeling of these factors,an amortized cost functionis adopted to model the effect of charging/discharging process on battery unit within one slot.In this function,Cbis a constant coefficient and we denoteCmax/Cminas the maximum/minimum first derivative ofversusyk.
During each slotk,the basic loadof nanogridi(e.g.,lighting,elevator),is unadjustable and should be first satisfied.Letbe the elastic heat load of HVAC unit in nanogridi.It is well known thatis related with the indoor temperatureunder heating mode of HVAC unit4The subsequent analysis developed in the paper can be easily adjusted to deal with the cooling mode,where the evolution function is revised by changing the last plus sign in(4)to a minus sign.[35],satisfying
with the constraint
whereis the outdoor temperature in slotk;εi(0,1) is the inertial coefficient;ηiis the energy conversion coefficient related with the heat-conversion efficiency and the thermal conductivity of nanogridi;andare the lower and upper bounds of comfort temperature for users in nanogridi,respectively.
In this paper,the HVAC load consumption is assumed to be regulated continuously in a certain range,i.e.,
whereis the rated power of HVAC unit.Specially,when HVAC units are directly controlled by the on and off cycles,the power consumptionsatisfiesThis Case involving the binary variable can also be tackled by extending the proposed Lyapunov approach in this paper,and see our previous work[36]for details.
Due to the intermittent and stochastic nature of the renewable energy generation and random power demand,nanogrids may have surplus energy during offpeak times or,conversely,lack energy during highdemand periods.Under this circumstance,each nanogrid can be both an energy supplier and consumer across a long-term horizon.Thus a two-way trade pattern with corresponding bidirectional pricing is needed to keep the balance of power demand and supply.We denote the power injected into nanogridifrom PME aswhich could be positive or negative.The negative value means that there exists power exported from nanogridiin slotk.Moreover it satisfies
Generally,given higher selling and lower buying prices of the main grid,nanogrids are stimulated to optimize their consumption and trade with the PME by purchasing energy at a lower price or selling their redundant energy at a higher price.In this paper,PME is in charge of providing supply-demand balance for nanogrids with procuring more revenue by making wiser decisions of pricing and storage charging.First,to enable this process,we assume without loss of generality that
where() and() are the buying (selling)prices of the main grid and the PME5The assumption about ≤is rational for PME with limited storage capacity.Otherwise,nanogrids are inclined to buy energy from the main grid directly.And then the residual energy of PME has to be bought by the main grid at lower prices.Note that this setting also ensures that the determined selling price is less than the average selling price of PME.in time slotk,respectively.
In this context,each nanogrid aims to minimize its average long-term individual cost by scheduling the HVAC energy consumption in each time slot.Note that considering the maintenance and operation costs of HVAC simultaneously is more realistic in the practical Case.As mentioned in[37],the maintenance of HVAC is usually done with a regular period or when the equipment is failed.Indeed,there are some studies that adopt the lifetime maintenance cost which can be allocated to the annual or even daily operation cost.For example,[38]has used an amortized annual maintenance cost of HVAC.It is noted that this amortized maintenance cost is usually related to the year and can be deemed as a constant value within a certain operation horizon(e.g.,one day).In this Case,the maintenance cost of HVAC is omitted in this paper.In addition,our work employs the electricity consumption cost and accompanying virtual thermal discomfort as the operation cost,which is dependent on the bidirectional electricity prices,energy supply and temperature conditions.A more complicated Case can be extended by including the startup and shutdown operation costs with the corresponding on-off control.The potential solution method can refer to our previous work [36],the direction of which is not elaborated here.To sum up,the individual cost of nanogridiincludes the bidirectional energy trading cost(involving electricity consumption expense) and thermal discomfort cost6Note that the operation and maintenance cost of renewable generators can also be included in the system.However,due to negligible order of magnitudes[39],the cost of this kind can be relatively neglected..But recall that nanogrids will dynamically switch the role between the energy consumer and supplier and the injection power may be positive or negative in response to the varying prices during different time slots.In this Case,the comprehensive cost achieved by nanogridiunder this two-way trade pattern necessitates the following form:
where the last term is thermal discomfort cost which is modeled by the the Taguchi loss function with a quadratic form[40-41];γiis the discomfort weighting coefficient;is the optimum comfort temperature for users in nanogridi.
Now,as energy management is performed on each slot separately,the overall cost of nanogridican be assessed by minimizing the long-term value of (10).Nevertheless,real-time energy management has no idea about the future power generation,demand and temperature,which are highly required in minimizing the long-term value of(10).Consequently,the optimization problem P1 of nanogridiin this paper is formulated as a long-term stochastic optimization problem as follows:
For PME,aking two-way trade pattern and battery using cost into consideration,the obtained net profit during slotkis formulated as (12) where the first two items represent the revenue procured by the trading with all nanogrids;the third item is the aforementioned amortized battery using cost;the last two items denote the cost incurred in offsetting the residual unbalanced energy of PME with the main grid at the prices ofandwhich generally need to be forecast in the optimization problem with an infinite horizon.
Similarly,the objective of PME is to maximize the average long-term profit.The decision variables are the bidirectional prices and battery charge(for brevity,such decision set is denoted asχk).Then we have the following problem P2 of PME:
Constraint(14)indicates the interaction relationship between the PME andnnanogrids in the decision-making process.To be specific,the energy consumption is determined by each nanogrid and affected by the strategy set of PME.
In this paper,we aim at devising a two-way pricing and DR scheme to optimize the long-term profit of PME and individual cost of each nanogrid with a guarantee of users’ comfort level.Meanwhile,we expect to obtain the optimized result in a distributed way and without forecasting future time-varying prices,power generation,demand and outdoor temperatures.
In this section,to solve price-based DR problems described in the previous section,we first introduce virtual queues and obtain a relaxed form with Lyapunov optimization technique.Then we develop a Stackelberg game modelGto analyze the interaction procedure between PME and nanogrids.After that,the feasibility of the proposed approach is demonstrated.
It is observed that,in problems P1 and P2,the indoor temperature (4) and battery storage level (1) are both time-coupled which means the antecedent decision-making will influence the decisions in the subsequent time slots.Similar issues are usually resolved by dynamic programming,which are computationally intensive in large-scale implementation.In addition,the future parameters(e.g.,electricity prices,random power generation,load and outdoor temperatures) in the long-term optimization problems vary over time with unknown statistics,which is a barrier for accurate energy management and pricing.
In the following,we will develop a method based on Lyapunov optimization technique.Different from dynamic programming,this method uses an alternative approach based on minimizing the drift of a Lyapunov function.This is done by defining an appropriate set of virtual queues.Subsequently,the drift-plus-penalty is obtained with the expectation over the system state and the drift bound is minimized greedily[42].After the conversion,the original time-average problems are finally transformed into some real-time subproblems,which can allow nanogrids and PME to interact dynamically without the knowledge of the stochastic system dynamics and HVAC demand information.For clarity,the above problem formulation process is summarized as in Fig.2.It can be observed that P1 and P2 within a long-term optimization period are finally converted as the real-time online problem based on Lyapunov optimization method.In practice,the time scale in the scheduling is one hour and it helps to meet the reality.
Fig.2 Problem formulation flow diagram
3.1.1 Virtual temperature queue design
Instead of solving the time-coupling constraint (4)directly,one way is to study its relaxed form where the average indoor temperature queuewith a shift parameterΓiin Lyapunov optimization framework[42,Sec.4.4] to ensure that (5) is feasible all the time,is bounded over time,i.e.,
It is noted that(15)only ensures the average thermal comfort for nanogridi.However indoor temperatures at some time points might exceed the comfortable range.Thus,the indoor temperature in such worst-Case should also be guaranteed.
For this purpose,we introduce a virtual temperature
whereΓiis a real constant.Actually,the intuition of this design is that the thermal demand requests adding shift parameterΓiare buffered in virtual queues when the actual backlog is nonempty.In this way,the virtual queuewould incur a larger backlog if thermal loads in queueshave not been served for a long period of time.Theorem 3 in later sections and Appendix D presented in our arXiv version [43] prove that we could regulate the system to enable queuesandto have finite bounds whenΓiis within a certain range,and then the users’ temperature comfort level can be satisfied.Besides,incorporating (16) into (4),we have the following dynamics:
3.1.2 Obtaining the drift-plus-penalty
Firstly,in order to maintain the above temperature queue in a stable context,we define a Lyapunov functionfor nanogridi.Subsequently,the one-slot conditional Lyapunov drift is given as
where the expectation is with respect to the random power generation,basic load,outdoor temperatures,optimum comfort temperature and stochastic selection of power consumption strategy.Then,to stabilize the queue and minimize nanogrids’time-averaged comprehensive cost simultaneously,we design a drift-pluspenalty termΔv,iby adding a weighted cost function to,as following:
where the weighting parameterViis a constant which denotes the trade-off between the temperature queue stability and the decrease in comprehensive energy cost of nanogridi.WhenVi0 is chosen,only the Lyapunov drift is minimized which means it does not provide any guarantees on the resulting time average comprehensive energy cost of nanogridi.In contrast,with a properly designedVi,it can be shown that whenever the HVAC unit consumes energy,the indoor temperature is always in a feasible region (see Theorem 3 and Appendix D in[43]for details).
3.1.3 Minimizing the upper bound of drift-pluspenalty
It can be shown that the objective value of P1 is determined by the upper bound of the drift-plus-penalty termΔv,i[42,Sec.4.5].Squaring both sides of (17)and combining with(18),we derive that
After plugging (20) into (19),we obtain (21).By minimizing the upper bound ofΔv,ishown in righthand-side of (21) based on the theoretical framework of‘opportunistically minimizing an expectation’in[42,Sec.1.8],we can obtain the following simplified problem P3 after several manipulations (refer to the Appendix A in [43]).
In this way,we can decide the strategy at each slotkpurely as a function of the current system state while guaranteeing the time-coupling constraint,which will be shown in Theorem 3.After obtaining the optimized power consumptionof P3,the optimal injection power of nanogridiis
The following theorem has provided insight into the analysis of optimal valueunder different prices.
Theorem 1The optimal consumption strategy of HVAC in nanogridiis given by
The former two Cases with the explicit formulation in (25) are obtained by the method of reduction to absurdity which is given as the first part of Appendix B in[43].The results mean that when the buying price offered by PME exceeds a certain threshold,the nanogrid is willing to consume HVAC power as few as possible to maximize its profit.Inversely,when the selling price is low,the nanogird tends to inject the maximum HVAC power from PME.Note that,the implicit functionf(χk)in the third Case includes several different kinds of classification which is difficult to obtain a precise calculated formulation directly.In addition,the value thatf(χk)may take is also discussed in the second part of Appendix B in[43]through the method of portrayal.
For PME,it dynamically makes decisions to solve its long-term profit maximization problem (P2).Note that the battery constraints (1) and (2) bring the timecoupling characters which complicate the optimization problem.To avoid such coupling,a time-average expected constraint is considered,i.e.,
We can prove that (1) and (2) signify (26).Summing both sides of(1)over all time slots and taking expectation yields
Then dividing them byTand takingT →∞,we have(26)since the initial storage stateE0and storage capacity are all finite.After eliminating the dependency property between storage energy state across time slots owing to the limited battery storage capacity,P2 can be resolved by following the Lyapunov optimization framework in a similar way.
First,we introduce a virtual battery energy queueBkwithBkEk+θ,where the constantθis the shift parameter and will be presented in the later section.Besides,Bkis updated as
The constraint(26)can be transformed into the virtual queue stability constraint as shown in[42,Chap.2]to guarantee the feasibleness of (2) even in the worst Case.
Following that,the one-slot conditional Lyapunov drift is given by
By minimizing the upper bound ofΔv,P,the profit of PME is greedily maximized and queueBkis stabilized.We can prove that the time-coupling constraints(1) and (2) are already satisfied under such operation in Theorem 4.Finally,the original problem P2 can be converted into the following problem P4 over individual time-slot,
The solution analysis is deferred to Appendix C in [43] by discussing two situations in detail.In addition,note that it could not obtain the calculated expression directly due to the implicit strategy function of followers.Hence,we develop a best response algorithm to derive solution strategies of problems P3 and P4 iteratively which is shown as Algorithm 1 in the later section.
After completing the above processes,we do not need to consider the stochastic processes related with unknown factors such as distributed generations supply.We can decide the strategy based on the observed current state at each slot to achieve the optimization in a long-term horizon without the need of forecasting any system parameters which makes the originally complicated energy management problems tractable.Specifically,on each slott,the controller of energy management system observes the current state of the distributed power generation and chooses the HVAC power demand from the decision space.This decision,together with the current status of ambient temperature,determines the vector of temperature queue/virtual queue.Inefficient energy management decisions would incur a larger backlog in certain queues.These backlogs will act as sufficient statistics on which the next energy management decision to base.According to Theorem 4.8 in[42],such an approach yields an optimal performance within O(1/VP)from the optimality which has used the complete information.The advantage of this approach is that it uses both current states to stabilize the system,and it does not require a-priori knowledge of random event probabilities.
Note that,the bidirectional pricing scheme set by PME will induce how nanogrids schedule their power consumption,which will conversely affect the planning of price mechanism through the total profit obtained by PME.Motivated by this observation,in this subsection,the coupling decision-making process between nanogrids and PME is captured by a one-leader and multi-follower Stackelberg game,where PME is modeled as the leader,and nanogrids are modeled as followers according to their functionalities.In this game,followers decide their energy management actions from their feasible strategy sets in response to the bidirectional prices designed by the leader to optimize their respective objectives presented in (22) and (31).Meanwhile,the leader is responsible for making a rational battery charging/discharging strategy and offsetting the unbalance energy with the main grid.Certainly,the proposed game is a bilevel optimization problem where followers optimize their utilities in the lower-level while in the upper-level leader determines its strategy by knowing the results of best demand responses of followers.
It is observed that the problem of seeking best strategies can be equivalent to sequentially optimizing the utility functions of nanogrids (followers) and the PME (leader) in a backward manner[44].The result at the end of each sequence of the game where neither PME nor nanogrids can obtain more benefits by a unilateral change of their strategy is called as SE.Thus a set of strategies(χk,*,ek,*) constitutes an SE for the proposed Stackelberg game if it corresponds to a feasible solution of the following problemG,
It is pointed out that an equilibrium in pure strategies might not always exist in a noncooperative game.Therefore,we need to prove that there exists a unique SE for the proposed Stackelberg game.See Appendix C in[43]for detailed proof.
Theorem 2A unique SE exists for the proposed Stackelberg game if the following three conditions are met.
1) The strategy sets of PME and nanogrids are nonempty,compact and convex.
2)Once each nanogrid is notified of the strategy set of PME,it has a unique best-response strategy.
3)PME only has one optimal strategy given the identified optimal best-response strategies of all nanogrids.
Theorem 2 guarantees that the proposed game can reach the equilibrium as soon as PME is able to find the unique optimal strategy while nanogrids select their optimal energy demand.
Although,a unique SE exists theoretically,it is difficult to obtain an analytical solution directly for the bilevel complicated optimization problem.In this section,we will develop an iterative energy management algorithm with the bidirectional pricing scheme to reach SE in a distributed way.
The detailed procedure is shown in Algorithm 1 which is separated into two main parts respectively executed by the PME (steps 1-3 and 8-11,13) and each nanogrid (steps 4-7 and 12) at each slot.First,PME arbitrarily generates its strategy set including twoway prices and battery charge-discharge amount before the iteration.The iterative loop in steps 2-11 illustrates the interaction between PME and nanogrids.Within themth iteration,each nanogridireceives the strategy set{χk,m}from PME,and determines the HVAC power consumption by minimizing P3 with nonlinear programming tools in step 5.Then,each nanogridicalculates its injection poweraccording to,and uploads this value to PME (step 6).After that,with the collected information(1,...,n),PME updates the bidirectional prices and battery charging value based on the subgradient projection method7The objective functions are all convex.in [45,Sec.6.3] [46].In step 9,P+is the projection operator which has the variables map to the feasible regions defined by constraints (3) and (9).anddenote the subgradients of the optimization function pro'with respect to,andykduring iterationm,respectively.We point out that,in Algorithm 1,the adjustment parameters for two-way prices and battery charging are adopted aswhereδs,0,δs,1,δb,0,δb,1,δy,0andδy,1are constants.Under such application,the convergence of algorithm can be guaranteed and found in[45,47].The algorithm will turn to the next iteration until the distance between two consecutive iterations is smaller than a specified valueϱ.Finally,nanogrids and PME will update queue status for the optimization in next time slot.A simple computation complexity analysis of the proposed algorithm is presented.In fact,the computation complexity of the PME side optimization problem is O(n)and the computation complexity of the one of followers is O(1) respectively,wherenis the number of nanogrids.
Actually,the proposed algorithm is executed iteratively in the EMS of nanogrids and PME sides.The equilibrium of Stackelberg game would be reached in a distributed way naturally in the broader sense.It can be seen that PME does not need to know the detailed information about power generations,demands,temperature and weighting parameter preferences of all nanogrids and only requires the result of injection powerfor each nanogrid.In this way,with less information interchange and only local computation resources,our algorithm can find optimal strategies independently,which helps preserve the users’ privacy.For more detail,the information interaction within the loop steps of Algorithm 1 is briefly described as follows.Before time slotk,the EMS of PME will receive market prices(and)from the main grid.In each iteration,the EMS of PME updates the pricing strategy setχk,mand sends them to nanogrids for their power consumption updates.After receiving action information of two-way transaction price from the PME,the EMS of nanogrids will react and select its best response strategies.On the other hand,when the algorithm is compared with the centralized method based on swarm optimization,our experience shows that the centralized one usually could converge to the optimum value at a faster speed.
In this section,we provide the experiment results by applying the proposed algorithm corresponding to the bilevel energy management problem.The simulation is performed on a desktop with an Intel Core i5-7200 CPU 2.50 GHz and 8 GB of RAM using MATLAB.
In simulation experiment,five nanogrids,a PME and a main grid are considered.Each nanogrid is equipped with basic loads,an HVAC unit and DGs (including rooftop solar photovoltaic panels and small wind turbines).For the renewable output of DGs in nanogrids,the data given in Fig.3(a)are generated with a typical wind turbine power curve in [48] and a photovoltaic generation model in[49]using the wind speed and solar radiation data from the websites[50]and[51].The basic loads of nanogrids shown in Fig.3(b)are obtained from[52].The outdoor temperature data are collected from the online weather website[53]as shown in Fig.3(c).The inertial coefficientεiis set to[0.93,0.98]which is randomized for different HVAC systems in nanogrids.As for the parameters in Theorems 3 and 4,for the purpose of the largest reduction in the nanogrid’s comprehensive energy cost and temperature queue backlog,we adoptVi,Γi,VPandθθmin.Moreover,we assume thatin each slot takes value from[-15,25] kW uniformly at random.As for the selling price of main grid,we have used the data from[54].Besides,the buying price is set to three ¢/kWh for simplicity.We set the battery cost parameterCb0.01¢/(kWh)2.We adopt one hour as the algorithm control slot.Other main parameters are shown in Table 1.
Table 1 Simulation parameters
Fig.3 Experiment environment setup
5.2.1 Results of pricing and energy management
First,based on the algorithm described in Section 4,the optimization iterative processes are given in Fig.4.It is observed that,from different initial values,the bidirectional prices,battery charging amount of PME and HVAC power consumptions of nanogrids are converged to the equilibrium after about 35 iterations.
Fig.4 Iteration process
The optimized selling and buying prices of PME are presented with the blue and green dashed line in Fig.5(a) respectively.It is observed that the selling prices of PME are not higher than the selling prices of the main grid across the total time horizon.Besides,the purchasing prices of PME are not lower than the ones of the main grid.Thus,instead of trading with the main grid directly,nanogrids can benefit from this trading pattern.Simultaneously,the PME can also obtain more revenue because its purchasing prices are lower than the selling prices of the main grid.Besides,the optimal power consumptions of HVAC units in nanogrids are given in Fig.5(b).Specifically,when selling (buying)prices of PME become large,the HVAC power consumption is decreased to reduce the nanogrids’ energy purchasing cost (to increase the gain from power selling).
In addition,we check the optimized results of indoor temperature and the battery energy level to validate Theorem 3 and Theorem 4.In Fig.5(c),a time-varying optimum comfort temperature is adopted and shown as the red solid line.It can be found that the indoor temperatures of all nanogrids fall between the upper and lower bounds of comfort temperature which proves that the desired temperature constraints can be met by the proposed algorithm under the time-varying optimum comfort temperature.Likewise,in Fig.5(d),it is observed that the battery energy level varies within [2,16] kWh which verifies Theorem 4.
Fig.5 Optimized results of pricing and energy management
5.2.2 Economic benefit evaluation
We further evaluate the economic performance of proposed algorithm with other Cases:1)Case 1 is similar to [55] which employs a fixed-point temperature control method to maintain the optimum indoor comfort temperature for residents in nanogrids.2) Case 2 based on [56] also tends to pursue the optimum temperature.The main difference between these two Cases is that the second adopts optimized real-time pricing while the first is based on the forecast of the balancing market prices.3) Case 3 based on the game model proposed in [57] aims to minimize the energy cost at each time without taking account of the future optimization.4) Case 4 is the proposed algorithm in Section 4.5) In Case 5,a modified algorithm is proposed with a social welfare scenario to optimize the aggregate cost8The aggregate costnanogrids’discomfort cost+nanogrids’energy trading cost-trading profit of the PME.of PME and nanogrids.In this scenario,the HVAC power consumption and battery charging amount are regulated concurrently under the premise that all the participants are cooperative(i.e.,no pricing and charges for PME and nanogrids).The social welfare in time slotkis formulated asThus the corresponding long-term social welfare optimization problem is given as follows:
The comparison results are given in Table 2.By comparing Case 1 with Case 2,we find that algorithm with real-time pricing can increase revenue of the PME and reduce the energy trading cost of users in nanogrids.It is observed that Case 3 can further reduce the aggregate cost by taking part in the game.However,its thermal discomfort cost is remarkably increased by 38.157 cents.By optimizing the utility in a long-term horizon with two-way pricing,the discomfort cost of Case 4 has decreased by 85.77% from Case 3.And the aggregate cost of Case 4 is further reduced by 154.658 cents.Besides,compared with Case 2,the profit of PME in the proposed algorithm is increased by 959.291 cents and users’energy trading cost is reduced by 641.683 cents.Furthermore,the aggregate costs of Case 4 and Case 5 have gone down by 76.29%and 82.8%from the Case 2.To sum up,the last two Cases can provide effective approaches to scheduling the consumptions of HVAC units when users in nanogrids pay attention to both thermal discomfort and aggregate cost.
Table 2 Numerical comparison results(given unit¢)
5.2.3 The impact of comfort temperature range
The impact of larger comfort temperature range is investigated by reducing/rasing the lower/upper limit of comfort temperature,individually.From Fig.6(a)and Fig.7(a),the discomfort cost is elevated along with the decrease ofand the increase of.It demonstrates that a larger comfort temperature range will lead to a higher thermal discomfort cost.Fig.6(b)and Fig.7(b) show that the aggregate cost of the proposed approach is larger than the value of modified social welfare scenario owing to the selfishness of the players in Stackelberg game.By comparing Fig.6(b)with Fig.7(b),we find that the aggregate cost is reduced along with the decrease ofand rises along with increasing.The intuition behind such result is that when increasing,on one hand,the discomfort cost increases.On the other hand,the indoor temperature tends to maintain a higher level compared with a smallersinceis the upper bound of average indoor temperature.Consequently,there is a larger power consumption of the HVAC unit in heating mode,which results in a higher energy cost.The optimized total power consumptions of HVAC units in nanogrids have been provided in Fig.6(c) and Fig.7(c).The results verify that the HVAC power consumptions have lowered along with the decrease inand increased along with the increase in.
Fig.7 The impact of
5.2.4 The economic profit of battery
The impact of battery storage system in the proposed Stackelberg game is evaluated by comparing the trading profit of PME with the other two scenarios: i)A PME without any energy storage device and does not participate in the Stackelberg game;ii)A PME without any energy storage device but participates in the Stackelberg game.As shown in Fig.8,the profit of PME under the proposed algorithm (corresponding with the green solid line)is usually higher than the other scenarios.On one hand,by participating in the game,there is a significant increase in the trading revenue for PME.This is because the PME in the game can optimize its profit by selling a portion of energy to nanogrids at a higher price as compared with the buying price of the main grid (i.e.,).In addition,the PME can also procure a part of energy from nanogrids cheaply considering the higher selling prices of the main grid(i.e.,).On the other hand,when the battery is discharged during the peak period (e.g.17-20 hour in Fig.8)the profit of PME under the proposed algorithm becomes higher as compared with the second scenario.It is because that less amount of electricity will be purchased from the main grid with the pre-stored energy.
Fig.8 Comparisons of trading profit of PME
5.2.5 The impact of discomfort weighting coefficient
As shown in Fig.9,the influence of the varying cost weighting coefficientγion the performance of the proposed algorithm is illustrated.It is observed that the proposed algorithm can obtain the minimum aggregate cost and nanogrids’ energy cost whenγiis located at[0.007,0.008].Besides it is found that the thermal discomfort cost increases near linearly withγi.The total average temperature deviation (TATD) from the optimum comfort temperature is decreased with the increasingγi(TATDAnd the descent rate slows down whenγiincreases to a certain value(i.e.,about 0.01).
Fig.9 The impact of varying γi
5.2.6 The impact of HVAC inertial coefficient
The performance of the proposed algorithm under varying inertial coefficient of the HVAC unit is investigated as shown in Fig.10.We find that a smaller nanogrids’energy cost and a smaller aggregate cost can be procured given a biggerεiwithin a certain range.Besides,whenεiexceeds 0.97,the nanogrids’ energy cost and aggregate cost will increase instead.The reason can be found from the fourth subfigure.Whenεiis large enough,the weighting parameterVibecomes smaller and the actual indoor temperature range becomes more narrow.Recall thatVidenotes a tradeoff between the decrease of comprehensive energy cost and the indoor temperature queue stability.Therefore,whenεiis bigger than a certain value,the energy cost and aggregate cost will increase.In addition,under a narrow indoor temperature range,the TATD is decreased along with the increase ofεiwhich is shown in the second subfigure of Fig.10.
Fig.10 The impact of varying εi
5.2.7 The impact of number of nanogrids
In this subsection,the impact of the different number of nanogrids on computational time is demonstrated.The amount of nanogridsnis increased from 1 to 30.Fig.11 presents the average computational time for each optimization problem.It is shown that the total computational time grows near polynomially with the increase in the number of follower players (nanogrids).Moreover,compared with the adopted 1-hour timescale (the strategy optimization is commonly required to be completed 15 min,i.e.,900 s ahead),the computational time is appropriately short and can boot its scalability in a larger amount of followers.Meanwhile,we can notice from Fig.11 that the trading profit of PME also rises progressively with more nanogrids for extended market share.In this Case,the proposed algorithm is sufficient in time complexity and privacy preservation for the optimization of energy transactions.
Fig.11 Optimal results with different number of nanogrids
In this paper,to stimulate the consumption of renewable energy as well as the long-term profits,a threelayer trading framework including the main grid,PME and nanogrids is devised where the energy transactions between different levels work both ways.A bidirectional pricing scheme and novel DR problems are proposed in order to make a joint-optimization for PME and nanogrids with HVAC units in a long-term horizon.Considering the time-coupling properties of temperature and battery queue constraints,we resolve the timeaveraged stochastic utility optimization problems by using the Lyapunov optimization technique.The trading interactions between PME and nanogrids are modeled by the Stackelberg game.The existence and uniqueness of SE are analyzed and the sufficient condition is also obtained.Furthermore,we develop an optimization algorithm which is guaranteed to reach the unique SE.The simulation results with experimental dataset have shown that the proposed pricing scheme and energy management strategies can improve the economic utility for both parties involved and without affecting the satisfaction of residents compared with naive strategies.