DDoS Attack Detection Scheme Based on Entropy and PSO-BP Neural Network in SDN

2019-07-24 09:28ZhenpengLiuYupengHeWenshengWangBinZhang
China Communications 2019年7期

Zhenpeng Liu,Yupeng He,Wensheng Wang,Bin Zhang*

1 School of Cyberspace Security and Computer,Hebei University,Baoding 071002,China

2 Information Technology Center,Hebei University,Baoding 071002,China

3 School of Electronic Information Engineering,Hebei University,Baoding 071002,China

Abstract: SDN (Software Defined Network) has many security problems,and DDoS attack is undoubtedly the most serious harm to SDN architecture network.How to accurately and effectively detect DDoS attacks has always been a difficult point and focus of SDN security research.Based on the characteristics of SDN,a DDoS attack detection method combining generalized entropy and PSOBP neural network is proposed.The traffic is pre-detected by the generalized entropy method deployed on the switch,and the detection result is divided into normal and abnormal.Locate the switch that issued the abnormal alarm.The controller uses the PSO-BP neural network to detect whether a DDoS attack occurs by further extracting the flow features of the abnormal switch.Experiments show that compared with other methods,the detection accurate rate is guaranteed while the CPU load of the controller is reduced,and the detection capability is better.

Keywords: software-defined networking; distributed denial of service attacks; generalized information entropy; particle swarm optimization; back propagation neural network; attack detection

I.INTRODUCTION

Distributed Denial of Service attacks (DDoS attacks) have always been one of the main threats to Internet security.After being controlled by the attacker,multiple hosts are sent a large number of attack packets to the victim host,consuming the resources of the target host,so that the target host cannot provide services for legitimate users [1].Because DDoS attacks are simple,dangerous,and difficult to detect and defend,they have always been a research hotspot of network security.Software-define networking (SDN) is a new type of network architecture proposed by Stanford University [2].The biggest difference from traditional networks is the separation of control layer and data layer.The standard protocol between the data layer and the control layer is the OpenFlow protocol [3].The data plane only forwards the incoming data stream according to the flow table rules.When there is no matching flow entry,the data packet is sent to the controller for processing.Because of the characteristics of the SDN network,when the DDoS attack occurs,not only the victim host is harmed,but also the flow entry in the switch grow sharply,and a large number of Packet_in messages are sent to the controller.In this process,the controller,the victim host and its connected switches are greatly affected.So the DDoS attack is fatal to the SDN architecture network [4].How to accurately and effectively detect DDoS attacks is one of the key and difficult issues of SDN network security.

Our contributions are summarized as follows.

1) A DDoS attack detection scheme based on the combination of generalized information entropy and BP neural network in SDN environment is proposed.The generalized entropy method deployed on the switch is used to pre-detect and locate the abnormal switch,and then the controller further detects to determine whether the abnormal switch is attacked.While ensuring the detection effect,the controller CPU load is reduced.

2) The particle swarm optimization algorithm is used to optimize the BP neural network related parameters and improve the detection ability.

3) Design experiments and verify the effectiveness of the proposed scheme.

II.RELATED WORK

Compared with traditional networks,there are fewer DDoS detection methods in SDN.For example,Braga et al.[5] and Wang et al.[6] extract the traffic 6-element features and use the self-organizing mapping method (SOM) to classify the traffic.However,the solution does not have data analysis for the SDN architecture and follows the features of traditional network traffic.Li et al.[7] use the DCNNDSAE deep learning method to detect,the detection effect is better,but the load on the controller is increased.Mousavi et al.[8] and Wang et al.[9] detect the Shannon entropy value of the destination IP address ,which can realize simple and rapid anomaly detection,but the detected information is single and easy to detect errors.Giotis et al.[10] combine the technology of OpenFlow and sFlow to lighten the burden of the controller and detecte anomaly by entropy.Cui et al.[11] Use BP neural network to detect DDoS attacks,although the accuracy is relatively high,BP neural network is easy to fall into local optimal solution and slow convergence.Li et al.[12] use a genetic algorithm optimized support vector machine to detect DDoS attacks.Compared with only the SVM method,the optimized model is improved in performance and efficiency.

Through the analysis of the above literature,we can see that the detection scheme based on information entropy is simple and takes up fewer resources,but the detection information is single and the error rate is high.The entropy method deployed on the controller can only detect Packet_in packets.The method of machine learning is highly accurate but complex to detect and cannot locate the switch to which the victim host is connected.The controller needs to cyclically collect the flow table information of each switch,extract the features and then perform detection,and the controller resource occupancy rate is high.The time period of collecting the flow table also affects the detection effect.A too short period will increase the load on the controller.If the period is too long,the attack will not be detected in time.

Based on this,a scheme combining generalized information entropy and PSO-BP neural network is proposed.The local network traffic of the switch is pre-detected by the generalized entropy method deployed in the switch,and the result is divided into normal and abnormal.For abnormal alarms,the network under the switch may be victimized.The controller only needs to locate the suspicious switch to collect flow table information,and extract the 6-element features and use BP neural network based on particle swarm optimization to detect whether the attack occurs.Using the way of information entropy and PSO-BP neural network.The advantage of the entropy method is simple and fast,which can guarantee the detection speed and reduce the load on the controller during detection.The advantage of high accuracy is detected by the neural network to ensure the accuracy of the detection.

III.SCHEME MODEL DESIGN

The scheme model is shown in figure 1.The attack pre-detection based on entropy is mainly performed in the OpenFlow switch.Further detection of the abnormal conditions is performed in the controller.Designed in the controller includes an anomaly detection module and an attack defense module.

3.1 Entropy pre-detection

Fig.1.DDoS attack detection scheme diagram.

Fig.2.Flow entry format diagram.

In information theory,information entropy is a measure of uncertainty of random variables.The greater the randomness of variables,the greater the entropy value.In contrast,the higher the certainty of information variables,the smaller the entropy value [13].In general,the opportunities for communication between normal hosts in the network are roughly equal.When a DDoS attack occurs,it is often a many-to-one attack.Several zombie hosts send attack packets to the attacked destination host.A large number of packets with the same destination address reduce network randomness.Therefore,the destination IP address of the DDoS attack traffic is relatively concentrated compared with the destination IP address distribution of the normal traffic.Therefore,the entropy of the attack traffic is smaller than the entropy of the normal traffic.

The entropy pre-detection method is deployed on the switch.The flow entry in the flow table is the basis for packet forwarding.When the data packet matches the flow entry,the number ofReceived_Packetsin the flow entry increases accordingly.Because of the decoupling of the SDN control layer from the data layer,the switch does not know whether an IP address belongs to the local network under its switch.Therefore,addCRP_localas a copy ofReceived_Packetsto the Counters of the flow entry.Determine whether a flow entry in a switch belongs to a flow entry that accesses the host connected to the switch.The format of the flow entry is shown in figure 2.

For example,a simple SDN network topology h1-S1-S2-S3-h2,when the h1 host located on the S1 switch needs to communicate with the h2 host located at S3.If the data packet matches the flow entry,the direct forwarding is performed.When there is no matching flow entry in the switch at the first request,the packet is encapsulated into Packet_in and transmitted to the controller.The controller knows the topology of the entire network,and the controller sends a flow entry to each switch,the initial value ofReceived_Packetsin the flow entry is 0.The initial value of theCRP_localvalue of the flow entry sent to S3 is 0,because the destination IP address is the IP address of h2 ,and h2 is the host in the S3 network.TheCRP_localvalue of the flow entry sent to S1 and S2 is -1,indicating that the destination host is not in S1 and S2.

When calculating the entropy,only the flow entries ofCRP_local≠-1 in the switch are counted.These flow entries represent the local network traffic information of a switch.The entropy detection method can be regarded as a separate module,and the switch can be forwarded as usual.Will not change the nature of OpenFlow-based SDN.

A flow entry has a feature.Received_Packetsrecords the number of all matching packets since the flow entry was created.To get the number of packets in ΔT,use equation (1) and then updateCRP_localby equation (2).

NFEi(t) is the number of packets of the flow entryFEiat time t,andVFEiis the amount of change of packets.

Entropy value detection based on the destination IP address is used.The formula for Shannon entropy is defined as:

M={M1,M2,M3,…Mi,…Mn} is a set with n destination IPaddresses.Mi=represents the sum of the data packets ofLflow entries whose destination IP address isIPiinis the probability of destinationIPiappearing.The formula for generalized information entropy is defined as:

Whenɑ→1,generalized entropy is Shannon entropy.Whenɑ>1,The characteristic of generalized entropy is that high probability events will have a greater impact on entropy value,and the degree of influence will depend on the change ofɑ[14].using the generalized entropy is conducive to better selection of thresholds,clear and efficient distinction between attacks and normal traffic,and enhance detection ability [15].So we use the generalized entropy to detect DDoS attacks.

However,the detection based on the destination address entropy easily misjudges the Flash Event as the DDoS attack.The Flash Event is normal traffic but also causes the entropy to decrease.The detection method based on entropy is less accurate.For different networks,the threshold needs to be recalculated,but the resource is small and easy to use.Therefore,the information entropy can be used to pre-detect the network status.

For a switch that issues an abnormal alert indicates that its connected network may be attacked,and further detected by the abnormality detection module in the controller.

3.2 Anomaly detection module

3.2.1 Flow table collection module

The module mainly uses OpenFlow protocol to collect flow tables of OF switches.The flow tables is the basis for flow forwarding.In the absence of entropy-based pre-detection,the controller needs to collect all switch flow tables periodically,but when the entropy method is used as pre-detection,the controller only needs to collect the flow table information of the switch that has an abnormal alarm for further detection.

3.2.2 Feature extraction module

To implement the neural network algorithm in this paper,we need to determine features.DDoS attackers can use a variety of attack means and methods,but most of the attack traffic comply with some rules,Therefore,the flow features can be used for detection.For example,when a DDoS attack occurs,the main attack mode is source IP address spoofing,which makes source IP more dispersed.The destination IP and destination port will be more concentrated.According to the characteristics of the attack traffic [16],and for the characteristics of the network of the SDN architecture.Through the above analysis,we used 6-element features related to DDoS attacks:

(1) APF (Average packets per flow).Through research,we know that the traffic in the attack state differs from the normal traffic in the number of packets.Because DDOS attacks can use source IP spoofing to randomly generate fake IP addresses,this feature quickly generates a large number of flows,but the number of packets in each flows will be reduced.Generally only 1-3 packages.Each stream in normal traffic contains a large number of packets.Therefore,APFcan be used to represent the flow characteristics.The formula is:

whereFE_numsjis the number of flow entries whose destination address isIPj,andPacket_numsiis the number of packets ofithflow entry.

(2) ABP (Average bytes per packet).LikeAPF,The attacker in order to improve the efficiency of DDoS attacks,the number of bytes of the attack state stream is very low(for example,in a TCP flood attack,Some packets of size 120 bytes are sent to the victim).Therefore,ABPis an important feature for detecting DDOS attacks.The number of bytes of normal traffic is much larger than this.Therefore,ABPcan be used to represent the flow characteristics.The formula is:

WhereFE_Bytesiis the byte size ofithflow entry.

(3)PPF(Percentage of pair flow).The meaning of pair flow is that the source IP of stream A is the destination IP of stream B,the destination IP of stream A is the source IP of B,and the protocol of stream A and B is the same.Normal traffic in the network is interactive.Because the attack traffic forges IP addresses,the number of single flow is increased.Therefore,PPFis used to represent the flow characteristics.The formula is:

WherePF_numsjis the number of pair flow of the flow whose destination address isIPj.

(4)RFE(Rate of flow entries).When a DDoS attack occurs,a large amount of useless traffic is sent to the victim,and the request of the victim host in the network increases,so the number of flow entries related to the host increases for a certain period of time.The growth rate of its flow entries is very obvious.Therefore,the rate of flow entries is also an important feature of DDoS attack.The formula is:

WhereFE_numsjtis the number of flow entries with thedstIPjat timet.Theintervalis the time interval.

(5) ESA (Entropy of source IP addresses).DDoS attacks generate a large number of forged source IP addresses.The source IP address is relatively dispersed and has high randomness,so the entropy value of thesrcIPof the attack traffic is larger than the normalsrcIPentropy value.ESAcan be used as a feature to distinguish traffic status.The formula is:

WhereSIP={SIP1j,SIP2j...SIPij...SIPnj},SIPijis the number of flow entries from source addressIPito destinationIPj.

(6) ADF (Average duration of per flow).Each flow entry in an openflow switch has two related timeout rules,idle timeout and hard timeout.The idle timeout rule is triggered when the flow rule is rarely used.A hard timeout rule is triggered when the timer expiry.Because an attacker randomly sends a large number of useless packets,most of the flow rules will be idle shortly.The flow entry rule of a normal flow lasts for a long time.SoADFcan be used as a feature of traffic.The formula is:

Wheredurationiis the duration ofithflow entry whose destination address isIPj.

3.2.3 Attack detection module

Because the feature of normal traffic and attack traffic are different.Therefore,attack detection can be regarded as a classification problem to detect whether the current network is normal.Flow table collection module collects flow table information.Feature extraction module extracts 6 features.Finally,the training samples are trained by the attack detection module.After training,the attack detection module can detect whether DDoS attacks occur.

The detection algorithm used is a BP neural network[17] optimized based on particle swarm.This paper chooses BP neural network as the detection algorithm in the attack detection module.Because this scheme mainly studies the detection effect of the combination of entropy method and neural network method,BP neural network is the most widely used and most representative neural network.The vast majority of neural network models use the BP network and its variants,which are also typical of forward neural networks,reflecting the most essential part of artificial neural networks.Therefore,BP neural network is used as the detection algorithm of this scheme.

(Back propagation) BP neural network is a multi-layer feedforward neural network trained by error inverse propagation algorithm.Its model topology includes input layer,hidden layer and output layer.The basic idea is the gradient descent method,which minimizes the sum of the squared errors of the actual output value of the network and the expected output value,and finally derives the mapping between input and output.As shown in figure 3.

This is a BP neural network consisting of n input layer nodes,l hidden layer node and m input layer nodes.The output formula of the hidden layer and the output layer are defined as.

Wherewijis the connection weight of theithinput layer node and thejthhidden layer node,wjkis the connection weight of thejthhidden layer node and thekthoutput layer node,xis the input vector,ais the hidden layer threshold,bis the output layer threshold,andfis the hidden layer Incentive function.

The weights and thresholds given by the BP neural network are random.When there are multiple minimum values in the network,the network is easy to fall into the local minimum and not reach the global optimal value.By using the advantage of the global search ability of the particle swarm optimization algorithm,the weight and threshold of the BP neural network can be optimized to improve the convergence speed and enhance the accuracy and stability of the detection model.

In the particle swarm algorithm,in an S-dimensional search space,Wi= (Wi1,Wi2,...,WiS)Trepresents the position of the i-th particle and is used to represent the potential solution of a problem.Vi= (Vi1,Vi2,...,ViS)Trepresents the velocity of the i-th particle.Pi=(Pi1,Pi2,...,PiS)Trepresents the extreme value of the individual,Pg=(Pg1,Pg2,...,PgS)Trepresents the global extremum.In each iteration,the velocity and position of the particle are updated according to the individual extremum and the global extremum.These formulas are.

Fig.3.BP neural network structure.

i=1,2,...,n; d=1,2,...S;kis the current iteration number;wis the inertia weight;c1andc2are the learning factors;Vidis thedthdimension of particle i velocity vector,andWidis thedthdimension of the particle i position vector,r1andr2are random numbers.

The steps of BP Neural Networks based on particle swarm optimization are as follows:

(1) Initialize the neural network and determine its topology.Calculate the number of weights and thresholds of the BP network,as the real-coded dimensionD,calculated asD = NL+ LM +L+ M.WhereNis the number of input layer nodes,Lis the number of hidden layer nodes,andMis the number of output layer nodes.

Fig.4.The process of PSO-BPNN.

(2) Initialize the population size,learning factorsc1andc2,the weightwand the position and velocity of each particle and calculate the fitness value of each particle.The fitness function is as follows:

yis the actual output,ois the expected output,nis the number of training samples andmis the number of output nodes.

(3) Calculate the fitness value corresponding to each particle,compare its size to determine the group extremumPgand the individual extremumPi.

(4) Update the velocity and position of the particle itself during each iteration

(5) Determine whether the fitness value has no change or the maximum number of iterations is reached.If the condition is reached,the iterative calculation is terminated and the result is output; otherwise,return to step (3) to continue the search.

(6) The optimal particle output is used as the initial weight and threshold of the BP neural network,and then trained and detected.Process as shown in figure 4.

Using the PSO-BP neural network after training to detect abnormal traffic and differentiate whether it is a DDoS attack.

3.3 Attack defense module

When the anomaly detection module detects the DDoS attack,the attack alert is issued.At this time,the attack defense module starts to send or change the flow table rule to the Open-Flow switch according to the characteristics of the detected attack flow.It prevents the attack flows from forwarding,and sends instructions to the firewall to prevent DDoS attacks from entering.

IV.EVALUATION

In this section,we evaluate and analyze the detection results of our scheme through experiments.

4.1 experimental design

Using the Java-based Floodlight as the controller and the Mininet simulation software to build the experimental topology.Select Open vSwitch switch that supports OpenFlow protocol,and kernel virtualization as the terminal host.Build 6 OVS switches S1-S6,each connected to a network topology of 20 hosts.The victim host H1 is in the network of S6,and the attack and normal traffic are from some hosts of the network of S1-S5.Normal traffic is generated by the D-ITG tool to randomly access each host in the S6 network,simulating normal network distribution.The network contains approximately 85% of TCP traffic,10% of UDP traffic,and 5% of ICMP traffic [18].Attack traffic is generated using the classic DDoS generation tool hping3.The types of attacks it can launch include TCP_SYN flood,UDP flood and ICMP flood.

4.2 Results and analysis

First,the entropy values of normal and attack traffic is calculated by Shannon entropy.Setas attack strength.That is,the ratio of the DDoS attack packetpato the all packetspk.The average entropy values were obtained by 10 experiments on normal traffic and 20% attack strength traffic and 50% attack strength traffic respectively.The experiment shows that the average entropy value of the normal flows is 4.312,and the average entropy value of the 20% and 50% attack flows are 4.219 and 3.122.Then we use generalized entropy to calculate the entropy value of normal and attack traffic with differentɑvalues.The results are shown in figure 5.

It can be seen from the experimental data that the entropy value of the generalized entropy of the flow decreases with the increase ofɑ,and the distance between the normal flow and the attack flow increases with the increase ofɑ.The greater the attack strength,the larger the distance.Distance is the difference of entropy between normal traffic and attack traffic.The larger the distance,the greater the differ- ence between normal and attack traffic,and clearer distinction between abnormal and normal traffic,improved detection sensitivity,and enhanced detection capability.The distance as shown in figure 6.

Selectɑ=8,because with the increase ofɑ,the range of entropy changes is smaller,the smaller the range of distance growth.Normalize the entropy value with formula (16)

The selection of threshold is shown in Table I.

Fig.5.The change of entropy with ɑ.

Fig.6.The distance change with ɑ.

Choose a value slightly smaller than the minimum entropy of 0.9263,0.92 is the threshold,which is beneficial to reduce the occurrence of false alarm.

We collect flow table information and generate a total of 6000 data samples.The number of normal and attack samples is shown in Table II.

After the neural network completes the training of the training samples,the test set samples are used for testing.The results of the test are evaluated using these 3 indicators.

Table I.Threshold selection.

Table II.Data samples.

Fig.7.Test result.

The three indicators are the Detection rate(DR),the False Alarm Rate(FR),and the Accurate Rate(AR).TPstands for the number of attack samples that are correctly marked,FPstands for the number of attack samples that are incorrectly marked,TNstands for the number of normal samples that are correctly marked,FNstands for the number of normal samples that are incorrectly marked.Test twice,the average results are shown in figure 7.

Comparison of experimental results with BP neural network without optimization and SVM (Support Vector Machines) method.It can be seen that the BP neural network optimized by the particle swarm algorithm has the highest detection rate,reaching 97.47%,followed by BP neural network with 95.83%,and finally SVM with 95.4%.The results of false alarm rate analysis showed that the false alarm rate of PSO-BP neural network detection was 1.43%,which was the lowest among the three detection methods.PSO-BP has the highest accurate rate of 98.02%,and the results of BP neural network and SVM are 96.88% and 96.43% respectively.Experiments show that the BP neural network optimized by the particle swarm algorithm has the best detection effect.

In 100 seconds,inject 20% of the attack strength flow to S6 network at 30-60 seconds,repeat 30 times,ΔT= 3s.Its entropy value changes as shown in figure 8.

If only using the entropy method ,when the flash event occurs,the legitimate traffic accesses the same host,and the entropy value is also reduced as in the case like figure 8.At this time,if only using the entropy method,the normal traffic will be misjudged as attack traffic.So the entropy method can only be used as a pre-detection.When the traffic entropy value in ΔTis lower than 0.92,it is considered that a DDoS attack may occur and an abnormal alarm is issued.In these experiments ΔT=3s,an abnormality is detected at the range of 33s-35s and an abnormal alarm is issued.

When only using the PSO-BP neural network method,the detection period is also set to 3s,and the attack flow is injected at the 30s.Because only using the neural network method,the location of the victim switch cannot be determined.According to the mechanism of [5,7],the controller needs to cyclically detect each switch.In this experiment,there are 6 switches.Only when the switch connected to the zombie host or the switch connected to the victim host is detected,the attack will be detected.Attacks were detected on average at the 42s in the experiment.When the number of switches in a complex network increases,the actual time to detect an attack will be longer.And the controller is always in the detection state regardless of the normal network or the attack,which seriously occupies the controller CPU resources.

When our scheme is used,an abnormality is detected at the 33s-35s,the controller locates the switch connected to the victim host for further detection,and an attack is detected at the range of 35s-37s.Compared with the PSO-BP neural network method used only,the attack was detected 5s-7s earlier.Although slower than the entropy method,it is acceptable.We deploy the entropy method in the switch and do not consume controller resources separately.Only after the abnormality is detected,the controller extracts the flow table information of the victim switch for further detection,so the resource occupancy of the controller is small.

Finally,using our scheme and the Shannon entropy method proposed by Mousavi et al.[8] and the PSO-BP neural networks method and the SOM method proposed by Wang et al.[6] to detect 1500 attacks and 1500 normal traffic accesses.Attack strength ranges between 10% and 100%,and normal traffic contains 300 flash events.Analyze its accurate rate and controller CPU average occupancy rate.The results are shown in figure 9.

The experimental results show that the accurate rate of our scheme is similar to that of the PSO-BP neural networks method and the SOM method,But the average occupancy rate of CPU is 15% less than SOM and 12% less than PSO-BP.Compared with the entropy method,although the CPU average occupancy is only reduced by 3%,the accurate rate is much larger than the entropy method.The reason why the accurate rate of the entropy method is low is due to the misjudgment of the flash event.It can be seen that our scheme has good detection effect and small load.

V.CONCLUSIONS

Fig.8.The change in entropy of S6.

Fig.9.Test results of each method.

In this paper,we proposed a DDoS attack detection method based on generalized entropy and PSO-BP neural network in SDN environment.The pre-detection is performed by the entropy method deployed on the switch,and the result is divided into normal and abnormal.For abnormal switches,PSO-BPNN method is used to detect whether attacks.Compared with other methods,our scheme reduce the controller overhead,improve the accuracy,and the detection speed is faster.The results demonstrate this scheme has better comprehensive detection performance.The research in this paper is mainly for the DDoS attack detection of the data layer in the SDN network,and is not applicable to the detection of DDoS attacks against the controller.As the core of SDN,the security of the controller is undoubtedly important.The detection and prevention of blind DDoS attacks for controllers will be the focus of future work.

ACKNOWLEDGEMENTS

The authors would like to thank anonymous reviewers for their detailed comments.This work was supported by the Hebei Province Innovation Capacity Improvement Program of China under Grant No.179676278D,and the Ministry of Education Fund Project of China under Grant No.2017A20004.