Security, Controllability,Manageability and Survivability in Trustworthy Network

2008-01-01 01:46WangShengYuHongfangXuDu
ZTE Communications 2008年1期

Wang Sheng,Yu Hongfang,Xu Du

(School of Communication and Information Engineering ,University of Electronic Science and Technology of China, Chengdu, Sichuan 610054, China)

Abstract:The Internet plays increasingly important roles in everyone's life;however,the existence of a mismatch between the basic architectural idea beneath the Internet and the emerging requirements for it is becoming more and more obvious.Although the Internet community came up with a consensus that the future network should be trustworthy,the concept of“trustworthy networks”and the ways leading us to a trustworthy network are not yet clear.This research insists that the security,controllability,manageability,and survivability should be basic properties of a trustworthy network.The key ideas and techniques involved in these properties are studied,and recent developments and progresses are surveyed.At the same time,the technical trends and challenges are briefly discussed.The network trustworthiness could and should be eventually achieved.

T he Internet is no doubt one of the greatest wonders created by human beings.Sticking to the design philosophies of simplicity and openness,it has achieved great success,which is reflected in the fast growth of users and terminals,rapid advancement of network technologies,and many business benefits being realized.But on the other hand,the rapid development results in more new demands being put forward,which challenge the Internet itself.Following its initial success,the Internet has been found with many problems(e.g.,poor security,control and management difficulty,untimely response to failures and attacks)that need attention.

The problems of the Internet have led to a general recognition that future networks should have some new features to allow Internet access to be more convenient and safer,and to allow the operators to timely find out various kinds of abnormalities and take proper actions.This kind of network is named“trustworthy network”,coming from the name“trustworthy computing”.

Despite the common understanding in the network community that the future network should be trustworthy,researchers have not yet agreed on many fundamental issues such as the deifinition of trustworthiness,evaluating methods of trustworthiness,as wellas on what efforts should be made to ensure the trustworthiness of the network.However,the vagueness of the definition of a trustworthy network has not hindered the researchers from studying related issues and technologies.In fact,there have already come out some achievements and approaches related to trustworthy network.

This article focuses on these achievements and approaches,and not on giving a clear definition of the trustworthy network.It discusses,based on the points of view of some researchers[1],the trustworthy network in terms of three aspects,namely,network security,controllability and manageability,and survivability,which are the basic properties of a trustworthy network and are closely associated with each other.The goal of achieving higher security demands more stringent requirements for the control and management capability of the network.Moreover,the threats against the survivability of the network have expanded from accidental failures to various kinds of abnormalities,including man-made attacks.

1 Network Security

According to the Network Security Report for the First Half of 2007[2]published by the National Computer Network Emergency Response Technical Team/Coordination Center of China(CNCERT/CC),the actual status of the Internet security in China is far from satisfactory.Compared to the same period in 2006,there is a considerable increase in the number of various kinds of network security events.The number of phishing events and that of web page malicious code events received by CNCERT/CC in the first half of 2007 are 14.6%and 12.5%more than the total numbers of the year 2006,respectively.In Mainland China,the hosts affected by Trojan are far more than that of 2006,increasing by 21 times.The number of tampered websites increased by 4 times compared to the same period in 2006.In other words,the Chinese public networks are facing serious security threats,and users are likely to incur direct economic losses from network attacks aiming to make profits.

1.1 Main Technical Methods Used to Guarantee Network Security

Currently,the mechanisms used to guarantee network security mainly include network content security,network authentication and authorization,firewall,virtual private network,network intrusion detection,network vulnerability detection,secure access,secure isolation and exchange,security gateway,security monitoring and management,network security auditing,malicious code detection and prevention,junk mail processing and emergency response[3].

Public Key Infrastructure(PKI)is an important technology for solving the trust and authorization problems in networks,such as authenticity of identities,confidentiality of data,integrity of files,and non-repudiation of behaviors.But PKImust have a role component,the security control point,which is a security weakness point in such a system.Once the attacker breaks through the security control point,all authentication measures willnot function at all.

Intrusion Detection System(IDS),another technology used in network security,is designed to find out intrusion and unauthorized behaviors in the network.It searches out suspect events by periodically examining audit information and monitoring network traffic.Now,the Intrusion Protection System(IPS),a combination of IDSand firewall,is developed.It can greatly deepen the defense,and better guarantee the network security.But the problem existing in the IDSis its false positives cannot meet the requirements of actual applications[4].In recent years,universities and labs,including University of California(UC)Davis,UC Berkeley,Carnegie Mellon and Massachusetts Institute of Technology(MIT),have done much work in the field of intrusion detection and have achieved a great deal.However,as the network environment becomes increasingly complex and the security threats emerge continuously,the intrusion detection technology has much to do,and many important issues have to be addressed.

1.2 Development Trend of Network Security Technologies

In the presence of a large-scale network security event,any isolated method would be useless.To achieve an ideal effect in the fight against Internet-wide attacks,all Internet users are required to participate in the defense.This idea is proposed in[5].The research organizations of UC Berkeley and Intel are also studying this issue.To fulfill“a wide participation of Internet users”,many technicaland non-technical difficulties have to be overcome,for instance,establishment of trust relationship between users,protection of user's privacy and development of an Internet-wide distributed data processing technique.

As what[1]points out,the vulnerabilities of Internet come from many sources and they are present in the whole process of Internet,from system design,implementation,and management,so it is not advisable to take actions in an isolated or independent way.Network security will be regarded as an important criterion in the design and research of next-generation network[6].Many universities,such as Carnegie Mellon,Stanford,UCBerkeley,MITand Princeton,and the research institutes of such enterprises as Microsoft,Cisco and Intel have been engaged in the research of this challenging field.

2 Controllability and Manageability

The controllability and manageability of network refers to the capability to effectively control and manage user behaviors,network states and resources.

This capability is indispensable not only for constructing secure networks,but also for healthy development of future network and continuous technical innovations.

2.1 User Behavior

To be trustworthy and secure,the network must have the capability to control and manage user behaviors.

The current network security research focuses on defense.Many researchers have recognized[7]that defense and deterrence should be of equal importance in achieving network security.To enable the network to be deterrent,the best method is to allow the traffic flows in the network to self-authenticate.

That is to say,to attach a label on each flow,or even each packet.This label can uniquely identify the computer sending the packet,and it cannot be interpolated(or the interpolation can be discovered),so it is undeniable.On the other hand,when the label is used in the network to identify the source address,it cannot invade the privacy of the user.A public key signature mechanism called Group Signature[8]can fulfill this requirement.Based on this mechanism,noveland effective user management and control methods can be worked out.Snoeren and others suggested[7]an authentication service(using Group Signature techniques)being set at the edge of the network to authenticate the belonging of each packet entering this network.Once a packet is found unauthenticated(i.e.,it does not belong to any known group),it will not be permitted to enter the network.On the contrary,if a malicious packet is found,its sender can be identified with its group signature.Once the huge computation problem is solved,this approach is no doubt of great significance for improving the current Internet status,which is difficult to manage and control.

Similar to Group Signature,the Security Architecture of Enterprise Network(SANE),proposed by Casado,et al[9],also tries to set a centralized control center called Domain Controller at the edge of the network,especially within the enterprise network.All communication within the jurisdiction of the host must get permission from the domain controller.With this centralized approach,the user behaviors can be controlled and managed to a satisfactory level,and the security policies can be easily deployed in the enterprise network.However,applying this approach in public networks to ensure both monitoring and scalability is a great challenge.

In addition to network security,the demand for supporting mobile equipment also requires the future network to effectively monitor the location information of the user and the end equipment.This can be regarded as another requirement for controllability and manageability of user behaviors.People realized in the very beginning that it would be very efficient if the location information can be integrated into the routing design of wireless networks[10].Recently,some researchers suggested,that the function of geographic location information should be taken full consideration in the protocollayer[11]of the future network architecture.The benefits of doing so are obvious,but how to provide efficient,Internet-wide location services for a huge number of mobile equipment,which is rapidly increasing,is still an open issue.Gruteser[11]purposes a multi-resolution location service scheme.This scheme,based on the addressing strategy of Public Service Telephone Networks(PSTN)and taking full use of the characteristics of hierarchalnetwork,is feasible to a certain extent.

2.2 Network State

In addition to configuring the network,the most important function of network management is to timely perceive various state information of a running network.

The purpose of this perception is to timely detect,locate,reason and diagnose all kinds of abnormalities,including failures,attacks and decrease of QoS,so that proper measures can be taken.However,in the current Internet,the situations are as follows:the control and management functions depend on the data plane;a complete,coordinated distributed control is not available;and most control and management functions are customized in a later stage rather than in the early design of the network.As a result,it is difficult for the network to effectively collect network state information,find and locate abnormalities,and make timely response.As to the management and control system of the future network,researchers have conceived several solutions from different aspects.For examples,Greenberg et al[12]emphasize the advantages of centralized control;Clark et al[13]introduce the concept“knowledge plane”and argues the necessity of reasoning and diagnosis;Shenker et al[14]and Barford et al[15]try to answer which function modules are absolutely necessary;Complexity Oblivious Network Management(CONMan)[16],gives more attentions to the idea of separation of management and control functions from data forwarding function;Maestro[17],based on the achievements in active networks and programmable networks[18],tries to work out a uniform operation platform for network control and management.

Greenberg et al[12]advocate to redivide the functionality of current routers into four planes:Data,Dissemination,Discovery and Decision,also called 4D.The Discovery plane is responsible for identifying the state information of the network,which is moved by the Dissemination plane to the Decision plane.The Decision plane then computes the proper routing and network configuration based on the collected information,and sends decisions to the Data plane.The basic goalof the 4D architecture is to simplify the complicated network management and realize automatic discovery of network state by means of centralized management and reorganization of critical function modules.

Similarly,CONMan,a new architecture,adopts the idea of centralized control.It is designed to simplify the configuration operations in the data plane.In CONMan,the data plane protocol has been abstracted into some function components(e.g.,pipe,switch and filters).All these abstract components provide open interfaces to the management plane;thus,the management plane can easily convert a high-leveldemand into a series of cascading configuration of these function modules.CONMan is partly inspired by the concept“decision plane”in the 4D architecture.From the idea of physically separated management channel in No.7 Signaling System,it also expands the concept“management channel”in 4D,allowing the data and management channels are logically separated although they share the same physical links.The designers of CONMan argues that this separation is a necessary means to ensure the network behaviors to be effectively monitored by the network management and control system.

The idea of Maestro is to modularize and generalize network management and control functions.In this scheme,a generalized operation platform is developed.Each network control and management function module is implemented as an independent application on the platform,and the information exchange and isolation between modules are done via the platform.The goal of Maestro is to abstract current functions of the network(e.g.,packet forwarding and routing maintenance)into modules with single function.These modules are easier to maintain and unlikely to make mistakes.Besides,they can be easily customized and assembled for diversified applications.

The above solutions reflect the three major trends in the development of network control and management:isolation of management plane from data plane,centralized control and management,and modularization of functions to facilitate assembly.

2.3 Network Resource

Without effective management,the network resources cannot be utilized in a coordinated way,and thus,the development of technologies and architecture would be hindered.In recent years,various network virtualization schemes have been proposed to overcome the flaw of the current Internet(i.e.,it cannot support new technologies).

The goal of network virtualization is to enable the future Internet,constructed with new network construction models,to support all kinds of new technologies and services,especially new networking technologies,and to allow different end-to-end networks to co-exist in a public platform.Turnet et al[19]and Feamster et al[20]introduce two typical models.In the model introduced by Turnet et al[19],a new layer called substrate is inserted between layer 2 and layer 3.This substrate layer manages and abstracts all resources of layers lower than it,and provides services for upper layer(i.e.,layer 3).The“Concurrent Architectures are Better than One”(CABO)model introduced by Feamster et al[20]is similar.It tries to construct a virtual infrastructure so as to provide necessary resource management and isolation functions for all Internet Service Providers(ISPs)to construct their own networks.The protocols,services,forwarding,signaling and routing of these networks can differ from each other.

The ideas of network virtualization can be summarized as follows:as it is impossible to design a generic networking/forwarding strategy(like in the case of ATM)for all services and applications(including those that may emerge in the future),it would be better to recognize the fact and design a common platform for managing the resources necessary for all networking methods,allowing all protocols and forwarding technologies to co-exist on this platform.

From the perspectives of controllability and manageability,the network virtualization approach solves some problems,but at the same time,it brings new problems.If the substrate layer does well enough in resource abstraction and management,the isolation of upper layer network can be thoroughly achieved,and each network can be developed independently.As a result,these advantages of manageability must depend on the perfect resource management capability of the substrate layer,which is exactly the greatest challenge.

In this sense,virtualization should not be the only way to effectively manage and control resources in order to deal with future technology development.In fact,a project funded by the Future Internet Network Design(FIND)Program,called“Enable Future Internet innovation through Transit wire”(eFIT)[21],and a project[22]funded by the Chinese 973 Program have adopted different approaches.A common point in the basic designs of the two projects is to separate the task of service offering from that of connectivity which guarantees the services.That is to say,to separate the control and management of users at the network edge from the management and usage of resources at the core,and the two aspects will be seamlessly connected by defining proper mapping services.As Massey et al[21]pointed out,this separation and controlapproach can effectively ensure the development of future technologies,but adopts a different implementation way from network virtualization.This approach is significant for enforcing controllability and manageability of the network.

3 Survivability

Survivability refers to the capability of a system to fulfill its mission,in a timely manner,in the presence of attacks,failures,or accidents[23].The network survivability is guaranteed by specific protection and restoration mechanism,which can recover the damaged services in case of network failures.

Currently,network security has been categorized into the domain of survivability.During the 25th IEEE Real-time Systems Symposium in 2004,the International Infrastructure Survivability Workshop was held,which focused on the challenge of survivability faced by today's network systems.According to its proceedings,the solutions for the challenge should take into account network load,attacks and failures.Yurcik et al[24]introduce the term“Survivability over Security”(SoS),and they think the traditional security technique is to protect individual components,while the survivability encompasses the functionality of an entire system.Thus,survivability is a higher goal over security.

To simplify our description,we define the traditional,random failure-targeted network survivability as narrow-sense survivability,and the survivability that covers man-made attacks as broad-sense survivability.

3.1 Narrow-sense Survivability

The research on network survivability originally focused on transmission networks.With the development of network services,the survivability of IP networks attracts increasing attention.

The capacity of a link in the transmission network(e.g.,Synchronous Digital Hierarchy(SDH)and Wavelength Division Multiplexing(WDM))is quite large,so if one of its components fails,more losses may be incurred than other networks.The research on survivability of transmission networks started early in 1970s.By now,a large number of documents regarding this issue have been published[25].These researches can be classified in the following ways:

(1)By topological structure:there are researches on ring and mesh networks.

(2)By service model:there are researches on dynamic and static survivability algorithms.

(3)By recovery mechanism:there are researches on self-healing ring,1+1,shared protection,path protection/restoration,link protection/restoration,sub-path protection/restoration and cycle coverage.

(4)By failure scenario:there are researches on single-link failure,multiple-link failure,node failure and region failure.

In sum,the purpose of these researches is to improve the network resource utilization as well as find a trade-off between the resource utilization and the recovery time.

Meanwhile,with the expansion of network services,high-level requirements have been posed on the reliability and availability of IPnetworks.Traditional“best effort”service model is far from meeting the service requirements.When a failure occurs,the traditional IPnetworks recover it by way of converging the dynamic routing,so the recovery is very slow,often between several seconds and several minutes.

This recovery time is insufferable in high-speed backbone networks.Therefore,several quick self-recovery mechanisms were proposed at the beginning of this century to improve the availability and reliability of the IP networks[26].These mechanisms fall into three categories:the first is the self-recovery mechanism by network-wide routing reconstruction;the second is locally pre-configured fast-rerouting mechanism;and the third is Multi Protocol Label Switching(MPLS)-based protection switching mechanism.The first category of mechanisms makes use of the inborn self-recovery capability of IProuting protocol.When a local failure occurs,the mechanisms re-compute the route in the new network state to recover the failure.The mechanisms in the second category are to pre-compute several routes.Once a failure is found locally,the failed route will be replaced by a pre-computed backup route for data transmission;thus,the failure is recovered.Currently,the research of local route recovery focuses on Fast Reroute(FRR)[27]and Multiple Topology Routing(MTR)[28].The third category is to set up backup paths in advance and reserve resources for each task,so its recovery is very quick.The protection switching mechanism is quite suitable for MPLSnetwork to quickly recover its failures.By the protected granularity,the protection switching mechanisms can be divided into two kinds:end-to-end and local.

3.2 Broad-sense Survivability

There are two main problems that have to be solved in the research of broad-sense survivability:one is quantitative evaluation,which involves establishment and development of proper failure model theories and quantitative evaluation methods for network vulnerability analysis and user attack description;the other is the mechanisms and policies used to guarantee the survivability,where both error-tolerance and intrusion-tolerance should be taken into account rather than error-tolerance only,as well as the evolution from the single technique in homogeneous network environment to the hierarchal,coordinated technologies in heterogeneous networks.

(1)Quantitative Evaluation

In view of the impossibility of constructing a perfectly survivable network in reality,quantitative evaluation of network survivability becomes quite useful and valuable.With quantitative evaluation of survivability,the network vulnerabilities can be found,and the potential risks can be identified;thus,proper improvement can be made.

The research on quantitative evaluation of survivability is still in the exploring stage.Current research works largely refer to the research results of dependability.The dependability research has experienced many years of development,and several modeling methods have been set up for different applications(e.g.,the Petrinet[29]state space model).As a result,the research on dependability has laid foundation for quantifying survivability.However,the dependability analysis often assumes the failures are accidental events of software or hardware,while in analyzing broad-sense survivability,the intentional,man-made failures has to be taken into account in addition to accidental failures.The man-made failures seem to be random events without any association,but they are actually elaborately conceived,and associated with each other.Therefore,they are difficult to be correctly described with typical random models.

Currently,much research has been made on the quantitative analysis of survivability and on the theories and techniques of intrusion tolerance,intrusion detection and security model,but little has been done on the effect of malicious attacks on network survivability.The world's main research organizations in quantitative analysis of survivability include Virginia University,Arizona University,Carnegie Mellon University and Computer Emergency Response Team(CERT).They focus on different aspects.For example,Virginia University and Arizona University focuses their research on the quantification[30]and architecture[31]of system survivability.With the aid of graphs,Jha et al[32]convert the survivability evaluation of a network system into a framework of solving a typical graph problem.These researches are still in the exploring stage.

To quantitatively evaluate the survivability of network,it is quite important to establish a failure model theory for network vulnerability analysis and user behavior description,which is also the greatest challenge in quantitative evaluation.The characterizations of network failures are critical to the design of a survivability scheme.Only after the failure characterizations and model are accurately created,can a best design solution be worked out.In 2004,UC Davis studied the characterization of failures in an IPbackbone[33],but the research results are not enough for setting up a suitable evaluation model.

(2)Survivability Technology and Policy

The current research on survivability mechanisms and algorithms focuses on given network failures(e.g.,single failure or dual failures)or assumes the network failures are accidental,and few works have been done in the network recovery technologies against malicious attacks.An error occurs accidentally,but an attack is an action that intentionally takes advantage of the vulnerabilities and defects of the system,making the number and scenarios of failures uncertain.Obviously,due to the difference between error and attack,the current,random failure-oriented protection and recovery technologies cannot be directly used to solve the malicious attack problem.The old survivability mechanisms or routing algorithms are no longer applicable.Therefore,the solutions for error-tolerance and intrusion-tolerance problems in trustworthy networks still need to be worked out.

The broad-sense survivability aims to improve the overall survivability of the network,and puts emphasis on the hierarchical,multi-domain and multi-layer design.Puype et al[34]discussed the survivability of multi-layer network.With information exchange between layers,the network flexibly decides when and where to take recovery actions,thus creating effective policies for inter-layer adjustment,and coordinating failure recovery mechanisms of different layers.Consequently,the competition between these recovery mechanisms is avoided,and the overall survivability of the network is improved.Huang and Messier[35]studied the network survivability in multi-domain environment,and enumerated the problems and challenges existing in current multi-domain networks.

4 Conclusions

This article discussed current researches of critical technologies related to trustworthy network from the perspectives of network security,controllability,manageability and survivability;the development trend and direction are also analyzed.As we can see,the research on trustworthy network is just beginning,and many problems have to be solved(e.g.,integrating the isolated and independent security policies and technologies,designing the network architecture so as to offer the network with built-in security protection and deterrence capabilities,ensuring a high-level control and management over user behaviors without invading the user's privacy,finding a trade-off between centralized control and scalability,quantitatively evaluating the network's error-tolerant,intrusion-tolerant and failure-tolerant capabilities,and more).With technical advancement and common efforts of the industry,these problems can be finally solved,and the goal of trustworthy network can be eventually achieved.