Research on Interoperability in Spatial Catalogue Service between CS/W and THREDDS

2010-09-05 12:44HUChengfangDILipingYANGWenliSpatialInformationTechnologyResearchInstituteYangtzeRiverScientificResearchInstituteWuhan430010ChinaCenterforSpatialInformationScienceandSystemsGeorgeMasonUniversityGreenbelt0770
长江科学院院报 2010年1期
关键词:异构领域空间

HU Cheng-fang,DI Li-ping,YANG Wen-li(1.Spatial Information Technology Research Institute,Yangtze River Scientific Research Institute,Wuhan 430010,China;.Center for Spatial Information Science and Systems,George Mason University,Greenbelt 0770,U.S.)

Research on Interoperability in Spatial Catalogue Service between CS/W and THREDDS

HU Cheng-fang1,2,DI Li-ping2,YANG Wen-li2
(1.Spatial Information Technology Research Institute,Yangtze River Scientific Research Institute,Wuhan 430010,China;2.Center for Spatial Information Science and Systems,George Mason University,Greenbelt 20770,U.S.)

The interoperablility and information sharing of GIS(geographic information system)information bymeans of or-ganization and communities have becomemore important due to the growing demands of GIS information research and appli-cation.In the past several years,interoperability gaps have generated an obstruction of cross-protocol and cross-community data access within the Earth science community.One such gap is between two protocol families developed within the geo-spatial and Earth science communities.Although it is need to use data from both geoscience community and OGC(open geospatial consortium)community,there is no interoperability product between OGC CS/W(catalog service for web)and THREDDS(thematic realtime environmental distributed data services)catalog system.The first problem we face is the in-formation models heterogeneity between these two catalogue systems.This paperwill focus on the effortof informationmodel transformation.ISO profile in CS/W and their interoperability with THREDDS catalogue schema are discussed in this pa-per.After that,it describes the detail strategy during the process ofelementmapping and illustrates the resolution of the se-mantic structuralheterogeneity and syntax heterogeneity,and italso gives themethod to advance the CS/W server flexibility during the connection with many different profile CS/W clients.

CS/W;THREDDS;catalogue;metadata;interoperability

1 INTRODUCTION

In recent years,remote-sensing technology has advanced to such a degree that almost every aspect of the Earth system can bemonitored.This unprecedent-ed capability,on the other hand,also posesmany con-siderable challenges to data and information systems for supporting the Earth system relating to research and application that typically require integrating multi-dis-cipline,multi-mission data for analysis and decision-making.One of the biggest challenges is how to in-tegrate the traditionalmission-based data and informa-tion systems,which typically only provide support to singlemission or single discipline/community,for pro-viding seamless data and information supports,in cross-discipline,cross-system,and cross-community,to Earth system relating to research and applications.

This paper will focus on the effort of catalog serv- ice interoperability by OGC community and geoscience community.In addition,the problems of CS/W inter-operability between different profiles are discussed.

The OGC is a non-profitable,international,vol-untary consensus standards organization that is leading the developmentof standards for geospatial and location based on services,and has developed a set of web-based interoperability protocols that are widely accept-ed by the geospatial/land science/applied Earth sci-ence communitieswhich typically use GIS tools for data analysis and decision supports.

Meanwhile,in atmosphere,ocean,and modeling science communities,data access protocols such as netCDF/http,OPeNDAP,and ADDE have been wide-ly used.Those protocols are called geoscience proto-cols,and servers and clients implement those protocols on the geoscience servers and clients.

In both OGC community and geoscience communi-ty,catalog service is important for the user to find the needed data and services.OGC has developed CS/W to support the registry and discovery of geospatial infor-mation.And in geoscience community,in order for us-ers to find data served with geoscience protocols,uni-data has developed THREDDSdata catalog system.Al-though Earth system science,global change and ap-plied geospatial research all need to use data from both geoscience community and OGC community for inte-grated analysis,modeling,and decision supports,there is no interoperability product among OGC,CS/W and THREDDS catalog system.The interoperability gaps among protocols make cross-protocol and cross-community data access generate extreme difficulty.It needs to develop a gateway product to communicate the THREDDS and CS/W.Resolving the semantic hetero-geneity is an important task during this product imple-mentation.This paperwill focus on the research how to implement the semantic interoperability between OGC CS/W protocol and THREDDS catalog system.

2 INFORMATION MODELS COM-PARISON

2.1 Information M odel of THREDDS Catalog

THREDDS catalogue is constructed by a series of catalog XML files which are compiled in compliance with THREDDSDataset Inventory Catalog Specification Version 1.0.In the schema specification,a root ele-ment is a“catalog”,one or several“dataset”elements may occur under the root element at the same time.The“dataset”can be regarded as a component unit in the catalogue system,which may be used to describe an independent datasetor a collection dataset.The col-lection dataset is a group which can also include collec-tion or some independent dataset.In the XML docu-ment,there are some reference elements connecting the different XML documents from one to another fol-lowing the tree level sequence.

The base catalog elements include:catalog,serv-ice,dataset,access,and catalogRef.

The catalog element is a top-level element.Itmay contain zero ormore service elements followed by zero or more property elements followed by one or more dataset or catalogRef elements.

A service element represents a data service.It must have a unique name for all service elementswith-in the catalog.

An access element specifies how a dataset can be accessed through a data service.

A dataset element represents a named,logical set of data ata level of granularity appropriate for presenta-tion to a user.A dataset is direct if it contains at least one access path,otherwise it is just a container for nested datasets,called a collection dataset.

A catalogRef element refers to another catalog that becomes a dataset inside this catalog.This is used to separatelymaintain catalogs and to break up large cata-logs.The referenced catalog should not be read until the user explicitly requests it,so that very large dataset collections can be represented with catalogRef elements without large delays in presenting them to the user.

2.2 Information M ode of CS/W

In GMU ISO19115 profile implementation,we have completed two version specifications:OGC 04-038r4 and OGC 07-045.The metadata information is based on the ISO 19115:2003,Geographic information-Metadata(with ISO 19115:2003/Cor.1:2006,Geo-graphic information-Metadata-Technical Corrigen-dum 1).OGC 07-045 supports for XML encoding per ISO/TS 19139(10/2005).This International Stand-ard identifies themetadata that require describing digit-al geographic data.It is composed of one or more metadata sections(UML packages)containing one or more metadata entities(UML classes).ISO19115:2003 specifies a general purpose model for metadata descriptions.

Fig.1 gives a high level overview of the basic classes of the information model.The classes belong to basic packages that are specified by 19115:2003/Cor.1:2006.Metadata entity set information consists of the entity(UML class)MD_Metadata,which ismandato-ry.The MD_Metadata entity contains both mandatory and optionalmetadata elements(UML attributes).

Identification information contains information to uniquely identify the data.Identification information includes information about the citation for the resource,an abstract,the purpose,credit,the status and points of contact.The MD_Identification entity ismandatory.It contains mandatory,conditional and optional ele-ments.The MD_Identification entity may be specified(subclassed)as MD_DataIdentification when used to identify data.

2.3 Information model interoperability

With the analysis of these three catalogue informa-tion models,it can be concluded that in THREDDS catalogue XML file,several dataset elements may be contained at the same time of which each one has its corresponding metadata description information.The metadata relation of these datasetsmay totally be inde-pendence or inheritable from one to another.If two datasets are independent,the content of the metadata will have no intersections.If the two datasets are par-ent-son relation,the metadata of the son dataset will include two parts:themetadata inherit from its parent,and that of its own.The first part ofmetadata will not appear within this dataset element scope,it only ap-pears in its parent element with the flag of attribute“inherited”value equal to“true”.This structure de-cides that the datasets at the same level in the cata-logue tree may contain some similar description meta-data if they have the same parent dataset,but they also have some differences because they still remain the special characteristic metadata respectively.

Beside the“dataset”element,there is also a spe-cial element“catalogRef”.A catalogRef element refers to another catalog that becomes a dataset inside this catalog.This is used to separately maintain catalogs and to break up large catalogs.The reference catalog should not be read until the user explicitly requests it,so that very large dataset collections can be represented with catalogRef elements and without large delays in presenting them to the user.The reference catalog is not textually substituted for the containing catalog,but remains a self-contained object.If we look on the THREDDS XML file as a datasets tree A,the dataset node which contains this catalogRef elementas B,then catalogRef elementwill refer to another datasets tree C.All datasets in tree C are the offsprings of node B.

The THREDDS client parses the hierarchy relation based on element XPath in a catalogue XML file,one dataset element can include another,the XML nested relation presents their hierarchy structure.However,this can not implement in CS/W profile to simply fol-low the XPath.In GMU CS/W,themajor elements to describe themetadata are“MD_Metadata”and“Data-Granule”respectively corresponding to ISO 19115.As to the CS/W profile specification,“MD_Metadata”can not contain another“MD_Metadata”element,the value of MD_Metadata.hierarchyLevel.MD_ScopeCode@codeListValue used to describe this“MD_Metadata”is Dataset or Datasetcollection.And“DataGranule”can not include another“DataGranule”either.So the CS/W client can not get the hierarchy structure infor-mation directly from the XPath of the XML file.The CS/W profiles have the opposite mechanism to store the parent son relation.In THREDDS all the children datasetswill be listed behind the parent dataset except datasets referred by catalogRef,so the client can read this information from the XPath.But in CS/W,it uses a reference element storing the parent ID information to keep this hierarchy structure.In ISO19115 profile,it is MD_Metadata parentIdentifier.

Fig.1 M etadata basic classes excerpted from 19115:2003/Cor.1:2006

As the analysis above,we should do some struc-ture transfer from THREDDS to CS/W.The hierarchy pointer direction stored in THREDDS catalogue XML is from parent to child.We will parse this and change to reserve it from child to parent in CS/W.And for the purpose ofmeeting the CS/W searchingmethod to the best,we will separate the dataset in one THREDDS catalog XML file into server dataset units,every unit will give its fullmetadata description.Although it will bring some overlap storage spaces in CS/W database,butwhen the CS/W retrieves a dataset unit,for exam-ple"MD_Metadata",CS/W has no need to do more search for getting its parentmetadata.

3 SEMANTIC MAPPING

The other important job to accomplish interopera-bility is to complete the semantic interoperation be-tween THREDDS catalogue and CS/W.The element mapping direction is from THREDDS to CS/W,and the purpose of thismapping is to confirm that every el-ement in THREDDS can find its corresponding element in CS/W profile.THREDDS catalogue and ISO19115 use the distinct metadata definition approach,the difference exists not only on the syntax of the name but also on the semantic and element structure.The follow-ing contextwill describe these mapping jobs aiming at their various aspects.

3.1 Syntax mapping

Some of themetadata elements in THREDDS and CS/W profile refer to the same meaning,but use the different expression syntaxes.In this type,element mapping can be done directly from one to another.The schema of ISO 19115 profile is based on ISO/TS19139 XML schemas(May 4,2006)

For example,/catalog/dataset/@name can map to/MD_Metadata/identificationInfo/MD_DataIdentifi-cation/citation/CI_Citation/title.The syntax“name”in THREDDS and“title”in ISO profile are pointing to the same object by which the cited resource is known.In the schema,their types are“xsd:string”and“scXML:CharacterString”which can be transferred smoothly without any problem.

3.2 Element structure transform ation

During mapping job,not only syntax mapping should be completed,but also structure mapping should be done.The following will describe an exam-ple of structuremapping.

In THREDDS catalog,the attribute of an element may play an important role in delivering the informa-tion.Different attribute values can make the element present differentmeans.For example:

<date type=“modified”>2007-08-18 10:24:28z</date>

Thismeans that themodified date of this dataset is 2007-08-18 10:24:28.The value of/catalog/dataset/date@type is restricted in the enumeration values:“created”,“modified”,“valid”,“issued”and“a-vailable”.The five values mean the“date”element can present the date in different five conditions.

However,in ISO19115,this“date”element and its attribute“type”can not be translated to the corre-sponding structure.ISO19115 doesn’t use attribute to restrict the date type,but uses two coequal elements that one is the date information,the other one is the property of the date to represent this information.The result is shown as follows:

Table 1 CI_DateTypeCode code list value

The element map can be done as Table 2.In THREDDS column,it is the value of“date”attribute.In ISO column,it is the corresponding value which is mapped into the ISO

/MD_Metadata/identificationInfo/MD_DataIden-tification/citation/CI_Citation/date/CI_Date/date-Type/CI_DateTypeCode/@codeListValue

Table 2 Date typemapping

There are five value options in attribute“type”to be selected.However,there are two type values“val-id”and“available”which could not do map directly into the ISO CI_DateTypeCode codelist values.Fortu-nately,we find other elements tomake up it indicated just as table 3.

Table 3 Date typemapping

4 INTEROPERABILITY BETWEEN CS/W SERVER AND CLIENT W ITH DIFFERENT PROFILES

The OpenGISCatalogue Services Interface Stand-ard(CAT)supports the ability to publish and search collections of descriptive information(metadata)about geospatial data,services and related resources.Provid-ers of resources use catalogues to registermetadata that conform to the provider’s choice of an informationmod-el,such models include descriptions of spatial refer-ences and thematic information.Clientapplications can then search for geospatial data and services in very effi-cientways.

CS/W specification specifies the interfaces,bind-ings,and a framework for defining application profiles required to publish and access digital catalogues of metadata for geospatial data,services,and related re-source information.Profile is a set of one ormore base standards and-where applicable-the identification of chosen clauses,classes,subsets,options and parame-ters of those base standards that are necessary for ac-complishing a particular function.Application profiles should be explicit about the selected query languages and any features peculiar to a scope of application.

OGC CS/W exposes the followingmethods:

(1)GetCapabilities:retrieve service metadata,in which operations are supported,description of catalog,contact info,etc;

(2)DescribeRecord:Learn about the catalog’s in-formation model;

(3)GetDomain:Dynamically retrieve information about the data range of a particular parameter;

(4)GetRecords:Search the catalog,returning search results and associated metadata;

(5)GetRecordById:Search catalog,returning search results thatmatch the given ID.

The names of the above methods are the same to the all CS/W application profiles,however,the con-tents are various to different profiles.If the client sends a request of"GetCapabilities"or"DescribeRecord",the response of it will describe the base information of CS/W sever,including which type of profile the server is being used by.If the client sends a request of"Ge-tRecords"or"GetRecordById",the content of the re-questwill be specified by which type of profile in that the client is supposed to accept according to this que-ry.If the client and the sever use different profiles,there will be a challenge during their talking.As to ad-vance the CS/W server’s interoperability,it is needed that the CS/W server should support themost of CS/W application profiles’request and response,which makes the CS/W clients that are base on different pro-file can be accessible.For this purpose,we propose to develop a CS/W profiles adaptor in front of the core CS/W server side.

The background core CS/W server only accepts one specified XML format during the request and re-sponse,which is designed by the server organization.We call this format as the core query language.The following is the structure of the CS/W profile adaptor.The request is received from the CS/W client.After judging the profile,it is changed into the formatof core CS/W server as a certain rule,and then is sent to the background core CS/W server.On the next step,the response is received from the background CS/W serv-er,and it is changed into the format complying with the profile’s specification then sent back to the client.In this process,two key challenges should be resolved:requestmapping and responsemapping.

4.1 Request M apping

The XML formatof“GetRecords”request sendingto the core CS/W server should also be strictly to fol-low the CS/W specification,however,the value of PropertyName will change to the schema of core CS/W server.The following present is a section of such“Ge-tRecords”request to comply with ISO application pro-file.As to the GMU CS/W core server,we give this example as follows: With the help of the configuring file,the request sent from the client can be changed to the uniform format that can be smoothly accepted by the core CS/W serv-er.

Fig.2 The structure of CS/W profile adaptor

4.2 Response M apping

There aremanymethods to use for the XML trans-formation.XSLT is a good choice.XSL stands for EX-tensible Stylesheet Language,and is a style sheet lan-guage for XML documents.XSLT is used to transform an XML document into another XML document,or an-other type of document.In the transformation process,XSLT uses XPath to define parts of the source docu-ment that should match one or more predefined tem-plates.When a match is found,XSLT will transform thematching partof the source document into the result document.

This is a section of the metadata transformation configuring file,which shows an example of change from core query schema to the ISO Application Profile.Such transformation configuring file is needed to complete the mapping process according to the different profile’s schema.

5 CONCLUSIONS

The interoperable sharing of GIS information by or-ganization and communities has become important due to the growing demand of GIS information research and application.In this paper,the semantic interoperation between THREDDS catalogues and GMU CS/W are ad-dressed,by comparing the differences between the in-formation models and discussing the challenges in the information transformation.On the other hand,this pa-per also describes the detail strategy during the process of elementmapping.With the analysis job above,it is possible to ingest the information in THREDDS cata-logues into our GMU CS/W that implement the interop-erability between THREDDSand GMU CS/W.With the help of profile adaptor,the CS/W sever can be more flexible during the connection with the CS/W client.In the experiment of GMU CS/W server,it is successful to connectwith ESRICS/W client and GI-Go CS/W cli-ent.

REFERENCES:

[1] VOGES U,SENKLER K.Document Number:07-045.OpenGIS?Catalogue services specification 2.0.2-ISO metadata application profile[S].

[2] International Organization for Standardization.19115:2003,Geographic information-metadata[S].

[3] International Organization for Standardization.2003b ISO/WD 19115-2.2,Geographic information-metadata-part 2:extensions for imagery and gridded data[S].

[4] International Organization for Standardization.ISO/TC 211 N 1979.ISO 19115:2003 Cor.1,Geographic information-metadata-technical corrigendum 1[S].

[5] International Organization for Standardization.ISO/TC 211 N 2049,Text for ISO/TS 19139 geographic information-metadata-XML schema implementation[S].

[6] UNIDATA.Dataset inventory catalog specification version 1.0[EB/OL].(2004-12-15)[2009-12-2]http://www.unidata.ucar.edu/projects/THREDDS/tech/catalog/Inv-CatalogSpec.html.

[7] WEIYa-xing,DILi-ping,ZHAO Bao-hua,et al.The de-sign and implementation of a grid-enabled catalogue serv-ice[C]∥IEEE International Geoscience and Remote Sensing Symposium.Proceeding of International Geosci-ence and Remote Sensing Symposium(IGARSS),COEX,Seoul,Korea,July 25-29,2005:4224-4227.

[8] CHEN Ai-jun,DI Li-ping,WEI Ya-xing,et al.Grid computing enabled geospatial catalogue Web service[C]∥American Society for Photogrammetry&Remote Sens-ing.Proceedings of ASPRS 2005 Conference,Baltimore,Maryland,U.S.,March 7-11,2005.

(Edited by LIU Yun-fei,YIXin-hua)

基于CS/W及THREDDS的空间目录服务互操作研究

胡承芳1,2,狄黎平2,杨文立2
(1.长江科学院空间信息技术应用研究所,武汉 430010;2.乔治梅森大学Greenbelt 20770)

随着地理信息技术的日益发展,地理信息机构、组织及社群之间的信息共享与互操作变得越来越重要。在过去几年中,地球空间信息研究一直存在着跨协议、跨机构的数据共享障碍。其中一个障碍存在两个研究领域之间:地理空间信息领域和地球空间信息领域。两个领域都存在信息共享的需求,其中一个重要的研究方向是解决空间目录服务层次的共享与互操作,即两个领域的代表服务OGCCS/W和THREDDS之间的互操作研究。首先对比了两套空间目录服务的信息模型构架,其后重点研究两者异构信息模型之间的转换方法,随后重点针对OGC中的ISO profile与THREDDS之间的目录schema差异,进行分析探讨,在此研究基础之上,研究了两者元数据的元素映射关系,语义异构及符号异构的转换方法。最后讨论了不同的CS/W服务器端和客户端互相便利通讯的方法。

CS/W;THREDDS;目录;元数据;互操作

TP311 Document code:A

1001-5485(2010)01-0073-07

date:2009-07-06

This projectwas funded by NASA ACCESS programme(NNX06AB49A)

HUCheng-fang(1978-),female,Ph.D.,graduated fromWuhan University,China,in 2008.She is presently the research fellow of Changjiang River Scientific Research Institute.Her main interest is sharing and interoperability of GIS.(Tel.)027-82826895(E-mail)hucf_2@163.com

猜你喜欢
异构领域空间
试论同课异构之“同”与“异”
空间是什么?
创享空间
2020 IT领域大事记
领域·对峙
异构醇醚在超浓缩洗衣液中的应用探索
overlay SDN实现异构兼容的关键技术
LTE异构网技术与组网研究
新常态下推动多层次多领域依法治理初探
肯定与质疑:“慕课”在基础教育领域的应用