Science, industry and administration require web-based geoinformation concerning storage, availability and processing. This trend will continue in the future as the available amount of spatial data sets increases. This is due to the availability of more detailed data acquisition techniques and improvements regarding Web technologies. Data is collected through airborne laser scanning, detailed models in 3D, billions of affordable and therefore ubiquitous geo-enabled sensors respectively devices such as smart phones equipped with GPS, as well as data acquisition by the masses (crowd-sourcing).
Introduction
This collected data is stored in huge databases which are available on the Web through Web Services. For geospatial applications, such services are organized and maintained in Spatial Data Infrastructures. A popular example of such an SDI is defined by the Infrastructure for Spatial Information in the European Community directive (INSPIRE).
Over the past years, Geographic Information Systems (GIS) as the common tool to manage, process and visualize local geospatial data have been replaced by applications accessing functionality hosted in SDIs for data retrieval and data portrayal. However, the processing task to generate information supporting decision making is still performed with GIS using local databases. Examples of geoprocessing tasks range from simple buffer calculation to complex routing functions or simulations. By applying geoprocessing on data, this data turns into information for supporting decision making.
To realize web-based geoinformation, geoprocessing in SDIs is essential. Moreover, geoprocessing in SDIs becomes crucial in terms of practicability and performance, as it allows to access geoinformation from anywhere and scale the processing effort in a distributed way over the Web. In particular the following aspects for Geoprocessing in SDIs are relevant and described in this article:
Interoperability and standardized Web Services for geoprocessing
Client applications for geoprocessing
Performance and scalability of distributed geoprocessing.
Different aspects based on open source software are described. One of the initiatives providing comprehensive open source solutions for geoprocessing in SDIs is the 52°North initiative. 52°North strives for innovation in Open source software development with its six communities covering topics such as Sensor Web, Geostatistics, ILWIS (a full fledge desktop GIS), security, semantics and Geoprocessing. 52°North is founded by organizations from academia and industries and thereby is able to transform innovation from academia into applications for practice. The previously identified aspects for geoprocessing in SDIs are described in the remainder of the article. Finally, a conclusion highlights the most relevant achievements and provides an outlook for future challenges.
Interoperability and Standardized Web Services for Geoprocessing
Enabling distributed geospatial information on the web is one of the identified goals for SDIs and requires interoperable components located on the Web. Standardized interfaces are a key aspect to enbale interoperability. Standards for geospatial applications are defined by the Open Geospatial Consortium (OGC) and might be later adopted by International Organization for Standardization (such as the Web Map Service).
Distributed information is achieved in a two step approach by 1) distributed data access and by 2) distributed geoprocessing (turning the distributed data into information). In OGC terms, distributed geospatial data access is realizes for instance by Web Feature Service (for vector data). Standardized geoprocessing functionality is available through the OGC Web Processing Service interface (current version of the OGC WPS interface specification is 1.0.0). It defines operations
1) to retrieve service metadata (GetCapabilities),
2) to retrieve process metadata (DescribeProcess) and
3) to perform the specific process (Execute).
All operations are accessible over an internet protocol (HTTP) using structured messages (XML). The WPS is not restricted to any type of data or type of process and allows service providers thereby to host any type of process combined with the appropriate data format via WPS interface. Typical types of data formats supported by WPS are GML, KML, shape file for vector data or geotiff for raster data.
For processing distributed data WPS provides the capability to process remote data sources (accessible via URLs) such as provided by WFS or WCS. Thereby the data transfer is minimized to the client which is essential for chaining of processes to geoprocessing workflows and to enable distributed information in SDIs. Additionally this minimizes the amount of data transferred to remote clients, which require the information in the first place.
The 52°North WPS implementation is fully compliant to WPS version 1.0.0 and includes a large set of functionality (220+ functions). Additionally service providers are able to extend the functionality with customized functions. 52°North WPS is able to handle different data formats such as GML, KML, shp and geotiff. Also the types of data formats can be extended according to the needs of the service provider.
Based on the OGC compliance of 52°North WPS, it is accessible by different types of client applications such as desktop GIS or browser-based applications. A selection of client applications and their capabilities is introduced in the following section.
Client Applications for Geoprocessing
Client applications are crucial to interact with the services hosted in an SDI. To be able to interact with specific services, the client applications need to be compliant to it. If the client application is compliant it can guide the user with a customized form or wizard to interact with the service and finally visualize the result in a map display. In case of geoprocessing, the form provides to select the desired process and to configure the desired parameters.
The client application internally manages then the required communication to retrieve the functionality desired by the user. Several client applications are available for interacting with Geoprocessing Services (WPS), which all have different purpose and functionality (browser-based client vs. desktop client).
In the following we will describe some client applications, which are available as Open Source software at 52°North Initiative. An example of a browser-based client is the OpenLayers client, which allows to embed mapping functionality into arbitrary websites. The OpenLayers client allows to interact and visualize web maps (served from WMS) and vector data (served through WFS).
The 52°North WPS OpenLayers client allows users to customize the web-based data by performing remote functionality hosted by WPS. In the given example FIGURE 1, the client application shows the form to configure a buffer process (right hand side) and visualizes the result of the buffer on a base map. The client can be configured to include any arbitrary web map and also any desired process available through WPS based on interoperable standards. Mostlikely those services are hosted in an SDI.
An example of a desktop client is uDig. It is suitable to access content hosted at SDIs, as it allows users to integrate different web services for web mapping and accessing vector data. The 52°North WPS udig client enables to configure and integrate remote functionality hosted by WPS. Moreover it allows to export successful process results into Google Earth via KML. A screenshot of Google Earth accessing simplified roads as a result obtained by WPS is given in Figure 2.
Performance and Scalability of Distributed Geoprocessing
Sufficient performance and scalability of distributed geoprocessing in SDIs is a key requirement according to the guidelines of for instance the Network Drafting Team of INSPIRE. Both aspects can be addressed using cloud computing, which is one of the latest trends in the mainstream IT world. It provides storage of data and the hosting of applications on distributed third-party facilities in an on-demand way.
From a provider perspective, Cloud Computing enables companies to increase their hardware utilization rate significantly and allows external customers to use the company’s infrastructure on pay-per-use revenue models. From a client perspective, it enables the on-demand allocation of sufficient resources to solve complex computational problems or to scale all kinds of applications. Therefore, Cloud Computing is a valid approach to ensure performance and scalability for computing-intensive Geoprocessing applications (FIGURE 3).
Performance is addressed by dividing the process task into smaller sub tasks and sending these sub tasks to the cloud, which manages virtualized machines for serving the specific request. The 52°North WPS for instance is able to process huge datasets on certain kinds of distributable algorithms using a large number of virtualized machines at the same time.
The WPS takes the input data of a WPS request, splits it into smaller sub-problems and sends these to the Cloud for parallel execution. If the different resources in the cloud finished the computation, the output of all the sub-problems is combined to a single output. However to split process tasks and to merge their results is not related to Cloud Computing per se, but Cloud Computing can be used to process the sub problems on a group of virtualizes machines. MapReduce describes this approach of divide and conquer.
Scalability is addressed by automatically starting and stopping virtualized machines in the Cloud. In the case of WPS, depending on the number of incoming requests new virtualized instances of the WPS can be started in the cloud. A classic load balancer then distributes the incoming requests to the increasing number of WPS instances. The 52°North WPS for instance was deployed as a proof-of-concept in the Google (Google AppEngine) and Amazon (Amazon Web Services) cloud infrastructure. The results of the scalability test in FIGURE 4 show that the cloud-enabled WPS scales better (response times are almost constant) over a non-cloud approach (i.e. WPS hosted on a single machine).
Conclusion
Geoprocessing in SDIs are an important aspect to realize web-based geoinformation. The WPS interface specification of OGC allows users to generate geoinformation based on distributed data. Client applications such as OpenLayers or uDig allow users to seamlessly integrate geoprocesses (based on WPS interface specification) and other distributed resources. To meet specific requirements regarding performance and scalability of distributed geoprocessing in SDIs, Cloud Computing has been identfied as a suitable approach. Future applications need to demonstrate the benefit of the available tools. Additionally, further standardization of geoprocessing approaches in OGC and for Cloud Computing has to continue to ease integration of Geoprocessing in SDIs.
References
52°North Open Source initiative: www.52north.org
52°North Geoprocessing Community: www.52north.org/wps
Sensor web and Web-based geoprocessing and Simulation lab (SWSL): swsl.uni-muenster.de
OGC Web Processing Service interface specification: www.opengeospatial.org/standards/wps
uDig website: udig.refractions.net
OpenLayers: www.openlayers.org
Google AppEngine: code.google.com/appengine/
Amazon Web Services: aws.amazon.com/
INSPIRE: inspire.jrc.ec.europa.eu/
Point of Contact: Dr. Theodor Foerster, [email protected], Institute for Geoinformatics, University of Muenster.