Towards open and interoperable hydrological library

From Geoinformatics FCE CTU
Jump to: navigation, search

Geoinformatics FCE CTU 2011

Robert Szczepanek
Division of Hydrology, Cracow University of Technology
Poland
Email: robert szczepanek.pl

Keywords: hydrological framework, Python, OpenMI, Water Framework Directive

Abstract

Hydrologists need simple, yet powerful, GIS-based and open source framework for their models. Such framework should ensure long-term models/modules interoperability and high scalability. This can be done by implementation of existing, already tested and well-known standards. At the moment there are two interesting options: Object Modeling System (OMS) and Open Modelling Interface (OpenMI). OpenMI was developed within Fifth European Framework Programme as response to need of integrated watershed management environment, described in Water Framework Directive. OpenMI 2.0 interfaces are available for C# and Java programming languages. OpenMI Association is now in the process of agreement with OGC, so spatial standards existing in OpenMI version 2.0, should be better implemented in the future. OMS project is pure Java, object-oriented modeling framework coordinated by U.S. Department of Agriculture. Big advantage of OMS compared to OpenMI is its simplicity of implementation. On the other hand, OpenMI seems to be more powerful and better suited for hydrological models. Finally, OpenMI model was selected as interface standard for proposed hydrological library.

Existing hydrological libraries (models) focus usually on just one GIS package (HydroFOSS – GRASS) or one operating system (HydroDesktop – Microsoft Windows). New hydrological library should break those limitations. To make hydrological models implementation as easy as possible, library should be based on simple, high-level computer language. Low and mid-level languages, like Java (Sextante) or C (GRASS, SAGA) were excluded, as too complicated for regular hydrologist. From popular, high-level languages, Python seems to be a good choice. Leading GIS desktop applications – GRASS and QGIS – use Python as second native language, providing well documented API. This way, Python-based hydrological library could be easily integrated with any GIS package supporting this programming language. As OpenMI 2.0 standard supported interfaces only for Java and C#, there was a need to write Python version of OpenMI interface. Python interface to OpenMI standard, presented in this paper, was the first step towards open and interoperable hydrological library. GIS-related issues of latest OpenMI 2.0 standard are also outlined and discussed.

Introduction

Mathematical modelling in hydrological sciences gets use of geospatial functions from early 1990s [7]. Recent development in remote sensing and automatic data acquisition technologies lead to increase of available data. Analysis and processing of those data in distributed models is very difficult without access to geospatial systems. HydroGIS conferences held in 1993 and 1996 in Vienna, have shown that there is big interest in practical application of geographical information systems (GIS) in the field of hydrology. At Cracow Universtity of Technology (Poland), in late 1980s we have developed GIS based, distributed hydrological model WISTOO [15]. Model was successfully implemented in several Polish and few European catchments. Unfortunately, due to closed license and monolithic structure in C language, model was hard to maintain and develop. We faced a problem, how to build a new, modern set of tools for hydrological modelling.

There are two general approaches of linking hydrology and GIS. First one is based on development of hydrological functions within GIS environment. Such models are easier to integrate and provide better interoperability. Example of this approach can be found in GRASS or SAGA, module based systems. Alternative approach are hydrological applications with GIS capabilities. In such a case, it is very hard and time consuming to build from scratch geospatial capabilities of the system. That is why, it is much easier to build hydrological system on top of existing applications (like QGIS) or libraries (like GDAL), than adding geospatial functionalities to hydrological system. In that sense, proper selection of GIS foundation is crucial step. Hopefully, there is big diversity of GIS projects in the world of Free and Open Source Software for Geomatics (FOSS4G). On the other hand hydrological library shouldn't be coupled with just one geospatial system.

Main problem in research studies is how to provide hydrologist-friendly programming environment, where once developed modules can be reused and easily modified. And not only by the author, but also by other scientists. In existing solutions, advanced computer languages, too complicated for regular hydrologist, are implemented. Instead of fighting with the code, one should focus on the problem to solve. To reach this goal, easy to use framework, based on high-level computer language is needed. This should be available with good documentation, tutorials and case studies.

There are many great sources of code for such base GIS library – GRASS, SAGA, QGIS, Thurban, Whitebox GAT, HidroSIG, SEXTANTE. There are ongoing projects to integrate some of mentioned systems. Applications already are able to use SEXTANTE library (see uDig, gvSIG, OpenJUMP) or GRASS modules (see QGIS). Interesting project – QGIS Processing Framework – was run to build environment for execution of modules from external projects (SAGA, GRASS, OTB, OSSIM) within QGIS application.

Figure 1: Hydrological library scheme

The main problem however, is lack of interoperability and redundancy. Redundancy of functions/modules can causes inconsistency, as implementation methods can be different in different applications. It will be probably easier for hydrological community to develop and maintain just one implementation instance. In fact, there are already many elements of this puzzle, but they are hard to merge. To make hydrologist researcher life easier, not another hydrological application is needed, but library with interoperable functions of basic hydrological processes (fig.1). Taking pieces from such library, any application can be build on top, and several alternative hypothesis or approaches tested easily. It should be also possible to use other, external models, compatible with selected interface, or even to use hydrological library as Web Processing Service (WPS) backend.

Modelling frameworks

In year 2000, the European Union approved European Water Framework Directive (WFD 2000/60/EC). WFD become important document for integrated river basin management and interdisciplinary studies not only on our continent. Many countries faced problem of weak interoperability of existing information systems, coming from different environmental domains. Additionally, transboundary issues and international cooperation become important element of WFD reporting. Huge diversity of standards in Europe is a fact, so simple standardisation was not an acceptable solution. One of potential options was preparation of common, versatile interface on top of existing systems. Based on experience from previous projects, two groups initiated development of completely different modelling frameworks: Object Modeling System (OMS) [3] and Open Modelling Interface (OpenMI) [14]. Assumptions of those two systems were different, so finally two different solutions were elaborated. First one very simple and pragmatic (OMS), while second one more sophisticated (OpenMI) and strictly oriented to WFD needs. Simpler framework allow only linear workflows of modules/models, while second one enable feedbacks, looping and much more.

OMS

OMS framework has been developed in a joint approach by the U.S. Geological Survey (USGS), the U.S. Department for Agriculture (USDA) and the Friedrich-Schiller-University from Jena, Germany [3]. Prototype version OMS 1.0 was published in 2001 year and the latest version OMS 3.0, called Next Generation Modeling Framework (NGMF), was released recently.

Object Modeling System is pure Java, lightweight, object-oriented modelling framework working in NetBeans environment. However developed in Java, OMS 3.0 provides interoperability with Fortran, C and C++. OMS 3.0 supports i.a. geospatial integration, calibration tools, sensitivity analysis. In the newest version, the minimal invasive approach was applied, and as result there are no framework data types and no interfaces provided. OMS 3.0 is multithreaded. Components are Plain Java Objects enriched with descriptive metadata by means of language annotations [4]. Annotations are being utilized to specify resources in a class, that relate to its use as a component. They allow for the extension of the Java programs with meta information that can be picked up from sources, classes, or at runtime.

OMS component metadata implement just three methods – Initialize, Execute and Finalize. Components are designed with a standard, well defined interface in mind. They are self contained units from the conceptual and technical perspective. Components can be developed and tested individually [11].

Several environmental models are implemented using OMS 3.0. One of them is Precipitation Runoff Modeling System (PRMS/OMS). Interesting from hydrological point of view is jgrasstools project [6]. It is based on JGrass application, fast growing and very well documented library of basic hydrological and geomorphological algoritms.


OpenMI

OpenMI was developed in 2001 within Fifth European Framework Programme as part of the HarmonIT project, and later OpenMI-LIFE projects . Partners of the project were: Natural Environment Research Council (UK), DHI (DK), Deltares (NL), Wallingford Software (UK), National Technical University of Athens (EL), University of Thessaly (EL), Aquafin (BE), VMM – AK (BE), Flanders Hydraulics Research (BE) and Université de Liège (BE). From the beginning, OpenMI was released under the Lesser General Public Licence (LGPL). It is a interface definition for the computational core of the computer models in the water domain [13]. Model components that comply with this standard can, without any programming, exchange data at run-time [5]. The idea was to easily combine programs from different providers, enabling the modeller free choice of the best model suited to a particular needs.

There are two official OpenMI versions at the moment (1.4 and 2.0), but all off the current implementations are .NET based. Well known, OpenMI 1.4 compliant software include: InfoWorks, ISIS, SWAT, Sobek and MIKE.

The OpenMI standard is defined by a set of software interfaces that a compliant model or component must implement. These interfaces are available both in C# and Java. Version 2.0, released in November 2010, specifies the base interfaces (to define component representing a model) and the extension supporting the time and space dependent component (OpenMI.Standard2.TimeSpace) [14]. Future extensions could support for example Open Geospatial Consortium (OGC) standards.

To support the development of OpenMI compliant components, a Software Development Kit (SDK) has been provided for .NET developers [14]:

  • Backbone – a default implementation for the majority of the OpenMI.Standard2 interfaces,
  • DevelopmentSupport – some very generic support utilities,
  • Buffer – utilities for timestep buffering and time interpolation and extrapolation,
  • Spatial – utilities for spatial interpolation,
  • ModelWrapper – utilities to facilitate wrapping existing models.

OpenMI is a pull-based architecture that consists of linked components (source components and target components) which exchange memory based data, in single-threaded architecture. OpenMI is not based on a framework, it only has linkable components. So in fact, none of presented frameworks is a framework sensu stricto.

The usage of the OpenMI namespace is the mandatory part of any OpenMI-compliant software component. In standard documentation [14][12], list of interfaces is described, which is as minimal and as complete as possible, to define exactly data that is being exchanged. Every compliant component must have an associated registration file in XML format and implement IBaseLinkableComponent interface.

The exchange items are used in initial phase to define what a component can provide (as output), and what information component accepts (as input). In OpenMI 2.0 values can be also transformed as needed with help of adopted outputs. The OpenMI does not use a standardized data dictionaries, except SI units.

Component life is divided into five phases: initialization, configuration, preparation, execution and completion. Every phase is well described in documentation.

Figure 2: Interfaces related to spatial data in OpenMI 2.0

In August 2011, OpenMI Association and Open Geospatial Consortium (OGC) have signed a memorandum of understanding to cooperate in standards development and promotion of open standards related to computer modelling. At the moment spatial operations are based on vector objects and no direct raster support is available (fig.2). Spatial elements are represented as point, line, polyline,polygon or polyhedron in element sets containing information about georeference system in the form of WKT [8]. It is possible to link 1D and 2D models/components and exchange data between them in run time. To perform more advanced spatial operations, authors prepared SDK (Spatial extention) and sample code files.

Open source hydrological library

There are many valuable hydrological libraries or modules possible to include in hydrological library. First practical limitation is problem of platform dependency. Many existing hydrological systems focus on one environment only. HydroDesktop developed by Consortium of Universities for the Advancement of Hydrologic Sciences (CUAHSI) is good example of that. It uses C#/.NET, limiting in fact usage to Windows operating system. That is why cross-platform languages, like Java or Python, are probably a better choice.

Many GIS applications has built-in functions used for hydrological purposes. There are also external spatial data analysis libraries like SEXTANTE[9][18], based on SAGA algorithms. This Java library contains more than three hundred algorithms for both raster and vector processing, and has bindings to GIS applications like gvSIG, uDig, OpenJUMP, Kosmo and ArcGIS [10]. There are also ongoing implementations of SEXTANTE as backend for GeoServer and 52North WPS Server [10]. Its library includes 10 algorithms for basic hydrological analysis, 15 for indices and other hydrological parameters, and 11 algorithms for geomorphometry and terrain analysis. Some packages (e.g. GRASS and SEXTANTE) have graphical modellers for more user friendly workflow design and implementation. This is substitution of previously used scripting languages for users with less experience. Mentioned graphical modellers are able to model only one directional, linear workflows.

Very interesting group of algorithms are complete hydrological open source models. They are stand alone application, or created within one of popular GIS platforms. The most popular and known hydrological models are:

  • HydroFOSS [2], GIPE, JGrass, TOPMODEL [1], ANSWERS [16] (built on GRASS project),
  • IHACRES (SAGA),
  • PIHM (QGIS),
  • HydroDesktop,
  • HidroGIS,
  • HydroPlatform (Thurban),
  • Kalypso.

Kalypso model is one of the most popular recently. Complete hydrological models are minority among available resources. Functions for geomorphological analysis based on Numerical Elevation Models can be found in almost every GIS package. Also almost every GIS package can be used for hydrological pre- or post-processing of spatial data. There are however packages which contain much more advanced and oriented strictly to hydrological analysis functionalities:

  • r.watershed, r.drain.r.flow, r.stream.* (GRASS),
  • Terrain Analysis – Hydrology, Channels (SAGA),
  • SEXTANTE,
  • jgrasstools.

Having so many options to choose, mayor problem start to be proper selection of package. Direct comparison is not so easy. Open source packages give access to source code, but sometimes it is hard to analyse implementation of algorithms in different languages written be different programmers. So, is the implementation of the same algorithms comparable? In last two years can be observed synergy effect in desktop GIS applications and libraries. Several FOSS4G projects cross-reference to each other. Leader in this field is probably QGIS project with QGIS Processing Framework, aiming at development of generic framework for external packages integration. At the moment QGIS project has part of GRASS modules included in last releases, SAGA modules almost ready to work under QGIS, Orfeo Toolbox and OSSIM modules are planned next.

Advantage of library compared to application is fact, that library must be interoperable to survive, so its implementation should be simpler. Well designed library should be interoperable not only in the terms of the interface (hardcoding vs. OpenMI), but also in the terms of access method (direct access vs. PyWPS).


General assumptions

When building open source hydrological library the following goals should be reached:

  • free and open source license,
  • implement one of popular modelling frameworks; easy, but powerful, with probable long-time support,
  • reusable, small components,
  • use of already existing code and algorithms,
  • scalable architecture,
  • based on simple, easy to learn, cross-platform computer language,
  • compatible with popular GIS environments (both server and desktop),
  • GUI independent,
  • well documented.

Framework and computer language selection

Both presented frameworks (OpenMI and OMS) are modern, well designed and versatile. OpenMI has strong support from leading water resources firms in Europe, while OMS is more U.S. public administration related. OMS is easier and faster to implement when compared to OpenMI. OpenMI on the other hand pays much attention recently on spatial aspects of modelling. OMS is Java based while OpenMI implements both C# and Java interfaces. Big advantage of OMS compared to OpenMI is its simplicity of implementation. One of the arguments in favour of OpenMI is that many of existing hydrological models, listed by Community Surface Dynamics Modeling System, plan to implement this standard. Finally, OpenMI standard have been selected as framework for developed hydrological library.

Through OpenMI interface, it will be possible to access models in the library from different consumers (fig.1). In the first stage, access from desktop applications like QGIS and GRASS, will be tested. Second potential consumer are web services like WPS, installed as server applications. Finally, it will b possible to get use of any OpenMI compliant, external model (component).

Existing GIS applications with hydrological functions use very different computer languages. The most popular are:

  • C/C++ – GRASS, QGIS, SAGA,
  • Java – uDig, SEXTANTE,
  • C# – HydroDesktop,
  • Python – Thurban.

There are however bindings to another languages, so for example QGIS plugins can be easily created in Python. As hydrological library should be used mainly be non professional programmers, most popular, but difficult languages like C and Java, were excluded [17]. Perfect language should be easy to learn and implement. From world's TOP10 languages, only Python is high-level, cross-platfom language with good support by GIS applications, and free to use with open license. According to Tiobe portal, Python was “Programming Language of the Year 2010”, with the highest rise in ratings. Two popular desktop GIS applications (QGIS and GRASS) use Python as second language in development, so there are at least two potential desktop GIS consumers. The only problem is that the are no OpenMI interface standards for Python. Having OpenMI as modelling framework and Python as basic language, the first step in library implementation was translation of C# OpenMI specification to Python.

Implementation

Implementation have started from analysis of OpenMI 2.0 documentation and examples. As typical open source project, source files are available to the public in internet. Project was named OpenHydrology, and most of its necessary infrastructure is hosted on SourceForge portal (http://sourceforge.net/projects/openhydrology/).

After that, most of OpenMI 2.0 specification standard was translated to Python language. As there are substantial differences between source (C#, Java) and target (Python) languages, to keep compatibility with the source, no mayor changes has been done, and all the comments were copied from the source files. First significant difference between source and target language is related to types definition, as Python has dynamic typing. Second difference relates to interfaces implementation, and the last one to naming conventions and namespaces standards. All those operations will be done later.

At this first stage only standard OpenMI 2.0 interface definitions are available, without GUI and SDK. Sample implementations, documentation, GUI for easy model access and workflow modeller will be built I the future.

Discussion

In the time, when every day new projects are launched, more and more initiatives tend to integrate efforts to better use available human resources. GDAL project is good example of this process. In before-GDAL-era, every GIS package used its own mechanism for data access. The same situation is now with hydrological models. A lot of redundant models, unable to exchange data with each other.

Hydrological library is based on mature and tested modelling framework, in order to provide good interoperability. OpenMI was my selection, however there are some threats related to this choice. One of them is complexity of OpenMI 2.0 standard implementation. Python as basic language was relatively easy decision, due to its favourable status in FOSS4G world. Problem is that in hydrology domain, this is not very popular language yet.

Translation of OpenMI 2.0 standard to Python interfaces was the first step needed to start development of the library. There are two potential options for the second step. Development of new and complete OpenMI components in Python (fig.3a), as long-term goal. Faster, but rather short-term option, is wrapping existing hydrological C modules (fig.3c), for example from GRASS.

Figure 3: Alternative locations of OpenMI interfaces

OpenMI interfaces can be available at any level of the hydrological model. Preferably higher decomposition into elementary physical processes, not treating the whole model as one unit, will be better. This will give scalability and usability, but at the cost of higher workload. I hope to attract open source developers to the project, as till now OpenMI Association is concentrated mostly on C# and commercial applications. OpenMI 2.0 is a new standard, and according information on web page of the project, there are no compliant software available yet.

References

  1. Beven, K.J., Lamb, R., Quinn, P., Romanowicz, R., and Freer, J. (1995). TOPMODEL, in Computer Models of Watershed Hydrology, Singh V.P. (Ed.), Water Resources Publications, 627-668.
  2. Cannata , M. (2006). A GIS embedded approach for Free & Open Source Hydrological Modelling, PhD dissertation, Politecnico di Milano.
  3. David, O. and P. Krause (2002). Using the Object Modelling System for future proof of hydrological model development and application. Proceedings of the Second Federal Interagency Hydrologic Modeling Conference, Las Vegas, NV, July 28 - August 2, 621-626.
  4. David, O., Ascough II, J., Leavesley, G., and L. Ahuja (2010). Rethinking modeling framework design: Object Modeling System 3.0. IEMSS 2010 International Congress on Environmental Modeling and Software – Modeling for Environment’s Sake, Fifth Biennial Meeting, July 5-8, 2010, Ottawa, Canada; Swayne, Yang, Voinov, Rizzoli, and Filatova (Eds.)
  5. Gregersen, J.B., Gijsbers P.J.A., Westen S.J.P. (2007). “OpenMI: Open Modelling Interface”. Journal of Hydroinformatics 9(3), 175-191. Available at: http://www.iwaponline.com/jh/009/3/default.htm
  6. jgrasstools project, Available at: http://code.google.com/p/jgrasstools/
  7. Kopp, S.M. (1996). Linking GIS and hydrological models: where we have been, where we are going? Proceedings og HydroGIS 96: Application of Geographic Information Systems in Hydrology and Water Resources. IAHS Publ. no.235. 133-139.
  8. OGC (2002). The OpenGIS Abstract Specification Topic 2: Spatial referencing by Coordinates OGC 01-063r2, OpenGIS Consortium Inc.
  9. Olaya, V., SEXTANTE – Spatial Data Analysis Library, Available at: http://www.sextantegis.com/
  10. Olaya, V. Gimenez J.C. (2011). SEXTANTE, a versatile open–source library for spatial data analysis.
  11. Object Modeling System (OMS) project, Available at: http://www.javaforge.com/project/oms
  12. The OpenMI Association (2010). OpenMI Standard 2 Reference for the OpenMI (Version 2.0). Part of the OpenMI Document Series
  13. The OpenMI Association (2010). Scope for the OpenMI (version 2.0). Part of the OpenMI report series.
  14. The OpenMI Association (2010) OpenMI Standard 2 Specification for the OpenMI (Version 2.0). Part of the OpenMI Document Series
  15. Ozga-Zielińska M., Gądek W., Książyński K., Nachlik E., Szczepanek R., (2002). Mathematical model of rainfall-runoff transformation - WISTOO, Mathematical Models of Large Watershed Hydrology, Ed. V.P.Singh, D.K.Frevert, Water Resources Publications, 811-860
  16. Rewerts, C.C., and Engel, B.A., (1993). ANSWERS on GRASS: Integration of a watershed simulation with a geographic information system. Abstracts, Proc., 8th Annu. GRASS GIS User’s Conf. and Exhibition, Conf. Agenda and Listing of Abstracts.
  17. Rey S.J., (2008). Show Me the Code: Spatial Analysis and Open Source, unpublished, Available at: http://geodacenter.asu.edu/2008_11
  18. Schröder D., Hildahb M. and David F., (2010), Evaluation of gvSIG and SEXTANTE Tools for Hydrological Analysis. 6th International gvSIG Conference. Available at: http://jornadas.gvsig.org/6as-jornadas-gvsig/descargas/articles