Complexity International      ISSN 1320-0682     
Volume 02 April 1995

Massively Parallel Simulations of Complex, Large Scale, High Resolution Ecosystem Models

Randy Gimblett, George Ball, Vince Lopes
Bernard Zeigler, Bill Sanders and Michael Marefat
School of Renewable Natural Resources
University of Arizona
Tucson, Arizona, USA 85718

Electrical and Computer Engineering Department
University of Arizona
Tucson, Arizona, USA 85718
Email: gimblett@nexus.srnr.arizona.edu

Abstract:

Recent interest in monitoring and predicting landscape and ecosystem changes for large, complex, geographic regions has motivated a series of international initiatives stimulating research on simulation environments, knowledge representations, and modelling methods. While much effort has been focussed on acquiring and manipulating spatial data for a variety of purposes, geographic information systems (GIS) have been a very promising tool for supporting such environmental simulation. However, GIS alone cannot solve all the problems in simulation-modelling. GIS must be augmented with high performance simulation. This paper reports on research which brings together scientists with detailed understanding of the problems and opportunities in ecosystem-modelling at the scales being considered, together with computer engineers with significant experience in the development of computer platforms for knowledge representation and simulation modelling. The outcome of this research is a set of algorithms to support landscape level simulations on massively parallel processing platforms implemented in the CM-5 programming system. The resulting environment will be accessible to ecosystem managers and scientists to support decision-making and research.

Introduction

Concern for the global environment has engendered much interest in monitoring and predicting landscape and ecosystem changes for large geographic regions. Modelling and simulation methodology can help to exploit the data gathered by satellite and land-based earth observation programs. High performance computing is clearly needed to handle the large volumes of archived and simulated data. This paper presents ongoing work that brings together GIS, cellular automaton (CA) methodology, discrete event (DEVS), object-oriented, hierarchical models in an environment linked together on a high performance CM-5 parallel computer with a GIS database and a 3-dimensional landscape visualizer. The outcome of the research will be a set of algorithms to support landscape level simulations on massively parallel processing platforms. Implemented in the CM-5 programming system, the resulting environment will be accessible to ecosystem managers and scientists to support decision-making and research. Moreover, the same algorithms will be adaptable to other high performance platforms with cellular array-based architectures. The impact of the research will extend more widely than ecosystems since many large scale systems require the same underlying modelling and simulation methodology for effective study.


The need for GIS

GIS technology provides powerful databases for storing and retrieving spatially referenced data. Spatial information is stored in many different themes representing quantitative, qualitative or logical information. These data can have different resolutions that range from detailed local information to small scale satellite imagery. GIS operators provide the means for manipulating and analysing layers of spatial information and for generating new layers. Since it allows distributed parameterisation, a GIS is useful for ecological models that need to explicitly incorporate the spatial structure and the variability of system behaviour [1]. A raster-based GIS represents spatial information as a grid of cells, each cell corresponding to a uniform parcel of the landscape. Cells are spatially located by row and column and the cell size depends on the resolution required.

Interest has been expressed in using GIS for simulation of spatial dynamic ecological processes [15]. However, GIS systems do not include procedures for handling time, they are designed to process entire arrays of data, and cannot easily address varying localised operations across the spatial grid.

The task of simulation software is to provide the dynamic state projection necessary for understanding complex system behaviour and ultimately providing decision support for ecosystem management. However, state-of-the-art simulation software does not have adequate capability to access the huge volumes of data necessary for realistic landscape representation. Adequate simulation modelling of landscape ecosystems requires the integration of the two technologies, GIS and simulation, into a new single environment with both database and simulation strengths. Our recent research has demonstrated how a simulation engine and a GIS database can be interfaced to provide ecosystem simulations of fire spread and forest succession not possible before [17,18].


The need for discrete event cellular modelling

Application of high performance computing is not simply a matter of rewriting existing landscape simulation code for parallel execution. Landscape ecosystem models employ localised neighbourhood computations, such as diffusion processes that involve movement through space, (for example, oil spills, seed dispersal, fire spread, insect infestation). These are usually represented as partial differential equations (PDEs) that have to be discretised in the form of finite differences or finite elements [8]. Such representations assume continuity of space and time and so are too fine-grained to enable feasible simulations of large regions. As an alternative, cellular automata (CA) also have been used for implementing spatial dynamic ecological models in cellular spaces. They can be seen as discrete models of spatio-temporal dynamics obeying local laws. Not only does the CA methodology incorporate discretised PDEs, it also, by employing more qualitative state representations, makes it easier to express the dynamics of interacting discrete units in space. Nevertheless, the coastal landscape model developed by Skylar et al. [15] makes clear the limitations of the CA framework. Its homogeneous cellular structure and synchronous time advancement are too rigid to easily accommodate the diversity of processes that interact at overall ecosystem scale. A generalisation of the CA methodology - namely, discrete event object-oriented simulation of hierarchical, modular models - has been shown to offer the flexibility to include the full spectrum of process description levels ranging from individuals to aggregated populations [3]. As earlier suggested by Couclelis [5], the utility of linking the DEVS (Discrete Event System Specification) formalism for cellular spaces with GIS was recently demonstrated by Vasconcelos et al. [17,18].


The need for massively parallel simulation: CM-5

The Connection Machine data parallel architecture is ideally suited to CA methodology as has been demonstrated in several simulations at the physical level [6,12]. Compared to array-based supercomputers suited to CA simulation, the latest version of the Connection Machine, CM-5 has demonstrated significantly greater performance. The CM-5 with 1K processors has been benchmarked at 60 gigaflops/sec compared to the 1.6 GFlops/s realised by MasPar's MP-2216 with 16K processors [7]. Moreover, the CM-5 is also easily accessible via Internet on NSF Supercomputer centers. However, little application of massively parallel computers to landscape ecosystem models has yet been reported [16]. Given the high performance requirements needed for landscape simulation, our goal was to demonstrate the applicability of the CM-5 to this important domain.

The architecture of the CM-5, with thousands of processing nodes is co-ordinated by the Data and Control networks. However, rather than being restricted to synchronous SIMD control as in the earlier CM-2, processors are also free to engage in asynchronous MIMD-style computations. This freedom is best exploited when it is consistent with data parallel computations such as array element operations, replication and broadcasting of elements, reduction to scalar values.


The need for interactive scientific visualisation

Although the CM-5 offers program-level visualisation, there is a strong need to construct model level interactive visualisation to interpret simulation results and to support scientific experimentation. For large scale ecosystem simulation, an interactive visualiser is the only reasonable method to view the large data sets, to interpret them from different perspectives, identify special areas of interest, recognise anomalies, and detect errors which may otherwise not be apparent.


Architecture of distributed simulation environment

The proposed architecture employs a simulation engine, realised in the CM-5, to execute the spatial model. The engine interacts directly with the GIS database on a Unix workstation and a visualiser on a Silicon Graphics IRIS workstation, both located at the research site. Rather than continually write states of the model to the GIS, the engine writes partial states representing significant changes conforming to the output requirements specified in the experimental frame [21]. Likewise, the GIS will infrequently download spatially referenced parametric data determining model behaviour, as determined by the experimental frame input generation specification. This reduced interaction between engine and GIS makes it feasible to have the engine and GIS physically separated with Internet communication links. This has been sufficient for the initial research proposed here which is intended to establish the logical connections between a massively parallel simulation engine and GIS/visualiser. Of course, the existence of the high bandwidth CM5-I/O Subsystem, and eventually, the availability of information highway links between supercomputer centres and remote computers will enable greater interactivity between engine and the GIS/visualiser components. Indeed, an exciting future possibility is that the visualizer will enable interactive on-line steering of the simulation.


The simulation process

To illustrate the simulation process, we consider a watershed with simplified hydrology. The watershed is modelled as a bounded 3-D grid, with each cell representing a cube whose dimensions are determined by the chosen resolution. A column of cells will typically start with bedrock at the bottom, move up through several layers of soil with differing hydrologic parameters, reaching a surface layer, and continuing on to several cells of air (to avoid unnecessary processor assignments, all air cells may be assigned to a single processor). Rain is represented as inputs to the top boundary layer of cells and will infiltrate downwards and sidewards to the watershed basin. Each cell is represented by a local component model and the cellular heterogeneity requires that each such model have parameters specifying its unique air, soil, or bedrock characteristics. As indicated, this information is stored in the GIS and downloaded into each model initially and at experimental frame-specified points in the simulation run.

The watershed model, as described, basically assumes the form of a cellular space fitting the CM-5 data parallel architecture. In the standard model-to-simulator mapping of this approach, each cell model is assigned to a processor (possibly virtual) with interprocessor communication being efficiently handled by the Data Network in a 3-dimensional nearest-neighbor grid.

In the case of modelling a watershed, the neighbor inputs are excess water amounts representing lateral or vertical flow. The local state transition is an integration step of the differential equation-based hydrologic model (as described next). We note that the MIMD capability of the CM-5 facilitates distinguishing between boundary and interior cells in the above cycle.

The novel contribution to HPCC infrastructure is our investigation of alternative simulation strategies working within the same DEVS Cellular framework. These differ in their exploitation of discrete event cellular interactions to obtain both conceptual model understanding and execution speed-up advantages. The watershed model exhibits such event-like behaviour arising from water retention in the cells. Each cell can be viewed as a reservoir which does not transmit water until its capacity is reached. This means that intercell data exchange need only occur at discrete instants rather than at each cycle as above. Therefore, this is an effective means to represent continuous transition/discrete interaction behaviour as well as simulation engines that can exploit these representations to achieve greater execution speed as developed in this research.

To sketch the basic idea for DEVS cellular space representation, each cell stores the water influx rates of its neighbours. These are assumed to remain constant until altered by respective neighbour cells at "significant" events in their state behaviour. In a simple reservoir model, these events are: outflow initiation (when a reservoir fills to capacity), and flow cessation (when the reservoir level recedes below capacity). Flux rates are formulated in water distribution rules as part of the overall model and must satisfy constraints such as mass conservation. Notice that interprocessor communication is necessary only in phase 2. To the extent that this phase is encountered relatively infrequently, the interprocess communication is significantly reduced. The CM-5's ability to support the sender-initiated communication of phase 2 is critical for viability of the approach.


DEVS cellular space formulation of hydrologic model

Much effort has been directed at developing physically based distributed hydrologic models at the catchment scale [2]. However, Beven et al. [4] note that the development of distributed catchment modelling has been a slow, faltering process and few studies have appeared on the application of catchment scale models to real world problems. To date, no modelling efforts have aimed at providing a spatially distributed visualisation of processes and interactions controlling water movement at a catchment scale.

This research adapts existing hydrologic process models to handle spatial variation in a spatially referenced DEVS cellular space framework. The distributed catchment-scale hydrologic model comprises two major components: hillslope and channel. The hillslope component is made up of three storage systems: surface storage, soil storage, and ground water storage. Spatial distribution of catchment parameters, rainfall input and hydrologic response will be achieved in a 3-dimensional representation of the catchment. Only the primary components of the land phase of the hydrologic cycle will be modelled: evapotranspiration, infiltration, unsaturated and saturated subsurface flow. Mass balance equations will be combined with equations for the dynamics of the various inputs and outputs of the primary components of the land phase of the hydrologic cycle.


Demonstration study: Riparian ecosystem model family

To demonstrate the massively parallel large scale ecosystem simulation, we are currently developing a graded series of model interactions representing a Riparian ecosystem with increasing complexity, starting with simple hydrologic functions. The series consists of five stages of complexity as follows:

Simple reservoir model with single soil type, simple infiltration, nearest lateral neighbors, no vegetation and single slope topography (Stage 1); Moderate subsurface flow across the region under light to moderate rainfall (Stage 2); Moderate to heavy rainfall developing high subsurface flow with non-uniform topography (Stage 3); Inclusion of simple vegetation response component, evapotranspiration, root effects (Stage 4); Increase of heterogeneity of system through multiple soil types, realistic topography (Stage 5).


Incremental evolutionary development

This series of models provides an incremental approach to develop and verify the models, the simulation algorithms and the visualiser. As each stage is reached, results of the previous stage can be relied upon, and progress to the next stage can concentrate on the new additional features being added. The theoretical grounds for discrete event and combined simulation have been well established [14,21]. A version of the architecture already exists [17,18] in which the simulation engine is realised as the DEVS-scheme modelling and simulation environment. This engine is interfaced to the widely used geographic resource and analysis simulation system (GRASS) information system, both executing on a Sparc 2. Also, a version of DEVS-Scheme has been implemented in the *lisp language on the CM-2 [19]. This implementation demonstrated that more than two orders of magnitude in speed-up could be obtained for models with suitable parallelism employing a 1K node CM-2 configuration. These systems not only establish proofs of concept but also facilitate the evolution of the proposed research. The watershed models are currently being developed and tested on the DEVS-GIS platform before migrating them to the new CM-5 environment. In parallel, we will design and implement the latter, based on the earlier DEVS-CM2 environment. For continuity, we will employ *lisp as the basic language for the CM-5 implementation. Having such infrastructure available, work can get quickly to essential issues of model representation and algorithm development.


Design of the visualiser

Output presentations of traditional hydrological models are inadequate for the proposed Riparian ecosystem demonstration. For example, a discharge hydrograph describes only the volume of flow at selected outlets and nothing about the contributions of various parts of the subwatershed. In addition, there is no indication of the level of the underlying water table, which is the direct influence on the maintenance of the Riparian ecosystem. In contrast, the kind of water and vegetation displays are minimally needed to understand inter-relationships of these spatially referenced quantities. However, beyond such capability, we propose to enable observers to view the watershed landscape from various spatial positions and orientations. This will allow domain experts to bring to bear their experience with real Riparian areas to identify special areas of interest and to recognise anomalies. The visualiser displays will play an important role in assessing the trade-off relationships between execution speed-up and accuracy of output as a function of cell size (resolution).

Canonical geometric models of visualised entities (trees, soil, water, etc.) will be combined with landscape maps from the GIS and sent to the Display system which will add the appearance qualities (illumination, shadow, colour, etc.) needed for realistic image-rendering. Geometry and appearance manipulation tools will take into account the view position, orientation and other parameters determined by the viewer. The graphics workstation (Silicon Graphics IRIS) is used since it implements many advanced rendering techniques and graphic primitives in fast special-purpose hardware. Perhaps the most exciting component of the visualiser is the "significant event detector"(SED). The SED is built with the capabilities to detect significant event changes in the hydrologic cycle. While traditional hydrologic modelling measurements and representations are at the watershed outlet and static, the SED can detect event changes that are important enough that they are worth investigating dynamically. Windows pop up with varying types of representations which are dependent on the severity of the event change. The difference lies in the ability to model and evaluate subcomponents of the watershed as the models are running. Variables can be altered and simulations run again. This aids the researcher in studying spatial interactions of the system being modelled and identifies locations where there are anomalies in the data or where the researcher needs to collect more data.


Conclusions

While this paper briefly presents the structure to the system we have been developing to create massively parallel simulations of complex, large scale, high resolution ecosystem models, it has incredible potential for monitoring and predicting landscape and ecosystem changes for large geographic regions. By coupling high performance simulation and GIS this research will have far reaching impacts on many large scale systems, such as advanced traffic management systems, global telecommunications systems, and computer-integrated manufacturing systems, all which require the same underlying simulation environment for effective study.

In addition, this research characterises new forms of cellular DEVS-GIS models required for landscape simulation and applies concepts in distributed/parallel simulation theory to speed up the execution of such models. An environment linking the high performance CM-5 with a GRASS database and a 3-dimensional landscape visualiser will undoubtedly be accessible to ecosystem managers and scientists to support decision-making and research.

Finally, the outcome of the research will be a set of algorithms to support landscape level simulations on massively parallel processing platforms implemented in the CM-5 programming system, the resulting environment. Moreover, the same algorithms will be adaptable to other high performance platforms with cellular array-based architectures.


Acknowledgements

The authors would like to acknowledge the support of this research by HPCC/GCAC New Technologies Program, National Science Foundation contract # ASC-9318169.


References

1
Band L. E., Peterson D. L., Running S. W., Coughlan J., Lammers R., Dung J. & Nemani R. (1991), "Forest ecosystem processes at the watershed scale: Basis for distributed simulation", Ecological Modelling, 56, pp. 171-196.

2
Bathurst J. C. (1986), "Physically-based distributed modelling of an upland catchment using the Systeme Hydrologique Europeen", J. of Hydrology, 87, pp 79-102.

3
Baveco J. M. & Lingeman R. (1992), "An object-oriented tool for individual-oriented simulation: Host-parasitoid system application", Ecological Modelling, 61, pp. 267-286.

4
Beven K. J. (1991), "Spatially distributed modeling: Conceptual approach to runoff prediction", in Recent Advances in the Modeling of Hydrologic Systems, P.E. O'Connel & D. S. Bowles (Eds.). NATO ASI Series, Kluwer Academic Publishers, pp. 373-387.

5
Couclelis H. (1985), "Cellular worlds: A framework for modeling micro-macro dynamics", Environment and Planning, A 17, pp. 585-596.

6
Dagum L. (1992), "Three-dimensional direct particle simulation on the connection machine", Journal of Thermophysics and Heat Transfer, 6(4), p. 637.

7
Dongarra L. (1993), "Performance of various computers using standard linear equation software", Technical Report CS-89-85, Univ. of Tennessee.

8
Fahrig L. (1988), "A general model of populations in patchy habitats", Applied Mathematics and Computation, 27(1).

9
Green D. G., Reichelt R. E., van der Laan J. & Macdonald B. W. (1989), "A genetic approach to landscape modelling", Proceedings of Eight Biennial Conference, Simulation Society of Australia, Canberra, Australia, pp. 342-347.

10
Knisel W. G. (Ed.) (1980), "CREAMS: A field-scale model for chemicals, runoff, and erosion from agricultural management systems", USDA, Conservation Research Report No. 26, Washington, D. C.

11
McLeod B. & Sander W. H. (1993), "Performance evaluation N-processor time warp using stochastic activity networks", PMRL Technical Report 93-7, Department of Electrical and Computer Engineering, University of Arizona.

12
Melcuk A., Giles I., Roscoe C. & Gould H., (1991), "Molecular dynamic simulation of liquids on the connection machine", Computer in Physics, 5(2), p. 311.

13
Morris E. M. (1980), "Forecasting flood flows in grassy and forestedbasins using a deterministic distributed mathematical model", Proceedings of the IAHS Int. Symp. on Hydrological Forecasting, Pub.No. 129, pp. 247-255.

14
Praehofer H. (1993), "Distributed discrete event simulation", Technical Report TR-93, University of Linz.

15
Skylar F. H., Constanza R. & Day J. W. (1985), "Dynamic spatialsimulation modeling of coastal wetland habitat succession", Ecological Modeling, 29, pp. 261-281.

16
Smith M. (1991), "Using massivelyŠparallel supercomputers to modelstochastic spatial predator-prey systems", Ecological Modelling, 58(1), pp. 347.

17
Vasconcelos M. J. (1993), Modeling Spatial Dynamic Ecological Processes with DEVS-Scheme and Geographics Informational Systems, Doctoral Thesis, School of Renewable Natural Resources, University of Arizona.

18
Vazconcelos M. J., Zeigler B. P. & Graham L. A. (1993), "Modeling multi-scale spatial ecological processes under the discrete event systems paradigm", Landscape Ecology, in press.

19
Wang Y. & Zeigler B. P. (1993), "DEVS based simulation on a SIMD massively parallel architecture", Journal of Discrete Event Dynamic Systems, in press.

20
Woolhiser D. A. (1973), "Hydrologic and watershed modeling - State-of-the-art", Trans. Am.Soc. of Ag. Engrs., pp. 553-559.

21
Zeigler B. P. (1984), Multifaceted Modeling and Discrete Event Simulation, London and Orlando, FL: Academic Press.

22
Zeigler B. P. (1989), "DEVS representation of dynamical systems: Event-based control for intelligent systems", Proceeding of IEEE, 77(1), pp. 72-80.

About this document ...

Massively Parallel Simulations of Complex, Large Scale, High Resolution Ecosystem Models

This document was generated using the LaTeX2HTML translator Version 95.1 (Fri Jan 20 1995) Copyright © 1993, 1994, Nikos Drakos, Computer Based Learning Unit, University of Leeds.

The command line arguments were:
l2h -dir gimblett gimblett.tex.

The translation was initiated by Pam Milliken on Tue Sep 24 12:42:37 EST 1996


Complexity International (1995) 2