This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Foreword The International Symposium on Spatial Data Handling (SDH) commenced in 1984, in Zurich, Switzerland, organized by the International Geographical Union Commission on Geographical Data Sensing and Processing which was later succeed by the Commission on Geographic Information Systems, Study Group on Geographical Information Science and then the Commission on Geographical Information Science (http://www.hku.hk/cupem/igugisc/). Previous symposia have been held at the following locations: 6th - Edinburgh, 1994 1st - Zurich, 1984 7th - Delft, 1996 2nd - Seattle, 1986 8th - Vancouver, 1998 3rd - Sydney, 1988 9th - Beijing, 2000 4th - Zurich, 1990 10th - Ottawa, 2002 5th - Charleston, 1992 This book is the proceedings of the 11th International Symposium on Spatial Data Handling. The conference was held in Leicester, United Kingdom, on August 23rd to 25th 2004, as a satellite meeting to the Congress of the International Geographical Union in Glasgow. The International Symposium on Spatial Data Handling is a refereed conference. All the papers in this book were submitted as full papers and reviewed by at least two members of the Programme Committee. 83 papers in all were submitted and among the 50 included here, all are considered above average by the reviewers. The papers cover the span of Geographical Information Science topics, which have always been the concern of the conference. Topics from uncertainty (error, vagueness, and ontology and semantics) to web issues, digital elevation models and urban infrastructure. I would venture to suggest that in the proceedings there is something for everyone who is doing research in the science of geographical information. The Edinburgh and Delft proceedings had been published as postconference publications by Taylor and Francis in the Advances in GIS series which appeared as two volumes edited by Tom Waugh and Richard Healey and then by Martien Molennar and Menno-Jan Kraak, and the Ottawa conference has been published by Springer-Verlag as Advances in Spatial Data Handling, edited by Dianne Richardson and Peter van Oosterom. Many important seminal papers and novel ideas have been originated from this conference series. This Publication is the second in the
Springer-Verlag series which members of the Commission hope will continue.
vi
Acknowledgement Any conference takes considerable organization, and this one especially. As the time came for submission of initial papers for review, I was confined to a hospital bed after emergency surgery and the conference only happened because Jill, my wife and helper, and Beth, my daughter, kept my email free of submissions. I want to thank them and Kate and Ian too from the bottom of my heart, for being there, the conference and this proceedings are dedicated to them. I should also thank those at the University of Leicester who were involved in putting on the conference. Kate Moore who worked on the web site, Dave Orme and Liz Cox who organized the finances and accommodation. Nick Tate, Claire Jarvis and Andy Millington all helped with a variety of tasks. I should also thank the programme committee who promptly completed the review process. The support and the questioning of Anthony Yeh are particularly appreciated. Finally I would like to thank the originators of the SDH symposium series, especially Kurt Brassel who organized the first and Duane Marble who edited it for their vision in setting up the series which has maintained a consistent level of excellence in the reporting of the best scientific results in what we now know as Geographical Information Science.
Peter Fisher 2 June 2004
Table of Contents
Plenary of Submitted Papers About Invalid, Valid and Clean Polygons................................................ 1 Peter van Oosterom, Wilko Quak and Theo Tijssen 3D Geographic Visualization: The Marine GIS.................................... 17 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski Local Knowledge Doesn’t Grow on Trees:............................................ 29 Community-Integrated Geographic Information Systems and Rural Community Self-Definition Gregory Elmes, Michael Dougherty, Hallie Challig, Wilbert Karigomba, Brent McCusker and Daniel Weiner.
Web GIS A Flexible Competitive Neural Network for Eliciting ........................ 41 User’s Preferences in Web Urban Spaces Yanwu Yang and Christophe Claramunt Combining Heterogeneous Spatial Data from ...................................... 59 Distributed Sources M. Howard Williams and Omar Dreza Security for GIS N-tier Architecture ..................................................... 71 Michael Govorov, Youry Khmelevsky, Vasiliy Ustimenko, and Alexei Khorev Progressive Transmission of Vector Data Based .................................. 85 on Changes Accumulation Model Tinghua Ai, Zhilin Li and Yaolin Liu
Elevation modelling An Efficient Natural Neighbour Interpolation...................................... 97 Algorithm for Geoscientific Modelling Hugo Ledoux and Christopher Gold Evaluating Methods for Interpolating Continuous ........................... 109 Surfaces from Irregular Data: a Case Study M. Hugentobler, R.S. Purves and B. Schneider Contour Smoothing Based on Weighted Smoothing ......................... 125 Splines Leonor Maria Oliveira Malva Flooding Triangulated Terrain............................................................. 137 Yuanxin Liu and Jack Snoeyink
Vagueness and Interpolation Vague Topological Predicates for Crisp Regions................................ 149 through Metric Refinements Markus Schneider Fuzzy Modeling of Sparse Data ............................................................ 163 Angelo Marcello Anile and Salvatore Spinella Handling Spatial Data Uncertainty Using a Fuzzy ............................ 173 Geostatistical Approach for Modelling Methane Emissions at the Island of Java Alfred Stein and Mamta Verma
x
Temporal A Visualization Environment for the Space-....................................... 189 Time-Cube Menno-Jan Kraak and Alexandra Koussoulakou Finding REMO - Detecting Relative Motion....................................... 201 Patterns in Geospatial Lifelines Patrick Laube, Marc van Kreveld and Stephan Imfeld
Indexing Spatial Hoarding: A Hoarding Strategy for Location- ...................... 217 Dependent Systems Karim Zerioh, Omar El Beqqali and Robert Laurini Distributed Ranking Methods for Geographic ................................... 231 Information Retrieval Marc van Kreveld, Iris Reinbacher, Avi Arampatzis and Roelof van Zwol Representing Topological Relationships between............................... 245 Complex Regions by F-Histograms Lukasz Wawrzyniak, Pascal Matsakis and Dennis Nikitenko The Po-tree, a Real-time Spatiotemporal Data ................................... 259 Indexing Structure Guillaume Noël, Sylvie Servigne and Robert Laurini
Uncertainty Empirical Study on Location Indeterminacy of.................................. 271 Localities Sungsoon Hwang and Jean-Claude Thill
xi
Registration of Remote Sensing Image with........................................ 285 Measurement Errors and Error Propagation Yong Ge, Yee Leung, Jianghong Ma and Jinfeng Wang Double Vagueness: Effect of Scale on the ............................................ 299 Modelling of Fuzzy Spatial Objects Tao Cheng, Peter Fisher and Zhilin Li Area, Perimeter and Shape of Fuzzy Geographical ........................... 315 Entities Cidália Costa Fonte and Weldon A. Lodwick
Generalisation Why and How Evaluating Generalised Data ?.................................... 327 Sylvain Bard, Anne Ruas Road Network Generalization Based on Connection ......................... 343 Analysis Qingnian Zhang Continuous Generalization for Visualization on................................. 355 Small Mobile Devices Monika Sester and Claus Brenner
Shape-Aware Line Generalisation With Weighted ............................ 369 Effective Area Sheng Zhou and Christopher B. Jones
Spatial Relationships Introducing a Reasoning System Based on.......................................... 381 Ternary Projective Relations Roland Billen and Eliseo Clementini A Discrete Model for Topological Relationships................................. 395 between Uncertain Spatial Objects Erlend Tøssebro and Mads Nygård xii
Modeling Topological Properties of a Raster ...................................... 407 Region for Spatial Optimization Takeshi Shirabe Sandbox Geography – To learn from children ................................... 421 the form of spatial concepts Florian A. Twaroch and Andrew U. Frank
Urban Infrastructure Street Centreline Generation with an .................................................. 435 Approximated Area Voronoi Diagram Steven A. Roberts,G. Brent Hall and Barry Boots Determining Optimal Critical Junctions for ....................................... 447 Real-time Traffic Monitoring for Transport GIS Yang Yue and Anthony G. O. Yeh Collaborative Decision Support for Spatial......................................... 459 Planning and Asset Management: IIUM Total Spatial Information System Alias Abdullah, Muhammad Faris Abdullah and Muhammad Nur Azraei Shahbudin
Navigation Automatic Generation and Application of Landmarks .................... 469 in Navigation Data Sets Birgit Elias and Claus Brenner Towards a Classification of Route Selection ....................................... 481 Criteria for Route Planning Tools Hartwig Hochmair An Algorithm for Icon Labelling on a Real-Time............................... 493 Map Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto xiii
Working with Elevation Semantically Correct 2.5D GIS Data – the.......................................... 509 Integration of a DTM and Topographic Vector Data Andreas Koch Christian Heipke Generalization of integrated terrain elevation .................................... 527 and 2D object models J.E. Stoter, F. Penninga and P.J.M. van Oosterom An Image Analysis and Photogrammetric........................................... 547 Engineering Integrated Shadow Detection Model Yan Li, Peng Gong and Tadashi Sasagawa
Semantics and Ontologies Understanding Taxonomies of Ecosystems: ........................................ 559 a Case Study Alexandre Sorokine and Thomas Bittner Comparing and Combining Different Expert ..................................... 573 Relations of How Land Cover Ontologies Relate Alexis Comber, Peter Fisher and Richard Wadsworth Representing, Manipulating and Reasoning ....................................... 585 with Geographic Semantics within a Knowledge Framework James O’Brien and Mark Gahegan
Data Quality and Metadata A Framework for Conceptual Modeling of ......................................... 605 Geographic Data Quality Anders Friis-Christensen, Jesper V. Christensen and Christian S. Jensen
xiv
Consistency Assessment Between Multiple ......................................... 617 Representations of Geographical Databases: a Specification-Based Approach David Sheeren, Sébastien Mustièr and Jean-Daniel Zucker Integrating Structured Descriptions of ............................................... 629 Processes in Geographical Metadata Bénédicte Bucher
Spatial Statistics Toward Comparing Maps as Spatial Processes .................................. 641 Ferko Csillag and Barry Boots Integrating computational and visual analysis.................................... 653 for the exploration of health statistics Etien L. Koua and Menno-Jan Kraak
Using Spatially Adaptive Filters to Map Late..................................... 665 Stage Colorectal Cancer Incidence in Iowa Chetan Tiwari and Gerard Rushton
xv
Author Index Abdullah, Alias Abdullah, Muhammad F. Ai, Tinghua Anile, A.Marcello Arampatzis, Avi Bard, Sylvain Billen, Roland Bittner, Thomas Boots, Barrry Brenner, Claus Bucher, Bénédicte Challig, Hallie Chau, Michael Cheng, Tao Christensen, Jesper V. Claramunt, Christophe Clementini, Eliseo Comber, Alexis Csillag, Ferko Dougherty, Michael Dreza, Omar Dzieszko, Marcin El Beqqali, Omar Elias, Birgit Elmes, Gregory Fisher, Peter Fonte, Cidália Costa Frank, Andrew U. Friis-Christensen, Anders Gahegan, Mark Ge, Yong Gold, Christopher Gong, Peng Goralski, Rafel Govorov, Michael Hall, G. Brent Harrie, Lars Heipke, Christian
Hochmair, Hartwig Hugentobler, Marco Hwang, Sungsoon Imfeld, Stephan Jensen, Christian S. Jones, Christopher B. Karigomba, Wilbert Khmelevsky, Youry Khorev, Alexei Koch, Andreas Koivula, Tommi Koua, Etien L. Koussoulakou, Alexandra Kraak, Menno-Jan Laube, Patrick Laurini, Robert Laurini, Robert Ledoux, Hugo Lehto, Lassi Leung, Yee Li, Zhilin Li, Yan Liu, Yaolin Liu, Yuanxin Lodwick, Weldon A. Ma, Jianghong Malva, Leonor M.O. Matsakis, Pascal McCusker, Brent Mustière, Sébastien Nikitenko, Dennis Noël, Guillaume Nygård, Mads O’Brien , James Penninga, F. Purves, Ross Quak, Wilko Reinbacher, Iris
Roberts, Steven A. Ruas, Anne Rushton, Gerard Sasagawa, Tadashi Schneider, B. Schneider, Markus Servigne, Sylvie Sester, Monika Shahbudin, Muhammad Sheeren, David Shirabe, Takeshi Snoeyink, Jack Sorokine, Alexandre Spinella, Salvatore Stein, Alfred Stigmar, Hanna Stoter, J.E. Thill, Jean-Claude Tijssen, Theo Tiwari, Chetan Tøssebro, Erlend
Twaroch,Florian A. Ustimenko, Vasiliy van Kreveld, Marc van Oosterom, Peter van Zwol, Roelof Verma, Mamta Wadsworth, Richard Wang, Jinfeng Wawrzyniak, Lukasz Weiner, Daniel Williams , M. Howard Yang , Yanwu Yeh, Anthony G. O. Yue, Yang Zerioh, Karim Zhang, Qingnian Zhou, Sheng Zucker, Jean-Daniel
Chair Peter Fisher William Mackaness Andrew Millington Martien Molennar Ferjan Ormeling Henk Ottens Donna Peuquet Anne Ruas Tapani Sarjakoski Monica Sester Marc Van Krefeld Peter van Oostermon Rob Weibel Stephan Winter Mike Worboys Anthony Yeh
Dave Abel Michael Barnsley Eliseo Clementini Leila De Floriani Geoffrey Edwards Pip Forer Andrew Frank
Randolph Franklin Chris Gold Mike Goodchild Francis Harvey Robert Jeansoulin Chris Jones Brian Klinkenberg Menno-Jan Kraak Robert Laurini
Local Organising Committee Chair Peter Fisher David Orme Sanjay Rana Kevin Tansley Nicholas Tate
Mark Gillings Claire Jarvis Andrew Millington Kate Moore
xix
About Invalid, Valid and Clean Polygons Peter van Oosterom, Wilko Quak and Theo Tijssen Delft University of Technology, OTB, section GIS technology, Jaffalaan 9, 2628 BX Delft, The Netherlands.
Abstract Spatial models are often based on polygons both in 2D and 3D. Many Geo-ICT products support spatial data types, such as the polygon, based on the OpenGIS ‘Simple Features Specification’. OpenGIS and ISO have agreed to harmonize their specifications and standards. In this paper we discuss the relevant aspects related to polygons in these standards and compare several implementations. A quite exhaustive set of test polygons (with holes) has been developed. The test results reveal significant differences in the implementations, which causes interoperability problems. Part of these differences can be explained by different interpretations (definitions) of the OpenGIS and ISO standards (do not have an equal polygon definition). Another part of these differences is due to typical implementation issues, such as alternative methods for handling tolerances. Based on these experiences we propose an unambiguous definition for polygons, which makes polygons again the stable foundation it is supposed to be in spatial modelling and analysis. Valid polygons are well defined, but as they may still cause problems during data transfer, also the concept of (valid) clean polygons is defined.
1 Introduction Within our Geo-Database Management Centre (GDMC), we investigate different Geo-ICT products, such as Geo-DBMSs, GIS packages and ‘geo’ middleware solutions. During our tests and benchmarks, we noticed subtle, but fundamental differences in the way polygons are treated (even in the 2D situation and using only straight lines). The consequences can be quite unpleasant. For example, a different number of objects are selected when
2
Peter van Oosterom, Wilko Quak and Theo Tijssen
the same query is executed on the same data set in different environments. Another consequence is that data may be lost when transferring it from one system to another, as polygons valid in one environment may not be accepted in the other environment. It all seems so simple, everyone working with geo-information knows what a polygon is: an area bounded by straight-line segments (and possibly having some holes). A dictionary definition of a polygon: a figure, (usually a plane, rectilinear figure), having many, i.e. (usually) more than four, angles (and sides) (Oxford 1973). A polygon is the foundation geometric data type of many spatial data models, such as used for topographic data, cadastral data, soil data, to name just a few. So, why have the main GeoICT vendors not been able to implement the same polygons? The answer is that in reality the situation is not as simple as it may seem at first sight. The two main difficulties, which potentially cause differences between the systems, are: 1. Is the outer boundary allowed to interact with itself and possibly also with the inner boundaries and if so, under what conditions? 2. The computer is a finite digital machine and therefore coordinates may sometimes differ a little from the (real) mathematical value. Therefore tolerance values (epsilons) are needed when validating polygons.
Fig. 1. Real world examples: topographic data (left: outer ring touches itself in one point, middle: two inner rings (holes) both touch the outer ring, right: two inner rings touch each other)
The interaction between the outer and possibly the inner boundaries of a single polygon is related to the topological analysis of the situation. This is an abstract issue, without implementation difficulties, such as tolerance values. So, one might expect that the main ‘geometry’ standards of OpenGIS and ISO will provide a clear answer to this. A basic concept is that of a straight-line segment, which is defined by its begin and end point. Polygon input could be specified as an unordered, unstructured set of straightline segments. The following issues have to be addressed before it can be
About Invalid, Valid and Clean Polygons
3
decided whether the set represents a valid or invalid polygon (also have a look a the figures in section 3): 1. Which line segments, and in which order, are connected to each other? 2. Is there one or are there more than one connected sets of straight-line segments? 3. Are all connected sets of straight-line segments closed, that is, do they form boundary rings and is every node (vertex) in the ring associated with at least 2 segments? 4. In case a node is used in 4, 6, 8, etc. segments of one ring, is the ring then disconnected in respectively 2, 3, 4, etc. ‘separate’ rings (both choices may be considered ‘valid’, anyhow the situation can happen in reality irrespective of how this should be modeled, see fig. 1)? 5. Are there any crossing segments (this would not be allowed for a valid polygon)? 6. Is there one ring (the outer ring), which encloses an area that ‘contains’ all other (inner) rings? 7. Are there no nested inner rings (would result in disconnected areas)? 8. Are there any touching rings? This is related to question 4, but another situation occurs when one ring touches with one of its nodes another ring in the interior of a straight-line segment. 9. Are the rings, after construction from the line segments, properly oriented, that is counter clockwise for the outer boundary and clockwise for the inner boundaries? (defining a normal vector for the area that points upward as a usual convection inherited from the computer graphics world: only areas with a normal vector in the direction of the viewer are visible) Note that this means that the polygon area will always be on the left-hand side of the polygon boundaries. Some of questions may be combined in one test in an actual implementation in order to decide if the polygon is valid. In case a polygon is invalid, it may be completely rejected (with an error message) or it may be ‘accepted’ (with a warning) by the system, but then operations on such a polygon may often not be guaranteed to be working correctly. In this paper it is assumed that during data transfer between different systems enough characters or bytes are used in case of respectively ACSII (such as GML of OpenGIS and ISO TC211) or binary data formats in order to avoid unwanted change of coordinates (due to rounding/conversion) and that the sending and receiving system have similar capabilities for representing coordinates; e.g. integers (4 bytes), floating point numbers (4, 8, or 16 bytes) (IEEE 1985). In reality this can also be a non-trivial issue in which errors might be introduced. A worst case scenario would be transferring the same data several times between two systems (without editing) and every time
4
Peter van Oosterom, Wilko Quak and Theo Tijssen
the coordinates drift further away due to rounding/conversions. By using enough characters (bytes) this should be avoided. In reality, polygons are not specified as a set of unconnected and unordered straight-line segments. One reason for this it that every coordinate would then at least be specified twice and this would be quite redundant with all associated problems, such as possible errors and increased storage requirements. Therefore, most systems require the user to specify the polygon as a set of ordered and oriented boundary rings (so every coordinate is stated only once). The advantage is also that many of the tasks listed above are already solved through the syntax of the polygon. But the user can still make errors, e.g. switch the outer and inner boundary, or specify rings with erroneous orientation. So, in order to be sure that the polygon is valid, most things have to be checked anyhow.
2 Polygon definitions In this section we review a number of polygon definitions. First, we have a look at the definition of a (simple) polygon as used within computational geometry. Then the ISO and the OpenGIS polygon definitions are discussed. Then, we present our definition, which tries to fill the blank spots of the mentioned definitions and in case of inconsistencies between the standards make a decision based on a well defined (and straight forward) set of rules. Finally, the concept of clean (and robust) polygons is introduced. 2.1 Computational geometry From the computational text book of Preparata and Shamos (1985, p.18): ‘a polygon is defined by a finite set of segments such that every segment extreme is shared by exactly two edges and no subset of edges has the same property.’ This excludes situations with dangling segments, but also excludes two disjoint regions (could be called a multi-polygon), polygon with a hole, or a polygon in which the boundary touches itself in one point (extreme is shared by 4, 6, 8, … edges). However, it does not exclude a self-intersecting polygon, that is, two edges which intersect. Therefore also the following definition is given: ‘A polygon is simple if there is no pair of nonconsecutive edges sharing a point. A simple polygon partitions the plane into two disjoint regions, the interior (bounded) and the exterior (unbounded).’ Besides the self-intersection polygons, this also disallows polygons with (partial) overlapping edges. Finally the following
About Invalid, Valid and Clean Polygons
5
interesting remark is made: ‘in common parlance, the term polygon is frequently used to denote the union of the boundary and the interior.’ This is certainly true in the GIS context, which implies that actually the simple polygon definition is intended as otherwise the interior would not be defined. One drawback of this definition is that is disallows polygons with holes, which are quite frequent in the GIS context. 2.2 ISO definition The ISO standard 19107 ‘Geographic information — Spatial schema’ (ISO 2003) has the following polygon definition: ‘A GM_Polygon is a surface patch that is defined by a set of boundary curves (most likely GM_CurveSegments) and an underlying surface to which these curves adhere. The default is that the curves are coplanar and the polygon uses planar interpolation in its interior.’ It then continues with describing the two important attributes, the exterior and the interior: ‘The attribute “exterior” describes the “largest boundary” of the surface patch. The GM_GenericCurves that constitute the exterior and interior boundaries of this GM_Polygon shall be oriented in a manner consistent with the upNormal of the this.spanningSurface.’ and ‘The attribute “interior” describes all but the exterior boundary of the surface patch.’ Note that in this context the words exterior and interior refer to the rings defining respectively the outer and inner boundaries of a polygon (and not referring to the exterior area and interior area of the polygon with holes). It is a bit dangerous to quote from the ISO standard without the full context (and therefore exact meaning of primitives such as GM_CurveSegments and GM_GenericCurves). The GM_Polygon is a specialization of the more generic GM_SurfacePatch, which has the following ISO definition: ’GM_SurfacePatch defines a homogeneous portion of a GM_Surface. The multiplicity of the association “Segmentation” specifies that each GM_SurfacePatch shall be in one and only one GM_Surface.’ The ISO definition for the GM_Surface is: ‘GM_Surface, a subclass of GM_Primitive, is the basis for 2-dimensional geometry. Unorientable surfaces such as the Möbius band are not allowed. The orientation of a surface chooses an “up” direction through the choice of the upward normal, which, if the surface is not a cycle, is the side of the surface from which the exterior boundary appears counterclockwise. Reversal of the surface orientation reverses the curve orientation of each boundary component, and interchanges the conceptual “up” and “down” direction of the surface. If the surface is the boundary of a solid, the “up” direction is outward. For closed surfaces, which have no boundary, the up direction is
6
Peter van Oosterom, Wilko Quak and Theo Tijssen
that of the surface patches, which must be consistent with one another. Its included GM_SurfacePatches describe the interior structure of a GM_Surface.’ So, this is not the simple definition of a polygon one might aspect. Further, it is not directly obvious if the outer boundary is allowed to touch itself or if it is allowed to touch the inner boundaries and if so, under what conditions this would be allowed. One thing is very clear: there is just one outer boundary and there can be zero or more inner boundaries. This means that a ‘polygon’ with two outer boundaries, defining potentially disconnected areas, is certainly invalid. Also the ISO standard is very explicit about the orientation of the outer and inner boundaries (in 2D looking from above: counterclockwise and clockwise for respectively the outer and inner boundaries). 2.3 OpenGIS definition The ISO definition of a polygon is at the abstract (mathematical) level and part of the whole complex of related geometry definition. The definition has to be translated to the implementation level and this is what is done by the OpenGIS Simple Feature Specification (SFS) for SQL (OGC 1999). The OpenGIS definition is based on the ISO definition, so it can be expected that there will (hopefully) be some resemblance:‘A Polygon is a planar Surface, defined by 1 exterior boundary and 0 or more interior boundaries. Each interior boundary defines a hole in the Polygon. The assertions for polygons (the rules that define valid polygons) are: 1. Polygons are topologically closed. 2. The boundary of a Polygon consists of a set of LinearRings that make up its exterior and interior boundaries. 3. No two rings in the boundary cross, the rings in the boundary of a Polygon may intersect at a Point but only as a tangent: P Polygon, c1, c2 P.Boundary(), c1 z c2, p, q Point, p, q c1, p z q, [ p c2 q c2] 4. A Polygon may not have cut lines, spikes or punctures: P Polygon, P = Closure(Interior(P)) 5. The Interior of every Polygon is a connected point set. 6. The Exterior of a Polygon with 1 or more holes is not connected. Each hole defines a connected component of the Exterior. In the above assertions, Interior, Closure and Exterior have the standard topological definitions. The combination of 1 and 3 make a Polygon a Regular Closed point set.’
About Invalid, Valid and Clean Polygons
7
Similar to the ISO standard, in the OpenGIS SFS specification, a polygon is also a specialization of the more generic surface type, which can exits in 3D space. However, the only instantiable subclass of Surface defined in the OpenGIS SFS specification, Polygon, is a simple Surface that is planar. As might be expected from an implementation specification a number of things become clearer. According to condition 3: rings may touch each other in at most one point. Further condition 5 makes clear that the interior of a polygon must be a connected set (and a configuration of inner rings, which somehow subdivides the interior of a polygon into disconnected parts, is not allowed). Finally, an interesting point is raised in condition 4: cut lines or spikes are not allowed. All-fine from a mathematical point of view, but when is a ‘sharp part’ of the boundary considered a spike. Must the interior angle at that point be exactly 0, or is some kind of tolerance involved. The same is true for testing if a ring touches itself or if two rings touch each other (in a node-node situation or a node-segment situation). Note that the OpenGIS does not say anything concerning the orientation of the polygon rings. 2.4 An enhanced ‘polygon with holes’ definition Our definition of a valid polygon with holes: ’A polygon is defined by straight-line segments, all organized in rings, representing at least one outer (oriented counterclockwise) and zero or more inner boundaries (oriented clockwise, also see sections 2.1-2.3 for used concepts). This implies that all nodes are at least connected to two line segments and no dangling line segments are allowed. Rings are not allowed to cross, but it is allowed that rings touch (or even partially) overlap themselves or each other, as long as any point inside or on the boundary of the polygon can be reached through the interior of the polygon from any other point inside the polygon, that is, it defines one connected area. As indicated above, some conditions (e.g. ‘ring touches other ring’) require a tolerance value in their evaluation and therefore this is the last part of the definition.’ One could consider not using a tolerance value and only look at exact values of the coordinates and the straight-lines defined by them. Imagine a situation in which a point of an inner ring is very close to the outer ring; see for example cases 4, 31 and 32 in the figure of section 3. The situation in reality may have been that the point is supposed to be located on the ring (case 4). However, due to the finite number of available digits in a computer, it may not be possible to represent that exact location (Goldberg 1991, Güting 1993), but a close location is chosen (cases 31 and 32). It is arbitrary if this point would be on the one or the other side of the ring. Not
8
Peter van Oosterom, Wilko Quak and Theo Tijssen
considering tolerances would either mean that this situation would be classified as crossing rings (not allowed) or two disjoint rings (is not the case, as they are supposed to touch each other). Anyway, the polygon (ring and validity) situation is not correctly assessed. This is one of the reasons why many systems use some kind of tolerance value. The problem is how to specify the manner the tolerance value is applied when validating the polygon (this part is missing in our definition above). Another example, which illustrates this problem, is case 30 (see figure in section 3): one option for tolerance processing could be to remove the ‘internal spike’ and the result would be a valid polygon. However, an alternative approach may be to widen the gap between the toe end nodes and in this situation the result is an invalid polygon as the ‘internal spike’ intersects one of the other edges. It may be difficult to formalize unambiguous epsilon processing rules as a part of the validation process. Another strange aspect of our definition of valid polygons is that a ‘spike to the outside’ (case 11 in figure section 3) results in an invalid polygon as it is not possible from a point in the middle of this spike to reach all other points of the polygon via the interior. While at the same time a ‘spike to the inside’ (case 12 in figure section 3) is considered a valid polygon as it is possible to reach from any point of the polygon (also from the middle of the spike) any other point via the interior of the polygon. Something similar as with ‘spikes’ occurs with ‘bridges’: while internal bridges are valid (cases 7 and 8), the external bridges (case 15 and 16) are invalid. Both in the situation of spikes and bridges, the difference between internal and external could be considered ‘asymmetrical’. 2.5 Valid and clean polygons The validation process (according to our definition of valid polygons as described above) would become much simpler is it can be assumed that no point lies within epsilon tolerance of any other point or edge (which it does not define itself), which will be called a (valid) clean polygon. In cases 4 (31 and 32) this implies that the segment on which the point of the other ring is supposed to be located, should be split into two parts, with the point as best possible representation within the computer. By enforcing this way of modelling, the polygon validity assessment may be executed without tolerances. Another advantage of clean polygons as described above is that they will not have any spikes (not to the inside and not to the outside) as the two end nodes of the spike lay too close together (or are even equal). Similarly, the internal and external bridges should be removed. Further, repeated points are removed from the representation. Be-
About Invalid, Valid and Clean Polygons
9
fore transferring polygon data the sending system therefore first do the epsilon tolerance processing. After that, the sender can be sure that the receiving system will correctly receive the polygons (assuming that coordinates are not changed more that epsilon during transfer). The largest distance of moving a coordinate, while the result is still a valid polygon, is called the robustness of the polygon representation of Thompson (2003). In case 4 without an additional node, the robustness would be equal to 0 as infinitely small change (to the outside) of the node of the inner ring on the edge of the outer ring would make this polygon invalid (not considering epsilon tolerance). However, adding an explicit shared node in both inner and outer ring as result of the epsilon tolerance processing increases the robustness of this representation (of the ‘same’ polygon) to at least the value of epsilon. In fact the robustness is even larger as it is possible to change every (shared) node by more than epsilon (the size of epsilon can be observed from the open circle in the drawing of cases 31 and 32). The robustness of a polygon can be computed finding the smallest distance between a node and an edge (not defined by the node). The smallest distance can be either reached somewhere in the middle of near one of the end points of the involved edge. A brute force algorithm would require O(n2), while a smarter (computational geometry) algorithm could probably compute this in O(n log n), where n is the number of nodes (or edges) in the polygon. The concept of robustness has some resemblance with the ‘indiscernibility’ relation between two representations as introduced by Worboys (1998).
3 Testing Geo-ICT systems In this section we first specify a list of representative test polygons. This list is supported to be exhaustive for all possible valid and invalid type of polygons. Next we use this test set in combination with four different geoDBMSs and compare the outcome to the OpenGIS, ISO and our own definition of valid polygons. 3.1 Polygon examples Figure 2 shows an overview of our test polygons. In these images, small filled circles represent nodes. Sometimes, two of these circles are drawn very close to each other; this actually means that the nodes are exactly the same. The same is true for two very close lines, which actually mean (partly) overlapping segments. Some figures contain empty circles, which
10
Peter van Oosterom, Wilko Quak and Theo Tijssen
indicate tolerance values (the assumed tolerance value is 4000 related to the coordinates used in our test polygons). In case such a situation occurs, the system may decide to ‘correct’ the polygon, within the tolerance value distances, resulting in a new polygon representation. The resulting new representation may be valid or invalid, depending on the configuration. Note that all polygon outer rings are defined counterclockwise and all inner rings are defined clockwise. More test variants can be imagined when reversing the orientation (only done for test 1). Further, rings touching itself can be modelled as one ring or modelled as separate rings. The separate ring option is chosen, with the exception of example 4, where both cases are tested: 4a, the ‘separate rings’ option (without explicit node were rings touch), 4b, the ‘single ring’ option (self touching). Even a third variant would be possible: two ‘separate rings’ with explicit nodes were these rings touch (not tested). Also in this case more test variants can be imagined. In addition to our presented set of test cases, it may be possible to think of other test cases. It is important to collect these test cases in order to evaluate the completeness (and correctness) of the standards related. To name just a few additional, untested, cases (but many more will exist): polygons with inner and outer ring switched, (first inner ring specified, then outer ring), but both with the proper orientation same as above, but now also the orientation (clockwise/ counterclockwise) reversed. two exactly the same points on a straight line (similar to case 26, but now with repeated points) same as above, but now the two points are repeated on a true 'corner' of the polygon line segment of inner ring is within tolerance of a line segment of the outer ring (but on the inside), similar to case 9 but with tolerance value same as above but now the line segment is on the outside two outer rings, with partly overlapping edges or touching in point
About Invalid, Valid and Clean Polygons
Fig. 2. Overview of the polygons used in the test
11
12
Peter van Oosterom, Wilko Quak and Theo Tijssen
Table 1. Results of validating the polygons, no code means polygon is considered valid (BS=boundary selfintersects, CR=crossing rings, EN=edge not connected to interior, FI=floating inner ring, NA=no area, NC=not closed, NH=not one homogenous portion, NO=not orientable, NS=no surface, Rn=rule n (n=1,3,4,5), RC=ring crosses ring, RO=rings overlap, RT=rings touch, SR=self crossing ring, TE=two exterior rings, TS=two separate areas, WO=wrong orientation) id 1a 1b 2 3 4a 4b 5 6 7 8 9 10 11 12 13 14a 14b 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
3.2 System tests We tested a test set of about 40 polygons in different spatial databases: Oracle (2001), Informix (2000), PostGIS (Ramsey 2001, PostgreSQL 2001), and ArcSDE binary (ESRI). Of these implementations Informix and PostGIS use the OpenGIS specification for polygons. Oracle Spatial defines a polygon as: ‘Polygons are composed of connected line strings that form a closed ring and the area of the polygon is implied.’ The OpenGIS Well Known Text (WKT) format was used when trying to insert the records in the different systems, with the exception of ArcSDE (see below). Because Oracle does not support this format we integrated the Java Topology Suite (1.3) into the Oracle server to implement the conversion function. Below a single example of the exact inserts statement for three (of the four) systems of a correct polygon with one hole (case 2): Oracle example: insert into test_polygon values ('2', GeomFromText( 'polygon( (33300 19200, 19200 30000, 8300 15000, 20000 4200, 33300 19200), (25000 13300, 17500 13300, 17500 19200, 25000 19200, 25000 13300))'));
Oracle has a separate validation function in which a parameter for the tolerance can be specified (example below shows our tolerance value of 4000): select id, sdo_geom.validate_geometry_with_context(geom,4000) as valid from test_polygon where sdo_geom.validate_geometry_with_context(geom,4000) <> 'TRUE';
It was not possible to load the WKT format into ArcSDE (without writing a program using ESRI’s ArcSDE API). As we did want to be sure to input the same polygon definitions with standard tools of the vendor we used the following procedure to load the data into ArcSDE: 1. WKT was converted by hand to ‘mif’ (‘mid’) format: in this manner exactly the same coordinates, ordering and rings were specified (including the repetition of first and last point)
14
Peter van Oosterom, Wilko Quak and Theo Tijssen
2. ArcTools 8.3 was used to convert the mif/mid format to ESRI shape files, a binary format. This conversion has as a side effect some coordinate manipulation: e.g. outer rings are now always ordered clockwise, a repeated last point is omitted, rings defining a region are per definition closed. This makes it impossible to test case 10. 3. Finally the polygons are loaded into ArcSDE binary with the following ArcSDE command: create –l testpoly,geom. –f testpoly.shp –e a+ -k SDEBINARY \ -9 10000 –a all –u username
The table gives an overview of the different responses by the four systems. This table also contains the expected results according to the ISO and OpenGIS definitions and our own definition. The result of inserting the test cases in the four different DBMSs leads to the following general observations. Oracle Spatial (version 9.2.0.3.0) provides a tolerance parameter that is used with many of the operations. In this test a tolerance of 4000 was used. Experiments with different tolerance values yielded the same results. If Informix (Spatial DataBlade Release 8.11.UC1 (Build 225)) finds a polygon with erroneous orientation, it reverses the polygon internally without warning. PostGIS 0.6.2 (on PostgreSQL 7.1.3) only supports the GeomFromText (and not PolyFromText) and the geometries cannot be validated.
4 Conclusion As noticed in our experience when benchmarking and testing geo-ICT products, the consistent use of a polygon definition is not yet a reality. This is both true for the standards (specifications) and the implementation in products. Both based on the ISO 19107 standard and OpenGIS SFS for SQL implementation specification it may sometimes be very difficult or impossible to determine whether a polygon is valid or not. Also according to our evaluation of the set with test cases, the results of (in)valid polygons are not always harmonized between ISO and OpenGIS. Further, both ISO and OpenGIS definitions do not cover the important aspect (when implementing polygons) of tolerance value. Therefore our own improved definition of a polygon (with holes) was given. This was refined by the definition of a (valid) clean polygon, which is suitable for data transfer (and easy validation). A part of the polygon validation may already be embedded in the syntax of the 'polygon input' (string) and certain validation tasks are implicit (e.g. the system does not have to assemble the ring from the individual straight
About Invalid, Valid and Clean Polygons
15
line segments as the rings are specified). One could wonder if the orientation of the rings in a polygon should be a strict requirement as the intended polygon area is clear. In case in the syntax the outer polygon ring would not be determined by the ordering of the rings (outer ring first), but purely by the orientation of the rings (outer ring could be any one in the list of rings), then proper orientation is useful as this can be used to detect inner and outer rings. But even this it is not strictly needed as one could also determine via the geometric configuration (computing) which ring should be considered the outer ring. Besides the theory, our tests with four different Geo-DBMSs, Oracle, Informix, PostGIS, and ArcSDE binary and one geo middleware product (LaserScan Radius Topology, not reported here), revealed that also in practice significant differences in polygon validation do exist. It needs no further explanation that this will cause serious problems during data transfer, including loss of data. We urge standardization organizations and GeoICT vendors to address this problem and consider our proposed definition. Until now, only the validation of (input) polygons is discussed, but what happens with these (in)valid polygons during operations; e.g. intersection of two valid polygons may result in disconnected areas (that is not a valid polygon). How is the area or perimeter of a polygon computed in case it is specified in long, lat (on the curved surface such as an sphere, ellipsoid or geoid), What will be the resulting units and how does the tolerance value influence this result? In this paper only simple polygons with holes on a flat surface were discussed. However as already indicated in the previous paragraph (curved surfaces), more complex situations can occur in the world of geo-ICT products (and standards): x Multi-polygons (that is, two or more outer boundaries which are not connected to each other) x Also include non linear edges in boundary (e.g. circular arcs) x In 3D space, but limited to flat surface x In 3D space, but limited to polyhedral surfaces (piecewise flat) x In 3D, non flat surfaces, especially an Earth ellipsoid (or geoid) For all these situations unambiguous and complete definitions, including the tolerance aspect, must be available. Test cases should be defined and subsequently the products should be valuated with these test cases. And after finishing with polygons, we should continue with polyhedrons (Arens et al. 2003).
16
Peter van Oosterom, Wilko Quak and Theo Tijssen
Acknowledgements We would like to thank the Topographic Service and the Dutch Cadastre for providing us with the test data sets (since January 2004, the Topographic Service is part of the Cadastre). We are further grateful to the vendors and developers of the Geo-ICT products mentioned in this paper (Oracle, Informix, PostgreSQL/PostGIS, ArcSDE), for making their products available for our research. Finally, we would like to thank the anonymous reviewers of this paper for the constructive remarks.
References Arens C, Stoter JE, van Oosterom PJM (2003) Modelling 3D spatial objects in a Geo-DBMS using a 3D primitive, Proceedings 6th AGILE, Lyon, France. Goldberg D (1991) What Every Computer Scientist Should Know About FloatingPoint Arithmetic, ACM Computing Surveys, Vol. 23: 5-48. Güting R and Schneider (1993) Realms: A foundation for spatial data types in database systems. In D. J. Abel and B. C. Ooi, editors, Proceedings of the 3rd International Symposium on Large Spatial Databases (SSD), volume 692 of Lecture Notes in Computer Science, pages 14-35. Springer-Verlag. IEEE (1985) American National Standard -- IEEE Standard for Binary Floating Point Arithmetic. ANSI/IEEE 754-1985 (New York: American National Standards Institute, Inc.). Informix (2000) Informix Spatial DataBlade Module User's Guide. December 2000. Part no. 000-6868. ISO (2003) ISO/TC 211/WG 2, ISO/CD 19107, Geographic information — Spatial schema, 2003. OGC (1999) Open GIS Consortium, Inc., OpenGIS Simple Features Specification For SQL, Revision 1.1, OpenGIS Project Document 99-049, 5 May 1999. Oracle (2001) Oracle Spatial User's Guide and Reference. Oracle Corporation, Redwood City, CA, USA, June 2001. Release 9.0.1 Part No. A8805-01. Oxford (1973) The Shorter Oxford English dictionary. PostgreSQL (2001) The PostgreSQL Global Development Group. PostgreSQL 7.1.3 Documentation. Preparata FP and Shamos MI (1985) Computational Geometry, an Introduction. Springer-Verlag, New York Berlin Heidelberg Tokyo. Ramsey P (2001) PostGIS Manual (version 0.6.2). Refractions Research Inc. Thompson R (2003) PhD research proposal ‘Towards a Rigorous Logic for Spatial Data Representation’. Department of Geographical Sciences and Planning, The University of Queensland, Australia, November 2003. Worboys MF (1998) Some Algebraic and Logical Foundations for Spatial Imprecision, in Goodchild M. and Jeansoulin, R (ed), Data Quality in Geographic Information: from error to uncertainty, Hermes.
3D Geographic Visualization: The Marine GIS Chris Gold 1, Michael Chau 2, Marcin Dzieszko 1 and Rafel Goralski 1 1
Department of Land Surveying and Geo-Informatics, Hong Kong Polytechnic University, Hong Kong, [email protected] 2 Hong Kong Marine Department, Hong Kong
Abstract The objective of GIS and Spatial Data Handling is to view, query and manipulate a computer simulation of the real world. While we traditionally work with two-dimensional static maps, modern technology allows us to work with a three-dimensional dynamic environment. We have developed a generic graphics component which provides many of the tools necessary for developing 3D dynamic geographical applications. Our example application is a 3D “Pilot Book”, which is used to provide navigation assistance to ships entering Hong Kong harbour. We show some of the “Marine GIS” results, and mention several other applications.
1 Introduction: The Real and Simulated World Traditional GIS is two-dimensional and static. The new game technology is 3D and dynamic. We attempt here to develop a “games engine” for 3D dynamic GIS, and to evaluate the process with several applications – primarily a “Marine GIS”, where objects and viewpoints move, and a realistic simulation of the real-world view should be an advantage in navigation and training. While computer game developers have spent considerable time in developing imaginary worlds, and the ways in which players may interact in a “natural” fashion, graphics software and hardware developers have provided most of the necessary tools. However, there has been limited examination of the potential of these technologies for “Real World” spatial data handling. This paper attempts to address these questions.
18 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
For “GIS”, we presume that our data represents some sampling of the real world. This real world data maps directly to the simulated world (objects and coordinates) of the computer representation that we wish to work with. The only difference between this activity and some kinds of computer game playing is the meaning inherent in our data, and the set of operations that we perform on it. Our perception or mental ability to manipulate and imagine the simulated world that we work with is usually limited by our concept of permissible interactions. Historically, we started with paper and map overlays, and this led to a particular set of techniques that were implemented on a computer. Game development is capable of freeing us from many of these constraints. We would like to present an approach to manipulation of our simulated world that attempts to be independent of the usual constraints of static maps – a static viewpoint, a static set of objects and relationships, the inability to reach in and make changes, the inability to make interactive queries, the inability to change the perceived mood of our world (darkness, fog, etc.) We are not alone in doing this – the computer games industry has put much effort into such increased “realism”. The intention of this work is to bring these concepts, and the improved algorithms and hardware that they have inspired, into the context of geographic space, where the objects and locations correspond in some way to the physical world. Again, we are not completely original here either – terrain fly-throughs and airport simulations are widely used. Nevertheless, we think that the overall structure of our interactive interface between the user and the simulated world contains components that have not previously been combined in a geographic application. We will first describe the motivation and imagined actions, then the consequent “games engine” design (which differs from pure games). This leads to the implementation issues of our “GeoScene” engine, Marine GIS and other applications, and our conclusions about this exercise. In the “God Game” (Greeley 1986) the author imagines the computer user creating an interactive game, with individuals who then come alive “inside the screen” – and he is thereafter responsible for their well-being. Our objective is to create a general-purpose interface that allows us to manage, not an imaginary world, but a simulation of some aspect of our current one. Thus we need to: 1) create a simulated world and populate it with geometric objects, lights and cameras (observers); 2) manipulate the relations between observers and objects; 3) query and modify objects’ properties, geometry or location; and 4) automate changes to objects (e.g. location). While not a complete list, these were our initial objectives, motivated by the desire to define a general-purpose simulation and visualization tool for geographic problems.
3D Geographic Visualization: The Marine GIS 19
2 Computer Graphics Background: Viewing Objects One concept in the development of computer graphics (e.g. Foley et al. 1990) is the ability to concatenate simple transformations (rotations, translation (movement) and scaling) by the use of homogeneous coordinates and individual transformations expressed as 4x4 matrices for 3D worlds. Blinn (1977) showed that these techniques allowed the concatenation of transformation matrices to give a single matrix expressing the current affine transformation of any geometric object. More recent graphics hardware and software (e.g. OpenGL: Woo 1999) adds the ability to stack previous active transformation matrices, allowing one to revert to a previous coordinate system during model building – e.g. the sun, with an orbiting planet, itself with an orbiting moon. A further development was the “Scene Graph” (Strauss and Carey 1992; also Rohlf and Helman 1994) which took this hierarchical description of the coordinate systems of each object and built a tree structure that could be traversed from the root, calculating the updated transformation matrix of each object in turn, and sending the transformed coordinates to the graphics output system. While other operations may be built into the scene graph, its greatest value is in the representation of a hierarchy of coordinate systems. These coordinate systems are applied to all graphic objects: to geometric objects, such as planets, or to cameras and lights, which may therefore be associated with any geometric object in the simulated world. Such a system allows the population of the simulated world with available graphic objects, including geometric objects, lights and cameras (windows, or observers). An object is taken from storage, using its own particular coordinate system, and then placed within the world using the necessary translation and rotation. If it was created at a different scale, or in different units, then an initial matrix is given expressing this in terms of the target world coordinates. Geometric objects may be isolated objects built with a CAD type modelling system or they may be terrain meshes or grids – which may require some initial transformation to give the desired vertical exaggeration. In most cases world viewing is achieved by traversing the complete tree, and drawing each object in turn, after determining the appropriate camera transformation for each window. Usually an initial default camera, and default lighting, is applied so that the model may be seen! We have implemented a general-purpose, scene graph based viewer for our particular needs, called “GeoScene”. The heart of GeoScene is the Graphic Object Tree, or scene graph. This manages the spatial (coordinate) relationships between graphic objects.
20 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
These graphic objects may be drawable (such as houses, boats and triangulated surfaces) or non-drawable (cameras and lights). The basis of the tree is that objects can be arranged in a hierarchy, with geometric transformations expressing the position and orientation of an object with respect to its parent object – for example the position of a wheel on a car, a light in a lighthouse, or a camera on a boat. The efficiency of this method is based on the concatenation of transformation matrices based on Blinn’s homogeneous coordinates. Redrawing the whole simulated world involves starting at the root of the tree, incorporating the transformation matrix, and drawing the object at that node (if any). This is repeated down the whole tree. Prior to this the camera position and orientation must be calculated, again by running down the tree and calculating the cumulative transformation matrix until the selected camera is reached. This could be repeated for several cameras in several windows. This process must be repeated after every event that requires a modified view. These events could be generated by window resizing or redrawing by the system, by user actions with the keyboard or mouse, or by automated operations, such as ship movements.
3 Action and Interaction: System Design The system described so far creates and views the world from different perspectives. It is designed for full 3D surface modelling, where the surface is defined by a collection of unrelated triangles. While it is often desirable for other operations to preserve the topological connections between the components (vertices, edges, faces) of an object, this is not necessary for basic visualization. User actions with the mouse or keyboard can have different meanings in different applications, or even in different modes of operation. This means that a “Manipulator” component needs to be written or modified for each application. We have developed a fairly consistent and effective mapping between wheel-mouse operations and navigation within the 3D world (where a camera moves and gives the observer’s view as a result of his operations, or else when the camera is located on a moving boat). These and other actions would be modified as necessary for the application or operation mode. The two main modes are when user gestures move the observer’s position (a left mouse movement means “Go left”) or when the gesture means “Move the selected object left”, in which case the observer appears to move right. Selection is performed by GeoScene, which calls the Manipulator to select the action, and the scene is redrawn. We have been quite successful with mapping the wheel of a wheel mouse to depth
3D Geographic Visualization: The Marine GIS 21
in the screen for many applications. In other cases, object selection is followed by a query of its properties from the database. These actions may be automated by using the “Animator” component, running in a separate thread so as to preserve the timing. This operates on objects that need to change over time – either a value must change (e.g. for a lighthouse), or else a transformation matrix must be updated for object movement. Examples are the movement of a boat, the rotation of a lighthouse beam or the movement of a camera in a fly-through. In the real world, objects may not occupy the same location at the same time. There is no built-in prohibition of this within the usual graphics engine, nor in GeoScene. Particular applications (e.g. the Marine GIS) implement their own data structures for collision detection and topology if necessary. Modification of an object, as done in CAD systems, requires the selection of a portion of the object (usually a vertex, edge or face), and in b-rep systems these are part of a topological model of the object surface (Mantyla 1988). These operations are specific to the application using GeoScene.
4 A Pilot Application – the “Pilot Book” Perhaps the ultimate example of a graphics-free description of a simulated Real World is the Pilot Book, prepared according to international hydrographical standards to define navigation procedures for manoeuvring in major ports (UK Hydrographic Office 2001). It is entirely text-based, and includes descriptions of shipping channels, anchorages, obstacles, buoys, lighthouses and other aids to navigation. While a local pilot would be familiar with much of it, a foreign navigator would have to study it carefully before arrival. In many places the approach would vary depending upon the state of the tides and currents. It was suggested that a 3D visualization would be an advantage in planning the harbour entry. While it might be possible to add some features to existing software, it appeared more appropriate to develop our own 3D framework. Ford (2002) demonstrated the idea of 3D navigational charts. This was a hybrid of different geo-data sources such as satellite pictures, paper chart capture and triangular irregular network data visualized in 3D. The project concluded that 3D visualization of chart data had the potential to be an information decision support tool for reducing vessel navigational risks. Our intention was to adapt IHO S-57 Standard Electronic Navigation Charts (International Hydrographic Bureau 2000, 2001) for 3D visualization. The study area was Hong Kong’s East Lamma Channel.
22 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Fig. 1: MGIS Model - Graphic User Interface
5 The Marine GIS Our own work, based on the ongoing development of GeoScene, was to take the virtual world manipulation tools already developed, and add those features that are specific to marine navigation. As GeoScene contains no topology or collision detection mechanism, in the Marine GIS application we use the kinetic Voronoi diagram (Roos 1993) as a collision detection mechanism in two dimensions on the sea surface, so that ships may detect potential collisions with the shoreline and with each other. Shoreline points are calculated from the intersection of the triangulated terrain with the sea surface, which may be changed at any time. In addition, marine features identified in the IHO S57 standards were incorporated. Fig. 1 illustrates the system. On top of GeoScene, a general “S57Object” class was created, and sub-classes created for each defined S57 object. These included navigational buoys, navigational lights, soundings, depth contours, anchorage areas, pilot boarding stations, radio calling-in points, mooring buoys, traffic separation scheme lanes, traffic
3D Geographic Visualization: The Marine GIS 23
scheme boundaries, traffic separation lines, precautionary areas, fairways, restricted areas, wrecks, and underwater rocks. See Figs. 2-7 for examples.
Fig. 2: Visualization of Navigational Buoy and Lights
Fig. 3: Visualization of Safety contour with vertical extension
Other objects include ship models, sea area labels, 3DS models, range rings and bearing lines, and oil spill trajectory simulation results. Various query functions were implemented, allowing for example the tabulation of all buoys in an area or the selection of a particular buoy to determine its details. Selecting “Focus” for any buoy in the table moves the window viewpoint to that buoy.
24 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Fig. 4: Visualization of Sea Area Label Using 3D fonts
Safety contours may be displayed along the fairways, and a 3D curtain display emphasizes the safe channel. Fog and night settings may be specified, to indicate the visibility of various lights and buoys under those conditions. Safety contours and control markers may appear illuminated if desired, to aid in the navigation. The result is a functional 3D Chart capable of giving a realistic view of the navigation hazards and regulations.
Fig. 5: Visualization of Oil Trajectory Record
6 Other Applications While Marine GIS is the most developed application using GeoScene, a variety of other uses have been developed. Our work on terrain modeling
3D Geographic Visualization: The Marine GIS 25
Fig. 6: Lists of Navigational Buoys
Fig. 7: MGIS Model – Scene of Navigational Mode. When animation mode is activated, the viewpoint will follow the movement of the ship model. The movement of the vessel could be controlled by using mouse clicks.
uses the same interface for 3D visualization, and runoff modeling (using finite difference methods over Voronoi cells rather than a grid) also requires GeoScene for 3D visualization (Fig. 8). Applications are under development for interactive landscape modification, using a “knife” to sculpt the triangulated terrain model.
26 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Fig. 8: Surface runoff modelling based on Voronoi cells.
Recent work on the development of new 3D spatial data structures also uses GeoScene as a visualization tool, and a preliminary demo to show an underground piping layout has been prepared – requiring only a few lines of code to be added to the GeoScene units to perform simple visualization. We believe that an available, operational toolkit for the creation, visualization, query, animation and modification of simulated worlds is of great potential benefit in the geosciences. In our simulation of the real world, we no longer need to think of 2D static visualization, because the tools exist to provide more realistic results. There should be no real reason to work in two dimensions, without time, unless it is clearly sufficient for the application.
7 Conclusions Since the objective of this exercise is one of improved visualization in a simulated real or modified world, it is difficult to evaluate the results directly. Some of our experiences, however, can be summarized. We started our work with extensive discussions and human simulations of the modes of operation: given the wheel mouse as our hardware limit,
3D Geographic Visualization: The Marine GIS 27
we developed gestures that were universally accepted by (young) game players and (old) professors as a mechanism for manipulating the hierarchical observer/world relationships. Game experts and non-experts could learn to manoeuvre in five minutes. This also led to rediscovering the scene graph as a hierarchy of transformations, allowing any levels of cartographic rescaling and object/sub-object repositioning – including locating observers on any desired object. This was the framework of GeoScene. We also found that intuitive interaction depended on observer/world scale: gesturing to the left if the observer was climbing a cliff implied that the actor moved left, but if the observer was holding an object (e.g. in a CAD modelling system) that implied the object moved left, giving a relative observer movement to the right. These are usually separate applications, but our global geographic intentions included landscape modelling as well as viewing. Thus different operating modes were necessary to avoid operator disorientation. This formed the basis of the Manipulator module, where the same gesture had to be mapped to different actions depending on the intention.
Fig. 9: Prototype “Pipes” application.
Relative success with the early marine applications emphasized the generality of the viewing/manipulation system, and forced a redesign separating GeoScene from the specific application. Other applications included 3D terrain and flow modelling projects, and 3D volumetric modelling of oceanographic data, which were standardized on GeoScene, and some surprising offshoots: a trial sub-road utility modelling system was developed in one day, for example, to indicate potential collisions between various pipes, cables and manholes (Fig. 9). Space precludes more illustrations.
28 Chris Gold, Michael Chau, Marcin Dzieszko and Rafel Goralski
Finally, the Marine GIS was enhanced by the inclusion of real features and marine aids, and by improved (perhaps new) 3D cartographic symbolism. Interest is being expressed by the HK and UK governments, and private industry. Our preliminary conclusions are that many applications exist for 3D dynamic modeling of real world situations, if simple enough tools exist. We hope that our work is a contribution to this.
Acknowledgements We thank the Hong Kong Research Grants Council for supporting this research (project PolyU 5068/00E).
References Blinn JF (1997) A homogeneous formulation for lines in 3-space. Computer Graphics 11:2, pp 237-241 Foley JD, van Dam A, Feiner SK and Hughes JF (1990) Computer graphics, principles and practice, second edition. Addison-Wesley, Reading, Massachusetts Ford SF (2002) The first three-dimensional nautical chart. In: Wright D (ed) Under Sea with GIS, ESRI Press, pp 117 -138, Greeley A (1986) God game. Warner Books, New York. International Hydrographic Bureau (2000) IHO transfer standard for digital hydrographic data edition 3.0, Special publication No. 57 International Hydrographic Bureau (2001) Regulations of the IHO for internal (INT) charts and chart specification of the IHO Mantyla M (1988) An introduction to solid modelling. Computer Science Press, College Park, MD Rohlf J and Helman J (1994) Iris performer: A high performance multiprocessing toolkit for real-time 3d graphics. Proceedings of SIGGRAPH 94, pp 381–395 Roos T (1993) Voronoi diagrams over dynamic scenes. Discrete Applied Mathematics 43:3 pp 243-259 Strauss PS and Carey R (1992) An object-oriented 3D graphics toolkit. In: Catmull EE (ed) Computer Graphics (SIGGRAPH ’92 Proceedings), pp. 341–349 The United Kingdom Hydrographic Office (2001) Admiralty sailing directions – China Sea Pilot volume I, fifth edition, United Kingdom National Hydrographer, Taunton Woo M (1999) OpenGL(R) programming guide: the official guide to learning OpenGL, version 1.2. Addison-Wesley Pub Co
Local Knowledge Doesn’t Grow on Trees: Community-Integrated Geographic Information Systems and Rural Community Self-Definition Gregory Elmes1, Michael Dougherty2, Hallie Challig1, Wilbert Karigomba1, Brent McCusker1, and Daniel Weiner1. 1 Department of Geology and Geography, PO Box 6300, West Virginia University, Morgantown, WV 26506-6300, USA. 2 WVU Extension Service, PO Box 6108, Morgantown, WV 26506-6108 USA.
Abstract The Appalachian-Southern Africa Research and Development Collaboratory (ASARD) seeks to explore the integration of community decisionmaking with GIS across cultures. Combining geospatial data with local knowledge and the active participation of the community creates a Community-Integrated Geographic Information System (CIGIS) representing and valuing themes related to community and economic development. The intent is to integrate traditional GIS with the decision-making regime of local people and authorities to assist them in making informed choices and to increase local participation in land use planning, especially within economically disadvantaged communities. Keywords GIS and Society, Participatory GIS, Community-Integrated GIS, Local Knowledge.
1 Introduction This paper addresses how a community defines and characterizes itself and how such information might be included within a GIS to assist the decision process associated with local development projects. The first question is simple: Where does geographical information about a community reside? Is it in the property books? In the maps and charts? The legal deeds? The land surveys? Or does it reside with the people? Clearly it is all of these
30 Elmes et al.
and more, but the local geographical knowledge of residents is often omitted or is the last to be considered in development planning. For development planning in West Virginia, geographical information about communities frequently is reduced to the standard layers of digital geospatial data. GIS typically include five basic components: people, data, procedures, hardware, and software, which are designed to analyze and display information associated with parcels of land, places or communities. Conventional GIS handle information only from formal sources. Thus GIS is seen to represent the world through an “objective” lens by presenting official geospatial data and statistics. This function is important and meaningful, but it is incomplete. Descriptions and understanding of the community by the people of that community are too often missing. Local knowledge can be conveyed along with the standard GIS layers however, and incorporating it enhances the ability of the system to serve as a more effective platform for communication and debate. No longer does the GIS report only on how experts and outsiders define the community, it may also include what those who live in the community and make the landscape alive “know” and even “feel” about that land. A Community-Integrated Geographic Information System (CIGIS) tries to accomplish this task by assimilating self-definitions and human experience of a place. A CIGIS is a hybrid of the formal and the familiar. The statistical data are important, but so are the informal perceptions, images, and stories of the community’s people. All information is necessary for a fuller understanding of place. This paper examines the concepts of CIGIS in a case study of the Scotts Run Community in Monongalia County, West Virginia (USA), one of three West Virginia University sites for the Appalachian-Southern Africa Research and Development Collaboratory (ASARD). The evolving process of CIGIS has guided the initial fieldwork over the last two years and continues to be used to modify the research plan. The initial sections of this paper discuss the derivation of CIGIS concepts within the objectives of ASARD. After a foundation for discussion has been established, the development of the Scotts Run project will be examined in detail and the results to date critiqued. 2 Community-Integrated GIS Advances in GIS from its rudimentary origins to its present state have reflected advances in the associated technology. Significantly for this research, qualitative information can now be incorporated into a system that heretofore has been dominated by quantitative spatial data. Successful in-
Local Knowledge Doesn’t Grow on Trees
31
tegration of public input and qualitative data has resulted in participatory GIS (Craig et al. 2002). Some participatory GIS undertakings have sought to improve access to or understanding of data. A joint mapping venture helped provide a better understanding of the Herbert River catchment area in northeastern Queensland (Walker et al. 1998). In another setting, he ability of GIS to examine various scenarios and outcomes was the focal point of public visioning sessions used to help develop plans for the Lower West Side of Buffalo, N.Y. (Krygier 1998). The inclusion of audio and video recordings along with geovisualization in traditional GIS has been advocated to improve public participation in design and planning efforts (Howard 1998). The NCGIA Research Initiative 19 provided several roots for CIGIS (Weiner et al. 1996). The report “GIS and Society: The social implications of how people, space, and environment are represented in GIS” proposed new methods for including and representing qualitative spatial data in GIS. In 2001, the European Science Foundation and National Science Foundation sponsored a workshop on access and participatory approaches in using geographic information documenting progress since the NCGIA initiative (Onsrud and Craglia 2003.) This focus on fuzzy, ambiguous and intangible spatial data was expanded during studies of the contentious issues of land reform in South Africa (Weiner and Harris 2003). CIGIS extends the capacity of the “expert” technology of GIS to include people and communities normally on the periphery with respect to politics and spatial decision-making, by incorporating local knowledge in a variety of forms – such as mental maps, images, and multimedia presentations – alongside the customary geometric and attribute data. As a result, CIGIS can pose questions that local community participants consider important and broaden access to digital spatial technology in the self-determination of their future. Local knowledge differentiates and values aspects of the landscape that are deemed socially important. It helps a community develop its own definition based upon its own knowledge, alongside the more formal information generally available to official bodies.
3 CIGIS and ASARD The creation of ASARD in 1999 expanded the effort to broaden the participatory scope of GIS activities. The project involves West Virginia University, the University of Pretoria in South Africa, and the Catholic University of Mozambique. The international research collaboration connects
32 Elmes et al.
Appalachia with Southern Africa, seemingly distinct regions, which, in spite of their evident differences, share some common problems, including development at the rural-urban interface, limited access to social and economic resources, external ownership or control of the land, and a lack of influence by local residents on most development-related decisions. Thus, each site aims to use CIGIS to investigate spatial aspects of local and regional land-use decisions, and patterns of uneven development (For more information, see: A-SARD Website undated; Objectives). 3.1 ASARD in Scotts Run In West Virginia, the focus of the work has been the Scott’s Run watershed of northern Monongalia County. The study area defined as the ‘Scotts Run Transect’ extends west from the Monongalia River to Cass and New Hill settlements, and north to the Pennsylvania state line (see http://www.up.ac.za/academic/centre-environmentalstudies/Asard/mapsUWV.htm for a study area map). The collaboratory’s principal effort has been an examination of land use and natural resources issues in the context of social, and particularly, physical development. Local knowledge of the particulars of land, property ownership, tenure, traditional land uses and accessibility characteristics have been acquired and used in the construction of a CIGIS, which includes a strong (re)emphasis on the human component of GIS. Residents are integrated as local experts; ideally they contribute data and shape the objectives of inquiry and analysis. Founded as a mining camp area across the Monongahela River from Morgantown WV, Scotts Run dates from the early 20th century (Ross 1994). While coal was king, the mining community became racially and ethnically diverse as the demand for labor outstripped the local supply. The community has been exceptionally impoverished from the 1920’s onwards, though it has had some spells of relative prosperity, contingent on the state of the national economy and the demand for coal,. As part of a social movement, the Methodist Church opened a Settlement House in 1922 that, even today, is much in use in the provision of basic necessities. One year later, the Presbyterian Church established a mission in the area, commonly referred to as “The Shack”, which continues to serve the needs of the poor and immigrant mining population (Lewis 2002). These centers provided foci from which the research team was able to introduce itself into the community. CIGIS studies in the area began by orientation, familiarization tours around the vicinity, and windshield surveys. Team members walked and
Local Knowledge Doesn’t Grow on Trees
33
drove through the communities, visited the few remaining business establishments, and spoke with people on the streets. During these visits, an informal local leader was identified – Mr. Al Anderson, the chair of the Osage Public Service District (PSD) and sometime seeker of public office. As a shoe repair shop operator, a gospel singer, and a youth club counselor, Mr. Anderson was well connected and respected in the community and became the primary contact for the West Virginia ASARD Team. During several visits, Mr. Anderson has provided the team with local insight, copies of the local newsletter, The Compass, and connections to other residents. His role as champion for CIGIS has become apparent. Discussions with Mr. Anderson and others have led to the identification of sources and collection of local knowledge. A contentious plan for sewer installation, the first in the district, will have profound effects on the social and environmental characteristics of the area. The $8.2 million project began in November 2003. Service is to be established along the main roads within one year, on the secondary roads within five years, and will serve approximately 1,000 households when completed (Henline 2003). From the researchers’ viewpoint in the University community across the river, sewerage service seemed not only essential, but uncontroversial. The plain need for the sewer system was reinforced as team members were shown, and mapped, the unofficial discharge points of household “sewers” directly into Scotts Run – which can be a dry steam bed in the summer time. Yet within the Scott’s Run community, this seemingly indispensable sewer project has caused deep concern and even opposition. Mr. Anderson helped organize the initial public meeting in June 2001, where the team learned of the economic difficulties many residents face for the cost of sewer connections and in regular sewerage charges. Physical displacement is also a possibility. In some localities, the majority of residents are renters whose future tenure has been destabilized by the sewer project. County tax records in the CIGIS confirm the extremely high proportion of absentee landlords. The concentration of land ownership means that a few families stand to gain through the infrastructure improvements. Land-holding corporations own coal-producing lands that are unlikely to be mined; a large proportion of the area. They represent developable land. Further development pressures are being brought to bear by the construction of a new bridge and multi-lane highway access to Morgantown, the county’s largest urbanized area. The residents of Scotts Run fear that the land owners are in a position to control the development that will follow the sewer. A GIS was constructed with available geospatial data. Existing data included USGS 1:24000 DLG framework layers, 30 meter DEM, 1:24000 digital orthophotographs (1997), and various SPOT and Landsat-TM images classified for land cover. Detailed sewer plans were obtained from the
34 Elmes et al.
PSD. Unfortunately the Monongalia County tax assessor’s office still produces manual parcel (cadastral) maps. The WV GIS Technical Center had created digital versions of tax maps for a mineral lands assessment project in 1999, but these had no formal ground controls and were extremely difficult to georegister with other layers. The lack of accurate digital cadastral data, and a similar lack of current land use data, indicated which spatial data would begin to be acquired. Eventually aerial photographs were retrieved from state and university archives for three intervals: 1930, 1957, and 1970. These photographs have been rectified, georegistered and mosaiced. A time-series of land cover change has been created, but much work remains in establishing a satisfactory and consistent classification of land use. Large scale maps of the Scotts Run Transect have been presented to Mr. Anderson for use in his leadership role with the Osage PSD on the area sewer project. In-depth interviews were conducted for a random sample of 57 households in the study area in June and July 2002. Survey respondents were almost equally divided between newcomers and long-time residents. Most respondents revealed that there was no single shared issue. They indicated that they were near coal mines, that houses in the area were run down, and that flooding was a recurrent problem. Despite half reporting income levels of $20,000 or less a large majority of respondents said they did not receive any financial assistance. Incomes were much below the median household income for the state or nation ($29,696 and $41,994 respectively, U.S. Census Bureau, 2003). More than 10 percent of the respondents were African-American, higher than the overall minority population for the county or the state (7.8 percent and 5.0 percent respectively, U.S. Census Bureau, 2003). Additionally, about 64 percent of respondents owned their residence, which, although high for an impoverished area, is below the national and state average, (66.2 percent and 75.2 percent respectively according to U.S. Census Bureau, 2003) reinforcing observations made to team members during meetings and informal discussions related to property ownership and the eventual financial beneficiaries of the sewer project in the area. Progress on the project continues to focus on completing the critical layers and attributes of the conventional GIS database. The 2000 Census of Population and Housing data has been loaded. Ancillary information includes hydrologic, water quality, and mining information, as well as locationally-referenced multimedia presentations. The survey results and narratives are also incorporated into the database to build a geographic representation of place in the community. Maintaining individual confidentiality of the survey records is proving to be a challenge. Appropriate geographic masking techniques are required. Finally, and most importantly, a further series of public meetings is necessary to
Local Knowledge Doesn’t Grow on Trees
35
tantly, a further series of public meetings is necessary to engage the population more deeply in the design, refine the initial feedback, and extend the use of the CIGIS. While we have gathered initial quantitative and qualitative information, it remains necessary to find acceptable means through which the citizens at large can access and handle the spatial data for their own ends.
4 CIGIS and Development Issues Up to now the CIGIS work of the ASARD project has helped enhance the development debate in Scotts Run in several ways. First, it has provided a complete, large-scale, updateable, seamless digital map of the area, which can be reproduced for customized needs. The Osage PSD now has an alternative source of information and does not have to rely on that of government institutions, developers, and land owners who might place their own interests above those of the community and its residents. The PSD did not have to use its limited resources to acquire this information. Second, the project continues to provide background information, such as census, land tenure and land cover data to the group. This information helps augment the base knowledge of the residents. Furthermore, the GIS products have the ability to bring people together to discuss matters related to their community and their own vision of its development. This capacity to stir community involvement will be augmented through a new series of public meetings. The project should be able to build on the momentum of interest generated by the sewer project to become more broadly focused on community development issues. These include the identification and spatial delineation by the community of characteristics that would enhance or detract from it, the types of development that would be beneficial or detrimental to the area, and how acceptable development could be tailored to minimize costs and maximize benefits to community residents. From the point of view of incorporating and sharing local knowledge, the team recognizes that there remains much to be done. While it has been relatively easy to assimilate a range of qualitative information through ‘hot links’, multimedia, and the like, the power of updating and maintaining the data currently resides with the researchers and not within the community. Several technical options for distributed GIS operation are now available through Internet mapping, such as Map Notes in ESRI’s ARCIMS ™. But access to the Internet in Scott’s Run is constrained to low bandwidth telephone modems in schools and community buildings, such as the Settlement House and the Shack. CIGIS must place acceptable means of access
36 Elmes et al.
and control in the hands of residents. Even tools such as ArcReader™ and similar freeware viewers are unlikely to overcome the lack of access to, and relative unfamiliarity with, computers and software. Since economic and socio-cultural conditions are unlikely to support the mass adoption of broadband Internet access in the area, solutions are more likely to be found in increased social interaction and low end technology.
5 Project Transferability and Related Lessons So long as the data and technical requirements are met, there is a high degree of project transferability. In other words, sufficient digital geospatial data to create a standard GIS, along with the hardware, software, and technical expertise have to exist in a community. More critical is the ability to gain entrance into a community and to be able to engage public opinion and sentiment. University researchers are always “others” especially where social, cultural, and economic characteristics mark us as different. A further intangible element is the level of desire of the local population to exercise greater control over the future of their community. Several lessons have been learned during this process on ways to improve the use of CIGIS. The first and perhaps the most important lesson is that there needs to be a project plan in place before starting. Clearly, the research cannot be so academic in its objectives that the results will be of no practical value to the community. A team cannot go into a community and say “We are from the university and we are here to help you.” That does not mean the plan cannot change. In the Scotts Run study for example, health issues have emerged as less central than originally anticipated, principally because the residents revealed more urgent priorities. Meanwhile, land tenure and access to natural resources, such as game lands, have proved to be more important than originally anticipated A second lesson is that there should be a key informant or champion who provides a liaison for the research team. This person should have the respect and the trust of the community and be able to lend credibility through that reputation to the CIGIS. In the Scotts Run study, the key informant was identified very early on and has helped frame the local situation for the team and assist in involving the community in the process. A third lesson is that there must be others besides the key informant working to ensure project success. As a major participant the project may overemphasize their personal perspective, unintentionally or otherwise. Key informants may quickly become dominant actors and shield the research team from other candidate participants. To ensure balance it is necessary to
Local Knowledge Doesn’t Grow on Trees
37
involve additional members of the community. With respect to the Scotts Run study, the team placed too much initial reliance on a single contact. Efforts to identify secondary contacts have so far met with limited success. A fourth lesson is that contact, once made, must be persistent. Not communicating on a frequent basis with the community members can have a devastating impact on momentum and trustworthiness. If the team is not visible in the community on a consistent basis, people may believe that it is there only for its own purposes. In Scotts Run, contact between the team and the community occurred more frequently at the beginning of the project, a natural outcome of the need for initial links. The need to add members to the research team and bring them “up to speed” has led to some unforeseen delays and sporadic communication. A fifth lesson is that community members should be engaged in multiple ways as the local knowledge is being developed. Just as people learn in different ways, they communicate in different ways. Not everyone feels comfortable speaking in public, talking to strangers, being video taped or photographed, and writing responses to questions. In Scotts Run there have been public meetings, informal conversations and discussions, photography, and sketch mapping, as well as the formal survey instrument as varied instruments to gain information and insight on the area.
6 Conclusions This project shows the potential of using CIGIS in community development. The Scotts Run CIGIS can take the next step as it adds the dimension of control, incorporating perceptions and local spatial knowledge into a database, and by using them as a tool for the community to define and examine itself and its values. So far, the Scotts Run undertaking has succeeded to extent of bringing a variety of geographically-referenced information to a greater number of people. It has also shown that it has the ability to create a powerful tool for examining the community and to represent what residents think about the potential changes brought about by development. But the project is a long way from becoming a CIGIS. Technological barriers exist that present hurdles to improved community participation and self-management. Even after two years, the research team is not yet fully integrated within the community, and the community does not yet have sufficient access to suitable computer technology to activate their agenda independently. Basic questions regarding the spatial nature and representation of specific local knowledge remain to be investigated. Two further studies of the nature,
38 Elmes et al.
acquisition and inclusion of local knowledge are underway. It is anticipated that these additional studies will yield the additional information necessary to enable CIGIS help residents to direct development activities and opportunities in their community.
References A-SARD Web Site (Undated) A-SARD research at a glance. www.up.ac.za/academic/centre-environmental-studies/Asard/research.htm accessed May 15, 2004 Craig WJ, Harris TM, Weiner D eds (2002) Community participation and geographic information systems. Taylor and Francis, London Dougherty M, Weiner D (2001) Applications of community integrated geographic information systems in Appalachia: a case study. Presented at the Appalachian Studies Association Annual Conference, March 30-April 1, 2001 Snowshoe, West Virginia Haywood I, Cornelius S, Carver S (1998) An introduction to geographic information systems. Pearson Education Inc, New York Henline J (2003) Scotts Run residents look forward to sewer service: project needs 20 more takers for green light. The Dominion Post June 25, 2003. Website: www.dominionpost.com/a/news/2003/06/25/ak/ accessed July 8, 2003 Howard D (1998) Geographic information technologies and community planning: spatial empowerment and public participation. Presented at the project Varenius specialist meeting, Santa Barbara, California October 1998 www.ncgia.ucsb.edu/varenius/ppgis/papers/howard.html accessed May 15, 2004 Krygier JB (1998) The praxis of public participation gis and visualization. Presented at the project Varenius specialist meeting, Santa Barbara, California October 1998 www.ncgia.ucsb.edu/varenius/ppgis/papers/krygier.html accessed July 1, 2003 Lewis AL (2002) Scotts Run: An introduction. Scott’s Run writing heritage project website. www.as.wvu.edu/~srsh/lewis_2.html accessed July 8, 2003 Mark DM, Chrisman N, Frank AU, McHaffie PH, Pickles J (1997) The GIS history project. Presented at UCGIS summer assembly. Bar Harbor, Maine June 1997 http://www.geog.buffalo.edu/ncgia/gishist/bar_harbor.html accessed May 15, 2004 Miller HJ (in press) What about people in geographic information science? In: RePresenting Geographic Information Systems. Fisher P, Unwin D, eds John Wiley, London Onsrud HJ, Craglia M, eds (2003) Introduction to the second special issue on access and participatory approaches in using geographic information. URISA Journal 15, 1 http://www.urisa.org/Journal/APANo2/onsrud.pdf accessed May 15, 2004
Local Knowledge Doesn’t Grow on Trees
39
Ross P (1994) The Scotts Run coalfield from the great war to the great depression: a study in over development. West Virginia History 54: 21-42 www.wvculture.org/history/journal_wvh/wvh53-3.html accessed July 8, 2003 US Census Bureau (2003) “West Virginia: state and county quick facts. quickfacts.census.gov/qfd/states/54000.html quickfacts.census.gov/qfd/states/54/54061.html accessed July 8, 2003 Walker DH, Johnson AKL, Cottrell A, O’Brien A, Cowell SG, Puller D (1998) GIS through community-based collaborative joint venture: an examination of impacts in rural Australia. Presented at the project Varenius specialist meeting, Santa Barbara, California, October 1998 www.ncgia.ucsb.edu/varenius/ppgis/papers/walker_d/walker.html accessed July 1, 2003 Weiner D, Harris TM, (2003) Community-Integrated GIS for land reform in South Africa. URISA Journal (on-line). www.urisa.org/Journal/accepted/2PPGIS/weiner/community_integrated_gis_f or_land_reform.htm accessed May 15, 2004 Weiner D, Harris TM, Burkhart PK, Pickles J (1996) Local knowledge, multiple realities, and the production of geographic information: a case study of the Kanawha valley, West Virginia. NCGIA Initiative 19 paper. www.geo.wvu.edu/i19/research/local.htm accessed on July 1, 2003
A Flexible Competitive Neural Network for Eliciting User’s Preferences in Web Urban Spaces Yanwu Yang and Christophe Claramunt Naval Academy Research Institute BP 600, 29240, Brest Naval, France. Email: {yang, claramunt}@ecole-navale.fr
Abstract: Preference elicitation is a non-deterministic process that involves many intuitive and non well-defined criteria that are difficult to model. This paper introduces a novel approach that combines image schemata, affordance concepts and neural network for the elicitation of user’s preferences within a web urban space. The selection parts of the neural network algorithms are achieved by a web-based interface that exhibits image schemata of some places of interest. A neural network is encoded and decoded using a combination of semantic and spatial criteria. The semantic descriptions of the places of interest are defined by degrees of membership to predefined classes. The spatial component considers contextual distances between places and reference locations. Reference locations are possible locations from where the user can act in the city. The decoding part of the neural network algorithms ranks and evaluates reference locations according to user’s preferences. The approach is illustrated by a web-based interface applied to the city of Kyoto. Keywords: image schemata, preference elicitation, competitive neural network, web GIS
1 Introduction The World Wide Web (web) is rapidly becoming a popular information space for storing, exchanging, searching and mining multi-dimensional information. In the last few years, a lot of researches have been oriented to the development of search engines (Kleinberg 1999), analysis of web communities (Greco et al. 2001), statistical analysis of web content, structure and usage (Madria et al.
42 Yanwu Yang and Christophe Claramunt
1999, Tezuka et al. 2001). Nowadays, the web constitutes a large repository of information where the range and level of services offered to the users community is expected to increase and enrich dramatically in the next few years. This generates many research and technical challenges for the information engineering science. One of the open issues to explore is the development of unsupervised mechanisms that facilitate manipulation and analysis of information over the web. This implies to approximate user’s preferences and intentions in order to guide and constrain information retrieval processes. Identifying user’s preferences over a given domain knowledge requires either observing user’s choices or directly interacting with the user with pre-defined questions. A key issue in preference elicitation is the problem of creating a valid approximation of user’s intentions with a few information inputs. The measurement process consists in the transformation of user’s intentions towards a classifier or regression model that rank different alternatives. Several knowledge-based algorithms have been developed for preference elicitation from pairwise-based comparisons to value functions. An early example is the pairwise algorithm comparison applied on the basis of ratio-scale measurements that evaluate alternative performances (Saaty 1980). Artificial neural networks approximate people preferences under certainty or uncertainty conditions using several attributes as an input and a mapping towards an evaluation function (Shavlik and Towell 1989, Haddawy et al. 2003). Fuzzy majority is a soft computing concept that provides an ordered weighted aggregation where a consensus is obtained by identifying the majority of user’s preferences (Kacprzyk 1986, Chiclana et al. 1998). Preference elicitation is already used in e-commerce evaluation of client profiles and habits (Riecken 2000, Schafer et al. 1999), flight selection using value functions (Linden et al. 1997), apartment finding using a combination of value function and user’s feedbacks (Shearin and Lieberman 2001). This paper introduces a novel integrated approach, supported by a flexible and interactive web GIS interface, which applies image schemata and neural networks to preference elicitation in a web GIS environment. We introduce a prototype whose objective is to provide a step towards the development of assisted systems that help users to plan actions in urban spaces, and where the domain knowledge involved is particularly diverse and stochastic. This should apply to the planning of tourism activities where one of the complex problems faced is the lack of understanding of the way tourists choose and arrange their activities in the city (Brown and Chalmers 2003).
A Flexible Competitive Neural Network 43
The web-based prototype developed so far is a flexible interface that encodes user’s preferences in the selection of places of interest in a web urban space, and ranks the places that best fit those user’s preferences according to different criteria and functions. Informally, a web urban space can be defined as a set of image schemata, spatial and semantic information related to a given city, and presented to the user using a web interface. The web-based interface provides the interacting level where user’s preferences are encoded using an image schemata selection of the places that present an interest for the user. Those places are classified using fuzzy quantifiers according to predefined degrees of membership to some classes of interest. For example, a temple surrounded by a garden is likely to have high degrees of membership to the classes: garden and temple, relatively high to a class museum and low to a class urban. The second parameter considered in the places ranking process is given by an aggregated evaluation of the proximity of those places to some reference locations. The computational part of the prototype is supported by a competitive back propagation neural network. First, the encoding part of the neural network returns the best location according to the elicitation of her/his preferences from where she/he would like to plan her/his actions in the city. Without loss of generality, those reference locations are represented by a set of hotels distributed in the city. Secondly, the decoding and ranking part of the neural network returns the places of reference ranked in function of their semantic and spatial proximity to the user’s preferences. The prototype is applied to the city of Kyoto, a rich historical and cultural environment that provides a high degree of diversity in terms of places of interest. The reminder of the paper is organised as follows. Section 2 introduces the modelling principles of our approach, and the concepts of places, image schemata and affordances. Section 3 develops the principles of the neural network algorithm and describes the different algorithms implemented so far. Section 4 presents the case study and the Kyoto finder prototype. Finally, Section 4 concludes the paper.
2 Modelling principles We consider the case where the user has little knowledge of a given city. Places are presented to the user using image schemata in order to approximate her/his range of interests. Image schemata are recurring imaginative patterns that help human to comprehend and
44 Yanwu Yang and Christophe Claramunt
structure their experience while moving and acting in their environment (Johnson 1987). They are closely related to the concept of affordance that qualifies the possible actions offered to do with them (Gibson 1979). Image schemata and affordance have been already applied to the design of spatial interfaces to favour interaction between people and real-world objects (Kuhn 1996). We apply those two concepts to the selection of the places that are of interest for the user, assuming that those image schemata and affordances relate to the opportunities and actions she/he would like to take and expect in the city. Places are represented as modelling objects classified semantically and located in space. Let us consider a set of places X={x1, x2, …, xp}. A place xi is described by a pair of coordinates in a two dimensional space, and symbolised by an image schemata that acts as a visual label associated to it. The memberships of a place xi with respect to some thematic classes C1, C2, …, Ck are given by the values x1i , xi2 , …, xik that denote some fuzzy quantifiers with 1didp. xih denotes the degree of membership of xi to the class Ch and it is
bounded by the unit interval [0,1], with 1dhdk. A value xih that tends to 0 (resp. 1) denotes a low (resp. high) degree of membership to the class Ch. A place xi can belong to several classes C1, C2, …, Ck at different degrees, and the sum of the membership values x1i , xi2 , …, xik can be higher than 1. This later property reflects the fact that some classes are semantically close, i.e. they are not semantically independent. This is exemplified by the fact that a place xi with a high degree of membership xih to a class Ch is likely to also have high membership values with respect to the classes that are semantically close to Ch. Let us consider some places of interest in the example of the city of Kyoto. We classify them according to a set of classes {C1, C2, C3, C4} with C1=’Museum’, C2=’Temple’, C3=’Garden’, C4=’Urban’. The image schemata presented in Figure 1 illustrates the example of the Toji Temple in Kyoto labelled as x1. This photograph exhibits a view of the temple surrounded by a park. This can be intuitively interpreted by a relatively high membership to the classes C1, C2 and C3 (one can remark a semantic dependence between the classes C1 and C2), and low to the class C4. Those degrees of membership are approximated by fuzzy qualifiers that are predefined within the web prototype (Figure 1).
3 Back propagation neural network 3.1 Contextual distances and contextual proximities We assume no prior knowledge of the city presented by the web interface, neither experiential nor survey knowledge1. Although those places are geo-referenced, this information is not presented to the user in order to not interfere with the approximation of her/his preferences. The web interface encompasses information either explicitly (image schemata of the representative places) or implicitly (location of the places and the reference location in the city, proximity between them). The proximity between two locations in an urban space is usually approximated as an inverse of the distance factor. We retain a contextual modelling of the distance and proximity between two places. This reflects the fact, observed in qualitative studies that the distance from a region A to a distant region B should be magnified when the number of regions near A increase, and vice versa (Worboys 1995). The relativised distance introduced by Worboys normalises the conventional Euclidean distance between a region A and a region B by a dividing factor that gives a form of contextual value to that measure. This dividing factor is the average of the Euclidean distance between the region A and all
1
Experiential knowledge is derived from direct navigation experience while survey knowledge reflects geographical properties of the environment (Thorndyke and Hayes-Roth 1980).
46 Yanwu Yang and Christophe Claramunt
the regions considered as part of the environment. We generalise the relativised distance to a form of contextual distance. Contextual distance: The contextual distance between a place xi of X={x1, x2, …, xp} and a reference location yj of Y={y1, y2, …, yq}, is given by D(xi, yj) =
d ( xi, yj ) (1)
d ( x, y )
where d ( xi, yj ) stands for the Euclidean distance between xi and yj; d ( x, y ) the average distance between the places of X and the reference locations of Y (the definition above gives a form of generalisation of Worboys’s definition of relativised distance as the dividing factor is here the average of all distances between the regions of one set with respect to the regions of a second set). We also slightly modify the definition of the relativised proximity introduced by Worboys in the same work (1995) by adding a square factor to the contextual distance in the denominator in order to maximize contextual proximities for small distances (vs. minimizing contextual proximities for large distances), and to extend the amplitude of values within the unit interval. The contextual proximity is defined as follows. Contextual proximity: the contextual proximity between a place xi of X={x1, x2, …, xp} and a reference location yj of Y={y1, y2, …, yq} is given by 2
d ( x, y ) 1 = P(xi, yj) = 2 2 1 D( xi , y j ) d ( x, y ) d ( xi, yj ) 2
(2)
Where P(xi, yj) is bounded by the unit interval [0,1]. The higher P(xi, yj) the closer xi to yj, the lower P(xi, yj) the distant xi to yj, and vice versa. 3.2 Neural network principles The web interface provides several flexible algorithms where the input is given by a set of places. Those algorithms encapsulate different forms of semantic and spatial associations between those places and some reference locations. An algorithm output returns the location which is the most centrally located with respect to those
A Flexible Competitive Neural Network 47
places, and according to those associations. Those algorithms are implemented using a back propagation neural network that complies relatively well with the constraints of our application: unsupervised neural network, no input/output data samples and maximum flexibility with no training during the neural network processing (cf. Freeman and Skapura, 1991 for a survey on neural network algorithms). The back propagation neural network is bi-directional and competitive as the best match is selected. This gives a form of “winner takes all” algorithm. The computation is unsupervised, and the complexity of the neural network is minimal.
2
Y layer – Ref erence locations
q
1
w i,j associations X layer - Places 1 p 2
1
2
Base map
q p
1 2
Fig. 2. Neural network principles We initialise the bi-directional competitive neural network using two layers X and Y where X={x1, x2, …, xp} denotes the set of places, Y={y1, y2, …, yq} the set of reference locations (no semantic criteria are attached to those reference locations but they can be added to the neural network with some minor adaptations). The neural network has p vectors in the X layer, q vectors in the Y layer (Figure 2). We define a weight matrix W where wi,j reflects the strength of the association between xi and yj for i=1,..,p and j=1, …, k. Matrix values are initialised during the encoding process which depends of the algorithm chosen. We give the user the opportunity to choose between several algorithms in order to explore different output alternatives and evaluate the one that is the most appropriate to her/his intentions.
48 Yanwu Yang and Christophe Claramunt
3. 3 Neural network encoding The propagation rules of the neural network are based on several semantic and spatial criteria that are described in the cases introduced below. First, propagation ensures selection of the reference location that best fits the user’s preferences according to some spatial and semantic criteria (neural network encoding). Secondly, back propagation to the layer of places ranks the places with respect to the selected reference location (neural network back propagation). Let us first describes the encoding process. An input vector x {(0,1)}p, is applied to the layer X and propagated from the layer X to the layer Y. This input vector describes the places of interest considered in the neural network (i.e. xi = 1 if the place is considered in the computation, xi=0 otherwise). Similarly an input vector y {(0,1)}q is applied to the layer Y where yj = 1 if the reference location is considered in the computation, yj=0 otherwise. The encoding processes provide a diversity of mechanisms in the elicitation of user’s preferences by either allowing x explicit choice of user’s places of interest and return of the most centrally located reference location, amongst the ones selected by the user, and according to some either spatial (cases A and B) or spatial and semantic metrics (case C), x derivation of user’s class preferences elicited from the places selected by the user, and return of the most centrally located reference location, amongst the ones selected by the user, according to some spatial and semantic metrics (case D), x explicit definition of user’s class preferences and return of the best reference location, amongst the ones selected by the user, according to some spatial and semantic metrics (case E). The information input by the user is kept minimal in all cases: selection of places of interest and selection of reference locations of interest. The criteria used for the propagation algorithm and the input values derived for each reference location in the layer Y are as follows (variables used several times in the formulae are described once). Case A: Contextual distance Based on the contextual distance, the algorithm returns the most centrally located reference location given a set of places. The input vector reflects the places that are selected by a user, those places
A Flexible Competitive Neural Network 49
corresponding to her/her place preferences. The algorithm below introduces the propagation part and the encoding of the algorithm: p
input ( yj ) = y j
¦
xi D ( xi , yj ) with wi,j = D ( xi, yj )
(3)
i 1
where xi=1 if the place is selected in the input vector, xi=0 otherwise; yj =1 if the reference location is selected in the input vector, yj =0 otherwise; D ( xi, yj ) denotes the contextual distance between the place xi and the reference location yj; p denotes the number of places in the layer X, q of reference locations in the layer Y. Case B: Contextual proximity This algorithm models the strength of the association between the places and the reference locations using contextual proximities (one can remark that case B negatively correlates case A): p
input ( yj ) = y j
¦
xi P ( xi , yj ) with Wi,j = P ( xi, yj )
(4)
i 1
where P ( xi, yj ) is the contextual proximity between the place xi and the reference location yj. Case C: Contextual proximity + degrees of membership The algorithm below finds out the most centrally located reference location based on two criteria: contextual proximity, and overall interest of the places considered. This case takes into account both spatial (i.e., proximity) and semantic criteria (i.e., overall degree of membership to the classes given) to compute the strength of the association between a place and a reference location. Degrees of membership ( x1i , xi2 , …, xik ) weight the significance of a given place xi with respect to the classes C1, C2, …, Ck. High values of class memberships increase the contribution of a place to the input values on the Y layer and its input ( xi ) in return. The propagation part of the algorithm is as follows: p
input ( yj ) = y j
k
k
h 1
h 1
¦ xi P( xi, yj )¦ xih with Wi,j = P( xi, yj )¦ xih i 1
(5)
50 Yanwu Yang and Christophe Claramunt
where xih stands for the degree of membership of xi with respect to Ch; k denotes the number of semantic classes. Case D: Contextual proximity + degrees of membership + class preferences The propagation part of the algorithm adds another semantic criteria to the algorithm presented in case C: class preferences, a high-level semantic factor derived from the places selected. Formally, and for a given class Ch, its degree of preference ph with respect to an input vector x {(0,1)}p is evaluated by p
¦x x i
ph =
h i
i 1 p j k
¦x ¦x i
i 1
(6)
j i
j 1
Those degrees of preference form a class preference pattern g (p1, p , …, ph), with respect to the classes C1, C2, …, Ch. At the difference of the previous cases, all places in X are considered as part of the input vector at the initialisation of the neural network. The places selected by the user are taken into account only to derive her/his class preferences. Input values in the layer of reference locations are derived as follows: 2
p
input ( yj ) = y j
k
¦ P ( x , y )¦ x i
h i
j
i 1
ph with wi,j
h 1
(7)
k
= P ( xi , yj )
¦x
h i
ph
h 1
Case E: Contextual proximity + degrees of membership + user-defined class preferences The approach is relatively close to the case D but with the difference that class preferences are user-defined. Input values in the layer of reference locations are calculated as follows: p
input ( yj ) = y j
k
¦ P( xi, yj )¦ xih puh with wi,j i 1
h 1
k
= P ( xi, yj )
¦x
h i
p uh
A Flexible Competitive Neural Network 51 k
= P ( xi, yj )
¦x
h i
p uh
h 1
(8)
Where puh denotes a user-defined class preference for class Ch, that is, an integer value given by the user at the interface level. This case gives a high degree of flexibility to user, it constitutes a form of unsupervised neural network. The reference location yj with the highest input ( yj ) value is the one that is the nearest to the places that are of interest, those being or not the ones given by the input vector. 3.4 Neural network decoding and back propagation The back propagation algorithms are applied to all cases. The basic principles of the decoding part of the neural network is to rank the places of the X layer with respect to the “winning” reference location selected in the layer Y. Output values are determined as follows. Case A 1 if input ( yj ) input ( yj ' ) for all j z j' yj(t+1)= ® ¯0 otherwise
(9)
for j =1, 2, …, q Cases B, C, D and E 1 if input ( yj ) ! input ( yj ' ) for all j z j' yj(t+1)= ® ¯0 otherwise
(10)
for j =1, 2, …, q The patterns yj(t+1) produced on the Y layer are back propagated to the X layer thus giving the following input values on the X layer. The consistency of the algorithm is ensured as the decoding is made with the function used in the selective process in the X layer. The place with the best fit is the xi where xi(t+2) t xi’(t+2) at the exception of Case A where the place selected is the xi where xi(t+2) d xi’(t+2)) for all i z i’. The other places are ranked according to their input ( xi ) values (ranked by increasing values for cases B, C, D and
52 Yanwu Yang and Christophe Claramunt
E, by decreasing values for case A). Input values for the X layer are given by q
Case A: input ( xi ) = xi
¦y
j (t
1) Dij
(11)
j 1
q
Case B: input ( xi ) = xi
¦y
j (t
1) P ( xi , yj )
j 1
k
a
Case C: input ( xi ) = xi
¦y j 1
q
Case D: input ( xi ) =
Case E: input ( xi ) =
j
(t 1) P( xi, yj )¦ xih h 1
(13)
k
¦ y j (t 1) P( xi, yj )¦ xih p h j 1
h 1
q
k
¦ y j (t 1) P( xi, yj )¦ xih puh j 1
(12)
h 1
(14)
(15)
with xi(t+2) = input(xi) where xi=1 if the reference location is selected in the input vector, xi = 0 otherwise; yj(t+1) values are given by the input(yj) values above; q denotes the number of places in the layer Y. The input functions above support a wide diversity of semantic and spatial criteria. This ensures flexibility in the elicitation process and maximisation of opportunities despite the fact that user’s data inputs are kept minimal.
4 Prototype development We developed a web-based Java prototype that provides an experimental validation of the neural network encoding and decoding algorithms. The prototype implements all algorithms (case A and B are merged into an algorithm AB in the interface as they give similar results) plus a variation of algorithm A based on the absolute distance (denoted as algorithm A0 in the interface). The web prototype is applied as an illustrative example to the city of Kyoto, an urban context that possesses a high diversity of places. The interface developed so far encodes two main levels of information inputs: places and reference locations. Several places of diverse
A Flexible Competitive Neural Network 53
interest in the city of Kyoto have been pre-selected to give a large range of preference opportunities to the user. Those places are referenced by image schemata and encoded using fuzzy qualifiers according to predefined semantic classes (urban, temple, garden and museum) and geo-referenced. Reference locations are represented by a list of geo-referenced hotels. Figure 3 presents the overall interface of the Kyoto finder. To the left are the image schemata of the places in the city of Kyoto offered for selection, to the top-right the list of seven hotels offered for selection. To the right-bottom is the functional and interaction part of the interface. The algorithm proposed by default is Case D, that is, the one based on an implicit elicitation of user’s class preferences (case D).
Fig. 3. Kyoto finder interface The encoding and decoding parts of the algorithms are encapsulated within the web-interface. The interface provides a selective access to those algorithms by making a distinction between options by default and advanced search facilities. The algorithm applied by default is the one given by the case D where the selection of some image schemata is used to derive user’s preferences (Figure 3). Advanced search facilities offer five algorithms to the user (namely A0, AB, C, D and E as illustrated in Figure 4). Figure 4 illustrates a case where the application of the algorithms gives different location/hotel winners (at the exception of algorithms AB and C that give a similar result). Class preferences are explicitly valued by user when the case E is chosen (index values illustrated at the right-middle interface presented in Figure 4).
54 Yanwu Yang and Christophe Claramunt
Fig. 4. Algorithm output examples After the user’s choice of an algorithm, user’s class preferences are derived. In the example of algorithm D illustrated in Figure 4, the ordered list of class preferences is Temple with p2 = 0.31, Garden with p3 = 0.27, Museum with p1 = 0.25, and Urban with p4 = 0.16. When triggered, the neural network calculates the input values in the Y layer, and selects the hotel that best fits the user’s preference patterns. Figure 5 summarises the results for the previous example (from the left to the right: hotels selected by the user, input values in the layer Y, normalised values in the layer Y). The winning reference location (Ana Hotel) is then propagated back to the X layer where places of interest are ranked according to the algorithm value function.
Fig. 5. Place result examples
A Flexible Competitive Neural Network 55
Finally the results of the encoding and the decoding process can be displayed to the user on the base map of the city of Kyoto. Figure 6 presents the map display of the previous example processed using algorithm D. The wining hotel (i.e. Ana Hotel) is the best reference location with respect to the user’s class preference pattern. The Ana Hotel is denoted by the central square symbol in Figure 4, the best places selected are the circles connected by a line to that hotel. Each place can be selected by pointing in the interface in order to display the image schemata associated to them, other squares are the other hotels while the isolated circles are the initial selection of the user.
Fig. 6. Map results The objective of the Kyoto finder is to act as an exploratory and illustrative solution towards the development of a web environment that supports elicitation of user’s preferences in a web urban space. The algorithms presented offer several flexible solutions to the ranking of some reference locations with respect to places of interest in a given city. The semantic and spatial criteria can be completed by additional semantic and spatial parameters although a desired constraint is to keep the user’s input minimal. A second constraint, we impose on the encoding and decoding processes, is to rely on an acceptable level of complexity in order to guaranty a straight comprehension of the algorithm results. The outputs given by the system are suggestions offered to the user. Those should allow her/him to explore interactively the different options suggested and
56 Yanwu Yang and Christophe Claramunt
to further explore the web information space to complete the findings of the neural network.
5 Conclusions The research presented in this paper introduces a novel approach that combines image schemata and affordance concepts for preference elicitation in a web-based urban environment. The computation of user’s preferences is supported by a competitive back propagation neural network that triggers an encoding and decoding process that returns the reference location that best fits user’s preferences. Several algorithms provide the encoding and the decoding part of the system. Those algorithms integrate semantic and spatial criteria using a few information inputs. The approach is illustrated by a web-based prototype applied to the city of Kyoto. The modelling and computational principles of our approach are general enough to be extended to different spatiallyrelated application contexts where one of the constraints is the elicitation of user’s preferences. The development realised so far confirms that the web is a valuable computational environment to explore and approximate user’s preferences. It allows interaction and multiple explorations of and with multi-modal data: visual, semantic, textual and cartographical. The fact that the web is part of a large data repository allows further exploration in the information space. This prototype opens several avenues of research for further work. The directions still to explore are the integration of training and reinforcement learning in the neural network, implementation of validation procedures and multi-user collaboration in the preference elicitation process.
References Brown B and Chalmers M (2003) Tourism and mobile technology. In: Proceedings of the 8th European Conference on Computer Supported Cooperative Work, 14th-18th September, Helsinki, Finland, Kluwer Academic Publishers. Chiclana F, Herrera F and Herrera-Viedma E (1998) Integrating three representation models in multipurpose decision making based on preference relations. Fuzzy Sets and Systems, 97: 33-48. Freeman JA and Skapura DM (1991) Neural Networks: Algorithms, Applications and Programming Techniques, Addison-Wesley, MA.
A Flexible Competitive Neural Network 57 Gibson J (1979) The Ecological Approach to Visual Perception. Houghton Mifflin Company, Boston. Greco G, Greco S and Zumpano E (2001) A probabilistic approach for distillation and ranking of web pages. World Wide Web: Internet and Information Systems, 4(3): 189-208. Haddawy P, Ha V, Restificar A, Geisler B and Miyamoto J (2003) Preference elicitation via theory refinement. Journal of Machine Learning Research, 4: 317-337. Johnson M (1987) The Body in the Mind: The Bodily Basis of Meaning, Imagination, and Reason. The University of Chicago Press, Chicago. Kacprzyk, J (1986) Group decision making with a fuzzy linguistic majority. Fuzzy Sets and Systems, 18: 105-118. Kleinberg JL (1999) Authoritative sources in an hyperlinked environment. Journal of the ACM, 46(5): 604-632. Kuhn W (1996) Handling Data Spatially: Spatializing User Interfaces. In: Kraak MJ and Molenaar M (Eds.), SDH’96, Advances in GIS Research II, Proceedings. 2, International Geographical Union, Delf, pp 13B.113B.23. Linden G, Hanks S and Lesh N (1997) Interactive assessment of user preference models: The automated travel assistant. User Modeling, June. Madria SK, Bhowmick SS, Ng WK and Lim EP (1999) Research issues in web data mining. In: Proceedings of the 1st International Conference on Data Warehousing and Knowledge Discovery, pp. 303-312. Riecken (2000) Personalized views of personalization. Communications of the ACM, 43(8): 26-29. Saaty TL (1980) The Analytic Hierarchy Process, McGraw-Hill, NewYork. Schafer JB, Konstan J and Riedl J (1999) Recommender systems in ecommerce. In: Proceedings of the ACM Conference on Electronic Commerce, pp. 158-166. Shavlik J and Towell G (1989) An approach to combining explanationbased and neural learning algorithms, Connection Science, 1(3): 233255. Shearin S and Lieberman H (2001) Intelligent profiling by example. In: Proceedings of the International Conference on Intelligent User Interfaces (IUI 2001), Santa Fe, NM, pp. 145-152. Tezuka T, Lee R, Takakura H and Kambayashi Y (2001) Web-based inference rules for processing conceptual geographical relationships. In: Proceedings of the 1st IEEE International Web GIS Workshop, pp. 1424. Thorndyke PW and Hayes-Roth B (1980) Differences in Spatial Knowledge Acquired from Maps and Navigation, Technical Report N-1595-ONR, The Office of Naval Research, Santa Monica, CA. Worboys M (1996) Metrics and topologies for geographic space. In: Kraak MJ and Molenaar M (Eds.), SDH’96, Advances in GIS Research II, Proceedings. 2, International Geographical Union, Delf, pp. 365-375.
Combining Heterogeneous Spatial Data From Distributed Sources M. Howard Williams and Omar Dreza School of Math & Comp. Sc., Heriot-Watt Univ., Riccarton, Edinburgh, EH14 4AS UK,
Abstract The general problem of retrieval and integration of data from a set of distributed heterogeneous data sources in response to a query has a number of facets. These include the breakdown of a query into appropriate subqueries that can be applied to different data sources as well as the integration of the partial results obtained to produce the overall result. The latter process is non-trivial and particularly dependent on the semantics of the data. This paper discusses an architecture developed to enable a user to query spatial data from a collection of distributed heterogeneous data sources. This has been implemented using GML to facilitate data integration. The system is currently being used to study the handling of positional uncertainty in spatial data in such a system. Keywords: Spatial databases, XQL, GML, distributed databases, heterogeneous data.
1 Introduction The problem of retrieving and integrating information from a set of heterogeneous data sources involves a number of complex operations (ElKhatib et al. 2002). These include the breakdown of a query into appropriate sub-queries that can be applied to different data sources and the integration of the partial results obtained to produce the overall result (MacKinnon et al. 1998). Initially the problem was more difficult to deal with but progress in communication and database technologies has facilitated solutions. The present situation is characterized by a growing number of
60
M. Howard Williams and Omar Dreza
of applications that require access to data from a set of heterogeneous distributed databases (Elmagarmid and Pu 1990). This need for access to distributed data is certainly true of spatial data where there has been growing interest in retrieval and integration of data from distributed sources. One development in the USA is the National Spatial Data Infrastructure (NSDI) which developed under the coordination of the Federal Geographic Data Committee (FGDC 2003). This defined the technologies, policies, and people necessary to promote sharing of geospatial data throughout all levels of government. It provides a base or structure of practices and relationships among data producers and users that facilitates data sharing and use. It is a set of actions and new ways of accessing, sharing and using geographic data that enables far more comprehensive analysis of data to help decision-makers. The implementation of Internet GIS requires not only network infrastructures to distribute geospatial information, but also software architectures to provide interactive GIS functions and applications. This paper discusses an architecture developed to demonstrate the integration of heterogeneous spatial data from distributed data sources to support particular decisions.
2 Query process In order to provide a user-friendly interface that is easily understandable to users, one may develop graphical or natural language interfaces to such a system. A query at this level may involve data from different data sources. Thus the first task is to map a query from this general form into subqueries addressed to specific data sources and expressed in an appropriate query language. For example, a query such as “What roads may be affected in region (xmin, ymin, xmax, ymax) if the level of lake x rises by 3 metres?” may require three different sets of data - a map of the lake, the road network and the DTM data sources. Thus this general query must be translated into three specific sub-queries, each formulated in the language that is understandable by the related database manager. In order to implement this, each data provider needs to develop a wrapper (Zaslavsky et al. 2000) whose purpose is to translate a query from the server language to that of the data source and transform the results provided by the data source to a common format understandable by the server. As this part of the process is not the main focus of our work, a simple approach has been adopted for our implementation.
Combining Heterogeneous Spatial Data From Distributed Sources
61
3 Data integration process Because of the autonomy of the data providers, the data generated from the three data sources mentioned in the previous section will in general have different syntax (i.e. different schemas), and different semantics. In order to integrate them it is necessary to convert them to a common format. For this purpose GML (GML 2001) was chosen. GML has the following properties:
It is an XML encoding of geography, It enables the GI community to leverage the world of XML technology, It supports vector mapping in standard Web applications, It provides complex features, feature collections and feature associations.
4 Architecture The architecture described here has been developed and implemented to support access to a distributed collection of heterogeneous spatial data sources. The basic idea behind this architecture is that the user should be able to pose a query which requires data from different sources and that the system will fetch the data and combine it to produce an answer without the user being aware of the different data sources involved. The architecture is based on a client/server approach, with three levels - the client level, the server level, and the data provider level. A system based on this architecture has been developed using Java. 4.1 The Client User Interface Currently the system uses a very simple Client User Interface (CUI) with a set of templates for fixed scenarios. Since the main focus of our work is on the handling of positional uncertainty this provides a simple and easy to use interface although limited for general use. The CUI is designed to allow the user to specify the keys of his/her query such as the region name, coordinates, etc. For a scenario such as flooding, the user can specify the affected entities (roads, land parcels, buildings, etc), the entity causing the flooding, etc. For a scenario such as utilities maintenance, the user can specify the utilities that may be affected
62
M. Howard Williams and Omar Dreza
by the maintenance (water pipe, gas pipe, etc.), the maintaining utility, and key parameters (such as digging width, measurement unit, etc.). Fig. 1 shows an example of the CUI for a flooding scenario query. The CUI presents the template to the user and obtains from the user the values of the parameters. These are then used to generate a query that is sent to the Server. The client also has a browser component for viewing the results retrieved by the Server. As shown in fig. 5 this browser contains GIS tools to help the user to zoom, pan, and select specific features. It also displays the scale of the viewed results and can send the results to a printer to produce hardcopy.
Fig. 1. Client User Interface
4.2 The Server The Server works as an information mediation service where user queries against multiple heterogeneous data sources are communicated via middleware (mediator) which is responsible for query rewriting, dispatching query fragments to individual sources, and assembling individual query results into a composite query response (Levy et al. 1996; Papakonstantinou et al. 1996; Tomasic et al. 1998; Wiederhold 1992). The proposed architecture of the server contains a number of processes used to analyse the query sent by the client as illustrated in Fig. 2. The server has three main purposes. The first is to break down the query passed
Combining Heterogeneous Spatial Data From Distributed Sources
63
from the client into a simple set of sub-queries, which can be understood by the data provider taking account of the purpose of the client query. This is performed by the KB component that generates the separate subqueries required to satisfy the general query. The second purpose is to determine appropriate data sets, which can be retrieved from the data providers and which will satisfy the user query parameters (scale, locations). Finally, the server must integrate the spatial results sent back to the server from the data providers. Once again the KB component is used to perform this integration. This task will be further illustrated in section 5. The server metadata consists of summary extracts derived from the provider metadata. It contains information about the data sets from all data providers known to the server. This information includes data provider URL address, the scale of the data set, the spatial location covered by the data set, etc., and is used to determine whether a data set is relevant to a specific sub-query and, if it is, how to perform a connection to the data provider. The following example shows part of the server metadata that is retrieved for the lake flood scenario.
y quer e User rpos u p ery + qu
Client part User query Spatial integration of the incoming results from the data provider
Spatial query
Results integration process
KB
ress add URL
et, atas L, d traints R U ons ry c que
Data provider connector
View of the dataset based on the user query
ta Da er vid pro rt pa
Fig. 2. Server Architecture
t name Datase ry al_que + spati +scale
Server metadata
Sp ati al
qu ery
Spatial_analysis control
, t, scale Datase n o ti a c lo
Search process
KB
ba Proc sed es on s se the ver UR al c the me Ls f alls o tad ata und in
The KB_XML file contains a set of knowledge relating to the specific data sets that can be used. Each set contains a spatial query related to a particular problem and the optimum scale that can be used in this particular scenario. The KB component takes the parameters of the query provided by the client (e.g. affected entities, etc.) and converts them to an XML query, which is used to fetch the KB_XML file. This query is developed using the XQL standard. More information on this and the XCool library can be found in (Robie 1999). The following is an example of the results retrieved from the KB_XML file with a flooding scenario XML query. LAKE
Combining Heterogeneous Spatial Data From Distributed Sources
65
25000 <Spatial_Query> SELECT * FROM LAKE WHERE LAKE.name = X DTM ROAD 25000 <Spatial_Query> SELECT * FROM ROAD WHERE ROAD.name = X DTM 25000 <Spatial_Query> SELECT * FROM REGION WHERE REGION.x >= Xmin And REGION.x <= Xmax And REGION.y >= Ymin And REGION.y <= Ymax
The Result integration process is responsible for integrating the data views sent to it from the various data providers corresponding to subqueries based on the user query. The result integration process is responsible for integrating the spatial data views to satisfy the client query. This integration is based on a KB, which contains procedures for integrating data for specific scenarios. For example, for the scenario used to calculate the effect of a flood on a lake, the lake has to be overlapped with the DTM data. Knowing the height of the lake we can generate the contour for the flood boundaries. Then from overlaying the generated contour and the road, the affected roads can be determined. 4.3 Data provider The main tasks of the data provider part of the architecture are to answer the server query and produce results to the server spatial query. When the server requires a data provider, it connects to the data provider process us-
66
M. Howard Williams and Omar Dreza
ing RMI. Each data provider as shown in Fig. 3, contains two major components: the wrapper, and the Metadata. The wrapper contains three processes. The first controls the retrieval of the data set. This is used to check for the location of data set by querying the data provider metadata. The following is an example of the part of the XML file, which relates to the DTM data provider metadata: DTM <SCALE>1:50,000 ID_points2 H:\www\DTMdataProvider\DTM\ Edinburgh 3 3
Fig. 3. Data provider
Combining Heterogeneous Spatial Data From Distributed Sources
67
The second process is responsible for applying the spatial query to the spatial database. For example, if the spatial database used is MapInfo, the retrieval process is written in the MapBasic language provided by MapInfo. This could easily be extended to handle other spatial databases. The third process retrieves the results generated by the query. It converts the TAB file generated from the spatial query to a GML file or a text file (in the case of DTM data). This process is implemented using the MITAB library (MITAB 2002). Fig. 4 shows part of the GML file representing the retrieved road network data. Finally, the uncertainty associated with the exact locations of features is <SpatialReferenceSystem srsName=""> unspecifiedunspecified <Spheroid> <SpheroidName>unspecified 3.0 1 Edinburgh Road1 329167.8580,655677.3579 329176.0383,655571.1892 329162.0340,655366.3166 329185.8037,655184.3520 329193.8983,655093.3545 329142.1628,654857.9885 329128.0728,654668.2872 328925.3267,654326.1183 328882.7140,653817.6995 328815.6889,653605.0299 328755.8159,653468.2469 328499.6635,653186.5517 327820.3729,652660.4251 Fig. 4. Part of GML file representing road network data.
68
M. Howard Williams and Omar Dreza
determined and added to the results.
5 Using the approach The approach described here has been implemented in a system developed to support access to a distributed collection of heterogeneous spatial data sources. To illustrate the operation of the system developed, consider the flooding scenario with a set of data sources. The results of a specific query of this type are shown in Fig. 5.
Fig. 5. Final results
Combining Heterogeneous Spatial Data From Distributed Sources
69
6 Conclusion This paper is concerned with the problem of combining spatial data from heterogeneous sources. An architecture that has been developed to handle queries involving multiple spatial data sources has been extended to incorporate different forms of uncertainty in the data. A prototype has been implemented to realise this architecture. This was initially based on the use of XML as encoding standard for the metadata and the KB and to represent the data transferred in response to requests between different levels of the architecture. The system uses XQL as the query language to retrieve information that encoded in XML, such as metadata and KB. GML has proved to be an excellent notation for representing vector-GIS data. It is simple and readable and based on XML schema. However, there are some limitations in handling large DTM files. Using GML schema, the data and its geometry can be stored in a single file, which makes it easier to represent data transferred between different spatial data sources.
Acknowledgement The authors gratefully acknowledge the support of Biruni Remote Sensing Center who is providing funding for O. Dreza to conduct this research toward a PhD.
Reference El-Khatib HT, Williams MH, MacKinnon LM, Marwick DH (2002). Using a distributed approach to retrieve and integrate information from heterogeneous distributed databases. Computer Journal, 45(4):381-394. Elmagarmid AK, Pu C (1990) Introduction: special issue on heterogeneous databases (guest editors), ACM Computing Surveys, 22(3):175–178. FGDC (2003) FGDC, USGS, 590 National Center, Reston, VA 20192. Updated: Thursday, 27-Mar-2003.http://www.fgdc.gov/nsdi/nsdi.html GML 2001. Geography Markup Language (GML 2.0), OpenGIS® Implementation Specification, 20, OGC Document Number: 01-029. http://opengis.net/gml/01-029/GML2.html Levy A, Rajaraman A, Ordille J (1996) Querying Heterogeneous Information Sources Using Sources Descriptions. Proceedings of the 22nd International Conference on VLDB, pp. 251-262.
70
M. Howard Williams and Omar Dreza
Mackinnon LM, Marwick DH, Williams MH (1998) A model for query decomposition and answer construction in heterogeneous distributed database systems. Journal of Intelligent Information Systems, 11: 69-87. MITAB. MapInfo .TAB and .MIF/.MID Read/Write Library, http://pages.infinit.net/danmo/e00/index-mitab.html. Papakonstantinou Y, Abiteboul S, Garcia-Molina H (1996) Object Fusion in Mediator Systems. Proceedings of the 22nd International Conference on VLDB, pp. 413-424. Robie J (1999). XCOOL WWW document, http://xcool.sourceforge.net . Tomasic A, Raschid L, Valduriez P (1998) Scaling Access to Heterogeneous Data Sources with DISCO. IEEE Transactions on Knowledge and Data Engineering, 10(5):808-823. Wiederhold G (1992) Mediators in the Architecture of Future Information Systems. IEEE Computer, 25(3):38-49. Zaslavsky I, Marciano R, Gupta A, Baru C (2000) XML-based Spatial Data Mediation Infrastructure for Global Interoperability, 4th Global Spatial Data Infrastructure Conference Cape Town, South Africa. http://www.npaci.edu/DICE/Pubs/.
Security for GIS N-tier Architecture Michael Govorov, Youry Khmelevsky, Vasiliy Ustimenko, and Alexei Khorev 1 GIS Unit, Department of Geography, the University of the South Pacific, PO Box 1168, Suva, Fiji Islands, [email protected]; 2 Computing Science Department, the University College of the Cariboo, 900 McGill Road, Kamloops, BC, Canada, [email protected]; 3 Department of Mathematics and Computer Science, The University of the South Pacific, PO Box 1168, Suva, Fiji Islands, [email protected]; 4 Institute of Computational Technologies, SBRAS, 6 Ac. Lavrentjev Ave., Novosibirsk, 630090, Russia, [email protected]
Abstract Security is an important topic in the Information Systems and their applications, especially within the Internet environment. Security issue for geospatial data is a relatively unexplored topic in Geographical Information Systems (GIS). This paper analyzes the security solutions for Geographical Information Storage Systems (GISS) within n-tier GIS architecture. The first section outlines the application of the main categories of database security for management spatial data. These categories are then analyzed from a point of view of application within GIS. A File System within Database (FSDB) with traditional and new encryption algorithms has been proposed to be used as a new GISS solution. A FSDB provides more safe and secure storage for spatial files and support centralized authentication and access control mechanism in legacy DBMS. Cryptography solutions as a topic of central importance to many aspects of network security are discussed in detail. This part of the paper describes several traditional and new symmetric, fast and nonlinear encryption algorithms’ implementation with fixed and flexible key sizes.
72
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
1 N-tier Distributive GIS Architecture Two major recent tendencies in the development of GIS technology are relevant to security: 1. First is adaptation of IT technology, such as n-tier software architecture. Existing GIS solutions started transition to the Web distributive and open n-tier architecture a few years ago. But still in most existing GIS applications, the map server provides only cartographic rendering and simple spatial data analysis on the client and back-end tiers. Current Web Map Servers are a simplification of full functional application server at the middle of the 3-tier industry-standard architecture. 2. The second tendency is GISS transition from file’s spatial data warehouses to full functionality of spatial databases solutions with employment of DBMS as a storage system within in Single Server or Distributed Environment. The advantages of such transition are well-known to the IT industry. In global geo-network large amounts of data are still stored in spatial warehouses as flat files (e.g. in .shp, .tab, .dxf, .img), which have single user access, large size of files and no transaction based processing.
Fig. 1. The Feasible GIS n-tier Architecture
The purpose of this article is to analyze the security solutions for spatial data management within GIS n-tier architecture. This section outlines the feasible GIS n-tier architecture and role of GISS to store GIS spatial data. The feasible GIS n-tier architecture is shown in Fig. 1. GIS functionality, data, and metadata can be assigned to various tiers (sometimes called layers) along a network and can be found on the server
Security for GIS N-tier Architecture
73
side in one or more intermediate middleware layers, either on the back-end or client side. All 3-tiers can be independently configured to meet the users' requirements and scaled to meet future requirements. The feasible architecture includes a client tier in which user services reside. Client tier is represented by Web browser or wireless device (thin client), and either Web browser with Java applets or ActiveX components or a Java application (thick client) [9]. The middle tier is divided in two or more subsystems (layers) with different functions and security features, including SSL encryption, authentication, user’s validation, single-sign logon server, and digital signature. GIS Web services perform specific GIS functions, and spatial queries; and can be integrated as a part of the middle-tier application server [1]. Spatial components have capabilities for accessing and bundling maps and data into the appropriate format before sending the data back to a client. These components support different functionalities: generate image maps and stream vector spatial data for the client; return attribute data for spatial and tabular queries; execute geo-coding and routing functions; extract and return spatial data in appropriate format; search a spatial metadata repository for documents related to spatial data and services; and run spatial and cartographic generalization techniques. Data Management Layer (GISS) controls database storage and retrieval. Data access logic describes transactions with a database. Data access is normally performed as a functionality of business logic. Since many spatial data are still stored in file format, the management of this data may be significantly improved by storing data within a database system. Critical security communication channels of information flows within classical Application Server are between: a Web browser and a Web Server; Web server and a business logic layer (cases of thin and medium client configurations); and a business logic layer and a back-end tier. Also attention should be focused on secure communication between all other distributed components of middle tier. The first question is how to secure flowing information, the second, how to maintain access control. Because of the connectionless nature of the Web, security issues relate not only to initial access, but to re-access also. For the case of the thick client, these two problems can be addressed how to secure communication between thick client and business logic layer.
74
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
2 Security Controls within n-tier GIS Architecture One of the primary reasons for deploying an n-tier system within Internet environment is security improvement. Thus, application logic in the middle tier can provide a layer of isolation to sensitive data maintained in spatial database. For GIS applications, the middle tier in n-tier system can focus on pre-presentation processing and cartographic presentation of spatial data to the user, allowing the back-end tier to focus on management and heavy processing of spatial data. However, n-tier architectures increase the complexity of practical security deployment compared with 2-tier Client/Server architecture. For GIS n-tier architecture a general security framework should address the same requirements as for legacy n-tier systems, which include authentication, authorization, identification, integrity, confidentiality, auditing, non-repudiation, credential mapping, and availability [4, 15]. There are some specifics of spatial data management, which concern protecting confidentiality, and integrity of data while in transit over the Internet and when it is stored on internal servers and in databases. This section outlines the general security framework for GIS Web based n-tier architecture. In the next sections, solutions for confidentiality protection of spatial data in storage are discussed. A firewall can be basically the first choice of defense within GIS Web based n-tier architecture. One device or application can use more than one basic firewall mechanisms such as stateful packet filtering, circuit-level gateway, and proxy server and application gateway. Many configurations are possible with placement of firewalls. Several layers of firewalls can be added for security [10]. Ideal solution is to provide buffers of protection between Internet, GIS Application Server and spatial database [12]. Most of the existing Web Map Servers use a low level authentication, which supports minimal security and based on a password. Cryptographic authentication in the form of digital certificates must to be used for stronger authentication. Authentication protection can be implemented within Web Server, JSP, servlet or ASP connector, business logic layer and back-end tier. The next defense line of security in GIS Application Server is proper access control to business logic components and back-end recourses. Authorization services determine what resources and files a user or application has access to. There are at least three main access control models, which can be used - mandatory, discretionary and role-and-policy based authorization schemes [5].
Security for GIS N-tier Architecture
75
If the subsystems of n-tier architecture have different security infrastructures, they may need to convey authorization information dynamically by propagating it along with an identity. GIS Application Server can dynamically update users and roles by leveraging an external, centralized security database or service, via LDAP server. Determining whether a specific user could have access to a specific table or file, but not access to specific data within the table or file usually enforces access control within the spatial database. Such a situation can be interesting for accessing certain level of multi-detailed representation of spatial features from spatial multi-scale database. If there is need to enforce entity-level access control for data within tables, one has to rely on database views, or program the access logic into stored procedures or applications. If access logic is programmed into applications, then these applications must be rewritten if security policies change. Another important feature of GIS n-tier architecture security is protection of GIS data and service confidentiality in exchanges between clients, middle tier and back-end tier, and in a spatial storage. Encryption is the standard mechanism for these purposes and can be used within GIS n-tier architecture for different purposes of protection. First purpose of such protection is encryption of a user’s identity for authentication and authorization services. For a typical case, this relies on the transport layer for security via the SSL protocol, which also provides data integrity and strong authentication of both clients and servers. Second, encryption can be used for the protection of spatial data in transit. Next section of the article gives an overview of this security aspect. Third, cryptography can be used to encrypt sensitive data stored on DSS, including caches.
3 Web Services' Security of Spatial Message Protection A GIS Web service is a software component that can provide spatial data and geographic information system functionality via the Internet to GIS and custom Web applications. GIS Web services to perform real-time processing on the computers where they are located and return the results to local application over the Internet. The protocols, which form the basis of the Web service architecture, include SOAP, WSDL, and UDDI. Current SOAP security model is based upon relying on the transport layer for security and recently emerged security specifications that provide message-level security that works end-to-end through intermediaries [14].
76
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
XML-based security schemes for Web services include XML Digital signature, XML Encryption, XML Key Management Specification, Extensible Access Control Markup Language (XACML), Secure Assertion Markup Language (SAML), Web Services Security, and ebXML Message Service. The XML Signature (XMLSIG) in conjunction with security tokens supports multiple signers for a single XML document for proving the origin of data and to protect against tampering during transit and storage. The XML Encryption (XMLENC) specification supports the ability to encrypt or portions of an XML document for providing the confidentiality. SAML specifies the XML format of asserting authentication, authorization, and attributes for an entity. XACML out of the OASIS group specifies how authorization information can be represented in an XML format. OpenGIS specifications are including Web Map Service, Web Feature Service, Web Coverage Service, and Catalog Service/Web profile. The SOAP message security approaches can be applied for protection of GIS Web service. Thus, GIS applications, which are using XML (GML, ArcXML) for a web services, can use XML digital signatures for verification of the origins of messages. Important advantage for encryption of spatial data (for large data streaming) with emerged XMLENC is encryption of a part(s) of an XML document while leaving other parts open. 3.1 Internet File System (IFS) and Encryption Security Solutions for Spatial Warehouses Volumes of spatial information, which are stored in files, are growing at explosive rates. According to some sources, the volume of such file storage is doubled every year [7]. At the same time, many new formats are used to store spatial and non-spatial data within files. The GIS users and distributive applications demand to store, manage and retrieve information in safe and secure manner. GIS users and applications should have universal secure access mechanism to the spatial files’ database. A RDBMS is a core system in any organization or should be a core system, which has powerful mechanism to store different type of information with different access rights and sophisticated security mechanisms. Every year new products have emerged on the market, which raise possibilities to utilize legacy RDBMS for unusual purposes. But idea of application of these products is similar: to have only one universal system for information storage, processing and retrieving within an organization.
Security for GIS N-tier Architecture
77
3.1.1 File System within RDBMS Instance as Storage for GIS Data Files File System within Database (FSDB), a relatively new idea, can help solve the above-mentioned problem effectively as follows: FSDB raises the possibility for any file to be created, reviewed, corrected, approved, and finally published with appropriate access restrictions for user groups or simple users into DBMS. The files can be versioned, checked in and checked out, and synchronized with the local copies [11]. At the same time FSDB can be replicated by standard replication procedures of any sophisticated modern DBMS. The protocol servers that are included, for example, with the Oracle IFS allow the FSDB to provide support for all common industry standard protocols through the Internet or application server and within the enterprise network [11]. A FSDB can provide a multi-level security model to ensure the privacy and integrity of documents in a number of different ways, such as: leveraging the security provided by the DBMS; user authentication; access rights definition; access control at the file, version and folder level; support for Internet security standards; and anti-virus protection [11]. A FSDB secures GIS files by storing them in a DBMS. The FSDB uses authentication mechanism to get access into a DBMS or repository of FSDB, regardless of the protocol or tool being used to access a file. Newest versions of FSDB have more sophisticated authentication mechanisms, such as SSO servers, Internet Directory and LDAP server’s utilization. Oracle IFS was used to test protection of the spatial data file while in storage and during an on-going processing [8]. Users can use their desktop GIS and any other applications while spatial data is stored and managed by database, thereby leveraging the reliability, scalability and availability that come with the database, and at the same time have the familiarity and ease of a standard file system. Oracle IFS stores spatial data files in the form of Large Objects (LOBs) inside of database, which lets GIS users store very large files. LOBs are designed for fast access and optimized storage for large binary content. Fig. 2 shows authentication and authorization processes between external desktop GIS application and IFS storage. Obviously FSDB while providing great possibility for security and management of spatial data files also prompts several concerns: Will the transition of spatial data files from standard OS file system (e.g. NTFS or UFS) to FSDB affect the performance of input, retrieval and updating of spatial data? Will the size of spatial storage be increased?
78
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
Fig. 2. IFS Security Model
Performance results (time differences) of input, retrieval and updating GIS data files in desktop GIS software such as MapInfo and ArcView from Oracle IFS 9i are shown in Fig. 3. Different sizes of vector GIS files were used for the study. The large pool size buffer, cache size and processes components of IFS and Oracle 9i Application Server were optimized to achieve the best performance of IFS.
Fig. 3. “IFS – NTFS” Time Differences (in seconds)
The negative results are obtained for processing of small-size files using Oracle Buffer Cache. All other results give difference of about 1-2 seconds for processing data files with the sizes up to 100 MB by using IFS storage to compare to native OS system. The study of the changes in the spatial data file sizes, compare with the amount of space that they take up in NTFS and IFS drives, shows that the
Security for GIS N-tier Architecture
79
Oracle IFS tablespace is increased in size by about 12% only. That difference can be reduced changing database storage parameters for IFS. The results of IFS performance investigation show that this approach is acceptable for data processing within GISS. Within this approach of spatial file storage, the following authentication and authorization levels can be used to secure spatial data files: OS Level (share permissions and folder permissions) and IFS Level. Permissions remain the same regardless of the protocol (e.g. HTTP, FTP, SMB) being used to access the content stored in IFS repository. 3.2 Conventional Encryption for GIS Data Protection in Storage It is noteworthy that the IFS within DBMS is capable enough to provide sufficient security to spatial files. If necessary, encryption can be employed to provide additional security to confidential and sensitive GIS information. Oracle Advanced Security of the Oracle 9iAS supports industry standard encryption algorithms including RSA’s RC4, DES and 3DES and can be used for spatial data encryption [6]. Custom external encryption algorithms can be integrated into that security schema as well. The data encryption can significantly degrade system performance and application response time. For performance testing, the Oracle 9i DBMS_OBFUSCATION.TOOLKIT was investigated (see Figure 4). Different key length gives different time results, for e.g. difference of time between 16 and 24 byte keys is about 10-20%, but time difference of 24 and 32 byte keys is about 5% only. Average speed 3 DES encryption is about 2.5 sec per megabyte, or about 1 hour to encrypt or decrypt 1 GB spatial data on workstation (1.6 GHz Intel Processor within Window OS). To use special multiprocessor UNIX servers, the encryption/decryption can be reduced to 10-20 minutes or in the best way to several minutes, what is applicable to real environment, when decryption/encryption of spatial data should be performed once per session. To keep encrypted GIS data files into IFS, standard encryption of Oracle and new developed encryption algorithms were analyzed and investigated for performance. To provide encryption or decryption of sensitive application data, decryption procedures can be activated by database triggers for authenticated users (during log in). To log off, user will again fire the trigger that should execute the procedure to encrypt all the modified files or to replace decrypted files by already encrypted files into IFS LOB objects from the temporary storage within encrypted files. If connection to database is lost by accident, changes to files should be committed or roll backed by DBMS and modified data encrypted back into permanent LOB objects. Decryp-
80
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
tion and encryption of spatial data files will slow down user interaction with the system. These delays would occur at two occasions when user logs in and logs out or there is session failure. 3.2.1 New Encryption Algorithm for GIS Data Protection in Storage Special approaches were developed to use encryption for large files in Oracle. To encrypt LOB data objects, the procedure splits the data into smaller binary chunks, then encrypts and appends them to the LOB object back. Once the encrypted spatial data files have been allocated into LOB segments, they can be decrypted by chunks and written back to BLOB object. For the read-only spatial data files, additional LOB object once encrypted should always be kept. It will save time for encryption procedure during log off. The decrypted spatial data files will be simply replaced by read-only encrypted spatial data files in the main permanent storage during log off. The algorithm of binary and text files encryption, which is more robust, compared to DES and 3DES, has strong resistance to attacks, when adversary has the image data and ciphertext proposed by V. Ustymenko [13]. This algorithm can be applied to encrypt spatial raster and vector data types, which are commonly used in GIS. The encryption algorithm is based on a combinatorial algorithm of walking on a graph of high girth. The general idea was to treat vertices of a graph as messages and arcs of a certain length as encryption tools. The encryption algorithm has a linear complexity and it uses nonlinear function for encryption, thus it resists to different type of attacks of adversary. The general idea was to treat vertices of a graph as messages and arcs of a certain length as encryption tools. The quality of such an encryption in case of graphs of high girth by comparing the probability to guess the message (vertex) at random with the probability to break the key, i.e. to guess the encoding arc is good. In fact the quality is good for graphs, which are close to the Erdos bound, defined by the Even Cycle Theorem [2, 3]. In the case of algebraically defined graphs with special colorings of vertices there is a uniform way to match arcs with strings in some alphabet. Among them can be found ''linguistic graphs'' whose vertices (messages) and arcs (encoding tools) both could be naturally identified with vectors over GF(q), and neighbors of the vertex defined by a system of linear equations. The encryption algorithm is a private key cryptosystem, which uses a password to encrypt the plain text, and produces a cipher text.
Security for GIS N-tier Architecture
81
The developed prototype model allows testing the resistance of the algorithm to attacks of different types. The initial results from such tests are encouraging. In case for p=127 (size of ASCII alphabet minus “delete” character), some values of t(k,l) [time needed to encrypt (or decrypt because of symmetry) file, size of which is k Kilobytes with password of length l (key space roughly 27l )], processed by an Intel Pentium 1.6 GHz processors workstation (Oracle 9i DBMS Server, PL/SQL programming language), can be represented by the matrix shown in Table 1. Our results presented in Table 1 indicate that the encryption/decryption time has linear correlation to the file size. Roughly it takes about 60 seconds for 51 KB file encryption within 16 byte length password by using PL/SQL functions, and for 1 MB - about 17 minutes. If more powerful 2-4 processors workstation and C++ or Macro Assembler programming languages are used to rewrite encryption/decryption functions, encryption time will be further decreased by several dozen times, e.g. for 100 MB file size it can reach 20-30 minutes encryption/decryption time, which can be used for practical implementation. Taking into consideration that the 10-20 processors systems are practical industrial server solution (expected to be common in near future), GISS encryption/decryption time can be reduced to less than 5 minutes. Table 1. Processing time t(k,l) for encryption/decryption by the New Algorithm as compared with RC4 New Algorithm (s) Kb/L 7.6 51.5 96.6 305.0 397.0
Currently, program code and encryption algorithm optimization are under investigation by the authors and will be the subject of our future publications.
82
M.Govorov, Y.Khmelevsky, V.Ustimenko, and A.Khorev
4 Conclusion N-tier architectures and Web Services are making the application layer more complex and fragmented. The solution in protection lies in application of the security framework to all subsystems and components of n-tier system. This framework has to comply with the industry security requirements of major application development models. GIS data management and Mapping Services are primary considerations when developing GIS n-tier architectures. There are several reasons for supporting n-tier architectures for spatial applications. Major reasons include providing user access to data resources and GIS services through the Web and at the same time providing better data and service protection. Framework of standard security mechanisms can be used to improve security within critical points of spatial information flows within GIS Application Server. Security solutions for GIS distributive systems can be approached in ways similar to e-commerce applications, but can be specific to spatial data security management as it relates to spatial data types, large size of binary files and presentations logic. Often, file servers are used to store GIS data. A file system within database instance provides more safe and secure storage for spatial files within centralized authentication and access control mechanism in legacy DBMS. By using additional encryptions, a FSDB is able to guarantee that access control is enforced in a consistent manner, regardless of which protocol or tool is being used to access the repository. Our encryption model would provide a secure working environment for GIS client to store and to transfer spatial data over the network. For this purpose we utilize existing and new fast nonlinear algorithms of encryption with flexible size of keys based on the graph theoretical approach.
References [1] [2] [3] [4]
ArcIMS 4 Architecture and Functionality (2002) An ESRI White Paper Biggs NL (1988)Graphs with large girth, Ars Combinatoria 25, pp 73-80 Bollobas (1976) Extremal Graph Theory, Academic Press Computer Networking: A Top-Down Approach Featuring the Internet (2001) Addison Wesley Longman, Online Course [5] De S, Eastman CM, Farkas C (2002) Secure Access Control in a Multi-user Geodatabase, 22nd Annual ESRI International User Conference [6] Heimann J (2003) Oracle 9i Application Server, Release 2, Security
Security for GIS N-tier Architecture
83
[7] iWrapper software (2002) eSpatial, http://www.espatial.com/products/iwrapper.htm [8] New name: The Oracle Content Management SDK (2003) http://otn.oracle.com/products/ifs/content.html [9] OpenGIS Web Map Server Interface Implementation Specification, Revision 1.0.0 (2000) OpenGIS Project Document 00-028 [10] Security and ArcIMS (2001) An ESRI White Paper [11] Security and the Oracle Internet File System, Oracle Internet File System (2000) Technical White Paper [12] System Design Strategies, (2003) An ESRI White Paper [13] Ustimenko V (2002) Graphs with special arcs and Cryptography, Acta Applicandae Mathematicae, 74, pp 117-153 [14] Web Services Security: SOAP Message Security 1.0 (2004) OASIS, WSSecurity, http://www.oasis-open.org/committees/documents.php [15] WebLogic Security Framework: Working with Your Security Eco-System (2003) BEA, White Paper [16] WebSphere (2003)Web Services Handbook, IBM, Version 5
Progressive Transmission of Vector Data Based on Changes Accumulation Model Tinghua Ai1,2, Zhilin Li 2, and Yaolin Liu1 1 School of Resource and Environment Sciences, Wuhan University, China 2 Department of Land Surveying and Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, China2
Abstract The progressive transmission of map data over World Wide Web provides the users with a self-adaptive strategy to access remote data. It not only speeds up the web transfer but also offers an efficient navigation guide in information acquisition. The key technology in this transmission is the efficient multiple representation of spatial data and pre-organization on server site. This paper aims at offering a new model for the multiple representations of vector data, called changes accumulation model, which considers the spatial representation from one scale to another as an accumulation of the set of changes. The difference between two consecutive representations is recorded in a linear order and through gradually addition or subtraction of “change patches” the progressive transmission is realized. As an example, the progressive transmission of area features based on this model is investigated in the project. The model is built upon the hierarchical decomposition of polygon into series of convex hulls or bounding rectangles and the progressive transmission is accomplished through component of the decomposed elements. Keywords: progressive transmission, map generalization, polygon decomposition, convex hull, web GIS, changes accumulation model.
1 Introduction The advent of Internet presents two challenges to cartography. One is the resulting new space to be mapped, namely cyberspace or virtual world
86
Tinghua Ai, Zhilin Li, and Yaolin Liu
(Taylor 1997, Jiang and Ormeling 1997). The other is the advancement of mapping technology into web environment, including the presentation of map data on web, remote access of map data, on-line map-making with data from different web sides, and so on. The former challenge resulted from modern visualization leads to cartography into the development of new basic concepts. The latter resulted from mapping technology, on one hand, provides new opportunities and methods for the representation of spatial phenomena, and on the other hand, results in new issue on how to live with web. When downloading map data from web side, the user usually demands a fast response. It has been studied that the user will be impatient after waiting for 3 minutes. There are two problems to be solved: (1) quickly finding the location of requested map data with a search engine, and (2) quickly downloading map data under an interactive control. The first problem is related to the efficiency of the specific map search engine to process metadata and will not be discussed here in this paper. The second problem can be partially solved by improving the hardware and web infrastructure, such as the use of broadband. This problem is also related to the organization of data on server and transmission approaches across web. In this domain, the progressive transmission of map data from coarse to fine details becomes a desirable approach. Organized in sequence of significance, the map data is transferred and visualized on the client site step by step with increasing details. Once the user finds that the accumulated data meets his requirements, he can interrupt the transmission at any time. It is a self-adaptive transmission procedure in which the user and system can communicate interactively. As the complete data set on server usually covers much more details than required by users, the interruption can save much time for some users. The progressive transmission not only speeds up the web transfer but also corresponds to the principle of spatial cognition. From the viewpoint information acquisition, the progressive process behaves as an efficient navigation guide. The progressive transmission of raster data and DEM/TIN has been successfully implemented in web transfer (Rauschenbach and Schumann 1999, Srinivas et al 1999). But for vector map data, it still remains a challenge. One of the reasons could be that the multi-scale representation of vector data is much more difficult than that of raster or DEM data. It is not easy to find a proper strategy to hierarchically compress vector data, similar to the quadtree for the approximate representation of raster data in different resolutions. Indeed, the progressive transmission of vector data becomes a hot issue. Bertolotto and Egenhofer (1999, 2001) first presented the concept of progressive transmission of vector map data and provided a formalism model
Progressive Transmission of Vector Data 87
based on distributed architecture. Buttenfield (2001) investigated the requirements of progressive transmission based on the modified strip tree (Ballad 1981) developed a model for line transmission. Han and Tao (2003) designed a server-client scheme for the progressive transmission. The progressive transmission is an application of multiple representation of spatial data in web transfer environment, associated with map generalization. It can be regarded as the inverse process of map generalization at a low interval scale change. The key solution is to pre-organize generalized data on server site in a linear order with progressively increased details. In this study, a model to represent multi-details based on changes accumulation is presented, which considers the spatial representation from one scale to another scale as an accumulation of the set of changes. The rest of paper is organized as follows: Section 2 explains the changes accumulation model for the purpose of progressive transmission of vector map data. An application of this model is offered in section 3, which is about the transmission of area feature. An algorithm of hierarchical decomposition of polygon is investigated and the progressive transmission is realized through the accumulation of certain separated elements. Section 4 presents some conclusions.
2 The changes accumulation model The key question in progressive transmission is to pre-organize the vector data in a hierarchical structure on server site, with the assistance of map generalization technology. In this section, a model of the organization of vector data for progressive transmission is described, which is called changes accumulation model and takes into consideration of the level of geometric details. 2.1 Model description Multi-scale representation of vector map data can be achieved by two methods: (1) storing representation versions based on the generalized results, and (2) deriving multi-scale representations from initial data via online generalization. Our model belongs to the former one, but the recorded data is the changes between two consecutive representations rather than the complete version of representation. Let the initial representation state be S0 and the change between Si and Si+1 be ǻSi. Then the ith representation can be derived as the accumulation of the series of changes, i.e.
88
Tinghua Ai, Zhilin Li, and Yaolin Liu
Si = Si-1 + ǻSi-1 = S0 + ǻS0 +ǻS1 +…+ ǻSi-1 The representation {Si} corresponds to the series of data in the order of detail increment. Each state scene Si is the function of spatial scale and the change ǻSi can be regarded as the differential of representation Si to scale variable c. On server site, only the data S0 and {ǻSi } are just recorded. The initial representation S0 is the background data to meet the basic demands from all potential users and is determined by the purpose of use and the semantics. The set {ǻSi } is determined by spatial scale. The resolution to decompose {ǻSi } is determined by the granularity of progressive transmission, which will impact on the number of set elements. The order of element organization in set {ǻSi } is determined by the transmission sequence, based on a linear index. Generally speaking, the data volume of the storage of {ǻSi } is much smaller than that of the storage of { Si }. 2.2 Three basic operations A new state is derived through the change integration which can be regarded as set operations. In geometric operations, the transformation from the original state (large scale) to target state (small scale) is rather complex and needs generalization operators and/or algorithms. But from the point of view of change decomposition, the changes between two consecutive representations can be classified into certain categories. For example, the decomposed results of different geometric operations can usually be regarded as the segment or bend for line object and patch or simple convex polygon for area object. According to the role, the operations of changes can be distinguished into three categories. There are two roles to play in change parts to make the representation. The first operation plays a positive role in the representation so as to make the foreground object, i.e. to act as the part component of the detailed representation. Another plays a negative role in the representation so as to make the background object, i.e. to act as the complement of the detailed representation. Correspondingly, three operations can be distinguished, namely the addition for the former one, the subtraction for the latter one, and the replacement for that with both roles in one change step. In geometric characteristics, the addition operation will increase the length of line or the area of polygon, but the subtraction operation does the reverse. The addition will assign an additional part to the representation and the subtraction will remove a part from the representation. From the
Progressive Transmission of Vector Data 89
viewpoint of set operations, in the change accumulation model, these three operations can be realized by union, difference and their joint respectively.
3 An example ---- Polygon hierarchical decomposition and progressive transmission In this section an application example of progressive transmission of vector data based on the changes accumulation model is presented. The studied object is an area feature. We try to accomplish the transmission of area feature from coarse to fine with increasing details. On server site the data is organized in a hierarchical structure (i.e. tree) with levels of details based on the hierarchical decomposition of polygon. Two kinds of area features will be investigated, i.e. natural area feature such as lake, and artificial area feature such as building. In this approach, the “change” reflects as the decomposed element including convex-hull and bounding rectangle. 3.1 Hierarchical decomposition We try to decompose the polygon object into set of basic parts (changes in changes accumulation model) and to represent the area feature at certain scale through certain combination of these parts. The transformation from small scale to large scale is realized by addition, subtraction and replacement of some “changes”. In computational geometry, a polygon can be decomposed into elements with or without overlaps. The result of the former is called a cover and that for the latter is called a partition (Culberson and Reckhow, 1994). The decomposed element usually includes rectangle, simple convex polygon, triangle, grid and others. In this study the polygon decomposition belongs to the cover type and the basic decomposed element is convex hull or bounding rectangle. For the approximation of polygon in GIS, we usually use MBR or convex hull to represent its coverage. This enveloping representation will include some background areas (the “pocket” polygons in the concave part of original polygon), leading to the increment of the area. The measure of convex degree is defined as the ratio of the area of polygon to the area of its convex hull (or bounding rectangle). This measure is used to describe the approximation accuracy. The larger the convex degree is (closer to value 1.0), the higher accuracy the approximation representation is. Here we investigate the included “pocket” polygons and eliminate them by their approximations respectively. But this elimination will remove more area
90
Tinghua Ai, Zhilin Li, and Yaolin Liu
than it should be and next we add the approximation of second level “pocket” polygons. Continue this decomposition with the addition and subtraction of the approximations in turn until the “pocket” polygon is small enough or can not be further decomposed. This is the basic idea of our method to decompose polygon. Here we define the studied area feature the object polygon (abbreviated as OP) and the approximation of one polygon the approximation polygon (abbreviated as AP). The AP in geometry includes MBR and convex hull and their different usage will be discussed later. The decomposition result is stored in a hierarchical tree, represented as H-tree with nodes representing AP. The node of H-tree has the description of hierarchical level. The algorithm is described as follows: 1) Construct the AP of OP and initiate AP to H-tree as the root with level 0; 2) Extract the “pocket” polygons through the overlap computation with AP and OP getting result R={p0, p1,…pn}, and sort the element polygons on area decrement; 3) Push the “pocket” polygon pi in R which satisfies the further decomposition condition to stack P; 4) If the stack P is null, stop, otherwise pull one polygon pi from stack P and do the following steps: 4.1) Construct the AP of pi and assign AP to H-tree with level add 1 on the basis of pi’s level; 4.2) Extract the “pocket” polygons through the overlap of AP and pi and sort the elements on area decrement ; 4.3) Push the “pocket” polygon satisfying the further decomposition condition to stack P;
The decomposition result is the final H-tree whose nodes represent the approximations of the polygon. Such an approximation of polygon has two candidates in terms of geometric shape: (1) convex hull, and (2) minimum bounding rectangle. They are respectively applied to represent such natural features as lake, land-use parcel with irregular boundary, and such artificial feature as building with orthogonal property. The convex hull of polygon can be constructed by many algorithms and the optimum one reaches to computation complexity O(lg(n)n). For the construction of MBR of building polygon, the bounding rectangles in all edge orientations are generated and the smallest one is selected. As the number of building polygon edges is limited, the generation of rectangle candidates costs not much time. The conditions to stop further decomposition in the above algorithm include: (1) the convex degree is larger than a tolerance, say 0.9, and (2) the
Progressive Transmission of Vector Data 91
area of the polygon is small enough. The sort operation on “pocket” polygons is to guarantee the “child nodes” under certain “parent” having the sequence of area decrement (from left to right), and the later progressive transmission having the order of significance decrement correspondingly. Figure 1 illustrates a polygon and the corresponding decomposition result, the H-tree structure. In this case we take the convex hull as the approximation element. +
h1
h11 h121
h14
h12 h31
h13
h2 +
h3
Fig. 1. The illustration of a polygon and the corresponding H-tree by the decomposition of convex hull.
The H-tree decomposition has the following properties: 1. The level depth is associated with the complexity of polygon boundary; 2. The sub-tree by cutting some “descendent branches” represents the approximation of polygon in different resolution, see Figure 2. 3. The hierarchical decomposition is convergency with the final complete decomposition representing the real polygon; 4. The size of the decomposed element (corresponding to a “node” in Htree) is associated with the visual identification. The “leaf node” is able to represent SVO, smallest visible object (Li and Openshaw, 1992). 5. The increment of data volume is not obvious. Compared with the full representation, the number of overlapped APs increases greatly, but the number of adding points is small. h1 h12 h14
h1
h2
h2 h3
h3 Fig. 2. Extracting different sub-trees gets different approximation of polygon.
92
Tinghua Ai, Zhilin Li, and Yaolin Liu
3.2 Linear index construction After the decomposition of polygon into parts, the next is to decide the sequence of these parts organization. The H-tree structure has to be converted to a linear structure in the order of detail increment. In H-tree, each “node” about convex hull (or MBR) has an operation sign depending on whether its level number is even or odd and computed as (-1)n, where n is the level number. The “node” with plus sign has the positive contribution in polygon representation and that with minus sign the negative contribution. Based on the hierarchical tree, the polygon can be represented as the integration of all nodes (a node corresponds to a AP) using the following expression: m
Poly
¦ (1)
ni
hi , where ni is the level number of “node” hi
i 1
The item in the above expression with plus sign corresponds to the addition in later progressive transmission, and that with minus sign corresponds to the subtraction. The order of item integration could be in two situations: (1) based on the decomposition level number (under one decomposition level, the “node” hi has been sorted on area size) , and (2) based on the area size of corresponding AP. On the basis of the level order, the polygon in figure 1 can be represented as: h0-h1-h2-h3+h11+h12+h13+h14+h31-h121
The level based linear index implies to vertically scan the H-tree from top to bottom. At the same level, the geometry of “nodes” may be quite different (in size). According to this linear index, the small size “node” but with priority level (low number in value) will be transmitted first. Obviously this sequence is not reasonable and is against the principle from coarse to fine. Compared with the area of full representation of the polygon, the area change for this transmission jumps from larger to smaller and next step vice versa. The second linear index, which is based on the order of area size, does not take care of the decomposition level. It will let the
Fig. 3. An example of building polygon and 21 decomposed MBRs .
Progressive Transmission of Vector Data 93
large size and more important AP first transmit, respecting the manual habits in visual cognition (from large characteristics to small details). In this study, the linear index based on the area size is used for the organization strategy for progressive transmission. For the building shown on the left side of Figure 3, 21 MBRs with 5 decomposition levels are made. A comparison of the transmission between the level based on linear index and the area size based on linear index are as shown in Figure 4 and Figure 5. From two figures, it can be found that the transmission based on level linear index is with steep change in area. Visually it is found that the transmission of the latter reflects well the process from coarse to fine with large component parts first appearing and small details later.
level 0 level 1 level 2 level 3 level 4 Fig. 4. The progressive transmission of MBRs based on the order of position level.
level 5 decom-
step 1 step 2 step 5 step 10 step 15 step 20 Fig. 5. The progressive transmission of MBRs based on the order of area size.
3.3 Application in progressive transmission Based on the hierarchical decomposition of polygon with MBR or convex hull, the changes accumulation model has been built. The decomposition level determines the operations, e.g. addition and subtraction in the model. This model covers a wide scale range in multi-scale representation. In the real application of progressive transmission, we need to determine the scale range for data transmission. We may not always begin with the tree root (the complete approximation of object polygon, convex hull or MBR). It means, on server site, we need to combine part of changes to generate the initial packet data and transfer it in the first step. And then let user to decide what when to stop the transmission. The size of MBR or convex hull has something to do with recognition ability. Given a scale, we can extract sub-expression of linear index expression. It implies this method
94
Tinghua Ai, Zhilin Li, and Yaolin Liu
can contribute to the application fields other than progressive transmission, such as the spatial query on multi-scale representation.
Fig. 6. An example of lake polygon and the decomposed convex hulls.
Considering the difference of area feature in geometry, in this method, we distinguish two kinds of area features, namely irregular polygon and orthogonal polygon, based on the same decomposition ideas. Two algorithms have been realized in our study. Figure 5 is the experiment result for building feature and Figure 7 for lake polygon with complex boundary, which is digitized from 1:10000 topographical map. In Figure 6 the lake polygon has been decomposed into 214 convex hulls (in experiment we adopt the termination condition that the polygon is exact convex.). In some deep levels, the decomposed result reflects as very small patches with few points. In the progressive transmission, when arriving at step 50, the representation of accumulated convex hulls is very close to that of the real polygon (full representation). The area size is also close to that of the full representation. On client site, the data reorganization can be accomplished by geometric computation of polygon overlap to get normal polygon representation. The AP element with plus sign corresponds to union operation and that with minus sign the difference operation.
step 0 step 5 step 10 step 20 step 50 step 80 Fig. 7. The progressive transmission of details based on H-tree decomposition.
4. Conclusion The multi-scale representation and hierarchical organization of vector data is the key for progressive transmission. Supported by the idea that video
Progressive Transmission of Vector Data 95
data compression in which only the change content rather than full frame image is recorded, we present the changes accumulation model. Indeed, the progressive transmission can be regarded as a mapping process of data representation from spatial scale to temporal scale. The data details separated on the basis of spatial scale is transmitted in time range domain. Each snapshot in time domain corresponds to one representation at certain spatial scale. Representation at a higher resolution in spatial scale will be obtained after waiting for a longer time. In technology, the progressive transmission is associated with map generalization. If generalization can output dynamically data within a wide scale range rather than at one scale point, the series of data is well suitable for progressive transmission. Unfortunately, most of existing generalization algorithms can just derive new data at one “scale point” other than in a “scale duration”. This paper presents the model of data organization based on changes accumulation and tries to unify the representation through change data, regardless what and how generalizations execute. As generalization just output data at one scale point, the changes between representations can be extracted by the geometric comparison of consecutive versions to get difference. Based on set operation, three operations, i.e. addition, subtraction, and replacement are defined. The change accumulation model in some degree is a bridge between off-line generalization on server site and online generalization on client site. Compared with multi-versions representation, the data volume of change accumulation model is reduced. For the purpose of approximation from coarse to fine, we present a hierarchical decomposition based on convex hull and MBR geometric construction for polygon multi-representation. The decomposed result is easy to be stored in the changes accumulation model. The linear index based on the area size of basic element not only guarantees the transmission from coarse to fine, but can generate the relation to recognition resolution according to the definition of SVO. The progressive transmission using this method is effective in time cost and can be realized on line, because the data is just displayed on client without additional computation.
Acknowledgements This work is supported by the National Science Foundation, China under the grant number 40101023, and Hong Kong Research Grant Council under the grant number PolyU 5073/01E.
96
Tinghua Ai, Zhilin Li, and Yaolin Liu
References Ai T and Oosterom P van (2002) GAP-tree Extensions Based on Skeletons. In: Richardson D and Oosterom P van(eds) Advances in Spatial Data Handling, Springer-Verlag, Berlin, pp501-514. Ballard D (1981) Strip Trees: A Hierarchical Representation for Curves. Communication of the Association for Computing Machinery, vol. 14: 310-321. Bertolotto M and Egenhofer M (2001) Progressive Transmission of Vector Map Data over the World Wide Web. GeoInformatica, 5 (4): 345-373. Bertolotto M and Egenhofer M (1999) Progressive Vector Transmission. Proceedings, 7th International Symposium on Advances in Geographic Information Systems, Kansas City, MO: 152-157. Buttenfield B P (2002) Transmitting Vector Geospatial Data across the Internet, In: Egenhofer M J and Mark D M (eds) Proceedings GIScience 2002. Berlin: Springer Verlag, Lecture Notes in Computer Science, No 2478: 51-64. Buttenfield B P (1999) Sharing Vector Geospatial Data on the Internet. Proceedings, 18th Conference of the International Cartographic Association, August 1999, Ottawa, Canada, Section 5: 35-44. Culberson J C and Reckhow R A (1994) Covering polygons is hard. Journal of Algorithms, 17(1): 2-44. Han H, Tao V and Wu H (2003) Progressive Vector Data Transmission, Proceedings of 6th AGILE, Lyon, France. Jiang B and Ormeling F J (1997) Cybermap: the map for cyberspace. Cartographic Journal, 34 (2):111-116. Li Z and Openshaw S (1992) Algorithms for automated line generalization based on a natural principle of objective generalization. International Journal of Geographical Information Systems. 6 (5): 373-389. Muller J C and Wang Z (1992) Area-path Generalization: A Competitive Approach, The Cartographic Journal. 29(2): 137-144. Oosterom P Van (1994) Reactive Data Structure for Geographic Information Systems. Oxford University Press, Oxford Rauschenbach U and Schumann H (1999) Demand-driven Image Transmission with Levels of Detail and Regions of Interest. Computers & Graphics, 23(6): 857-866 . Srinivas B S R, Ladner M and Azizoglu (1999) Progressive Transmission of Images using MAP Detection over Channels with Memory. IEEE Transactions on Image Processing, 8(4): 462-475. Taylor D (1997) Maps and Mapping in the Information Era. In: Ottoson L (eds) Proceedings of the 18th ICA/ACI International Cartographic Conference, Stockholm, Sweden, June 1997 Gävle, pp 23-27.
An Efficient Natural Neighbour Interpolation Algorithm for Geoscientific Modelling∗ Hugo Ledoux and Christopher Gold Department of Land Surveying and Geo-Informatics The Hong Kong Polytechnic University, Hong Kong [email protected] — [email protected]
Abstract Although the properties of natural neighbour interpolation and its usefulness with scattered and irregularly spaced data are well-known, its implementation is still a problem in practice, especially in three and higher dimensions. We present in this paper an algorithm to implement the method in two and three dimensions, but it can be generalized to higher dimensions. Our algorithm, which uses the concept of flipping in a triangulation, has the same time complexity as the insertion of a single point in a Voronoi diagram or a Delaunay triangulation.
1 Introduction Datasets collected to study the Earth usually come in the form of two- or three-dimensional scattered points to which attributes are attached. Unlike datasets from fields such as mechanical engineering or medicine, geoscientific data often have a highly irregular distribution. For example, bathymetric data are collected at a high sampling rate along each ship’s track, but there can be a very long distance between two ships’ tracks. Also, geologic and oceanographic data respectively are gathered from boreholes and water columns; data are therefore usually abundant vertically but sparse horizontally. In order to model, visualize and better understand these datasets, interpolation is performed to estimate the value of an attribute at unsampled locations. The abnormal distribution of a dataset causes many problems for interpolation methods, especially for traditional weighted average methods in which distances are used to select neighbours and to assign weights. Such methods have problems because they do not consider the configuration of the data. ∗
This research is supported by the Hong Kong’s Research Grants Council (project PolyU 5068/00E).
98 Hugo Ledoux and Christopher Gold It has been shown that natural neighbour interpolation (Sibson, 1980, 1981) avoids most of the problems of conventional methods and therefore performs well for irregularly distributed data (Gold, 1989; Sambridge et al., 1995; Watson and Phillip, 1987). This is a weighted average technique based on the Voronoi diagram (VD) for both selecting the set of neighbours of the interpolation point x and determining the weight of each. The neighbours used in an estimation are selected using the adjacency relationships of the VD, which results in the selection of neighbours that both surround and are close to x. The weight of each neighbour is based on the volume (throughout this paper, ‘volume’ is used to define area in 2D, volume in 3D and hyper volume in higher dimensions) that the Voronoi cell of x ‘steals’ from the Voronoi cell of the neighbours in the absence of x. The method, which has many useful properties valid in any dimensions, is further defined in Sect. 2. Although the concepts behind natural neighbour interpolation are simple and easy to understand, its implementation is far from being straightforward, especially in higher dimensions. The main reasons are that the method requires the computation of two Voronoi diagrams—one with and one without the interpolation point—and also the computation of volumes of Voronoi cells. This involves algorithms for both constructing a VD—or its geometric dual the Delaunay triangulation (DT)—and deleting a point from it. By comparison, conventional interpolation methods are relatively easy to implement; this is probably why they can be found in most geographical information systems (GIS) and geoscientific modelling packages. Surprisingly, although many authors present the properties and advantages of the method, few discuss details concerning its implementation. The twodimensional case is relatively easy to implement as efficient algorithms for constructing a VD/DT (Fortune, 1987; Guibas and Stolfi, 1985; Watson, 1981) and deleting a point from it (Devillers, 2002; Mostafavi et al., 2003) exist. Watson (1992) also presents an algorithm that mimics the insertion of x, and thus deletion algorithms are not required. The stolen area is obtained by ordering the natural neighbours around x and decomposing the area into triangles. In three and higher dimensions, things get more complicated because the algorithms for constructing and modifying a VD/DT are still not well-known. There exist algorithms to construct a VD/DT (Edelsbrunner and Shah, 1996; Watson, 1981), but deletion algorithms are still a problem—only theoretical solutions exist (Devillers, 2002; Shewchuk, 2000). Sambridge et al. (1995) describe three-dimensional methods to compute a VD, insert a new point in it and compute volumes of Voronoi polyhedra, but they do not explain how the interpolation point can be deleted. Owen (1992) also proposes a sub-optimal solution in which, before inserting the interpolation point x, he simply saves the portion of the DT that will be modified and replaces it once the estimation has been computed. The stolen volumes are calculated in only one operation, but that requires algorithms for intersecting planes in three-dimensional space. The idea of mimicking the insertion algorithm of
Natural Neighbour Interpolation for Geoscientific Modelling
99
Watson (1992) has also been generalized to three dimensions by Boissonnat and Cazals (2002) and to arbitrary dimensions by Watson (2001). To calculate the stolen volumes, both algorithms use somewhat complicated methods to order the vertices surrounding x and then decompose the volume into simplices (tetrahedra in three dimensions). The time complexity of these two algorithms is the same as the one to insert one point in a VD/DT. We present in this paper a simple natural neighbour interpolation algorithm valid in two and three dimensions, but the method generalizes to higher dimensions. Our algorithm works directly on the Delaunay triangulation and uses the concept of flipping in a triangulation, as explained in Sect. 3, for both inserting new points in a DT and deleting them. The Voronoi cells are extracted from the DT and their volumes are calculated by decomposing them into simplices; we show in Sect. 4 how this step can be optimised. The algorithm is efficient (its time complexity is the same as the one for inserting a single point in a VD/DT) and we believe it to be considerably simpler to implement than other known methods, as only an incremental insertion algorithm based on flips, with some minor modifications, is needed.
2 Natural Neighbour Interpolation The idea of a natural neighbour is closely related to the concepts of the Voronoi diagram and the Delaunay triangulation. Let S be a set of n points in ddimensional space. The Voronoi cell of a point p ∈ S, defined Vp , is the set of points x that are closer to p than to any other point in S. The union of the Voronoi cells of all generating points p in S form the Voronoi diagram (VD) of S. The geometric dual of VD(S), the Delaunay triangulation DT(S), partitions the same space into simplices—a simplex represents the simplest element in a given space, e.g. a triangle in 2D and a tetrahedron in 3D— whose circumspheres do not contain any other points in S. The vertices of the simplices are the points generating each Voronoi cell. Fig. 1(a) shows the VD and the DT in 2D, and Fig. 1(b) a Voronoi cell in three dimensions. The VD and the DT represent the same thing: a DT can always be extracted from a VD, and vice-versa. The natural neighbours of a point p are the points in S sharing a Delaunay edge with p, or, in the dual, the ones whose Voronoi cell is contiguous to Vp . For example, in Fig. 1(a), p has seven natural neighbours. 2.1 Natural Neighbour Coordinates The concept of natural neighbours can also be applied to a point x that is not present in S. In that case, the natural neighbours of x are the points in S whose Voronoi cell would be modified if the point x were inserted in VD(S). The insertion of x creates a new Voronoi cell Vx that ‘steals’ volume from the Voronoi cells of its ‘would be’ natural neighbours, as shown in Fig. 2(a). This idea forms the basis of natural neighbour coordinates (Sibson, 1980, 1981),
100 Hugo Ledoux and Christopher Gold which define quantitatively the amount Vx steals from each of its natural neighbours. Let D be the VD(S), and D+ = D ∪ {x}. The Voronoi cell of a point p in D is defined by Vp , and Vp+ is its cell in D+ . The natural neighbour coordinate of x with respect to a point pi is wi (x) =
V ol(Vpi ∩ Vx+ ) V ol(Vx+ )
(1)
where V ol(Vpi ) represents the volume of Vpi . For any x, the value of wi (x) will always be between 0 and 1: 0 when pi is not a natural neighbour of x, and 1 when x is exactly at the same location as pi . A further important consideration is that the sum of the volumes stolen from each of the k natural neighbours is equal to V ol(Vx+ ). Therefore, the higher the value of wi (x) is, the stronger is the ‘influence’ of pi on x. The natural neighbour coordinates are influenced by both the distance from x to pi and the spatial distribution of the pi around x.
p
(a) Voronoi diagram and Delaunay triangulation (dashed lines) in 2D.
(b) A Voronoi cell in 3D with its dual Delaunay edges joining the generator to its natural neighbours.
Fig. 1. Voronoi diagram. p1
p2
w1
p6 w6 w5
p1
p1
w2
p
2
p
x
p
2
p
6
6
w3
w4
p3
p5
x p
3
p
p4
(a) Natural neighbour coordinates of x in 2D.
p
3
p
5
5
p
4
p
4
(b) 2D DT with and without x.
Fig. 2. Two VD are required for the natural neighbour interpolation.
Natural Neighbour Interpolation for Geoscientific Modelling
101
2.2 Natural Neighbour Interpolation Based on the natural neighbour coordinates, Robin Sibson developed a weighted average interpolation technique that he named natural neighbour interpolation (Sibson, 1980, 1981). The points used to estimate the value of an attribute at location x are the natural neighbours of x, and the weight of each neighbour is equal to the natural neighbour coordinate of x with respect to this neighbour. If we consider that each data point in S has an attribute ai (a scalar value), the natural neighbour interpolation is f (x) =
k
wi (x) ai
(2)
i=1
where f (x) is the interpolated function value at the location x. The resulting method is exact (f (x) honours each data point), and f (x) is smooth and continuous everywhere except at the data points. To obtain a continuous function everywhere, that is a function whose derivative is not discontinuous at the data points, Sibson uses the weights defined in Eq. 1 in a quadratic equation where the gradient at x is considered. To our knowledge, this method has not been used with success with real data and therefore we do not use it. Other ways to remove the discontinuities at the data points have been proposed: Watson (1992) explains different methods to estimate the gradient at x and how to incorporate it in Eq. 2; and Gold (1989) proposes to modify the weight of each pi with a simple hermitian polynomial so that, as x approaches pi , the derivative of f (x) approaches 0. Modifying Eq. 2 to obtain a continuous function can yield very good results in some cases, but with some datasets the resulting surface can contain unwanted effects. Different datasets require different methods and parameters, and, for this reason, modifications should be applied with great care. 2.3 Comparison with Other Methods With traditional weighted average interpolation methods, for example distancebased methods, all the neighbours within a certain distance from the interpolation location x are considered and the weight of each neighbour is inversely proportional to its distance to x. These methods can be used with a certain success when the data are uniformly distributed, but it is difficult to obtain a continuous surface when the distribution of the data is anisotropic or when there is variation in the data density. Finding the appropriate distance to select neighbours is difficult and requires a priori knowledge of a dataset. Natural neighbour interpolation, by contrast, is not affected by these issues because the selection of the neighbours is based on the configuration of the data. Another popular interpolation method, especially in the GIS community, is the triangle-based method in which the estimate is obtained by linear interpo-
102 Hugo Ledoux and Christopher Gold lation within each triangle, assuming a triangulation of the data points is available. The generalization of this method to higher dimensions is straightforward: linear interpolation is performed within each simplex of a d-dimensional triangulation. In 2D, when this method is used with a Delaunay triangulation, satisfactory results can be obtained because the Delaunay criterion maximizes the minimum angle of each triangle, i.e. it creates triangles that are as equilateral as possible. This method however creates discontinuities in the surface along the triangle edges and, if there is anisotropy in the data distribution, the three neighbours selected will not necessarily be the three closest data points. These problems are amplified in higher dimensions because, for example, the max-min angle property of a DT does not generalize to three dimensions. A 3D DT can contain some tetrahedra, called slivers, whose four vertices are almost coplanar; interpolation within such tetrahedra does not yield good results. The presence of slivers in a DT does not however affect natural neighbour interpolation because the Voronoi cells of points forming a sliver will still be ‘well-shaped’ (relatively spherical).
3 Delaunay Triangulation, Duality and Flips In order to construct and modify a Voronoi diagram, it is actually easier to first construct the Delaunay triangulation and extract the VD afterwards. Managing only simplices is simpler then managing arbitrary polytopes: the number of vertices and neighbours of each simplex is known and constant which facilitates the algorithms and simplifies the data structures. Extracting the VD from a DT in 2D is straightforward, while in 3D it requires more work. In two dimensions the dual of a triangle is a point (the centre of the circumcircle of the triangle) and the dual of a Delaunay edge is a bisector edge. In three dimensions the dual of a tetrahedron is a point (the centre of the circumsphere of the tetrahedron) and the dual of a Delaunay edge is a Voronoi face (a convex polygon formed by the centre of the circumspheres of every tetrahedron incident to the edge). In short, to get the Voronoi cell of a given point p in a 3D DT, we must first identify all the edges that have p as a vertex and then extract the dual of each (a face). The result will be a convex polyhedron formed by convex faces, as shown in Fig. 1(b). We discuss in this section the main operations required for the construction of a DT and for implementing the natural neighbour interpolation algorithm described in Sect. 4. Among all the possible algorithms to construct a VD/DT, we chose an incremental insertion algorithm because it permits to firstly construct a DT and then modify it locally when a point is added or deleted. Other potential solutions, for example divide-and-conquer algorithms or the construction of the convex hull in (d + 1) dimensions, might be useful for the initial construction, but local modifications are either slow and complicated, or simply impossible.
Natural Neighbour Interpolation for Geoscientific Modelling
103
3.1 Flipping A flip is a local topological operation that modifies the configuration of adjacent simplices in a triangulation. Consider the set S = {a, b, c, d} of points in the plane forming a quadrilateral, as shown in Fig. 3(a). There exist exactly c
c d
a
c
c
flip22
flip13
d
a flip22 b
d
flip31
b
a
a
b
(a)
b
(b) Fig. 3. Two-dimensional flips.
two ways to triangulate S: the first one contains the triangles abc and bcd; and the second one contains the triangles abd and acd. Only the first triangulation of S is Delaunay because d is outside the circumcircle of abc. A flip22 is the operation that transforms the first triangulation into the second, or vice-versa. It should be noticed that when S does not form a quadrilateral, as shown in Fig. 3(b), there is only one way to triangulate S: with three triangles all incident to d. A flip13 refers to the operation of inserting d inside the triangle abc and splitting it into three triangles; and a flip31 is the inverse operation that is needed for deleting d. The notation for the flips refers to the numbers of simplices before and after the flip. The concept of flipping generalizes to three and higher dimensions (Lawson, 1986). The flips to insert and delete a point generalize easily to three dimensions and become respectively flip14 and flip41, as shown in Fig. 4(b). The generalization of the flip22 in three dimensions is somewhat more complicated. Consider a set S = {a, b, c, d, e} of points in R3 , as shown in Fig. 4(a). There are two ways to triangulate S: either with two or three tetrahedra. In the first case, the two tetrahedra share a face, and in the latter case the three a
a
a flip23
flip14
c
b
flip32
c
b e
e
flip41
d
d
d
(a)
(b) Fig. 4. Three-dimensional flips.
c
b
104 Hugo Ledoux and Christopher Gold tetrahedra all have a common edge. A flip23 transforms a triangulation of two tetrahedra into another one containing three tetrahedra; a flip32 is the inverse operation. 3.2 Constructing a DT by Flips Consider a d-dimensional Delaunay triangulation T and a point x. What follow are the steps to insert x in T by flips, assuming that the simplex τ containing x has been identified (see Devillers et al. (2002) for different methods). After the insertion of x, one or more simplices of T will be in ‘conflict’ with x, i.e. their circumspheres will contain x. We must identify, delete and replace these conflicting simplices by other ones. A flipping algorithm first splits τ into d + 1 simplices with a flip (e.g. a flip13 in 2D). Each new simplex must then be tested to make sure it is Delaunay; this test involves only two simplices: the new simplex and its adjacent neighbour that is not incident to x (there is only one). If the new simplex is not Delaunay then a flip is performed. The new simplices thus created must be tested later. The process continues until every simplex incident to x is Delaunay. This idea can be applied to construct a DT: each point is inserted one at a time and the triangulation is updated between each insertion. This incremental insertion algorithm is valid in any dimensions, i.e. there always exists a sequence of flips that will permit the insertion of a single point in a d-dimensional DT. For a detailed description of the algorithm, see Guibas and Stolfi (1985) and Edelsbrunner and Shah (1996) for respectively the twoand d-dimensional case.
4 A Flip-Based Natural Neighbour Interpolation Algorithm Our algorithm to implement natural neighbour interpolation performs all the operations directly on the Delaunay triangulation (with flips) and the Voronoi cells are extracted when needed. We use a very simple idea that consists of inserting the interpolation point x in the DT, calculating the volume of the Voronoi cell of each natural neighbour of x, then removing x and recalculating the volumes to obtain the stolen volumes. Two modifications are applied to speed up the algorithm. The first one concerns the deletion of x from the DT. We show in Sect. 3 that every flip has an ‘inverse’, e.g. in 2D, a flip13 followed by a flip31 does not change the triangulation; in 3D, a flip23 creates a new tetrahedron that can then be removed with a flip32. Therefore, if x was added to the triangulation with a sequence l of flips, simply performing the inverse flips of l in reverse order will delete x. The second modification concerns how the overlap between Voronoi cells with and without the presence of x is calculated. We show that only some faces of a Voronoi cell (in the following, a Voronoi face is a (d − 1)-face forming the boundary of the cell, e.g. in 2D
Natural Neighbour Interpolation for Geoscientific Modelling
105
it is a line and in 3D it is a polygon) are needed to obtain the overlapping volume. Given a set of points S in d dimensions, consider interpolating at the location x. Let T be the DT(S) and pi the natural neighbours of x once it is inserted in DT(S). The simplex τ that contains x is known. Our algorithm proceeds as follow: 1. x is inserted in T , thus getting T + = T ∪ {x}, by using flips and the sequence l of flips performed is stored in a simple list. 2. the volume of Vx+ is calculated, as well as the volumes of each Vp+i . 3. l is performed in reverse order and the inverse flip is performed each time. This deletes x from T + . 4. the volume of Vpi are re-calculated to obtain the natural neighbour coordinates of x with respect to all the pi ; and Eq. 2 is finally calculated. To remember the order of flips in two dimensions, a simple list containing the order in which the pi became natural neighbours of x is kept. The flip13 adds three pi , and each subsequent flip22 adds one new pi . In 3D, only a flip23 adds a new pi to x; a flip32 only changes the configuration of the tetrahedra around x. We store a list of edges that will be used to identify what flip was performed during the insertion of x. In the case of a flip23, we simply store the edge xpi that is created by the flip. A flip32 deletes a tetrahedron and modifies the configuration of the two others such that, after the flip, they are both incident to x and share a common face abx. We store the edge ab of this face. Therefore, in two dimensions, to delete x we take one pi , find the two triangles incident to the edge xpi and perform a flip22. When x has only three natural neighbours, a flip31 deletes x completely from the T + . In 3D, if the current edge is xpi , a flip32 is used on the three tetrahedra incident to xpi ; and if ab is the current edge, then a flip23 is performed on the two tetrahedra sharing the face abx. 4.1 Volume of a Voronoi Cell The volume of a d-dimensional Voronoi cell is computed by decomposing it into d-simplices and summing their volumes. The volume of a d-simplex τ is easily computed: 0 1 v . . . v d det (3) V ol(τ ) = 1 ... 1 d! where v i is a d-dimensional vector representing the coordinates of a vertex and det is the determinant of the matrix. Triangulating a 2D Voronoi cell is easily performed: since the polygon is convex a fan-shaped triangulation can be done. In 3D, the polyhedron is triangulated by first fan-shaped triangulating each of its Voronoi faces, and then the tetrahedra are formed by the triangles and the generator of the cell. In order to implement natural neighbour interpolation, we do not need to know the volume of the Voronoi cells of the pi in T and T + , but only the
106 Hugo Ledoux and Christopher Gold difference between the two volumes. As shown in Fig. 2(a), some parts of a Voronoi cell will not be affected by the insertion of x in T , and computing them twice to subtract afterwards is computationally expensive and useless. Notice that the insertion or deletion of x in a DT modifies only locally the triangulation—only simplices inside a defined polytope (defined by the pi in Fig. 2(b)) are modified. Each pi has many edges incident to it, but only the edges inside the polytope are modified. Therefore, to optimise this step of the algorithm, we process only the Voronoi faces that are dual to the Delaunay edges joining two natural neighbours of x. In T + , the Voronoi face dual to the edge xpi must also be computed. Only the complete volume of the Voronoi cell of x in T + needs to be known. The Voronoi cells of the points in S forming the convex hull of S are unbounded. That causes problems when a natural neighbour of x is one of these points because the volume of its Voronoi cell, or parts of it, must be computed. The simplest solution consists of bounding S with an artificial (d + 1)-simplex big enough to contain all the points. 4.2 Theoretical Performances By using a flipping algorithm to insert x in a d-dimensional DT T , each flip performed removes one and only one conflicting simplex from T . For example, in 3D, the first flip14 deletes the tetrahedron containing x and adds four new tetrahedra to T + ; then each subsequent flip23 or flip32 deletes only one tetrahedron that was present in T before the insertion of x. Once a simplex is deleted after a flip, it is never re-introduced in T + . The work needed to insert x in T is therefore proportional to r, the number of simplices in T that conflict with x. As already mentioned, each 2D flip adds a new natural neighbour to x. The number of flips needed to insert x is therefore proportional to the degree of x (the number of incident edges) after its insertion. Without any assumptions on the distribution of the data, the average degree of a vertex in a 2D DT is 6; which means an average of four flips are needed to insert x (one flip13 plus three flip22). This is not the case in 3D (a flip32 does not add a new natural neighbour to x) and it is therefore more complicated to give a value to r. We can nevertheless affirm that the value of r will be somewhere between the number of edges and the number of tetrahedra incident to x in T + ; these two values are respectively around 15.5 and 27.1 when points are distributed according to a Poisson distribution (Okabe et al., 1992). Because a flip involves a predefined number of adjacent simplices, we assume it is performed in constant time. As a result, if x conflicts with r simplices in T then O(r) time is needed to insert it. Deleting x from T + also requires r flips; but this step is done even faster than the insertion because operations to test if a simplex is Delaunay are not needed, nor are tests to determine what flip to perform. The volume of each Voronoi cell is computed only partly, and this operation is assumed to be done in constant time. In the natural neighbour interpolation algorithm, if k is the
Natural Neighbour Interpolation for Geoscientific Modelling
107
degree of x in a d-dimensional DT, then the volume of k Voronoi cells must be partly computed twice: with and without x in T . As a conclusion, our natural neighbour interpolation algorithm has a time complexity of O(r), which is the same as an algorithm to insert a single point in a Delaunay triangulation. However, the algorithm is obviously slower by a certain factor since x must be deleted and parts of the volumes of the Voronoi cells of its natural neighbours must be computed.
5 Conclusions Many new technologies to collect information about the Earth have been developed in recent years and, as a result, more data are available. These data are usually referenced in two- and three-dimensional space, but socalled four-dimensional datasets—that is three spatial dimensions plus a time dimension—are also collected. The GIS, with its powerful integration and spatial analysis tools, seems the perfect platform to manage these data. It started thirty years ago as a static mapping tool, has recently evolved to three dimensions (Raper, 1989) and is slowly evolving to higher dimensions (Mason et al., 1994; Raper, 2000). Interpolation is an important operation in a GIS. It is crucial in the visualisation process (generation of surfaces or contours), for the conversion of data from one format to another, to identify bad samples in a dataset or simply to have a better understanding of a dataset. Traditional interpolation methods, although relatively easy to implement, do not yield good results, especially when used with datasets having a highly irregular distribution. In two dimensions, these methods have shortcomings that create discontinuities in the surface and these shortcomings are amplified in higher dimensions. The method detailed in this paper, natural neighbour interpolation, although more complicated to implement, performs well with irregularly distributed data and is valid in any dimensions. We have presented a simple, yet efficient, algorithm that is valid in two, three and higher dimensions. We say ‘simple’ because only an incremental algorithm based on flips, with the minor modifications described, is required to implement our algorithm. We have already implemented the algorithm in two and three dimensions and we hope our method will make it possible for the GIS community to take advantage of natural neighbour interpolation for modelling geoscientific data.
References Boissonnat JD, Cazals F (2002) Smooth surface reconstruction via natural neighbour interpolation of distance functions. Computational Geometry, 22:185–203. Devillers O (2002) On Deletion in Delaunay Triangulations. International Journal of Computational Geometry and Applications, 12(3):193–205.
108 Hugo Ledoux and Christopher Gold Devillers O, Pion S, Teillaud M (2002) Walking in a triangulation. International Journal of Foundations of Computer Science, 13(2):181–199. Edelsbrunner H, Shah N (1996) Incremental Topological Flipping Works for Regular Triangulations. Algorithmica, 15:223–241. Fortune S (1987) A Sweepline algorithm for Voronoi diagrams. Algorithmica, 2:153– 174. Gold CM (1989) Surface Interpolation, spatial adjacency and GIS. In J Raper, editor, Three Dimensional Applications in Geographic Information Systems, pages 21–35. Taylor & Francis. Guibas LJ, Stolfi J (1985) Primitives for the Manipulation of General Subdivisions and the Computation of Voronoi Diagrams. ACM Transactions on Graphics, 4:74–123. Lawson CL (1986) Properties of n-dimensional triangulations. Computer Aided Geometric Design, 3:231–246. Mason NC, O’Conaill MA, Bell SBM (1994) Handling four-dimensional georeferenced data in environmental GIS. International Journal of Geographic Information Systems, 8(2):191–215. Mostafavi MA, Gold CM, Dakowicz M (2003) Delete and insert operations in Voronoi/Delaunay methods and applications. Computers & Geosciences, 29(4):523–530. Okabe A, Boots B, Sugihara K, Chiu SN (1992) Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. John Wiley and Sons. Owen SJ (1992) An Implementation of Natural Neighbor Interpolation in Three Dimensions. Master’s thesis, Brigham Young University, Provo, UT, USA. Raper J, editor (1989) Three Dimensional Applications in Geographic Information Systems. Taylor & Francis, London. Raper J (2000) Multidimensional Geographic Information Science. Taylor & Francis. Sambridge M, Braun J, McQueen H (1995) Geophysical parameterization and interpolation of irregular data using natural neighbours. Geophysical Journal International, 122:837–857. Shewchuk JR (2000) Sweep algorithms for constructing higher-dimensional constrained Delaunay triangulations. In Proc. 16th Annual Symp. Computational Geometry, pages 350–359. ACM Press, Hong Kong. Sibson R (1980) A vector identity for the Dirichlet tesselation. In Mathematical Proceedings of the Cambridge Philosophical Society, 87, pages 151–155. Sibson R (1981) A brief description of natural neighbour interpolation. In V Barnett, editor, Interpreting Multivariate Data, pages 21–36. Wiley, New York, USA. Watson DF (1981) Computing the n-dimensional Delaunay tessellation with application to Voronoi polytopes. The Computer Journal, 24(2):167–172. Watson DF (1992) Contouring: A Guide to the Analysis and Display of Spatial Data. Pergamon Press, Oxford, UK. Watson DF (2001) Compound Signed Decomposition, The Core of Natural Neighbor Interpolation in n-Dimensional Space. http://www.iamg.org/ naturalneighbour.html. Watson DF, Phillip G (1987) Neighborhood-Based Interpolation. Geobyte, 2(2):12– 16.
Evaluating Methods for Interpolating Continuous Surfaces from Irregular Data: a Case Study M. Hugentobler1, R.S. Purves1 and B. Schneider2 1 GIS Division, Department of Geography, University of Zürich, Winterthurerstr. 190, Zürich, 8057, Switzerland [email protected] 2 Department of Geosciences, University of Basel, Berhoullistr. 32, Basel, 4056, Switzerland
Abstract An artificial and ‘real’ set of test data are modelled as continuous surfaces by linear interpolators and three different cubic interpolators. Values derived from these surfaces, of both elevation and slope, are compared with analytical values for the artificial surface and a set of independently surveyed values for the real surface. The differences between interpolators are shown with a variety of measures, including visual inspection, global statistics and spatial variation, and the utility of cubic interpolators for representing curved areas of surfaces demonstrated.
1 Introduction Terrain models and their derivatives are used in a wide range of applications as ‘off the shelf’ products. However, Schneider (2001b) points out how the representation of surface continuity in many applications is both implicit and contradictory for different products of the terrain model. The use of continuous representations of terrain, which are argued to better represent the real nature of terrain surfaces, is suggested as an important area of research. Furthermore, it is stated that the nature of a representation should also be application specific with, for example, a surface from which avalanche starting zones are to be derived implying a different set of constraints than those required to interpolate temperature. In the latter case it is sufficient to know only elevation at a point, whereas in the former de-
110
M. Hugentobler, R.S. Purves and B. Schneider
rivatives of the terrain surface such as gradient, aspect and curvature (Maggioni and Gruber, 2003) are all required. Models of terrain can generally be categorised as either regular or irregular tessellations of point data sometimes with additional ancillary information representing structural features such as breaks in slope or drainage divides (Weibel and Heller, 1991). Regular tessellations have dominated within the modelling and spatial analysis communities, despite the oft-cited advantages of irregular tessellations (e.g. Weibel and Heller, 1991). Within the domain of regular tessellations geomorphologists and GIScientists have combined to examine the robustness of descriptive indices of topography, such as slope and aspect (e.g. Evans, 1980; Skidmore 1989; Corripio, 2003; Schmidt et al., 2003), hypsometry, and hydrological catchment areas (Walker and Willgoose, 1999; Gallant and Wilson, 2000) all derived using a range of data models, resolutions and algorithms. Irregular tessellations are commonly based upon a triangular irregular network (TIN) which in itself is derived from point or line data (e.g. Peucker et al., 1978). Surfaces interpolated from irregular tessellations may or may not fulfil basic conditions of continuity, with Schneider (2001b) describing a family of techniques derived from Computer-Aided Graphics Design (CAGD) which may be used to attempt to fulfill continuity constraints. Hugentobler (2002) describes one of these techniques, triangular Coons patches, which are applied to the problem of representing a continuous surface. In comparison to regular tessellations, little work has been carried out to assess the implications of differing surface representations and their resulting products with irregular tessellations, as opposed to comparisons between regular and irregular tessellations (though the work of Kumler (1994) is an exception to this observation). In this paper we introduce a case study where a suite of interpolation methods are applied to a TIN and resulting elevations and first order surface derivatives compared in order to assess the properties of different techniques. The paper first lists a range of methods for comparing the properties of these representations, mostly derived from the literature on regular tessellations. A methodology for carrying out such comparisons on irregularly tessellated points is then presented and a subset of the resulting values are discussed. The implications of the case study for interpolation of TINs for differing applications are then examined. Finally, some recommendations for further work in this area are made.
Evaluating Methods for Interpolating Continuous Surfaces
111
1.1 Techniques for evaluating terrain models A number of techniques are described in the literature for evaluating terrain models. Perhaps the most straightforward, but nonetheless a very powerful technique is the use of visual inspection. For instance, Wood and Fisher (1993) cite the use of a range of mapping techniques including 2D raster rendering, pseudo-3D projection, aspatial representations and a range of shaded relief, slope and aspect maps. In computer-aided graphic design (CAGD), reflection lines are often used to detect small irregularities in surfaces (Farin, 1997). Mitas and Mitasova (1999) and Schneider (1998) detect artefacts in continuous surfaces though the use of shaded pseudo-3D projection. Visual inspection has the great advantage that patterns can straightforwardly be identified with inspection that are difficult or impossible to identify through summary statistics (Wood and Fisher, 1993). On the other hand, as Schneider (2001b) points out a surface may be visually pleasing whilst not being a realistic representation of terrain. Indeed, two different representations of the same surface may both appear realistic whilst giving different impressions of the same surface. Thus, visual inspection is probably best employed in searching for grave artefacts in the surface. The use of artificial surfaces generated from a mathematical function with an analytical solution allows quantitative evaluation of interpolators at any point on a continuous surface. This technique has been used by a number of authors, including Corripio (2003) to compare a number of algorithms for calculating slope and Schmidt et al. (2003) to examine calculations of curvature. Analytical surfaces provide a means for rapidly collecting many points and knowing true values of elevation and derivatives at any point. However, the ability of a function with an analytical solution to have the same or similar properties to a real terrain is unclear. Perhaps the most convincing method of assessing the quality of a terrain model is the collection of independent data from the same terrain, with precision and accuracy at least as high as that assumed for the model. Skidmore (1989) and Wise (1997) attempt to collect independent data from topographic maps for comparison with gridded elevation models. Bolstad and Stowe (1994) used GPS measurements to evaluate elevation model quality. Other work has used field measurements of elevation, slope and profile curvature for quality assessment (Giles and Franklin, 1996). However, little work appears to exist using field data to evaluate the quality of continuous models of terrain derived from irregular data. In considering the validation of interpolated terrain models with field data several important considerations must be taken into account:
112
M. Hugentobler, R.S. Purves and B. Schneider
x All applications use a terrain model at some implicit scale. However, this scale specific surface does not exist in reality and therefore cannot be measured (Schneider, 2001a). x Factors other than the interpolation influence the errors in the resulting terrain model: x the discretisation of the continuous terrain surface; x the choice of triangular tessellation; and x errors in the base data used for surface derivation and comparison. In order to assess interpolation methods the uncertainty introduced through discretisation, tessellation and base data must be small in comparison to the uncertainties resulting from the interpolation methods themselves. 2 Methodology 2.1 Overview The aim of this study was to generate a comprehensive set of methods to compare a number of different interpolators of irregular point datasets. To achieve this all of the methods reviewed above were used, namely the generation of an artificial surface with analytical solutions, the collection of high precision field data within a test area and the use of a variety of statistical and visual methods to compare different interpolators. In this section the techniques used to generate these comparisons are described. 2.2 Test surfaces 2.2.1 Artificial surface An artificial surface as described in (1) was created and is shown in Figure 1.
The function was evaluated with values of x and y between 100 and 300. 162 datapoints were selected randomly, since the surface varies
Evaluating Methods for Interpolating Continuous Surfaces
113
smoothly, in order to build a tessellation which was triangulated using a Delaunay triangulation for the different interpolation methods described in 2.4.
Fig. 1. 3D view of the artificial surface described in Equation 1
2.2.2 Field test area The test area for the study was a square of approximately 200m x 200m near Menzingen in the Canton of Zug in Switzerland. The landscape itself is one formed by glacial deposition containing farmland bisected by a road, a small hill and an area of flat land (Figure 2). A survey of the area was carried out with a geodimeter to produce an independent dataset (called hereinafter the control dataset, consisting of 263 points) with the intention of comparison with a photogrammetric dataset collected by the Canton of Zug. However, this field data and the photogrammetric data were found to have a consistent offset of around 30cm. Since the origin of this offset was unclear, it was decided to resample the field area at approximately the same locations as the photogrammetric dataset (including breaklines along either side of the road) to produce an independent dataset (called hereinafter the triangulation dataset, consisting of 230 points) where the same survey techniques had been used. A constrained Delaunay triangulation was used to interpolate the triangulation dataset following the method of de Floriani and Puppo (1992).
114
M. Hugentobler, R.S. Purves and B. Schneider
To examine the sensitivity of the results to the points contained within the triangulation dataset a randomly selected set of points were swapped between the triangulation and control datasets to generate different surfaces. Finally, three profiles were measured at regular intervals from the hill to the plain for comparison with interpolated surfaces. One of these profiles is represented schematically in Figure 2 by a dashed line.
Fig. 2. The field area viewed from the north-east, showing hill and ‘road’ (running across the middle of the image and then behind bushes), and the plain (in foreground). Dashed line represents a profile schematically.
In order to avoid the problems discussed in section 1.2 with comparison of field data and surfaces, the following were considered when collecting field data: x The two datasets (triangulation and control) must contain features of approximately the same scales. Since the data were collected by the same teams with the same purpose in mind this problem was minimized. x The tessellation used must portray the terrain well, with no edges crossing valleys or ridges and no long thin triangles. x The elevation values of the two datasets have to be of similar accuracy and precision. The problems encountered in attempting to use the photogrammetric dataset illustrated these issues well.
Evaluating Methods for Interpolating Continuous Surfaces
115
2.4 Interpolation methods Four different interpolators were used to create continuous elevation surfaces using the generated triangulations. 2.4.1 Linear interpolation Linear interpolation is the simplest interpolation scheme for TINs and also the most widely used. The surface within each triangle is a plane passing through the three vertices. Each facet has one value for gradient and one for aspect, while the curvature is zero everywhere. These properties make it relatively straightforward to derive more complex properties such as viewsheds and catchment areas. 2.4.2 Triangular Coons patch The triangular Coons patch (Barnhill and Gregory, 1975, Hugentobler, 2002) is a method using combinations of ruled surfaces to interpolate to a triangular network of boundary curves and cross-boundary derivatives. In this paper, a cubic version was used, which interpolates to the position and the first order cross-derivatives of the boundary curves. Therefore, G1continuous surfaces, which can be considered synonymous with first-order continuity (Farin, 1997), can be generated. The cross derivatives between the data points have been interpolated linearly along each triangle edge. 2.4.3 Clough-Tocher Bezier splines Clough-Tocher Bezier splines (Farin, 1997) are surfaces specified by means of control points. These control points attract the surface and allow the shape of the surface to be controlled in an intuitive way. To achieve G1-continuity, each triangle of the TIN has to be split into three subtriangles and a cubic Bezier triangle has to be specified for each. The shape of the surface is not fully determined by the condition of G1-continuity. A further assumption as to how the cross-derivatives along the edges of the macrotriangles behave has to be used. Again, linear interpolation of the cross-derivatives has been used. 2.4.4 Smoothed version of the Clough-Tocher spline A smoothed version of the Clough-Tocher spline (Farin, 1985) was also utilised. With a Lagrange minimisation, the behaviour of the crossderivatives along the triangle edges is constrained such that the deviations of the control points to a G2-transition between two triangles is minimised.
116
M. Hugentobler, R.S. Purves and B. Schneider
G1-continuity between the triangles has also been specified as constraint for the minimisation. 2.5 Comparison techniques 2.5.1 Visual inspection A shaded 3D projection was prepared for each of the four interpolators from the same location (with the view of the test area covering the road and the hill, where it was considered likely that artefacts may occur).These images were visually inspected to find irregularities and artefacts of the interpolation and tessellation. 2.5.2 Artificial surfaces Values of elevation and derivatives can be calculated for any point on the surface. Elevation, and the magnitude of the first derivative (gradient) were compared for a total of around 25 thousand points and a number of summary statistics and relationships derived. The following selection is presented in this paper: x Mean unsigned deviations of ‘real’ elevation values from the four interpolation methods. x Correlations of signed deviations of interpolators with respect to the total curvature (Gallant and Wilson, 2000) at a point were compared. x Graphs of slope values derived from linear and Coons interpolation with those derived from the artificial function were produced. 2.5.3 Comparison of triangulation and control datasets Global statistics were calculated for the deviation of elevation at 263 points measured in the control dataset from the triangulation dataset. The stability of these results was measured by recalculating these global statistics with 80 points randomly swapped between the two datasets (and four resulting new interpolated surfaces generated). To examine the variation in sensitivity of the interpolators to the nature of the surface being modelled points on the surface were classified as lying on the hill, road (breakline) or plain. Global statistics were calculated and compared for these subsets of points. A second set of comparisons mapped the spatial variation of the signed deviations from the surface in order to see whether deviations of the interpolated surface from the control dataset were spatially autocorrelated with particular areas of the surface. Finally, the three profiles measured were
Evaluating Methods for Interpolating Continuous Surfaces
117
graphed against a two dimensional profile extracted along each of these profiles to examine in which areas the interpolated surfaces showed the greatest agreement and deviation from the measured points.
3 Results In this section the results are presented for the comparisons described above. Figures 3 and 4 show an overview of the surface and detail of the surface as generated from the four interpolators and represented as a shaded 3D surface.
Fig. 3. Overview of the surface generated by a linear interpolator and viewed from the north east
(a)
(c)
(b)
(d)
118
M. Hugentobler, R.S. Purves and B. Schneider
Fig. 4. Detail of the surface showing the road and hill, viewed from the north-east for (a) a linear interpolator, (b) triangular Coons patches, (c) Clough-Tocher Bezier splines and (d) smoothed Clough-Tocher splines.
Table 1 shows the unsigned deviation of the interpolated values from the artificial surface together with the maximum and standard deviation of these values. The correlation of the signed deviation of the surface with curvature was found to be 0.64 for the linear surface, whereas it was only 0.08 for the triangular Coons patch and Clough-Tocher Bezier splines, and 0.07 for the Clough-Tougher splines. Figure 5 shows the relationship between the analytically derived slope and slope calculated from linearly interpolated triangles and Coons patches respectively. The standard deviation of slope from the analytically derived value was 5.25° for the linearly interpolated surface and 3.93° for the Coons patches. Linear
Mean devia- 2.61m tion 10.90m Maximum deviation Standard 2.12m deviation
Coons
CloughTocher
0.87m
0.83m
Smoothed CloughTocher 0.83m
7.16m
7.17m
7.04m
1m
1m
0.99m
Table 1. Deviations of the interpolated elevation values from the artificial surface for around 25000 points
Evaluating Methods for Interpolating Continuous Surfaces (a)
119
(b)
Fig. 5. Relationship between slope derived from the linear interpolator (a) and Coons patches (b) with ‘real’ values of slope calculated analytically
Table 2 shows global statistics for the comparison of the elevation of the interpolated surfaces from the triangulation dataset in comparison with the control dataset, along with the results obtained when 80 points were randomly swapped between datasets. Linear
Mean deviation Maximum deviation Standard deviation
Table 2. Deviations of the elevation of control data points from the interpolated surfaces derived from the triangulation dataset, with bracketed values representing results when 80 data points were randomly swapped between datasets. 263 were points used in the evaluation.
Statistics were also calculated for four sub-divisions of the test area (hill, road and two parts of the plain). These statistics showed that the linear interpolator’s performance was similar to that of the other interpolators on the plain, and slightly worse than the cubic interpolators in other areas (as was the case in the global statistics). Since the global statistics were generally similar for the three cubic interpolators, spatial comparisons are only presented for comparisons between surfaces interpolated linearly and using Coons patches. Figures 6a and 6b illustrate the results obtained by mapping the variation in deviation of the control dataset from the interpolated triangulation dataset for two interpolators – the linear and Coons patch. In Figure 6c the interpolator with the smaller deviation from the triangulated interpolated surface (linear or Coons) is mapped along with the unsigned magnitude of the deviation.
120
M. Hugentobler, R.S. Purves and B. Schneider
Fig. 6. Deviation of the elevation values derived the control dataset with respect to the interpolated surfaces from the triangulated datasets ((a) linear interpolation and (b) Coons patches). The hill is the triangular area in the bottom centre of the picture, with the plain to the top right. The image is oriented with north to the top. (c) shows which interpolator (linear or Coons) is closer to the surface.
In Figure 7 a comparison of values measured along a 2D profile, as indicated in Figure 2, with two interpolated surfaces running from the top of the hill down onto the plain is shown. The deviations are relatively small, and so are multiplied by ten from the real values.
Evaluating Methods for Interpolating Continuous Surfaces
121
Fig. 7. Deviation of a profile from interpolated surface for linear and triangular Coons patches. Deviations are small so are shown magnified by a factor of ten.
4 Discussion The 3D images of the test area show, not surprisingly the most apparent artefacts on the linear interpolator. Linear facets which result from the tessellation do not well represent the hill since such breaklines are clearly derived from the tessellation. All four images show artefacts of the triangular tessellation, with the smoothed Clough-Tocher most effectively removing these artefacts. The interpolated artificial surface shows the greatest deviation from the analytical values for the linear interpolator (Table 1). The deviations of the three cubic interpolators are all of similar magnitude. Deviations of the linear interpolator from the surface also showed a strong tendency to correlate with convexities and concavities in the surface, illustrating well how this representation fails to deal with these important features in the terrain. Figure 4 shows how the deviation of interpolated values of gradient varies with the value of gradient calculated directly from the analytical surface. Smaller values of slope show higher deviations, and the linear interpolator showed greater mean deviation from slope than the other interpolators, though in all cases the deviations were relatively small. In a landscape where absolute values of slope are both small and important (i.e. variation
122
M. Hugentobler, R.S. Purves and B. Schneider
in elevation is small) then the choice of interpolator is much more important than for a steeper landscape, such as the hill in the test area used here. The global statistics demonstrate consistent results between the three cubic interpolators, with the performance of the linear interpolator being slightly worse. When the landscape is subdivided, all four interpolators performed equally well on the planar areas, where curvature did not exist at the scale of measurement (e.g. on the plain). This is consistent with the result obtained by the correlation of the linear interpolator with the convexities and concavities. Figures 6a and 6b, mapping the signed deviation of the two interpolators, shows an overall similar pattern. However, the hill area (the triangle formed in the lower part of the figure) shows more positive deviation for the Coons patches and the magnitude of negative deviations is smaller than for the linear interpolation. This result again illustrates the efficacy of Coons patches in representing curved surfaces, with the variation in curvature for each triangle concentrated along its edges. Figure 6c further shows the better performance of the Coons patches on the hill, whilst the two interpolators perform similarly on the plain. Figure 7 illustrates that both interpolators lie relatively close to the measured surface. However, they also show high frequency correlated variation which would result in uncertainty in derived values of slope and curvature. These variations are most likely artefacts of the underlying tessellation.
5 Conclusions and further work In this paper a range of techniques have been used to compare four different interpolators on a “real” and artificial test datasets. In general the cubic interpolators performed better than the linear. However, little quantitative difference was found between the three cubic interpolators, although the smoothed Clough-Tocher reduced prominent artefacts when visualised. On surfaces where curvature is an important property, a linear interpolation is not adequate and in order to allow local inflexion points at least a third order interpolator is required. In turn, curvature is defined by the data resolution and careful thought must be given to the implicit scale of the features being modelled in making a choice of interpolator. This study has indicated the value of a range of techniques for comparing irregular tessellations and further work will investigate the influence of variations caused by interpolators of these derived primary properties of topography on compound indices (Moore et al., 1993) such as
Evaluating Methods for Interpolating Continuous Surfaces
123
stream power, with the aim of specifying useful interpolators to users for differing applications.
Acknowledgements The Canton of Zug and the local farmers are thanked for the provision of data and permission to work on our test area. Alastair Edwardes is thanked for his assistance in the collection of field data in sometimes inclement conditions. This research was funded by the Swiss National Science Fund (Project No 59578).
References Barnhill RE and Gregory JA (1975) Compatible smooth interpolation in triangles. Journal of Approximation Theory 15: 214-225 Bolstad P, Stowe T (1994) An evaluation of DEM accuracy: elevation slope and aspect. Photogrammetric Engineering and Remote Sensing 60: 1327–1332 Corripio JG (2003) Vectorial algebra algorithms for calculating terrain parameters from DEMs and solar radiation modelling in mountainous terrain. IJGIS 17: 1–23 Evans IS (1980) An integrated system of terrain analysis for slope mapping. Zeitschrift fur Geomorphologie 36: 274-295 Farin G (1985) A modified Clough-Tocher interpolant. Computer Aided Geometric Design 2: 19–27 Farin G (1997) Curves and surfaces for CAGD. A practical guide (Academic Press) de Floriani L , Puppo E (1992) An online algorithm for constrained Delaunay triagulation. Graphical Models and Image Processing 54: 290-300 Gallant JC and Wilson JP (2000) Primary Topographic Attributes, In Terrain Analysis: Principles and Applications edited by Wilson, J.P. and Gallant, J.C (Wiley): 51- 85 Giles P, Franklin S (1996) Comparison of derivative topographic surfaces of a DEM generated from stereoscopic spot images with field measurements. Photogrammetric Engineering and Remote Sensing 62: 1165–1171 Hugentobler M (2002) Interpolation of continuous surfaces for terrain surfaces with Coons patches. In Proceedings of GISRUK 2002 (Sheffield, UK): 13-15. Kumler M (1994) An intensive comparison of TINs and DEMs. Cartographica (Monograph 45), 31: 2 Maggioni M, Gruber U (2003) The influence of topographic parameters on avalanche release dimension and frequency. Cold Regions Science and Technology, 37: 407-419
124
M. Hugentobler, R.S. Purves and B. Schneider
Mitas L, Mitasova H (1999) Spatial interpolation. In Geographical Information Systems edited by P.Longley, M.F. Goodchild, D.J. Maguire, and D.W.Rhind (Longman): 481–492 Moore ID, Grayson RB, Landson AR (1993) Digital terrain modelling: A review of hydrological, geomorphological and biological applications. In Terrain Analysis and Distributed Modelling in Hydrology edited by Beven, K.J. and Moore, I.D (Wiley):7 - 34 Peucker TK, Fowler RJ, Little JJ, Mark DM (1978) The Triangulated Irregular Network, Proceedings of the American Society of Photogrammetry: Digital Terrain Models (DTM) Symposium, St. Louis, Missouri, May 9-11, 1978: 516-540 Schneider B (1998) Geomorphologisch plausible Rekonstruktion der digitalen Repräsentation von Geländeoberflächen aus Höhenliniendaten. PhD thesis, University of Zurich Schneider B (2001a) On the uncertainty of local shape of lines and surfaces. Cartography and Geographic Information Science 28: 237–247 Schneider B (2001), Phenomenon-based specification of the digital representation of terrain surfaces. Transactions in GIS 5: 39–52 Schmidt J, Evans IS and Brinkmann J (2003) Comparison of polynomial models for land surface curvature calculation. IJGIS 17:797 – 814 Skidmore A (1989) A comparison of techniques for calculating gradient and aspect from a gridded digital elevation model. IJGIS 3: 323–334 Walker JP, Willgoose GR (1999) On the effect of digital terrain model accuracy on hydrology and geomorphology. Water Resources Research 35: 2259-2268 Weibel R, Heller M (1991) Digital terrain modeling. In GIS: Principles and Applications edited by Maguire, D.J., Goodchild, M.F. and Rhind, D.W. (Wiley, New York ): 269-97 Wise S (1997) The effect of GIS interpolation errors on the use of digital elevation models in geomorphology. In Landform monitoring, modelling and analysis edited by S. Lane, K. Richards, and J. Chandler (Wiley): 139–164 Wood J and Fisher P (1993) Assessing interpolation accuracy in elevation models. IEEE Computer Graphics & Applications: 48–56
Contour Smoothing Based on Weighted Smoothing Splines Leonor Maria Oliveira Malva Departamento de Matemática, F.C.T.U.C., Apartado 3008, 3001 454 Coimbra, Portugal, email: [email protected]
Abstract Here we present a contour-smoothing algorithm based on weighted smoothing splines for contour extraction from a triangular irregular network (TIN) structure based on sides. Weighted smoothing splines are onevariable functions designed for approximating oscillatory data. Here some properties are derived from a small space of functions and working with few knots and special boundary conditions. However, in order to apply these properties to a two variable application such as contour smoothing, local reference frames for direct and inverse transformation are required. The advantage of using weighted smoothing splines as compared to pure geometric constructions such as the approximation by parabolic arcs or other type of spline function is the fact that these functions adjust better to the data and avoid the usual oscillations of spline functions. We note that Bezier and B-spline techniques are result in convenient, alternative representations of the same spline curves. While these techniques could be adapted to the weighted smoothing spline context, there is no advantage as our approach will be simple enough.
1 Introduction By a triangular irregular network we mean a triangulation of the convex hull of scattered data in space. In this prismatic surface, each triangle is represented by a plane with the following equation, f x, y bi z i b j z j bk z k (1.1)
126
Leonor Maria Oliveira Malva
where bi , b j , bk are the barycentric coordinates. The interpolation function is a continuous but non-differentiable (nonsmooth) surface. This means that the contours of the reconstructed surface are polygonal lines parallel in the interior of each triangle, forming sharp connections between adjacent triangles. Following Christensen(Christensen 2001) those will be called raw contours. In order to produce smooth contours from this kind of data it is usual to apply smoothing procedures such as B-splines or Bézier curves. 1.1 Data structure In TIN models, data can be stored in different ways- namely in structures of triangles, sides, nodes or combinations of these. The side structure is more suitable for contour extraction. The structure based on triangles is more suited to the computation of slope, aspect and volume. Therefore, we will describe a structure based on sides. Usually a structure based on sides (see figure 1) has three tables: a table of points, composed of four columns (point number, x coordinate, y coordinate, z coordinate); a table of sides, composed of five columns (side number, first node, second node, left triangle, right triangle), and a table of triangles, composed of four columns (triangle number, side one, side two, side three).
Fig. 1. Side structure
To implement our algorithm it is necessary to include another table composed of three columns: (number of triangle, x barycentre coordinate, y barycentre coordinate). This table allows the computation of the medians of each triangle. For example, in figure 2 the medians P1C1 , P2 C1 and P4 C1 are computed from the barycentre C1 and from the vertices P1 , P2 , P4 .
Contour Smoothing Based on Weighted Smoothing Splines
127
1.2 Extracting linear contours Let us define the elevation of the lowest contour. First it is necessary to verify if this elevation is included between the elevations of the nodes of the first side. In case this is true, let us compute the intersection of the contour with the side. On the contrary, the search proceeds in the sides of the triangulation until an intersection point is reached. At this point we know the initial side and choose the left or right triangle for a particular contour. Then we search for a new side that belongs to the same triangle or belongs to the adjacent triangle. This can be easily computed because in the table of sides each side has the indication of the triangle on the left and on the right. The procedure follows until we reach the first side again (for closed contours) or the boundary (zero triangle) for open contours. If the boundary is reached, the search proceeds from the initial side and for the non-initial chosen left or right triangle. This way we are computing a second part of the contour. In case of large files it is probably the case that a contour has several sections. During all the procedure every triangle found is placed in a vector of used triangles.
Fig. 2. The geometry of the procedure
1.3 Christensen procedure Our algorithm is in part based on the procedure of Christensen that smooths portions of raw contours included between the medians of adjacent triangles and the vertices of the same contour on the common edge of those triangles. As an illustration take adjacent triangles >P1 , P2 , P4 @ and
>P2 , P3 , P4 @ (see figure 2); these are cut by a raw contour >V1V2V3 @
a sharp angle at V 2 .
forming
128
Leonor Maria Oliveira Malva
The smoothing procedure consists of the substitution of the raw contour between the intersection points H 1 and H 2 with the medians of the triangles by a parabolic arc tangent to the raw contour at H 1 and H 2 . However, this is made by a pure geometric procedure. Such a solution is illustrated in figure 3 and as Christensen, closer to the raw contours than Bezier curves or B-splines because the latter put the contours at the wrong elevation due to their oscillatory behaviour.
Fig. 3. Interpolation of smooth contours using a parabola
2 Weighted smoothing splines The weighted splines were introduced in independent works by Salkauskas (Salkauskas 1984) and Cinquin (Cinquin 1981) for one variable interpolation of oscillatory data. However, this is not the only application of such functions, as we will see later. These functions are chosen on from H 2 >a, b@
^f : f c is absolutely continuous on >a, b@, and f cc L >a, b@` 2
(2.1)
in such a way to interpolate and minimize the semi-norm, Jf
b
³ a wt > f cct @
2
dt , wt t 0, w z 0, t >a, b@
(2.2)
A classical smoothing spline (Wahba 1990) does not have to interpolate and minimizes b
Q f
n
³ >f t @ dt D ¦ z b , t >a, b@ ''
a
2
i
i 1
i
2
(2.3)
The optimum is a certain cubic spline. As in J , a weight function can be included for additional shape control. Here, like in (Malva and Salkauskas 2000) we will need the simplest setting of these splines and we will restrict f to a small space of functions and work with few knots and special boundary conditions.
Contour Smoothing Based on Weighted Smoothing Splines
129
These are C 1 functions, which means that the derivative is continuous but not necessarily the second derivative. These functions can be written as a linear combination of the Hermite cardinal functions I i ,\ i , i 1,2,3 for ordinate and derivative interpolation at the knots t1 t 2 t 3 . Then, a piecewise cubic function s interpolating ordinate b1 and slope m1 at t1 , ordinate b and slope mat t 2 , and ordinate b3 and slope m3 at t 3 , can be written as s
>
b1I1 m1\ 1 bI 2 m\ 2 b3I 3 m3\ 3
(2.4)
@
on t1 , t 3 Proposition 2.1: For any nonnegative weight function w , which is not identically zero and is constant on intervals t1 , t 2 , t 2 , t 3 , and for any
>
>
D ! 0 , there is a unique V in the space of C 1 piecewise cubics, with knots t1 t 2 t 3 , interpolating at t1 with b1 ordinate and slope m1 , and interpolating at t 3 with b3 ordinate and slope m3 , which minimizes
Q s
³
t3 t1
w t > s cc t @ 2 dt D >z st 2 @
(2.5)
2
for any constant z . Furthermore
lim V t 2
D of
(2.6)
z
Proof. The proof follows the proof of proposition 5.1 in Malva (Malva and Salkauskas 2000) with the coefficients A, B, C , D and E given by t3
̝ Now we can make special choices of ordinates and slopes and end up with the following cases: Case one: a piecewise cubic function s interpolating zero ordinate and slope m1 at t1 , ordinate b and slope mat t 2 , and zero ordinate and slope
m3 at t 3 , can be written as s
>
m1\1 bI 2 m\ 2 m3\ 3
(2.9)
@
on t1 ,t 3 (see figure 4, case 1). Case two : a piecewise cubic function s interpolating zero ordinate and slope m1 at t1 , ordinate b and slope m at t 2 and ordinate b3 and slope
m3 at t 3 can be written s
>
m1\1 bI 2 m\ 2 b3 I 3 m3\ 3
@
on t1 ,t 3 (see figure 4, case 2).
Fig. 4. The piecewise cubic functions
(2.10)
Contour Smoothing Based on Weighted Smoothing Splines
131
2.1- Local reference frames Let >H 1V 2 H 2 @ be a portion of a raw contour to be smoothed. This polyline can have any orientation in the x, y coordinate system in the map. To apply the preceding theory it is necessary to have a local reference frame in which to the points H 1 , V 2 , H 2 corresponds a partition t1 t 2 t 3 . We identify three different situations. Case 1: This corresponds to figure 5 where the local s axis has the orientation from H 1 to H 2 . The st -coordinates are related to general ones by t x cos T y sin T x H1 s x sin T y cos T y H1 (2.11) In this case the points H 1 , V 2 , H 2 have abscissas t1 , t 2 , t 3 and
t 2 t1
h , and t 3 t 2
k
Fig. 5. Local frame of reference, case 1
. In this frame of reference the ordinate at H 1 and at H 2 is zero. The values of the slope of H 1V 2 at t1 and the slope of V 2 H 2 at t 3 can be used to construct a weighted smoothing spline tangent to H 1V2 at t1 and tangent to H 2V2 at t 3 . The parameter D and the ordinate z will define the degree of interpolation at V2 . Once the weighted smoothing spline is computed one must proceed by the inverse transformation x
t cos T s sin T x H 1
y
t sin T s cos T y H1
,
(2.12)
132
Leonor Maria Oliveira Malva
to represent the computed function in the actual position. Case 2: In some cases choosing a local frame like in case 1 doesn’t lead to a partition t1 t 2 t 3 and we get h ! 0 and k 0 instead (as can be seen in figure 6). In such cases we can choose a new local frame with the s ' axis connecting H 1 with the point P in the direction of H 2V 2 and at distance H 2V 2 of V 2 . In this frame we have in particular that h k . There are in this case also two direct transformations, involving the angles T and E , in order to arrange the reference frame so that proposition 2.2 can be applied, and thus there are two inverse transformations to put the computed function into xy - coordinates. The values of m1 , m3 are computed in the same way for the coordinates t ' , s' , and the value b3 is the ordinate of H 2 on the same axes.
Fig. 6. Local frame of reference, case 2
Case 3: In the last case, on the local frame we get h 0 , and k ! 0 (see figure 7). Therefore we choose a new frame of reference centred on H 1 , with the direction t ' perpendicular to H 2 P , where P is in the direction of H 1V2 , at the distance H 1V 2 of V2 . The rotation angles are computed in a similar way to the previous cases.
Contour Smoothing Based on Weighted Smoothing Splines
133
Fig. 7. Local reference frame, case3 2.2 The weight function The motivation underlying the choice of the weight functions is to allow that variations on data be followed by the adjusted spline function. The information available in this case consists of the nodes ^t i , i 1,2,3` and of the corresponding values of the ordinates ^s i , i 1,2,3` , and its variability can be assessed by the slope between adjacent segments. In the tests we choose the weight function introduced by Salkauskas (Salkauskas 1984) 3
wt
ªs s º2 ½ ° ° i i 1 ®1 « » ¾ , t >t i 1 , t i , i h i ¼ °¿ °¯ ¬
The choice was made to make
³
b a
(2.13) 2,3
wt >s cct @2 dx resemble a L 2 -norm of
the curvature of s . 2.3 Smoothing with weighted smoothing splines Figure 8 is an application of proposition 1.1. The non-interpolation case corresponds to D 0 , and the interpolating spline corresponds to D f . We can also observe an intermediate spline corresponding to an arbitrary value of 0 D f .
134
Leonor Maria Oliveira Malva
Fig. 8. Smoothing weighted spline, case 1
In figure 9 we can see the difference between the parabola obtained from the Christensen procedure and the weighted spline from our procedure. As it can be observed the weighted spline is closer to the raw contour than the parabola.
Fig. 9. Approximations to the raw contour
3 Applications
3.1 Generation of contours from spot heights In order to make an application of the exposed theory to the contour generation from spot heights, let us make a Delaunay triangulation of the spot heights included on figure 11. This is shown by the dashed lines in figure 10 a) and b). Figure 10 b) corresponds to the application of the results of proposition 1.1 with D 0 to the raw contours of figure 10 a). We must point out that in a similar way to the procedure of Christensen where the parabola is approximated by a polyline, in this application the weighted smoothing splines are approximated by sets of polylines, connecting the points of the spline at a regular spacing of the parameter t .
Contour Smoothing Based on Weighted Smoothing Splines
135
Fig. 10. Contours from spot heights
3.2 Generation of intermediate contours Let us take a portion of a ten meter equidistance contour map of a region of >2.0 u 2.0 Km@ in Coimbra in the centre of Portugal (see figure 11).
Fig. 11. Original contour map
Fig. 12. Weighted spline contour map
Since the vertices lie on contour lines of the map, any contour triangulation will produce raw contours passing through the vertices of the triangulation. That fact prevents the reconstruction of the original contours either by the procedure of Christensen or by the application of our procedure. However the same triangulation can be used to compute intermediate contours as can be seen on figure 12. This results from a Delaunay triangulation of all the points in the contours and the application of the weighted
136
Leonor Maria Oliveira Malva
spline for the intermediate contours. The original contours are hatched to allow comparison with the original picture.
4 Conclusions The Christensen procedure was designed to smooth raw contours in such a way that the resultant contour is closer to the raw one than the approximations with B-splines or Bezier curves. However, this is made by a pure geometric procedure. With the application of weighted smoothing splines we get contours that depending on the value of D , can be closer to the raw contours than the procedure of Christensen and at the same time we have the advantage of producing C 1 curves. The previous sections show that this can be applied to the contour extraction from spot heights or to the generation of intermediate contours.
References Cinquin. P., (1981) Splines Unidimensionells Sous Tension et Bidimensionelles Parametrées : Deux Applications Medicals, Thèse, Université de Saint_Etienne. Christensen, A. H. J., (2001) Contour Smoothing by an Ecletic Procedure, Photogrametric Engineering & Remote Sensing, 67(4): 511-517. Malva, L., Salkauskas, K., (2000) Enforced Drainage Terrain Models Using Minimum Norm Networks and Smoothing Splines, Rocky Mountain Journal of Mathematics, 30(3): 1075-1109. Rahman A. A., 1994, Design and evaluation of TIN interpolation algorithms, EGIS Foundation 1
Salkauskas, K., 1984, C splines for interpolation of rapidly varying data, Rocky Mountain Journal of Mathematics, 14(1): 239-250. Wahba. G., (1990) Spline models for observational data, SIAM Stud. Appl. Math. 59.
Flooding Triangulated Terrain1 Yuanxin Liu and Jack Snoeyink Department of Computer Science, University of North Carolina, Chapel Hill, USA {liuy,snoeyink}@cs.unc.edu
Abstract We extend pit filling and basin hierarchy computation to TIN terrain models. These operations are relatively easy to implement in drainage computations based on networks (e.g., raster D8 or Voronoi dual) but robustness issues make them difficult to implement in an otherwise appealing model of water flow on a continuous surface such as a TIN. We suggest a consistent solution of the robustness issues, then augment the basin hierarchy graph with different functions for how basins fill and spill to simplify the watershed graph to the essentials. Our solutions can be tuned by choosing a small number of intuitive parameters to suit applications that require a data-dependent selection of basin hierarchies.
1 Introduction and Previous Work Without a doubt, the computation of drainage characteristics, including basin boundaries, is one of the successes of GIS analysis. Digital data and GIS are widely employed to partition terrain into hydrological units, such as watersheds and basins. The US proposal for extending the hydrologic unit mapping from a 4-level to a 6-level hierarchy [5] includes discussion of the role of DEMs in what was formerly a manual cartographic process. A GIS can give preliminary answers to analysis questions such as how much rainfall runoff a downstream point can receive. Detailed hydrologic analysis will apply numerical computation to hillslope patches created to have uniform slope, soil, and/or vegetation characteristics. The raster DEM is the most common terrain and hydrology model in GIS, because regular grid representations of terrain are most amenable to computation. Many algorithms have been proposed for pit-filling [8,9,10], 1
Research partially supported by NSF grant 9988742.
138
Yuanxin Liu and Jack Snoeyink
barrier-breaching [16], flow direction assignment [2,15], basin delineation [12,14,23,24], and other steps of hydrologic modeling [4,13]. Another common terrain model is the TIN (Triangulated Irregular Network), which forms a continuous surface from triangles whose vertices are irregularly-sampled elevation points. A TIN is more complex to store and manipulate because it must explicitly store the irregular topology. The debate between using grid or TIN as the DEM model has been a longstanding one in GIS [11]. Often-mentioned advantages for a TIN include: x A grid stores the terrain at uniform resolution, while a TIN can potentially store large flat regions as single polygons. x To construct a grid from irregularly spaced data or multiple source data often requires interpolation and possible loss of information. x Grid squares do not give a continuous surface model, so geometric operations and measurements on a surface must be approximated on grid cells. x Water flow on a grid is commonly constrained to eight (or even four) neighbor directions for ease of computation, which can produce visible artifacts.
Our work focuses on basin computations on a TIN, including filling spurious pits and simplifying the basin hierarchy. We provide a framework for basin computation that can support several natural variations efficiently and robustly. We partition the computation into several steps: 1. Assign low directions to triangles. For this we use steepest descent directions on the triangles, but other choices are possible, especially in flat portions of terrain. Triangles can be subdivided or otherwise parameterized to allow more complex specification of flow directions. 2. Trace flow paths network. A key contribution of this work is to keep explicit track of the order that paths cross triangle edges and triangles, and thus avoid computational failures due to degenerate configurations and floating point inaccuracies, which otherwise make this a difficult task. 3. Compute basins, spill points, and an initial basin hierarchy. 4. Compute basin characteristics such as spill times, by propagation through the basin hierarchy. With consistent flow paths, these become simple problems on graphs. The initial basin hierarchy contains many spurious pits, but its structure can help us decide which are spurious, and which are significant. 5. Simplify the basin hierarchy, and, if desired, the terrain to match. We give examples of hierarchies from basin spill times computed from projected area and volume for basins and the sub-basins that spill into them. Natural variants for computing basin spill times that can be suitable to different type of terrain.
Assignment of flow directions (step 1) and computation of basin spill times (step 4) can be carried out in several ways, a few of which we demonstrate. Our framework calls for simple output of these variants (an assignment of direction vectors to triangles, or event times to basin hierarchy edges) that allows experimentation with algorithm or terrain-dependent policies, while the bulk of the geometric computation remains fixed.
Flooding Triangulated Terrain
139
The basin hierarchy has been studied in grid-based terrain by Band et al. [12]. An explicit representation of the basin hierarchy during pit-filling provides structural information that can help us decide which pits are spurious. Moreover, it simplifies the handling of pits—e.g., we need not assign flow directions at spill points, as many pit-filling algorithms on a grid must do, because we just send overflow into the basin we spill into. Some other researchers have used a TIN as the terrain model for flow computation, but directed flow only along triangle edges, using either the triangles [20,21] or the dual Voronoi cells [18,22] as the hydrological unit. Thus, they still use a discrete network flow model, much like the raster; they simply substitute the regular grid for an irregular network. The flow model that we use has an underlying continuous surface, discretized along flow paths, rather than by a limited set of directions (D8 or edges). Two sections give definitions and sketches of the algorithms in our work: Sec. 2 reviews previous results [17,25] that we use directly and Sec. 3 describes the basin filling model. Sec. 4 details our implementation, which includes the computational steps above. Sec. 5 describes our experiments, and Sec. 6 discusses possible future work.
2 TIN-based hydrology model We use a TIN-based hydrology model proposed by Yu et al. [25] that extends the accurate definitions of Frank et al. [6]. We briefly review relevant definitions, properties, and algorithms. A terrain is mathematically modeled as the graph of a real-valued function f over a bounded region of the plane. This definition does not tell us how to store f or, given a set of noisy or irregular data samples, how such a function f might be computed. Still, this general definition of terrain allows us to specify water flow precisely: the trickle path of a point is the steepest descent path that ends either at a local minimum (pit) or the boundary of the terrain (Figure 1c). The watershed of a point p is the set of all points whose trickle paths contain p. The watercourse network is all points whose watersheds have nonzero area. Catchments (strips draining into a portion of the watercourse network) and basins (watersheds of pits) can also be defined. Since a TIN is a piecewise linear surface, the only segments on a watercourse network are local channels (Figure 1a) and segments of steepest descent paths on triangle faces traced from saddle vertices of the TIN. With this observation, the watercourse network is straightforward to compute: take the set of all segments that belong to the watercourse network and join
140
Yuanxin Liu and Jack Snoeyink
them at intersection points. It can be easily shown that this watercourse network can be characterized: x The watercourse network in a TIN is a collection of disjoint (graphtheoretic) trees rooted at pits, whose leaves are local channels.
d) a) local chanel
b) local ridge
e)
c) trickle path
Fig. 1 The thick edges in a) and b) are a local channel and a local ridge. c) shows a trickle path. d) and e) show segments of a watershed graph (dotted) before and after being joined in the neighborhood of a vertex.
McAllister [17] shows how to compute a basin graph: an embedded planar graph for the entire TIN whose faces are polygons that define basins and whose edges separate pairs of adjacent basin faces. We first compute a watershed graph, which is essentially a basin graph with extra interior segments. We then delete the interior segments to obtain the basin graph. The computation of the watershed graph consists of two steps: 1. Collect the local ridges (Figure 1b) of the TIN, and the steepest descent paths traced backward from each saddle point of the TIN. 2. The open line segments from step 1 are connected to form an embedded planar graph consistent with the TIN hydrology model.
Step 1 is simple, and we note the similarity between the watershed graph and the watercourse network, whose segments are local channels and steepest descent paths. Intuitively, the segments for the watershed graph form ridges that potentially separate basins. Step 2 is complicated, and we examine the details here. There are two cases: we can connect the upper end point of a steepest descent path to a local ridge segment, or connect two segments ending at a TIN vertex. The first case is general, and easy to handle. The second is degenerate, and requires care. Figure 1d shows the neighborhood of a vertex, with the open segments shortened by an infinitesimal length. If we simply join these segments by extending them to the vertex, the trickle paths through the vertex will be cut off, dividing a basin face and of the watershed graph. Instead, we must join the segments so they do not “collide” with the trickle paths through the vertex as in Figure 1e. These joining operations in a vertex neighborhood involve only graph operations on data structures. A key problem with many hydrology models, including this one, is selecting appropriate scale. How can we consistently extract hydrological
Flooding Triangulated Terrain
141
objects at a desired scale—say, the watershed of a large river—no matter how detailed our terrain model is? With a TIN, two triangles define a segment of the watercourse network regardless of whether they correspond to the banks of a river or sides of a ditch. This problem is exacerbated by data resolutions ranging from 100-meter DEMs to sub-meter LIDAR. Another key problem of this model is its assumption that water has no volume. In heavy rainfall, depressions such as storm ponds in urban areas, become inundated, changing the drainage characteristics of the terrain. The good news is that the work on TIN-based hydrology models allows for an extension that meets the challenge of these key problems.
3 The basin filling model Suppose that we change the model so that water has volume and can accumulate in basins. Then we must allow a basin to fill until it “spills” into a neighboring basin. Consider the moment the surface of the accumulated water in a basin A reaches the lowest point, a, on its boundary—the spill point. Let basin B be the basin that a drains into. Then the two basins can be merged into a single one: the boundary between A and B are deleted, and the pit of the new basin is the pit of B. This filling model naturally defines a sequence of merges forming a basin hierarchy tree: leaves are the initial basins, and the root is the merge of all basins. It also defines a sequence of deformation operations on the terrain, each replacing a filled basin by the flat surfaces of its water body. Although trickle paths on flat surfaces are not well defined, since steepest descent is not unique, the trickle path of point p in a filled basin A starts at p and exits at the spill point of A. Therefore, on a “flooded” terrain, a trickle of water can still be directed from its origin to a pit. To incrementally compute the basin hierarchy tree, we can compute the “spill time” of each basin, then repeatedly take the basin with the earliest spill time, merge it with a neighboring basin and compute the new spill time of the merged basin. This “flooding simulation” algorithm responds to events rather than simulating flow at regular time steps. Each iteration processes one event when a topological change of the hierarchy tree and terrain occurs. The algorithm calculates volumes and surface areas—which can be done accurately up to round-off errors—and updates data structures that store local geometry and connectivity as described in the next section. The spill times merely define an ordering of the basins; other definitions of spill times can be substituted. We describe experiments in Sec. 5.
142
Yuanxin Liu and Jack Snoeyink
4 Data structures and implementation We have implemented the basin filling computations in C++. Our program takes a set of points in 3D as input and processes them as the following: 1. Create a TIN by computing the Delaunay triangulation of the input points (using only the x and y coordinates). 2. Compute the watershed graph of the TIN, and delete the internal edges of the graph to create a basin graph. 3. Run the flooding computation until one basin is left. At each step, a filled basin is found that spills into a neighbor by merging the corresponding faces in the basin graph. The terrain surface is modified to reflect this by simply deleting the set of triangles that are below the water surface and recording the water surface height with the partially immersed triangles. 4. Simplify the basin hierarchy by removing edges, merging pairs of basins, and displaying the corresponding basin graph.
The two most important objects in the program, the TIN and the watershed graph, can each be stored as planar subdivisions using data structures such as Quadedge [7]. However, a number of options should be considered to reduce the space and speed up the computation. These are particularly important due to the size of the data sets we would like to handle. The watershed graph shares many edge segments with the TIN—in particular, all the local ridges of a TIN are part of the watershed graph. Therefore, instead of duplicating the connectivity information in the TIN, we simply store it in a simple graph data structure with pointers to the segments and vertices of the TIN. An advantage of using a simple graph data structure is that vertex and edge insertion and deletion operations have lower overhead than the same operations on a subdivision. Note, however, that the watershed graph is more than a subset of the TIN. As discussed earlier, the watershed graph in an infinitesimal neighborhood of a saddle point can be very different from the TIN. If we wish to implement sophisticated subdivision operations such as the merge-face operations for flooding, we want to have the basin graph stored as a subdivision independent of the TIN. Fortunately, the watershed graphs have a large number of internal edges, which implies that the basin graphs are much smaller. We create the basin graph subdivision after we have deleted all the internal edges from the watershed graph. The flooding computation need not know the connectivity between triangles in the TIN. It is enough to keep, for each basin, the set of triangles in it, ordered by the height so that the triangles within a height interval can be quickly located. Other options that trade between computation time, space, and ease of implementation are yet to be explored. E.g., computing the watershed is currently the memory bottle-neck of our program; pointers between the
Flooding Triangulated Terrain
143
TIN elements and the simple watershed graph data structure still take too much space. These pointers provide topological information through the TIN that represent the planar embedding of the watershed graph. We can try to reconstruct the basin graph subdivision using only the coordinate information, but numerical errors and degeneracy make this a challenge. 4.1 Robustness issues in implementation Geometric algorithms that manipulate topological data structures are harder to implement robustly than algorithms that manipulate rasters. The culprits are round-off error and geometric “degeneracies” in a problem. Both have been extensively studied in the field of computational geometry [1,3,19]. Although these results are often quite technical, the most important techniques are not hard to understand. Bit complexity analysis often involves only back-of-the-envelope calculations, and degeneracies can be eliminated by implementing a small set of policies that conceptually perturb the input. We look at these two techniques in more detail to robustly implement the algorithms for this flooding model. 4.1.1 Numerical issues Geometric algorithms derive spatial relationships from computations on coordinates, but a computer uses only a limited number of bits to represent a number. If algorithm correctness depends on ideal geometry, round-off error results in at best an approximation and, at worst, a crash. Three numerical computations are involved in our algorithm: 1. Computing the steepest descent direction for each triangle. 2. Testing whether water on some triangle flows into a triangle edge or away from it. This will classify whether an edge is a local channel, a ridge, or neither. 3. Tracing the steepest descent path backward from a vertex to a local ridge.
For 1), observe that if we replace “steepest descent” by “unique descent direction,” in the definition of trickle path, the definitions for watersheds are still consistent. So, if we assign each triangle some descent direction that closely approximates the steepest descent, we have a TIN whose drainage characteristics are acceptable as a close approximation of the original. We store the steepest descent vector as single-precision, though the exact vector requires double-precision. We simply round off the directional vector so that the result gives a descent direction. For 2), testing the flow direction on a triangle edge is an orientation test, which is of algebraic degree two. Since double precision is supported in hardware, an exact implementation of this test is straightforward.
144
Yuanxin Liu and Jack Snoeyink
For 3), we have the most problematic numeric computation involved in the algorithm. Each time we cross a triangle as we back-trace a steepest descent path, we must compute the intersection of the steepest ascent path with an edge of the triangle, which quadruples the precision for exact computation. If k intersections are computed, we go from b bits to a worst case of (3k+1)b bits to represent the last point of the path exactly. We use single precision floating point instead, with the unfortunate consequence that not only does the path's position in space become inexact, we can no longer guarantee watershed graph properties such as that each face contains exactly one pit. This approximation can be acceptable as long as small errors do not cause catastrophic changes. The main issue is that two inexact paths might cross each other, contradicting the assumption that the steepest descent path is unique. We can try two ways to handle this: 1. We can compute the watershed graph without regard to whether the steepest ascent paths cross each other. Then, the nodes of the computed watershed graph that are supposed to be on the steepest ascent paths cannot be embedded right away. We must first repair steepest ascent paths so they no longer intersect. 2. We can incrementally maintain the invariant that no steepest ascent paths cross by asserting that the order of the entry points of the steepest ascent paths into a polygon must be the same as the order of the exit points. In the data structure, we maintain a list of all steepest ascent flows across each triangle edge.
We have chosen 2) in our implementation. When we back-trace a steepest descent path through a triangle, we first compute the position of the exit point by numeric computation. If this position fails to satisfy the invariant, we assign a position in the valid range that the path is allowed to exit. In our experiments, we have found that the number of faces that do not have exactly one pit is small and the deviation is never more than one. 4.1.2 Resolving degeneracies Geometric algorithms often assume that the inputs are in general position. For example, no three points lie on the same line. Subsets of the input that violate the assumption are called degeneracies [1] because an infinitesimal random perturbation of the input will eliminate them. To actually handle the degeneracies in an implementation, one can either directly handle all the “exceptions” introduced, or create policies that treat the degeneracies consistent with some infinitesimal perturbation. In McAllister [17], the first option is taken, while we have done the latter. We list below the general position assumptions made in our algorithm and how the corresponding degeneracies are handled with perturbations. 1. No two points have the same height. In the degenerate case, when two z-values are equal, compare their x-values; and if their x-values are equal, compare their y-values. This policy is consistent with infinitesimally rotating the xy-plane.
Flooding Triangulated Terrain
145
2. The steepest descent direction in a triangle is not in the same direction as any of the edge of the triangle. We need this to test whether water on a triangle flows into a triangle edge or away. In the degenerate case, we choose “away.” This is consistent with rotating the steepest descent direction infinitesimally. 3. The steepest ascent path does not go through a triangle vertex. In the degenerate case, we choose one of the two edges adjacent to the vertex as the next edge. This is consistent with perturbing the x y coordinates of the vertex infinitesimally. Note that this perturbation precludes the possibility that an area can drain into another area through a single trickle path.
When implementing a perturbation policy, we must reason backwards about what the policy implies to how the input is perturbed and convince ourselves that perturbations in different operations are not contradictory. If this is not done, the behavior of the program can be unpredictable.
5 Experiments We have tested our program against digital elevation data from various sources. The graphical output of the program is shown in Figure 2. Basin boundaries are drawn as outlines over the hill-shaded terrains. Black regions are flooded regions of the terrain. We have used several different basin filling strategies. Each strategy effectively defines a different basin spill time. We have experimented with these four strategies: 1. Uniform precipitation model: compute a basin's filling time by dividing the volume of a basin by (projected) area. Once a basin spills into another, the areas and volumes add. This is consistent with the simplistic, but physical, assumption that all rainfall turns into surface flow. 2. The basins are filled in order of increasing volume. 3. The basins are filled in order of increasing area. 4. For each basin, we compute two numbers: the number of basins that have spilled into it and its filling time under the uniform precipitation model. Two basins are compared by lexicographic ordering of their number pairs.
These filling strategies—some corresponding to precipitation models— demonstrate the flexibility of the flooding model and the program. The program stores enough topological and geometrical information so that the numbers used for the ordering can be easily computed. Figure 2 shows the output using the strategies 1 and 2 (please refer to our website, http://www.cs.unc.edu/~liuy/flooding/, for more results); two terrain models were used as the input. The terrain model for the first column of pictures was produced from 1 degree quad of USGS DEM data from Grand Junction, Colorado. The second terrain model was produced from unfiltered LIDAR data from Baisman Run, Maryland.
146
Yuanxin Liu and Jack Snoeyink
Fig. 2 Basin graphs from two terrain models using two filling stratgies. Each basin is delineated. The number of basins is also shown. These particular filling strategies were motivated by observations about our initial results from a uniform precipitation model. For example, in the second row from Figure 2, we see that large rivers merge before small basins are filled. This is unfortunate, though unsurprising with this model, since a large basin with a low spill point holds a relatively small amount of water compared to the amount of water it receives through precipitation
Flooding Triangulated Terrain
147
over a large area. Our other filling strategies somewhat alleviate this problem. The improvement is especially obvious with LIDAR data that contains elevation points from vegetation that often produce “wells” on the terrain that are hard to fill with a uniform precipitation model.
6 Conclusions and Future Work We have shown how to produce basin hierarchies on a TIN with the basin filling model. We have implemented our algorithms robustly and given sample outputs. Currently, the largest TINs we can handle are about one million data points, so we would like to use more out-of-core algorithms and data structures to handle tens of millions of points. We would like to compare our results against basin hierarchies produced with grid-based algorithms, and to include more physically-based parameters into our implementation, such as absorption rate determined by land cover type. We would also like to incorporate the basin graph into TIN simplification so that the basin graph of the simplified terrain is close to a basin graph at some appropriate level of the basin hierarchy of the original terrain, where closeness is defined both topologically and geometrically.
References [1] P. Alliez, O. Devillers, and J. Snoeyink. Removing degeneracies by perturbing the problem or perturbing the world. Reliable Computing, 6:61-79, 2000. [2] T-Y Chou, W-T Lin, C-Y Lin, W-C Chou, and P-H Huang. Application of the PROMETHEE technique to determine depression outlet location and flow direction in DEM. J Hydrol, 287(1-4):49-61, 2004. [3] H. Edelsbrunner and E.P. Mücke. Simulation of simplicity: A technique to cope with degenerate cases in geometric algorithms. ACM TOG., 9(1):66–104, 1990. [4] J. Fairfield and P. Leymarie. Drainage networks from grid digital elevation models. Water Resour. Res, 27:709–717, 1991. [5] Federal standards for delineation of hydrologic unit boundaries. version 1.0. http://www.ftw.nrcs.usda.gov/HUC/HU_standards_v1_030102.doc, mar 2001. [6] A.U. Frank, B. Palmer, and V.B. Robinson. Formal methods for the accurate definition of some fundamental terms in physical geography. In Proc. 2nd Intl. SDH, pages 585–599, 1986. [7] Leonidas J. Guibas and J. Stolfi. Primitives for the manipulation of general subdivisions and the computation of Voronoi diagrams. ACM TOG, 4(2):74123, 1985. [8] M.F. Hutchinson. Calculation of hydrologically sound digital elevation models. In Proc. 3rd Int. SDH, pages 117–133, 1988.
148
Yuanxin Liu and Jack Snoeyink
[9] S.K. Jenson and J.O. Domingue. Extracting topographic structure from digital elevation data for geographic information system analysis. Photogrammetric Engineering and Remote Sensing, 54(11):1593–1600, Nov. 1988. [10] S.K. Jenson and C.M. Trautwein. Methods and applications in surface depression analysis. In Proc. AUTO-CARTO 8, pages 137–144, 1987. [11] M.P. Kumler. An intensive comparison of triangulated irregular networks (TINs) and digital elevation models (DEMs). Cartog., 31(2), 1994. mon 45. [12] D.S. Mackay and L.E. Band. Extraction and representation of nested catchment areas from digital elevation models in lake-dominated topography. Water Resources Research, 34(4):897–902, 1998. [13] David R. Maidment. GIS and hydrologic modeling – an assessment of progress. In Proc. Third Conf. GIS and Environmental Modeling, Santa Fe, NM, 1996. NCGIA http://www.sbg.ac.at/geo/idrisi/gis_environmental_modeling/ sf_papers/maidment_david/maidment.html.
[14] D. Marks, J. Dozier, and J. Frew. Automated basin delineation from digital elevation data. Geo-Processing, 2:299–311, 1984. [15] L.W. Martz and J. Garbrecht. The treatment of flat areas and closed depressions in automated drainage analysis of raster digital elevation models. Hydrological Processes, 12:843–855, 1998. [16] L.W. Martz and J. Garbrecht. An outlet breaching algorithm for the treatment of closed depressions in a raster DEM. Computers and Geosciences, 25, 1999. [17] M. McAllister. The Computational Geometry of Hydrology Data in Geographic Information System. Ph.D. thesis, UBC CS, Vancouver, 1999. [18] O.L. Palacios-Velez and B. Cuevas-Renaud. Automated river-course, ridge and basin delineation from digital elevation data. J Hydrol, 86:299–314, 1986. [19] J. Shewchuk. Adaptive precision floating point arithmetic and fast robust geometric predicates. Discrete & Comp. Geom. 18:305-363, 1997. [20] A.T. Silfer, G.J. Kinn, and J.M. Hassett. A geographic information system utilizing the triangulated irregular network as a basis for hydrologic modeling. In Proc. Auto-Carto 8, pages 129–136, 1987. [21] D.M. Theobald and M.F. Goodchild. Artifacts of TIN-based surface flow modeling. In Proc. GIS/LIS’90, pages 955–964, 1990. [22] G.E. Tucker, S.T. Lancaster, N.M. Gasparini, R.L. Bras, S.M. Rybarczyk. An object-oriented framework for distributed hydrologic and geomorphic modeling using triangulated irregular networks. Comp Geosc, 27(8):959–973, 2001. [23] K. Verdin and S. Jenson. Development of continental scale DEMs and extraction of hydrographic features. In Proc. 3rd Conf. GIS and Env. Model., Santa Fe, 1996. http://edcdaac.usgs.gov/gtopo30/papers/santafe3.html. [24] J.V. Vogt, R. Colombo, and F. Bertolo. Deriving drainage networks and catchment boundaries: a new methodology combining digital elevation data and environmental characteristics. Geomorph., 53(3-4):281–298, July 2003. [25] S. Yu, M. van Kreveld, and J. Snoeyink. Drainage queries in TINs: from local to global and back again. In Proc. 7th SDH, pages 13A.1–13A.14, 1996.
Vague Topological Predicates for Crisp Regions through Metric Refinements Markus Schneider University of Florida Department of Computer and Information Science and Engineering Gainesville, FL 32611, USA [email protected]
Abstract Topological relationships between spatial objects have been a focus of research on spatial data handling and reasoning for a long time. Especially as predicates they support the design of suitable query languages for data retrieval and analysis in spatial databases and geographical information systems. Whereas research on this topic has always been dominated by qualitative methods and by an emphasis of a strict separation of topological and metric, that is, quantitative, properties, this paper investigates their possible coexistence and cooperation. Metric details can be exploited to refine topological relationships and to make important semantic distinctions that enhance the expressiveness of spatial query languages. The metric refinements introduced in this paper have the feature of being topologically invariant under affine transformations. Since the combination of a topological predicate with a metric refinement leads to a single unified quantitative measure, this measure has to be interpreted and mapped to a lexical item. This leads to vague topological predicates, and we demonstrate how these predicates can be integrated into a spatial query language. Keywords. Vague topological relationship, metric refinement, quantitative refinement, 9-intersection model, lexical item, spatial data type, spatial query language
1 Introduction In recent years, the exploration of topological relationships between objects in space has turned out to be a multi-disciplinary research issue involving disciplines like spatial databases, geographical information systems, CAD/CAM
This work was partially supported by the National Science Foundation under grant number NSF-CAREER-IIS-0347574.
150 Markus Schneider systems, image databases, spatial analysis, computer vision, artificial intelligence, linguistics, cognitive science, psychology, and robotics. From a database perspective, their development has been motivated by the necessity of formally defined topological predicates as filter conditions for spatial selections and spatial joins in spatial query languages, both at the user definition level for reasons of conceptual clarity and at the query processing level for reasons of efficiency. Topological relationships like overlap, inside, or meet describe purely qualitative properties that characterize the relative positions of spatial objects to each other and that are preserved (topologically invariant) under continuous transformations such as translation, rotation, and scaling. They deliberately exclude any consideration of metric, that is, quantitative, measures and are associated with notions like adjacency, coincidence, connectivity, inclusion, and continuity. Some well known, formal, and especially computational models for topological relationships have already been proposed for spatial objects, for example for regions. They permit answers to queries like “Are regions A and B disjoint ?” or “Do regions A and B overlap?”. Unfortunately, these purely qualitative approaches (topology per se) are sometimes insufficient to express the full essence of spatial relations, since they do not capture all details to make important semantic distinctions. This is motivated in Figure 1 for the topological relationship overlap. Obviously, for all three configurations the predicate overlap(A, B) yields true. But there is no way to express the fact that in the left configuration regions A and B hardly overlap, that the middle configuration represents a typical overlap, and that in the right configuration regions A and B predominantly overlap. In these statements and the corresponding resulting queries, the degree of overlapping between two spatial objects is of decisive importance. The crucial aspect is that this degree is a relative metric, and thus quantitative, feature which is topologically invariant under affine transformations. This leads to metrically refined topological relationships having a vague or blurred nature. Transfering this observation to concrete applications, we can consider polluted areas, for example. Here it is frequently not only interesting to know the fact that areas are polluted but also to which degree they are polluted. If two land parcels are adjacent, then often not only this fact is interesting but also the degree of their adjacency. Section 2 discusses some relevant related work about topological relationships. Our design is based on the 9-intersection model, an approach that uses point set topology to define a classification of binary topological relationships in a purely qualitative manner. The goals of this paper are then pursued in the following sections. In Section 3, we explore metrically refined topological relationships on spatial regions with precisely determined boundaries (so-called crisp regions) and show how qualitative descriptions (topological properties) can be combined with quantitative aspects (relative metric properties) into a single unified quantitative measure between 0 and 1. This leads us to vague topological predicates. In Section 4, we demonstrate how the obtained quan-
Vague Topological Predicates through Metric Refinements 151
A
B
A
B
A
B
Fig. 1. Topological relationship overlap(A, B) with different degrees of overlapping.
titative measures can be mapped to lexical items corresponding to natural language terms like “a little bit inside” or “mostly overlap”. This introduces a kind of vagueness or indeterminacy into user queries which is an inherent feature of human thinking, arguing, and reasoning. Section 5 deals with the integration of these indeterminate predicates into an SQL-like query language. Finally, Section 6 draws some conclusions.
2 Related Work An important approach for characterizing topological relationships rests on the so-called 9-intersection model (Egenhofer et al. 1989). This model allows one to derive a complete collection of mutually exclusive topological relationships for each combination of spatial types. The model is based on the nine possible intersections of boundary (∂A), interior (A◦ ), and exterior (A− ) of a spatial object A with the corresponding components of another object B. Each intersection is tested with regard to the topologically invariant criteria of emptiness and non-emptiness. 29 = 512 different configurations are possible from which only a certain subset makes sense depending on the definition and combination of spatial objects just considered. For each combination of spatial types this means that each of its predicates p can be associated with a unique boolean intersection matrix BI p (Table 1) so that all predicates are mutually exclusive and complete with regard to the topologically invariant criteria of emptiness and non-emptiness. 0
∂A ∩ ∂B = ∅ BI p (A, B) = @ A◦ ∩ ∂B = ∅ ∅ A− ∩ ∂B =
∅ ∂A ∩ B ◦ = A◦ ∩ B ◦ = ∅ A− ∩ B ◦ = ∅
1 ∂A ∩ B − = ∅ ◦ − A ∩B = ∅ A A− ∩ B − = ∅
Table 1. The boolean 9-intersection matrix. Each matrix entry is a 1 (true) or 0 (false).
Topological relationships have been first explored for simple regions (Clementini et al. 1993, Cui et al. 1993, Egenhofer et al. 1989). A simple region is a bounded, regular closed set homeomorphic (that is, topologi-
152 Markus Schneider cally equivalent) to a two-dimensional closed disc1 in IR2 . Regularity of a closed point set eliminates geometric anomalies possibly arising from dangling points, dangling lines, cuts, and punctures in such a point set (Behr & Schneider 2001). From an application point of view, this means that a simple region has a connected interior, a connected boundary, and a single connected exterior. Hence, it does not consist of several components, and it does not have holes. For two simple regions eight meaningful configurations have been identified which lead to the well known eight topological predicates of the set T = {disjoint , meet , overlap, equal , inside, contains, covers, coveredBy}. In a vector notation from left to right and from top to bottom, their (well known) intersection matrices are: BI disjoint (A, B) = (0, 0, 1, 0, 0, 1, 1, 1, 1), BI meet (A, B) = (1, 0,1, 0, 0, 1, 1, 1, 1), BI overlap (A, B) = (1, 1, 1, 1, 1, 1, 1, 1, 1), BI equal (A, B) = (1, 0, 0, 0, 1, 0, 0, 0, 1), BI inside (A, B) = (0, 1, 0, 0, 1, 0, 1, 1, 1), BI contains (A, B) = (0, 0, 1, 1, 1, 1, 0, 0, 1), BI covers (A, B) = (1, 0, 1, 1, 1, 1, 0, 0, 1), BI coveredBy (A, B) = (1, 1, 0, 0, 1, 0, 1, 1, 1). For reasons of simplicity and clear presentation, in this paper, we will confine ourselves to metric refinements of topological relationships for simple regions. An extension to general, complex regions (Schneider 1997), that is, regions possibly consisting of several area-disjoint components and possibly having area-disjoint holes, is straightforward. For this purpose, metric refinements have to be applied to the 33 generalized topological predicates between two complex regions (Behr & Schneider 2001). Approaches dealing with metric refinements of spatial relationships on crisp spatial objects are rare. In (Hernandez et al. 1995) metric refinements of distance relationships are introduced to characterize indeterminate terms like very close, close, far, and very far. In (Peuquet & Xiang 1987, Goyal & Egenhofer 2004) directional relationships like north, north-west, or southeast are metrically refined. Two papers deal at least partially with metric refinements of topological relationships. In (Vazirgiannis 2000) refinements which are similar to our directed topological relationships are proposed. Metric details, which are similar to our metric refinements, are used in (Egenhofer & Shariff 1998) to refine natural-language topological relationships between a simple line and a simple region and between two simple lines. There are a number of differences to our approach. First, we deal with two regions. Second, they do not interpret the entries of a 9-intersection matrix for a given topological predicate as optimum values, as we do in Section 3.2. Third, our set of refinements is systematically developed and complete but not ad hoc (see Section 3.1). Fourth, their refinements are not combined with the 9-intersection matrix into a so-called similarity matrix, as in our case (see Section 3.2). Fifth, they do not employ our concept of applicability degree (see Section 3.2). 1
D(x, ) denotes a two-dimensional closed disc with center x ∈ IR2 and radius ∈ IR+ iff D(x, ) = {y ∈ IR2 | d(x, y) ≤ } where d is a metric on IR2 .
Vague Topological Predicates through Metric Refinements 153 Two completely different approaches to modeling indeterminate topological predicates rest on the concept of so-called fuzzy topological predicates (Schneider 2001a, Schneider 2001b), which are defined on complex fuzzy regions (Schneider 1999). That is, in contrast to the assumption in this paper, these predicates operate on (complex) regions whose extent cannot be precisely determined or is not precisely known.
3 Metric Refinements of Topological Relationships Topological relationships are designed as binary predicates yielding a boolean and thus strict decision whether such a relationship holds for two spatial objects or not. Metric details based on the geometric properties of the two spatial objects can be used to relax this strictness. As we will see in Sections 4 and 5, they enable us to describe vague nuances of topological relationships, as they often and typically occur in natural language expressions, and hence they allow us to refine topological relationships in an indeterminate manner. Queries like “Which are the land parcels that are hardly adjacent to parcel X?” or “Which landscape areas are mostly contaminated (overlapped) with toxic substances?” can then be posed and answered. To describe metrical details, we use relative area and length measures provided by the operand objects (in our case two simple regions). These measures are normalized values with respect to the areas of interiors and lengths of boundaries of two simple regions. Consequently, they are scale-independent and topologically invariant. 3.1 Refinement Ratio Factors We now introduce six refinement ratio factors which are illustrated in Figure 2. For the definition of all factors we assume two simple regions A and B. The common area ratio area(A◦ ∩ B ◦ ) CA(A, B) = area(A◦ ∪ B ◦ ) specifies the degree to which regions A and B share their areas. Obviously, CA(A, B) = 0, if A◦ ∩ B ◦ = ∅ (that is, A and B are disjoint or they meet ), and CA(A, B) = 1, if A◦ = B ◦ (that is, A and B are equal ). Like all the other factors to be presented, the common area factor is independent of scaling, translation, and rotation, and hence constant. This factor is also symmetric, that is, CA(A, B) = CA(B, A). The outer area ratio OA(A, B) =
area(A◦ ∩ B − ) area(A◦ )
computes the ratio of that portion of A with A that is not shared with B. Here, OA(A, B) = 0, if A◦ = B ◦ , and OA(A, B) = 1, if A◦ ∩ B − = A◦ , that
154 Markus Schneider is, A and B are disjoint or they meet. Obviously, the outer area ratio is not symmetric, that is, OA(A, B) = OA(B, A). The exterior area ratio EA(A, B) =
area(A− ∩ B − ) area(A− ∪ B − )
calculates the ratio between the area of the common exterior of A and B on the one hand and the area of the application reference system, where all our regions are located, minus the area of the intersection of A and B on the other hand. The reference system is usually called the Universe of Discourse (UoD ). We assume that our UoD is bounded and thus not equal to but a proper subset of the Euclidean plane. This is not a restriction, since all spaces that can be dealt with in a computer are bounded. If A and B meet and their union is the UoD, EA(A, B) = 0. If A = B, EA(A, B) = 1. The exterior area ratio is symmetric, that is, EA(A, B) = EA(B, A). The inner boundary splitting ratio IBS (A, B) =
length(∂A ∩ B ◦ ) length(∂A)
determines the degree to which A’s boundary is split by B. If A and B are disjoint or meet , then IBS (A, B) = 0. If A is inside or coveredBy B, then IBS (A, B) = 1. The inner boundary splitting ratio is not symmetric, that is, IBS (A, B) = IBS (B, A). The outer boundary splitting ratio OBS (A, B) =
length(∂A ∩ B − ) length(∂A)
yields the degree to which A’s boundary lies outside of B. If A is inside or coveredBy B, then OBS (A, B) = 0. If A and B are disjoint or meet , then OBS (A, B) = 1. The outer boundary splitting ratio is not symmetric, that is, OBS (A, B) = OBS (B, A). The common boundary splitting ratio CBS (A, B) =
length(∂A ∩ ∂B) length(∂A ∪ ∂B)
calculates the degree to which regions A and B share their boundaries. Obviously, CBS (A, B) = 0, if ∂A∩∂B = ∅, and CBS (A, B) = 1, if ∂A∩∂B = ∂A,
C A (A , B )
O A (A , B )
E A (A , B )
IB S (A , B )
Fig. 2. Refinement ratio factors.
O B S (A , B )
C B S (A , B )
Vague Topological Predicates through Metric Refinements 155
A in B
A
(a)
A
A
A out (b)
B
(c)
(d)
Fig. 3. Problem configuration for the common boundary splitting ratio (a), enlarged region Aout and reduced region Ain for region A, and 0-dimensional (c) and 1dimensional (d) boundary intersections with their corresponding boundary areas.
which means that A and B are equal . The common boundary splitting ratio is also symmetric, that is, CBS (A, B) = CBS (B, A). The common boundary splitting ratio is especially important for computing the degree of meeting of two regions. A problem arises with this factor, if the common boundary parts do not have a linear structure but consist of a finite set of points. Figure 3a shows such a meeting situation. The calculation of CBS (A, B) leads to 0, because due to regularization ∂A ∩ ∂B = ∅ and the length is thus 0. Hence, common single points are not taken into account by this factor, which should be done for correctly evaluating (the degree of) a meeting situation. To solve this problem, for each simple region A we introduce two additional simple regions Aout and Ain which are slightly enlarged and reduced, respectively, by scale factors 1 + and 1 − , respectively, with > 0 (Figure 3b). We then consider ∆A = Aout − Ain as the extended boundary of A and redefine the common boundary splitting ratio as CBS (A, B) =
area(∆A ∩ ∆B) area(∆A ∪ ∆B)
In Figures 3c and d, the dark shaded regions show the extended boundaries of A and B. The diagonally hatched regions correspond to the boundary intersection of A and B. The refinement ratio factors have, of course, not been defined arbitrarily. They have been specified in a way so that each intersection occurring in a matrix entry in BI p is contained as an argument of the area or length function of the numerator of a refinement ratio factor. As an example, consider the intersection ∂A ∩ B ◦ included in the inequality of the first row and second column of BI p (A, B). This intersection reappears as argument of the length function of the numerator of IBS (A, B). The intersection A◦ ∩ ∂B (second row, first column of BI p (A, B)) is captured by IBS (B, A). The purpose of the denominator of a refinement ratio factor then is to make the factor a relative and topologically invariant measure.
156 Markus Schneider 3.2 Evaluation of the Applicability of a Topological Relationship In this subsection we show how the concept of metric refinement can be used to assess the applicability of a topological relationship for a given spatial configuration with a single, numerical value. This then leads us to the concept of a vague topological relationship. The Similarity Matrix The boolean intersection matrix BI p (A, B) contains nine strict, binary intersection tests leading either to true (1) or false (0). We now replace each intersection test (inequality) by the corresponding refinement ratio factor. This leads us to the real-valued similarity matrix RS p (A, B) (Table 2) which we now employ in order to represent and estimate a topological relationship between two simple regions. Each matrix entry of RS p (A, B) represents a value between 0 and 1 and is interpreted as the degree to which the corresponding intersection in BI p (A, B) holds. That is, the statement about the existence of an intersection is replaced by a statement about the degree of an intersection. 0 RS p (A, B) =
CBS (A, B) IBS (A, B) @ IBS (B, A) CA(A, B) OBS (B, A) OA(B, A) 0
length(∂A ∩ B ◦ ) length(∂A) area(A◦ ∩ B ◦ ) area(A◦ ∪ B ◦ ) area(A− ∩ B ◦ ) area(B ◦ )
length(∂A ∩ B − ) length(∂A) area(A◦ ∩ B − ) area(A◦ ) area(A− ∩ B − ) area(A− ∪ B − )
1 C C C C C C A
Table 2. The real-valued similarity matrix. Each matrix entry is computed as a value between 0 and 1.
Seen from this perspective, each matrix entry 0 or 1 of BI p can be interpreted in a new, different way, namely as the “optimum”, “best possible”, or sometimes “asymptotic” degree to which the corresponding intersection occurring as part of an intersection test in BI p holds. On the other hand, this is not necessarily obvious. Hence, for each predicate p, Table 3 contains an analysis of the suitability of a matrix entry in BI p for our interpretation. The left column contains a list of the topological predicates. The first row uses shortcuts to represent the nine intersections. For example, ∂ ◦ means ∂A ∩ B ◦ (= ∅), and ◦ ∂ means A◦ ∩ ∂B (= ∅). An entry “+” in the table indicates that the respective 0 or 1 in BI p is the optimum, perfect, and adopted value to fulfil predicate p. For example,
Vague Topological Predicates through Metric Refinements 157
Table 3. Suitability of a matrix entry in BI p for interpreting it as the degree to which the respective intersection holds.
if for disjoint the intersection of the boundary of A and the interior of B is empty (matrix entry 0), this is the optimum that can be reached for the inner boundary splitting ratio IBS (A, B). If for covers the intersection of the boundary of A and the exterior of B is non-empty (matrix entry 1), this is the optimum that can be reached for the outer boundary splitting ratio OBS (A, B). This situation implies that the boundary of B touches the boundary of A only in single points and not in curves. An entry “(+)” expresses that the respective 0 or 1 in BI p is an asymptotic value for predicate p. That is, this value can be approached in an arbitrarily precise way but in the end it cannot and may not be reached. For example, for meet the common boundary splitting ratio CBS (A, B) can be arbitrarily near to 1. But it cannot become equal to one, because then the relationship equal would hold. For our later computation, this is no problem. We simply assume the respective asymptotic 0’s and 1’s as optimum values. Computing the Degree of Applicability Whereas exactly one topological relationship applies with the boolean 9intersection matrix, we will show now that the similarity matrix enables us to assess all topological relationships but with different degrees of applicability. For that purpose, we test for the similarity of RS p with BI p . Since we can interpret the matrix entries of BI p for a predicate p as the ideal values, the evaluation of the similarity between RS p and BI p can be achieved by measuring for each matrix entry RS p(x,y) (A, B) with x, y ∈ {∂,◦ ,− } the deviation from the corresponding matrix entry BI p(x,y) (A, B). The idea is then to condense all nine deviation values to a single value by taking the average of the sum of all deviations. We call the resulting value the applicability degree of a topological relationship p with respect to two simple regions A and B. The applicability degree is computed by a function µ taking two simple regions and the name of a topological predicate p as operands and yielding a real value between 0 and 1 as a result:
158 Markus Schneider µ(A, B, p) =
x∈{∂,◦ ,− } y∈{∂,◦ ,− }
if BI p(x,y) (A, B) then RS p(x,y) (A, B) else 1 − RS p(x,y) (A, B) 9
What we have gained is the relaxation of the strictness of a topological predicate p : region × region → {0, 1} (region shall be the type for simple regions) to an applicability degree function µ : region × region × T → [0, 1] (remember that T is the set of all topological predicates). The applicability degree µ(A, B, p) gives us the extent to which predicate p holds for two simple regions A and B. We abbreviate µ(A, B, p) by the vague topological predicate value pv (A, B) with pv : region × region → [0, 1]. The term pv indicates the association to predicate p. Whereas the topological predicate p maps to the set {0, 1} and thus results in a strict and abrupt decision, the vague topological predicate pv maps to the closed interval [0, 1] and hence permits a smooth evaluation. Codomain [0, 1] can be regarded as the data type for vague booleans.
4 Mapping Quantitative Measures to Qualitative, Lexical Items The fact that the applicability degree yielded by a vague topological predicate is a computationally determined quantification between 0 and 1, that is, a vague boolean, impedes a direct integration into a query language. First, it is not very comfortable and user-friendly to use such a numeric value in a query. Second, spatial selections and spatial joins are not able to cope with vague predicates and expect strict and exact predicates as filter conditions that yield true or false. As a solution which maintains this requirement, we propose to embed adequate qualitative linguistic descriptions of nuances of topological relationships as appropriate interpretations of the applicability degrees into a spatial query language. Notice that the linguistic descriptions given in the following are arbitrary and exchangeable, since it is beyond the scope of this paper to discuss linguistic reasons how to associate a meaning to a given applicability degree. In particular, we think that the users themselves should be responsible for specifying a list of appropriate linguistic terms and for associating an applicability degree with each of them. This gives them greatest flexibility for querying. For example, depending on the applicability degree yielded by the predicate inside v , the user could distinguish between not inside, a little bit inside, somewhat inside, slightly inside, quite inside, mostly inside, nearly completely inside, and completely inside. These user-defined, vague linguistic terms can then be incorporated into spatial queries together with the topological predicates they modify. We call these terms vague quantifiers, because their semantics lies between the universal quantifier for all and the existential quantifier there exists.
Vague Topological Predicates through Metric Refinements 159 not
a little bit
nearly completely
1.0
somewhat
0.1
slightly
0.2
0.3
quite
0.4
0.5
0.6
mostly
0.7
0.8
0.9
completely
1.0
Fig. 4. Membership functions for vague quantifiers.
We know that a vague topological predicate pv is defined as pv : region × region → [0, 1]. The idea is now to represent each vague quantifier γ ∈ Γ = {not, a little bit, somewhat, slightly, quite, mostly, nearly completely, completely, . . . } by an appropriate membership function µγ : [0, 1] → [0, 1]. Let A, B ∈ region, and let γ pv be a quantified vague predicate (like somewhat inside with γ = somewhat and pv = inside v ). Then we can define: γ pv (A, B) = true
:⇔
(µγ ◦ pv )(A, B) = 1
That is, only for those values of pv (A, B) for which µγ yields 1, the predicate γ pv is true. A membership function that fulfils this quite strict condition is, for instance, the partition of [0, 1] into n ≤ |Γ | disjoint or adjacent intervals completely covering [0, 1] and the assignment of each interval to a vague quantifier. If an interval [a, b] is assigned to a vague quantifier γ, the intended meaning is that µγ (pv (A, B)) = 1, if a ≤ pv (A, B) ≤ b, and 0 otherwise. For example, the user could select the intervals [0.0, 0.02] for not, [0.02, 0.05] for a little bit, [0.05, 0.2] for somewhat, [0.2, 0.5] for slightly, [0.5, 0.8] for quite, [0.8, 0.95] for mostly, [0.95, 0.98] for nearly completely, and [0.98, 1.00] for completely. Alternative membership functions are shown in Figure 4. While we can always find a fitting vague quantifier for the partition due to the complete coverage of the interval [0, 1], this is not necessarily the case here. Each vague quantifier is associated with a vague number having a trapezoidal-shaped or triangular-shaped membership function. The transition between two consecutive vague quantifiers is smooth and here modeled by linear functions. Within a vague transition area, µγ yields a value less than 1 which makes the predicate γ pv false. Examples in Figure 4 can be found at 0.2, 0.5, or 0.8. Each vague number associated with a vague quantifier can be represented as a quadruple (a, b, c, d) where the membership function starts at (a, 0), linearly increases up to (b, 1), remains constant up to (c, 1), and linearly decreases up to (d, 0). Figure 4 assigns (0.0, 0.0, 0.0, 0.02) to not, (0.01, 0.02, 0.03, 0.08) to a little bit, (0.03, 0.08, 0.15, 0.25) to somewhat, (0.15, 0.25, 0.45, 0.55) to slightly, (0.45, 0.55, 0.75, 0.85) to quite, (0.75, 0.85, 0.92, 0.96) to mostly, (0.92, 0.96, 0.97, 0.99) to nearly completely, and (0.97, 1.0, 1.0, 1.0) to completely.
160 Markus Schneider So far, the predicate γ pf is only true if µγ yields 1. We can relax this strict condition by defining: γ pf (A, B) = true
:⇔
(µγ ◦ pf )(A, B) > 0
In a spatial database system this gives us the chance also to take the transition zones into account and to let them make the predicate γ pv true. When evaluating a spatial selection or join in a spatial database system on the basis of a vague topological predicate, we can even set up a weighted ranking of database objects satisfying the predicate γ pv at all and being ordered by descending membership value 1 ≥ µγ (x) > 0 for some value x ∈ [0, 1]. A special, optional vague quantifier, denoted by at all, represents the existential quantifier and checks whether a predicate pv can be fulfilled to any extent. An example query is: “Do regions A and B (at all ) overlap?” With this quantifier we can determine whether µγ (x) > 0 for some value x ∈ [0, 1].
5 Querying In this section we briefly demonstrate with a few example queries how spatial data types and quantified vague topological predicates can be integrated into an SQL-like spatial query language. It is not our objective to give a full description of a specific language. We assume a relational data model where tables may contain regions as attribute values in the same way as integers or strings. What we need first are mechanisms to declare and to activate userdefined vague quantifiers. These mechanisms should allow the user to specify trapezoidal-shaped and triangular-shaped membership functions as well as partitions. In general, this means to define a (possibly overlapping) classification, which for our example in Section 4 could be expressed by the user in the following way: create classification fq (not a little bit somewhat slightly quite mostly nearly completely completely
Such a classification could then be activated by set classification fq We assume that we have a relation pollution, which stores among other things the geometry of polluted zones as regions, and a relation areas, which keeps
Vague Topological Predicates through Metric Refinements 161 information about the use of land areas and which stores their spatial extent as regions. A query could be to find out all inhabited areas where people are rather endangered by pollution. This can be formulated in an SQL-like style as (we here use infix notation for the predicates): select areas.name from pollution, areas where area.use = inhabited and pollution.region quite overlaps areas.region This query and the following two ones represent vague spatial joins. Another query asks for those inhabited areas lying almost entirely in polluted areas: select areas.name from pollution, areas where areas.use = inhabited and areas.region nearly completely inside pollution.region Assume that we are given living spaces of different animal species in a relation animals and that their indeterminate extent is represented as a vague region. Then we can search for pairs of species which share a common living space to some degree: select A.name, B.name from animals A, animals B where A.region at all overlaps B.region As a last example, we can ask for animals that usually live on land and seldom enter the water or for species that never leave their land area (the built-in aggregation function sum is applied to a set of vague regions and aggregates this set by repeated application of vague geometric union): select name from animals where (select sum(region) from areas) nearly completely covers or completely covers region
6 Conclusions In this paper we have presented a simple but expressive and effective concept showing how metric details can be leveraged to make important semantic distinctions of topological relationships on simple regions. The resulting vague topological predicates are often more adequate for expressing a spatial situation than their coarse, strict counterparts, because they are multi-faceted and
162 Markus Schneider much nearer to human thinking and questioning. Consequently, they allow a much more natural formulation of spatial queries than we can find in current spatial query languages. We are currently working on a prototype implementation for demonstrating the concepts presented in this paper and validating their relevance to practice. In the future we plan to extend the concept of metric refinement to complex regions. We will also investigate metric refinements between two complex line objects and between a complex line object and a complex region object.
References Behr, T. & Schneider, M. (2001), Topological Relationships of Complex Points and Complex Regions, in ‘Int. Conf. on Conceptual Modeling’, pp. 56–69. Clementini, E., Di Felice, P. & Oosterom, P. (1993), A Small Set of Formal Topological Relationships Suitable for End-User Interaction, in ‘3rd Int. Symp. on Advances in Spatial Databases’, LNCS 692, pp. 277–295. Cui, Z., Cohn, A. G. & Randell, D. A. (1993), Qualitative and Topological Relationships, in ‘3rd Int. Symp. on Advances in Spatial Databases’, LNCS 692, pp. 296–315. Egenhofer, M. J., Frank, A. & Jackson, J. P. (1989), A Topological Data Model for Spatial Databases, in ‘1st Int. Symp. on the Design and Implementation of Large Spatial Databases’, LNCS 409, Springer-Verlag, pp. 271–286. Egenhofer, M. J. & Shariff, A. R. (1998), ‘Metric Details for Natural-Language Spatial Relations’, ACM Transactions on Information Systems 16(4), 295–321. Goyal, R. & Egenhofer, M. (2004), ‘Cardinal Directions between Extended Spatial Objects’, IEEE Trans. on Knowledge and Data Engineering . In press. Hernandez, D., Clementini, E. C. & Di Felice, P. (1995), Qualitative Distances, in ‘2nd Int. Conf. on Spatial Information Theory’, LNCS 988, Springer-Verlag, pp. 45–57. Peuquet, D. J. & Xiang, Z. C. (1987), ‘An Algorithm to Determine the Directional Relationship between Arbitrarily-Shaped Polygons in the Plane’, Pattern Recognition 20(1), 65–74. Schneider, M. (1997), Spatial Data Types for Database Systems - Finite Resolution Geometry for Geographic Information Systems, Vol. LNCS 1288, SpringerVerlag, Berlin Heidelberg. Schneider, M. (1999), Uncertainty Management for Spatial Data in Databases: Fuzzy Spatial Data Types, in ‘6th Int. Symp. on Advances in Spatial Databases’, LNCS 1651, Springer-Verlag, pp. 330–351. Schneider, M. (2001a), A Design of Topological Predicates for Complex Crisp and Fuzzy Regions, in ‘Int. Conf. on Conceptual Modeling’, pp. 103–116. Schneider, M. (2001b), Fuzzy Topological Predicates, Their Properties, and Their Integration into Query Languages, in ‘ACM Symp. on Geographic Information Systems’, pp. 9–14. Vazirgiannis, M. (2000), Uncertainty Handling in Spatial Relationships, in ‘ACM Symp. for Applied Computing’.
Fuzzy Modeling of Sparse Data Angelo Marcello Anile1 and Salvatore Spinella2 1 2
Dipartimento di Matematica ed Informatica, Universit` a di Catania, Viale Andrea Doria 6, 90125 Catania Italy, [email protected] Dipartimento di Linguistica, Universit` a della Calabria, Ponte Bucci Cubo 17B, 87036 Arcavacata di Rende, Italy, [email protected]
Abstract In this article we apply fuzzy bspline reconstruction supplemented by fuzzy kriging to the problem of constructing a smooth deterministic model for environmental pollution data. A method to interrogate the model will also be discussed and applied. Keywords: Uncertain, Fuzzy Number, Fuzzy Interpolation, Fuzzy Queries, Fuzzy Kriging, B-spline, Sparse Data, Environment Pollution.
1 Introduction Geographical data concerning environment pollution consist of a large set of temporal measurements (representing, e.g. hourly measurements for one year) at a few scattered spacial sites. In this case the temporal data at a given site must be summarized in some form in order to employ it as input to build a spatial model. Summarizing the temporal data (data reduction) will necessarily introduce some form of uncertainty which must be taken into account. Statistical methods reduce the data to some moments of the distribution function as means and standard deviations, but these procedures rely on statistical assumptions on the distribution function, which are hard to verify in practice. In the general case, without any special assumption on the distribution function, statistical reduction can grossly misrepresent the data distribution. An alternative way is to represent the data with fuzzy numbers, which has the advantage of keeping the full data content (conservatism) and also of leading to computationally efficient approaches. This method has been employed for ocean floor geographical data by [Patrikalakis et al 1995] (in the interval case) and [Anile 2000] (for fuzzy numbers) and to environmental pollution data by [Anile and Spinella 2004]. Once the temporal data at the given sites have been summarized with fuzzy
164 Angelo Marcello Anile and Salvatore Spinella numbers then it is possible to resort to fuzzy interpolation techniques in order to build a mathematically smooth deterministic surface model representing the spacial distribution of the quantity of interest. An alternative approach would be to employ fuzzy kriging which will build a stochastic model. However our aim is to construct a smooth deterministic model, because this could be used for simulation purposes. We shall use fuzzy kriging only to estimate the missing information which is required just outside the domain boundary, as we shall see, in order to build a consistent deterministic model.
2 Fuzzy representation 2.1 Modeling observations Let O be a sequence of n observational data in a domain X ⊆ R2 in the form O = {(x1 , y1 , Z1 ), . . . , (xi , yi , Zi ), . . . , (xn , yn , Zn )}
(1)
with (xi , yi ) ∈ X
Zi = {zi,1 . . . zi,mi }
(2)
where zi,j ∈ R represents the j-th observation at the point (xi , yi ). By introducing a fuzzy approach [Anile 2000][Lodwick and Santos 2002] that represents the datum Zi with an appropriately constructed fuzzy number it is possible to preserve both the data distribution and their quality. Here fuzzy numbers [Kauffman and Gupta 1991] are defined as maps that associate to each presumption level α ∈ [0, 1] a real interval Aα such that α > α ⇒ Aα ⊆ Aα
(3)
and the latter property is formally called convex hull property. By utilizing one of several methods for constructing fuzzy sets membership functions [Gallo et al. 1999] from Zi one can represent the n observational data as FO = {(x1 , y1 , z1 ), . . . , (xi , yi , zi ), . . . , (xn , yn , zn )} (4) where zi ∈ F (R) is the fuzzy number representing the observations at the point (xi , yi ). For computational purposes fuzzy numbers are represented in terms of a finite discretization of α-levels, which is a natural generalization of intervals and a library for arithmetic operations on them has been implemented in [Anile et al. 1995].
Fuzzy Modeling of Sparse Data 165 2.2 Fuzzy B-splines We introduce fuzzy B-spline as follow [Anile 2000]: Definition 1 (B-spline fuzzy). A fuzzy B-spline F (t) relative to the knot sequence (t0 , t1 , . . . , tm ), m = k + 2(h − 1), is a function of the kind F (t) : R → F (R) defined as F (t) =
h+h−1
Fi Bi,h (t)
(5)
i=0
where the control coefficients Fi are fuzzy numbers and Bi,h (t) real B-spline basis functions [DeBoor 1972]. Notice that definition (1) is consistent with the previous definitions and more precisely for any t, F (t) is a fuzzy number, i.e. it verifies the convex hull property (3): α > α ⇒ F (t)α ⊆ F (t)α because the B-spline basis are non negative. The generalization in 2D of a fuzzy B-spline relative to a rectangular grid of M × N knots is f (u, v) =
M −1 N −1
Fi,j Bi,h (u)Bj,h (v)
(6)
i=0 j=0
with the same properties as above described. Similar considerations in the more general framework of fuzzy interpolation can be found in [Lodwick and Santos 2002]. Here we proceed with the construction of a fuzzy Bspline approximation following the approach already expounded in the detail in [Anile and Spinella 2004]. 2.3 Constructing fuzzy B-spline surfaces Let us consider a sequence of fuzzy numbers representing the observations (4). If a fuzzy B-spline F (u, v) on a rectangular grid G ⊇ X of M × N knots approximates FO (4) then ∀α ∈ [0, 1]
[ zi ]α ⊆ [F (xi , yi )]α
and furthermore one must also have
i = 1...n
(7)
166 Angelo Marcello Anile and Salvatore Spinella
M −1 N −1
G i=0 j=0
u l ([Fi,j ]α − [Fi,j )]α Bi,h (u)Bj,h (v)dudv ≤
M −1 N −1 G i=0 j=0
u l ([Yi,j ]α − [Yi,j ]α )Bi,h (u)Bj,h (v)dudv
∀{Yi,j }i=0...N −1,j=0...M −1 ∈ F (R)
∀α ∈ [0, 1]
(8)
where [F l ]α and [F u ]α indicate respectively the lower and upper bound of the interval representing the fuzzy number α-level. More precisely, for each presumption α-level, the volume encompassed by the upper and lower surface of the fuzzy B-spline is the smallest, which corresponds to minimizing the uncertainty propagation. These definitions are the generalization to the fuzzy case of the corresponding interval ones [Patrikalakis et al 1995]. Notice that the integral in a rectangular domain of a real B-spline is a linear expression.
−1 M −1 N
D i=0 j=0
Pi,j Bi,h (u)Bj,h (v)dudv =
M −1 N −1 i=0 j=0
Pi,j
(ti+h − ti )(sj+h − sj ) h2 (9)
where obviously {(ti , sj )}i=0...M +h−1,j=0...N +h−1 are the grid knots. Therefore, given the set FO of observations in (4) and a finite number P +1 of presumption levels α0 > α1 > . . . > αP the construction of a fuzzy B-spline requires the solution of the following constrained optimization problem: ⎧ P M −1 N −1 u (ti+h −ti )(sj+h −sj ) l ⎪ ⎪ min k=0 i=0 j=0 ([Fi,j ]αk − [Fi,j ]αk ) h2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ l l l u ⎪ ⎪ [Fi,j ]α0 ≤ [Fi,j ]α1 ≤ . . . ≤ [Fi,j ]αP ≤ [Fi,j ]αP ≤ . . . ⎪ ⎪ ⎪ ⎨ u u ]α1 ≤ [Fi,j ]α0 i = 0 . . . M − 1 j = 0 . . . N − 1 . . . ≤ [Fi,j ⎪ ⎪ ⎪ ⎪ ⎪ M −1 N −1 u u ⎪ ⎪ r = 1, . . . , n k = 0, . . . , P ⎪ i=0 j=0 [Fi,j ]αk Bi,h (xr )Bj,h (yr ) ≥ [zr ]αk ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ M −1 N −1 l l r = 1, . . . , n k = 0, . . . , P i=0 j=0 [Fi,j ]αk Bi,h (xr )Bj,h (yr ) ≤ [zr ]αk (10) For the problem (10) one notices that, the objective function minimizes the uncertainty of the representation.
Fuzzy Modeling of Sparse Data 167
3 Fuzzy Kriging and boundary condition In the previous paragraph we expounded the method for constructing a fuzzy surface approximating, in a well defined way the fuzzy numbers representing the data sets. The quality of this approximation will deteriorate the farther one is from the sites of the data sets. Therefore one expects the constructed approximation not to be very satisfactory at the border of the domain within which the data sites are comprised. To remedy such drawbacks one has to introduce further information regarding the decay of the quantity of interest away from the domain. A simple approach would be to assume that the quantity of interest decays to zero outside the boundary but this is hardly justifiable. However a better treatment would be to construct fictitious observation data just outside the boundary by utilizing statistical kriging. The latter approach is more realistic because, in some sense, it amounts to an extrapolation driven by the data. By ordinary spatially distributed data and with stationary hypothesis of the distribution, the kriging combines the available data by weights in order to construct an unbiased estimator with minimum variance. Likewise such approach is extended to the fuzzy case [Diamond 1989]. Given N spatially distributed fuzzy data {(x1 , y1 , z1 ), . . . , (xi , yi , zi ), . . . , (xn , yn , zn )} one looks for an estimator constructed by a linear combination of weights λi to evaluate the distribution in the point (x, y) like ∗ = Z
N
λi zi
(11)
i=1
∗ like ranNotice that the above interprets fuzzy data zi and estimator Z dom fuzzy number. In order to construct this estimator the follow hypothesis must be satisfied ∗ ) = E(Z) = E(Z(x + h, y + k) = r (K1) E(Z ∗ 2 , Z) must be minimized (K2) Ed(Z (K3) λi ≥ 0, i = 1..N
(12)
N The first condition (K1) implies i=1 λi = 1, instead the last one (K3) guarantees the estimator stands in the cone of random fuzzy number generated by the data. It can be proved that ∗ , Z) 2 = N λi λj Cα (xi , yi , xj , yj )+ Ed(Z i,j=1 α N −2 i=1 λi α Cα (xi , yi , x, y)+ + α Cα (x, y, x, y)
(13)
168 Angelo Marcello Anile and Salvatore Spinella Where Cα is a positive defined function that represents a covariance. The minimization of such function together to the hypothesis (K1) and (K3) leads to a constrained minimization problem. It is solvable formulating everything in terms of Kuhn-Tucker conditions by the follow theorem ∗ be an estimator for Z of the form Z ∗ = N λi zi . Theorem 1. Let Z i=1 Suppose the conditions (K1) and (K3) are satisfied and the matrix defined by Γij = α Cα (xi , yi , xj , yj ), i, j = 1..N is strictly positive defined. Then there ∗ satisfying the (K2) condition. exists a unique linear unbiased estimator Z Moreover the weight satisfies the follow system N Γij λij − Lj − µ = α Cα (xi , yi , x, y) i=1 N λi = 1 (14) i=1 N i=1 Li λi = 0 Li , λi ≥ 0, i = 1..N The residual is σ2 =
Cα (x, y, x, y) + µ −
α
N i=1
λi
Cα (xi , yi , x, y)
(15)
α
The above problem defines the weights λi and it can be solved using a method to manipulate the constrains in (14) like the Active Set Method [Fletcher 1987]
4 Fuzzy B-spline Interrogation 4.1 Inequality relationship between fuzzy Number In order to represent the inequality relationship it is convenient to introduce the definition of overtaking between fuzzy numbers. The concept of overtaking between fuzzy number was been introduced in [Anile and Spinella 2004]. We start with overtaking between intervals Definition 2 (Overtaking between intervals). The overtaking of interval A with respect to interval B is the real function σ : I(R)2 → R defined as: ⎧ u l ⎨0 u l A ≤ B A −B (16) σ(A, B) = width(A) Au > B l ∧ Al ≤ B l ⎩ 1 Al > B l where width(A) is the width of interval A. Figure 1 clarifies the above definition. The overtaking of A with respect to B is 0, of B with respect to C is 35 while of D with respect to C is 1. From this definition one can define the δ-overtaking operator as:
Fuzzy Modeling of Sparse Data 169
Fig. 1. Interval A overtakes B by 0, B overtakes C by by 1
3 5
and finally D overtakes C
Definition 3 (δ-overtaking operator between intervals). Given two intervals A, B and a real number δ ∈ [0, 1] then A overtakes B by δ if σ(A, B) ≥ δ, i.e (17) A ≥δ B ⇐⇒ σ(A, B) ≥ δ Likewise we can define the overtaking between fuzzy numbers as follows. Definition 4 (Overtaking between fuzzy numbers). One defines over with respect to the fuzzy number B the real taking of the fuzzy number A 2 function σ : F (R) → R defined as 1 α , [B] α )w(α)dα σ([A] (18) σ(A, B) = 0
where w : [0, 1] → R is an integrable weight function.
Fig. 2. The fuzzy number A overtakes B by 0, B overtakes C by about 0.61, finally D overtakes C by 1
Figure 2 clarifies the above definition. The overtaking of A with respect to B is 0, of B with respect to C is about 0.61 while of D with respect to C is 1. Likewise one can define the operator of δ-overtaking between fuzzy numbers as follows.
170 Angelo Marcello Anile and Salvatore Spinella Definition 5 (δ-overtaking operator between fuzzy numbers). Given B ∈ F (R) and a real number δ ∈ [0, 1] then A overtakes B by δ if σ(A, B) ≥ A, δ, i.e ⇐⇒ σ(A, B) ≥δ ≥δ B A
(19)
4.2 Query by sampling By utilizing the definitions of overtaking previously given it is possible to phrase the fuzzy or interval B-spline surface interrogation as a global search in the domain G where the fuzzy or interval B-spline surface is defined. For instance interrogating the fuzzy B-spline surface in order to find the regions Xδ ⊆ G where F (x, y) overtakes z by δ amounts to Xδ = {(x, y)|F (x, y) ≥δ z} = {(x, y)|σ(F (x, y), z) ≥ δ}
(20)
Let us consider now an algorithm of global search that divides the given domain G in rectangles Rp and samples in (xp , yp ) ∈ Rp by considering as objective function to maximize the overtake sp = δ(F (xp , yp ), z). By iterating until a stop criterion is satisfied one could then with the parameter ε divide G into the two sets:
s <δ = G \ X s ≥δ = s ≥δ X X Rp (21) p p p sp ≥δ
and consider the first set as an approximate solution (within δ accuracy) of the query. s ≥δ Xδ X p
(22)
5 Fuzzyfication and Map Construction The construction of a fuzzy map consists of the following steps. has been constructed preserving Fuzzyfication of data: the fuzzy number Z the convex hull (3) property as a map (α-level, interval) which associates to each α-level the smallest median interval which contains a fraction (1 − α) of the set of data. Kriging: outside the region of measured datavirtual” observations are estimated by kriging in order to complete the information needed for the approximation procedure. Approximation: a smooth and deterministic model is fitted by a fuzzy bspline.
Fuzzy Modeling of Sparse Data 171 For kriging step the following correlation function was chosen h
C(h) = C0 − e− α
(23)
where C0 is the variance of the fuzzy distribution. This is a simple assumption for an isotropic distribution of the data. Moreover, notice that the choice of an anisotropic correlation function for the kriging procedure can be used to take into account more complex characteristics of the region like orography and microclimate. The data for CO referring to the city of Catania in the year 2001 have been represented by 17 fuzzy numbers through the α-levels construction previously introduced. Each fuzzy number represents the measures of a sensor placed in the city. Eight kriged points was added at the vertexes and at the sides of the map in order to make an estimation on the bound of the map. Then we have chosen a regular grid 6 × 8 and constructed the fuzzy B-spline surface of order 4 by solving the problem (10). At last the fuzzy map was interrogated at growing level of CO. The figure 3 shows on the z axis the level of CO pollution exceeded in the mapped urban area (in the sense of the 19).
6 Conclusion In this case the fuzzy kriging procedure has been used only to supply missing information just outside the boundary of the region of interest. However one could envisage a hybrid method where the kriging procedure would create a regularly spaced set of fictitious data observations in the form of fuzzy numbers which afterward would be approximated by a fuzzy bspline in order to construct an viable smooth deterministic model.
Acknowledgements We thank dr. C. Oliveri of the Catania city environmental office for providing the data set.
References [Anile et al. 1995] A. M. Anile, S. Deodato, G. Privitera, Implementing Fuzzy Arithmetic, Fuzzy Sets and Systems, 72, p.239, 1995. [Anile 2000] A. M. Anile, B. Falcidieno, G. Gallo, M. Spagnuolo, S. Spinello, Modeling uncertain data with fuzzy B-spline, Fuzzy Sets and System 113 (2000) 397-410.
172 Angelo Marcello Anile and Salvatore Spinella
Fig. 3. The z axis represents the level of pollution defuzzyfied by overtaking
[Anile and Spinella 2004] A. M. Anile, S. Spinella, Modeling Uncertain Sparse Data with Fuzzy B-spline, accepted for publication on Reliable Computing. [DeBoor 1972] C. DeBoor, On Calculating with B-splines J. Approx. Theory 6 (1972) 50-62. [Diamond 1989] P. Diamond, Fuzzy Kriging, Fuzzy Sets and Systems, 33 (1989) 315-332. [Fletcher 1987] R. Fletcher, Pratical Methods of Optimization, Wiley, 1987. [Gallo et al. 1999] G. Gallo, I. Perfilieva, M. Spagnuolo, S. Spinello, Geographical Data Analysis via Mountain Function International Journal of Intelligent Systems, 14, p.359-373, 1999. [Kauffman and Gupta 1991] A. Kauffman and M. M. Gupta, Introduction to Fuzzy Arithmetic: Theory and Applications, Van Nostrand Reihnold, New York, 1991. [Lodwick and Santos 2002] W. A. Lodwick and Jorge Santos, Constructing Consistent Fuzzy Surface From Fuzzy Data, Fuzzy Sets and Systems, 135 p. 259-277, 2002. [Patrikalakis et al 1995] N. M. Patrikalakis, C. Chryssostomidis, S. T. Tuohy, J. G. Bellingham, J. J. Leonard, J. W. Bales, B. A. Moran, J. W. Yoon, virtual environment for ocean exploration and visualization, Proc. Computer Graphics Technology for Exploration of the Sea, CES’95, Rostock, May 1995.
Handling Spatial Data Uncertainty Using a Fuzzy Geostatistical Approach for Modelling Methane Emissions at the Island of Java Alfred Stein and Mamta Verma International Institute for Geo-Information Science and Earth Observation (ITC), PO Box 6, 7500 AA Enschede, The Netherlands (email: [email protected]); Indian Institute of Remote Sensing (IIRS), Dehradun, India (email: [email protected])
Abstract Handling uncertain spatial data and modelling of spatial data quality and data uncertainty are currently major challenges in GIS. Geodata usage is growing, for example in agricultural and environmental models. If the data are of a low quality, then model results will be poor as well. An important issue to address is the accuracy of GIS applications for model output. Spatial data uncertainty models, therefore, are necessary to quantify the reliability of model results. In this study we use a combination of fuzzy methods within geostatistical modelling for this purpose. The main motivation is to jointly handle uncertain spatial and model information. Fuzzy set theory is used to model imprecise variogram parameters. Kriging predictions and kriging variances are calculated as fuzzy numbers, characterized by their membership functions. Interval width of predictions measures the effect of variogram uncertainty. The methodology is applied on methane (CH4) emissions at the Island of Java. Kriging standard deviations ranged from 12 to 26.45, as compared to ordinary kriging standard deviations, ranging from 12 to 33.11. Hence fuzzy kriging is considered as an interesting method for modeling and displaying the quality of spatial attributes when using deterministic models in a GIS. Keywords: spatial data uncertainty, fuzzy variogram, fuzzy kriging, methane, java
174
Alfred Stein and Mamta Verma
1 Introduction Handling spatial data quality and uncertainty is currently a major challenge in GIS. Studies on spatial data quality provide information on the fitness-for-use of a spatial database. These describe why, when and how data are created, how accurate the data are and how well they correspond to the physical meaning they represent. These further describe the purpose and usage of the data. Data quality specifically aims at lineage, positional accuracy, attribute accuracy, logical consistency and completeness of the data (Guptill and Morrison, 1996; Goodchild and Jeansoulin, 1998). Concern about uncertainty in spatial data is not new. For example, data errors and uncertainties in GISs were identified in the research agenda of the National Center for Geographic Information and Analysis (NCGIA) since 1989 as one of the most important impediments to the successful implementation of GISs (NCGIA 1989). Agricultural and environmental models are increasingly applied in a GIS context. If the data are of a low quality, then model results will be poor as well. If these models are used for decision support then the data need to be as accurate as possible. A powerful capability of GIS, particularly in earth and environmental sciences, is its capacity to derive new attributes from attributes already present in a GIS database (Burrough, 2001). Spatial data uncertainty is a particular aspect of spatial data quality that focuses on positional and the attribute uncertainty. It describes it in terms of relative precision, measurement precision and modelling accuracy. No layer in a GIS is truly free from error, such as statistical variation. When maps from a GIS database are used as input into a GIS operation, then errors in the input will propagate to the output of the operation. One way of handling uncertainty is with the spatial data uncertainty models that use the error propagation odels (Heuvelink, 1998). Error propagation can only be used if the input errors to the analysis are available, which in practice often will only only crudely and incompletely be the case. In addition, when no record is kept of the accuracy of intermediate results, it becomes extremely difficult to evaluate the accuracy of the final result. Also, no professional GIS currently in use can present the user with information about the confidence limits that should be associated with the results of an analysis (Heuvelink, 1998). Fuzzy theory and applications have found widespread use in the past to overcome vagueness in data, such as incomplete definitions and concepts, approximate measurements and vague reasoning (Bezdek, 1981). In this sense, fuzzy theory is a useful addition to statistical methods, in particular
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
175
since it has been possible to handle fuzziness in a quantitative way, using membership functions. An interesting development concerns fuzzy kriging. Fuzzy kriging includes data and information of restricted quality into an interpolation procedure by considering fuzziness of variogram parameters. Bardossy et al. (1990a,b) calculated kriged values and estimation variances as fuzzy numbers and characterized by their membership functions. Membership functions create an uncertainty measure, which depends both on homogeneity and configuration of the data. Transition of laboratory data to the field situation, followed by extraction of useful information from field analysis requires a quantitative approach to integrate and propagate the uncertainty derived from physical and temporal heterogeneity between defined and open systems. Geostatistics allows quantify the uncertainty associated with this transition (Chilès and Delfiner, 1999). Combination of geostatistical modelling and fuzzy set methods allows to handle these different types of uncertain information (Zadeh, 1965; Klir and Folger, 1988; Bardossy et. al, 1990a). Diamond and Kloeden (1989) have used fuzzy valued random functions in kriging. In calculation of the reliability of estimation, variogram models play a critical role. In this study fuzzy set theory is used to model imprecise variogram parameters in relation to an agricultural model. The objective of this study is to combine fuzzy methods with geostatistics of environmental modelling data. In particular, we study handling of different types of uncertain information. The study is illustrated with modelling of methane (CH4) emissions in rice paddies at the isle of Java.
2 Materials and Methods Fuzzy kriging includes fuzzy methods within a geostatistical modelling to handle the spatial data uncertainty (Bardossy et al., 1990a). The main motivation is to jointly handle different types of uncertain information like uncertainty in the variogram parameters and uncertainty in the variogram model. Fuzzy set theory is used to model imprecise variogram parameters, yielding fuzzy variograms. These are used in turn in fuzzy kriging. Both predictions and estimation variances are obtained as fuzzy numbers and are characterized by their membership functions. The interval width of the kriged values quantifies the effect of variogram uncertainty. This measure of uncertainty both follows from variogram uncertainty and depends on the actual measurement values.
176
Alfred Stein and Mamta Verma
2.1 Preliminaries We consider spatial data z(x1),…,z(xn), collected at supports V with central points x1,…,xn within a space X. These data can be either measurements or modelled results using collected data. The support can be either a point or an area with a positive size |V|. Spatial variability is in this study described by the variogram. For each pair of points z(xi),z(xi+h) separated by the distance h the sum of squared pair differences yields the estimated 1 variogram Jˆ (h) ( z ( x ) z ( x h)) 2 , where N(h) equals the number of 2 N (h)
¦
i
i
points pairs at separation distance h and summation is done for i=1 to N(h). Use of approximate distances hj ensures that sufficient pairs of points (N(h) > 30) are obtained for each distance class. Through the pairs hj, Jˆ (h j ) a parametric model is fitted. Common models are the exponential model Ȗe(h) = ș1 + ș2·(1-exp{-h/ș3) and the spherical model Ȗs(h) = ș1 + ș2·(3/2 h/ș3-1/2 (h/ș3)3). These models apply for h > 0, whereas Ȗe(0) = Ȗs(0) = 0 and Ȗs(h) = ș1 + ș2 for h > ș3. Both models depend upon a vector Ĭ of k = 3 parameters. In this study this will be assumed to belong to a fuzzy set {Ĭ}, in contrast to a crisp set as commonly in ordinary kriging. The set {Ĭ} is assumed to be compact. Kriging is equivalent to predicting values at an unvisited location. Let the vector Ȗ contain evaluations of the variogram from observation locations to the prediction location, and let the matrix ī contain those among the observation locations. The kriging equations equal t
Eˆ J ' * 1 ( z Eˆ1n ) K [V , 4, {xi }i
1,... n , { z ( x i )}i
(1) 1,..., n ]
where 1n is the vector of n elements, all equal to 1, z = (z(x1),…,z(xn))’ is the column vector containing the n observations, and Eˆ G 1n ' * 1 z is the spatial mean, with G 1n ' * 11n 1 . The symbol K is used to emphasize that kriging is an operator on the support, the parameters, the configuration of observation points and the observations themselves. As Eˆ is linear in the observations z(xi), the kriging predictor is also linear in the observations. In addition, it the minimum prediction error variance, the so-called kriging variance can be expressed as an operator U on the support set V, the set Ĭ of variogram parameters and the configuration set of observation locations xi, i = 1,…,n, but not on the observations 1themselves as
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics Var (t z ( x0 ))
J ' * 1J x a ' Gx a U [V , 4, {xi }i
1,...n
177
(2) ]
where x a 1 1n ' * 1 z . The kriging variance does not contain the observations, but relies on C, being the configuration of data points and prediction location, and on the variogram. Both the data and the variogram are commonly assumed to be without errors. 2.2 Fuzzy kriging Fuzzy kriging is an extension of equations (1) and (2) as it takes uncertainty in the variogram parameters into account. To define fuzzy kriging, we need the extension principle. Consider the set of fuzzy sets on Ĭ, denoted as F(Ĭ)p. A function g: Ĭp Æ T shall become extended to a function gˆ : F (4) p o F (T ) by defining for all fuzzy sets Ĭ1,…,Ĭp from F (4) p a set B : gˆ : (41...4 p ) with P B (t )
sup T1 ...T p 4 t g (T1 ...T p )
^
`
min P41 (T1 ),..., P4 p (T p )
for all t T. Using the extension principle for the kriging operator K, the membership value for any real number t resulting from the kriging equation (1) is: P K (t )
sup^P 4 (T ); t K [V , T , {xi }i 1,...,n , {z ( xi )}i ® 0 if the above set is empty ¯
1,..., n
]`
(3)
Here, P4 (T ) is the membership function for the fuzzy subset of variogram parameters. This equation defines a membership function on the set of real numbers. The fuzzy number corresponding to this membership function is the fuzzy kriging prediction. Use of the extension principle with operator U results in an estimation variance expressed as a fuzzy number. Similarly, the membership value of any real number s2 for the estimation variance is defined as:
^
sup P 4 (T ); s 2
PU ( s 2 ) ® ¯
U [V , T , {xi }i
1,...,n
0 if the above set is empty
`
]
(4)
178
Alfred Stein and Mamta Verma
The above fuzzy set is a fuzzy number. In this way the uncertainty (imprecision) of the variogram model is transferred to the kriging estimate and estimation variance. 2.3 Implementation At each location, fuzzy kriging results into fuzzy numbers for the Kriging estimate and estimation variance. However, the definition of the membership functions (eqs. (1) and (2)) as suprema is not convenient for computational purposes because it would require for each possible real number t an optimization algorithm to retrieve the membership value µ. To simplify calculations, instead of assigning a membership value to every specific value t, a value t could be assigned to each specific membership value µ. If the fuzzy set Ĭ is connected, this can be done by using the fact that level sets of fuzzy numbers are intervals. For fuzzy kriging, the endpoints of these intervals can be determined with the help of two optimization problems; namely, for any selected membership level 0 < t < 1 find: min K [V , T , {xi }i
1,..., n
, {z ( xi )}i
1,..., n
]
subject to the constraint µĬ(ș) t t and find: max K [V , T , {xi }i
1,...,n
, {z ( xi )}i
1,..., n
]
subject to the same constraint. Results of these optimizations provide end points of intervals Rt. These optimizations for a selected finite set of t are sufficient because, by virtue of convexity, they provide greater and lesser bounds for membership of intermediate values. The particular shape of the membership function can be taken into account by repeatedly changing the support of fuzziness for each prediction, yielding a full membership function in the end. The same is done to calculate fuzzy estimation variances.
3 Study area and data used The methodology is applied on a dataset from the Island of Java (fig. 1), containing modelled methane data (Van Bodegom et al., 2001, 2002). The isle of Java is approximately 280 km wide and more than 1000 km long. It
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
179
is the most densely populated island of Indonesia. The island is transversed from east to west by a volcanic mountain chain, but more than 60% of the 130,000 km2 island has been cultivated. In general the soil is physically and chemically fertile and alluvium. The samples used in this study were collected within the rice growing areas of Java. Methane emission is measured at a plot scale of 1 m2 using a closed chamber technique. Soil samples are analyzed by Center for Soil and Agro climate Research (CSAR) in Bogor, Indonesia. Soil organic carbon was estimated from soil profile data in the world inventory of soil emission potentials database, which was directly available at CSAR. Methane (CH4) emissions are an important greenhouse gases and rice paddy fields are among the most important sources of atmospheric methane, and these methane emissions are accounting for 15-20% of the radiative forcing added to the atmosphere (Houghton et al, 1996). Precise estimates of global methane emissions from rice paddies are, however, not available and depend on the approaches, techniques and databases used. One principal cause for uncertainties in global estimates results from the large, intrinsic spatial and temporal variability in methane emissions. Methane emissions from rice fields are strongly influenced by the presence of the rice plants. Methane emissions are higher with rice plants than without (Holzapfel-Pschorn and Seiler, 1986), and methane emissions are dominated by plant-mediated emissions (Schütz et al., 1989). The rice crop influences the processes underlying methane emissions via its roots.
Figure 1: Map of Java Island showing the distribution of the modeled methane emissions through out the paddy fields.
180
Alfred Stein and Mamta Verma
4 Results
4.1 Descriptive statistics Statistics of the variables of the data set are shown in Table 1. Mean (44.8) and median (47.8) are relatively close to each other, and the standard deviation (26.6) is approximately equal to half the value of the mean. Such a coefficient of variation shows that the distribution may be skewed, in this case to the right. In fact, high CH4 emission values (maximum value equals 192.3) occur.
minimum maximum mean median variance
x(m)
y(m)
149066 1085455
71832 351632
Methane (g m-2 season-1) 2.22 192 47.8 44.8 707
Table 1: Descriptive statistics for the study area.
Empirical variograms were constructed and both an exponential and a spherical model were fitted. The exponential model gave the best fit, yielding parameter values equal to 4ˆ (Tˆ1 , Tˆ2 , Tˆ3 )' (37.7, 296.6, 350.0)' , where Tˆ corresponds with the nugget, Tˆ with the sill and Tˆ with the range. 1
2
3
From the variogram fit we notice that the estimated parameters are relatively uncertain, and we interpreted these as fuzzy numbers. In the subsequent analysis we used a fixed value for the nugget, we allowed a spread of 10 km in the range parameter and a spread of 25 g2 m-4 season-2 in the sill, leading to a nugget Tˆ1 equal to (350.0), a fuzzy value for Tˆ2 equal to (27.7,37.7,47.7) and a fuzzy value for Tˆ equal to (271.6,296.6,321.6). As 3
concerns notation, fuzzy numbers y are denoted as triples (y1,y2,y3), where the membership function µ(x) equals 1 at y2, equals 0 for x < y1 as well as for x > y3. A triangular membership function is applied, i.e.
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
181
P ( x) 0 if x y1 P ( x) P ( x) P ( x)
x y1 if y1 d x y2 y2 y1 y3 x if y2 d x y3 y3 y2 0 if y3 d x
(5)
4.2 Fuzzy kriging Fuzzy kriging is applied in two steps. First, a fuzzy prediction is made at an arbitrary point, in this case the point with coordinates (664,200). Fuzzy variograms were applied (fig 2). Both the fuzzy prediction and the fuzzy standard deviation (fig. 3) were calculated on the basis of occurring values from the fuzzy variograms, using equations (1) and (2). The kriging prediction shows a clear maximum well within the space governed by the uncertainties in the parameters ș2 and ș3, i.e. with estimated values equal to Tˆ2 = 321.6 and Tˆ3 = 46.286. Minimum value and both minimum and maximum values for the kriging standard deviation occur at the edges of the ș2 u ș3 space.
Figure 2: Fuzzy variograms
Next, maps were created showing the fuzzy kriged prediction and the fuzzy kriging standard deviation (figs. 4, 5). The kriged values remain relatively stable under the fuzziness in the variogram values, as the three maps showing the endpoints of the fuzzy intervals are almost identical. The kriging standard deviation, however, is more sensitive to changes in the variogram parameters. We notice that the lowest map, showing the left points of the fuzzy intervals, has a much lighter color than the upper map, showing the right points of the fuzzy intervals. This yields an average (fuzzy) value on the map equal to (23.20,24.16,24.34).
182
Alfred Stein and Mamta Verma
By means of the use of fuzzy variograms in the case of methane it was possible to obtain kriging standard deviations in the range of 12 to 26.45 as compared with the ordinary kriging standard deviations that was 12 to 33.11. Fuzzy kriging was also performed for carbon and Iron. In case of carbon the fuzzy kriging standard deviations was found in the range of 0.33 to 0.37 as compared to the ordinary kriging standard deviations, which was 0.4 to 0.69. In the case of Iron the fuzzy kriging standard deviations was found in the range of 0.44 to 0.5 as compared to the ordinary kriging standard deviations, which was 0.65 to 1.02. So with these results we can see that by means of fuzzy variograms, it was possible to obtain the low kriging standard deviations as compared to the kriging standard deviations with the ordinary kriging using the crisp kriging.
Figure 3: Fuzzy kriging prediction (left) and fuzzy kriging standard deviation at the single location with coordinates (664,200).
5 Discussion Fuzzy set theory can be used to account the imprecise knowledge of variogram parameters in kriging. This type of problem can also be solved using a Bayesian or a probabilistic approach. Each approach has its advantages and limitations. The problem with a Bayesian approach is that a prior distribution has to be selected. Also a Bayesian approach requires extensive calculations. The fuzzy set approach has a similar difficulty in selection of the membership functions, but only simple dependenceindependence assumptions are necessary, and computations are relatively
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
183
simple. In this research the approach of fuzzy methods within geostatistical modeling has been proposed to handle the uncertainty.
Figure 4: Fuzzy kriging of CH4 values using a fuzzy variogram - maximum value (top), predicted value (middle) and minimum value (bottom).
Several advantages exist in using fuzzy input data. An example is the possibility to incorporate expert knowledge, to be used for the definition of fuzzy numbers, at places where exact measurements are rare in order to reduce the kriging variance. As the kriging variance decreases, fuzziness emerges in the results. Such a result presents more information, because the vague information is taken into account that could not be used by conventional methods.
184
Alfred Stein and Mamta Verma
Figure 5: Fuzzy kriging standard deviations for the CH4 predictions using a fuzzy variogram - maximum value (top), predicted value (middle) and minimum value (bottom).
Data often incorporate fuzziness such as measurement tolerance that can be expressed as a fuzzy numbers. It can be of a great advantage to know the result tolerances when the results have to be judged. There is necessarily some uncertainty about the attribute value at an unsampled location. The traditional approach for modeling local uncertainty consists of computing a kriging estimate and associated error variance. A more rigorous approach is to assess first the uncertainty about the unknown, then deduce an optimal estimate, e.g. using an indicator approach
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
185
that provides not only an estimate but also the probability to exceed critical values, such as regulatory thresholds in soil pollution or criteria for soil quality. One vision for uncertainty management in GIS is the application of 'intelligent' systems. Burrough, 1991, suggests that such systems could help decision makers evaluate the consequences of employing different combinations of data, technology, processes and products, to gain an estimate of the uncertainty expected in their analyses before they start. Elmes et.al, 1992, investigated use of a data quality module in a decision support system to advise on management of forest pest infestations. Agumya and Hunter, 1996 believe the uncertainty debate must now advance from its present emphasis on the effect of uncertainty in the information, to considering the effect of uncertainty on the decisions. Just as uncertainty is an important consideration in spatial decision-making, also in many different applications the temporal aspects of spatial data are often a vital component. To improve the quality of decisions in such scenarios, information systems need to provide better support for spatial and temporal data handling. This part of intelligent system is still lacking, though. The aboveproposed approach should be helpful to be of direct use for the decision making. An important element in this regard is to have a good interface to represent the outputs of the analysis in a proper form so that it can be used by the decision makers.
6 Conclusions Fuzzy kriging provides interpolation results in the form of fuzzy numbers. As such in contributes to further modelling attribute uncertainty in spatial data. In our study, it was applied to modelling of methane emissions at the Isle of Java, where fuzziness in the variogram parameters could be incorporated. For these data, the exponential model provides the best fit for variogram of methane giving parameter values of nugget, sill and range. Fuzzy kriging provides an interesting way of both calculating and displaying the interpoaltion results as maps. Validation of the results, shows that ordinary kriging gives values of kriging standard deviations equal to 12 to 33.11, whereas fuzzy variograms reduced these values to kriging standard deviations (12 to 26.45).
186
Alfred Stein and Mamta Verma
Acknowledgement We are thankful to Prof. Peter van Bodegom for providing the data set for this work, also to Dr. Theo Bouloucous, from ITC, The Netherlands, Dr. P.S.Roy and Mr. P.L.N.Raju, from IIRS, India, for their support during the research work.
References Agumya, A. and Hunter, G.J., 1996, Assessing Fitness for Use of Spatial Information: Information Utilisation and Decision Uncertainty. Proceedings of the GIS/LIS '96 Conference, Denier, Colorado, pp. 349-360 Bardossy, A., Bogardi, I. and Kelly, W.E., 1990a, Kriging with Imprecise (Fuzzy) Variograms I: Theory. Mathematical Geology 22, 63-79 Bardossy, A., Bogardi, I., and Kelly, W.E., 1990b, Kriging with Imprecise (Fuzzy) Variograms II: Application, Mathematical Geology 22, 81-94 Bezdek, J.C., 1981, Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York. Burrough P.A., 1986. Principles of geographical information systems for land resources assessment. Clarendon press, Oxford. Burrough, P.A., 1991, The Development of Intelligent Geographical Information Systems. Proceedings of the 2nd European Conference on GIS (EGIS '91), Brussels, Belgium, vol. 1, pp. 165-174 Burrough, P.A., 2001, GIS and geostatistics: essential partners for spatial analysis. Environmental and ecological statistics 8, 361-378 Chilès, J.P., and Delfiner, P. 1999. Geostatistics: modelling spatial uncertainty. John Wiley & Sons, New York. Diamond, P., and Kloeden, P. 1989. Characterization of compact subsets of fuzzy sets. Fuzzy Sets and Systems 29, 341-348 Elmes, G.A. and Cai, C., 1992, Data Quality Issues in User Interface Design for a Knowledge-Based Decision Support System. Proceedings of the 5th International Symposium on Spatial Data Handling, Charleston, South Carolina, vol. 1, pp. 303-312 Goodchild, M. and Jeansoulin, R. 1998. Data quality in geographic information. Hermes, Paris. Guptill S.C. and Morrison, J.L. (1995). Elements of Spatial Data Quality. Elsevier Science Ltd, Exeter, UK. Heuvelink, G.M.H., 1998. Error Propagation in Environmental modeling with GIS. Taylor Francis, London Holzapfel-Pschorn, A. and Seiler, W. 1986. Methane emission during a cultivation period from an Italian rice paddy. Journal of Geophysical Research 91, 11804-14
Handling Spatial Data Uncertainty Using Fuzzy Geostatistics
187
Houghton, J.T., Meira Filho, L.G., Calander, B.A., Harris, N., Kattenberg, A. and Marskell, K. 1996. Climate change 1995. The science of climate change. Cambridge University Press, Cambridge. Klir G.J. and Folger, T.A., 1988. Fuzzy sets, uncertainty and Information. Prentice Hall, New Jersey. NCGIA (1989) The research plan of the National Center for Geographic Information and Analysis. International Journal of Geographical Information Systems 3 117-136 Schütz, H., Seiler, W. and Conrad, R., 1990. Influence of soil temperature on methane emission from rice paddy fields. Biogeochemistry 11, 77-95 Stein, A. and Van der Meer, F. 2001. Statistical sensing of the environment, International Journal of Applied Earth Observation and Geoinformation 3, 111113 Van Bodegom, P.M.; R. Wassmann; and T.M. Metra-Corton, 2001, A processbased model for methane emission predictions from flooded rice paddies, Global biogeochemical cycles 15, 247-264 Van Bodegom, P.M, Verburg, P.H., and Denier van der Gon, H.A.C. 2002. Upscaling methane emissions from rice paddies: problems and possibilities, Global biogeochemical cycles 16, 1-20 Zadeh.L., 1965, Fuzzy Sets. Information Control 8, 338-353
A Visualization Environment for the Space-TimeCube Menno-Jan Kraak1 and Alexandra Koussoulakou2 1 ITC, Department of Geo-Information Processing, PO Box 6, 7500 AA Enschede, The Netherlands [email protected]; 2 Aristotle University of Thessaloniki, Department of Cadastre, Photogrammetry and Cartography, 541 24 Thessaloniki, Greece, [email protected]
(for extra colour illustrations: http://www.itc.nl/personal/kraak/sdh04 )
Abstract At the end of the sixties Hägerstrand introduced a space-time model which included features such as a Space-Time-Path, and a Space-Time-Prism. From a visualization perspective the Space-Time-Cube was the most prominent element in Hagerstrand’s approach. However, when the concept was introduced the options to create the graphics were limited to manual methods and the user could only experience the single view created by the draftsperson. Today’s software has options to automatically create the cube and its contents from a database. Data acquisition of space-time paths for both individuals and groups is also made easier using GPS. The user’s viewing environment is, by default, interactive and allows one to view the cube from any direction. In this paper the visualization environment is proposed in a geovisualization context. Keywords: Space-Time-Cube, Geovisualization, Time
1 Introduction Our dynamic geo-community currently witnesses a trend which demonstrates an increased need for personal geo-data. This need for geo-data is based on a strong demand driven data paradigm. Individuals move around assisted by the latest technology which requires data that fits personal needs. They want to know where they are, how they get to a destination and what to expect there, and when to arrive. The elementary questions
190
Menno-Jan Kraak and Alexandra Koussoulakou
linked to geospatial data such as ‘Where?’, ‘What?’ and ‘When?’ become even more relevant. This is further stimulated by technological developments around mobile phones, personal digital assistance, and global positioning devices. Next to the demand for location based services one can witness an increase in use of for instance mobile GIS equipment in fieldwork situations (Pundt, 2002; Wintges, 2003). The above technology also proves to be a rich new data source (Mountain, 2004), since these devices can easily collect data. Interests in exploratory and analytical tools to process and understand these (aggregate) data streams are increasing. Geographers see new opportunities to study human behaviour and this explains the revival in the interest in Hägerstrand’s time geography (Miller, 2002). The time-geography approach has been hampered by the difficulty to get abundant data and methods and techniques to process data. This seems no longer the case, but it might have serious implications on privacy which should not be taken for granted (Monmonier, 2002). From a visualization perspective the Space-Time-Cube is the most prominent element in Hägerstrand’s approach. In its basic appearance the cube has on its base a representation of the geography (along the x- and yaxis), while the cube’s height represents time (z-axis). A typical SpaceTime-Cube could contain the space time-paths of for instance individuals or objects. However, when the concept was introduced the options to create the graphics were limited to manual methods and the user could only experience the single view created by the draftsperson. A different view on the cube would mean to go through a laborious drawing exercise again. Today software exists that allows to automatically creating the cube and its contents from a database. Developments in geovisualization allow one to link the (different) cube views to other alternative graphics. Based on the latest developments in geovisualization, this paper presents an extended interactive and dynamic visualization environment, in which the user has full flexibility to view, manipulate and query the data in a Space-Time-Cube, while linked to other views on the data. The aim is to have a graphic environment that allows for creativity via an alternative perspective on the data to sparkle the mind with new ideas and to solve particular geo-problems. The proposed visualization environment is illustrated by two applications: sports and archaeology. The first is relatively straight forward when it comes to the application of the space-time-cube and the second demonstrates new opportunities for the cube because of the visualization possibilities.
A Visualization Environment for the Space-Time-Cube
191
2 Hägerstrand’s Time Geography Time-geography studies the space-time behaviour of human individuals. In their daily life each individual follows a trajectory through space and time. Hägerstand’s time geography sees both space and time as inseparable, and this becomes clear if one studies the graphic representation of his ideas, the Space-Time-Cube as displayed in this papers figures. Two of the cubes axes represent space, and the third axe represents time. This allows the display of trajectories, better known as Space-Time-Paths (STP). These paths are influenced by constraints. One can distinguish between capability constraints (for instance mode of transport and need for sleep), coupling constraints (for instance being at work or at the sports club), and authority constraints (for instance accessibility of buildings or parks in space and time). On an aggregate level time geography can also deal with trends in society. The vertical lines indicate a stay at the particular location which are called ‘stations’. The time people meet at a station creates so-called ‘bundles’. The near horizontal lines indicate movements. The Space-TimePath can be projected on the map, resulting in the path’s footprint. Another important time-geography concept is the notion of the Space-TimeCube. In the cube it occupies the volume in space and time a person can reach in a particular time-interval starting and returning to the same location (for instance: where can you get during lunch time). The widest extent is called the Potential Path Space (PPS) and its footprint is called Potential Path Area (PPA). In the diagram it is represented by a circle assuming it is possible to reach every location at the edge of the circle. In reality the physical environment (being urban or rural) will not always allow this due to the nature of for instance the road pattern or traffic intensity. During the seventies and eighties Hägerstrand’s time-geography has been elaborated upon by his Lund group (Hägerstrand, 1982; Lenntorp, 1976). It has been commented and critiqued by for instance Pred (1977), who in his paper gives a good analysis of the theory, and summarizes it as “the time geography framework is at one and the same time disarmingly simple in composition and ambitious in design.” As one of the great benefits of the approach he notes that is takes away the (geographers) over emphasize on space and includes time. Recently Miller (2002) in a homage to Hägerstrand worded this as a shifting attention from a ‘place-based perspective’ (our traditional GIS) to a more people based-perspective (timegeography). The simplicity is only partly through, because if one considers the Space-Time-Cube from a visual perspective it will be obvious that an interactive viewing environment is not easy to find. Probably that is one of the reason the cube has been used only sporadically since its conceptuali-
192
Menno-Jan Kraak and Alexandra Koussoulakou
sation. Although applications in most traditional human-geography domains have been described, only during the last decade we have witnessed an increased interest in the Space-Time-Cube. Miller (1991; 1999) applied its principles in trying to establish accessibility measure in an urban environment. Kwan (1998; 1999) has used it to study accessibilities differences among gender and different ethnic groups. She also made a start integrating cyberspace into the cube. Forer has developed a interesting data structure based on taxels (‘time volumes’) to incorporate in the cube to represent the Space-Time-Prism (Forer, 1998; Forer and Huisman, 1998). Hedley et al (1999) created an application in a GIS environment for radiological hazard exposure. Improved data gathering techniques have given the interest in Time geography and the Space-Time-Cube again a new impulse. Andrienko et. al. (2003) did put the cube in the perspective of exploratory spatio-temporal visualization. Recent examples are described by Mountain and his colleagues (Dykes and Mountain, 2003; Mountain and Raper, 2001) who discuss the data collection techniques by mobile phone, GPS and location based services and suggest visual analytical method to deal with the data gathered. The Space Time-Path or geospatial life lines as they are called by Hornsby & Egenhofer (2002) are object of study in the framework of moving objects. From a visualization point of view the graphics created in most applications are often of an ad-hoc nature. This paper tries to systematically look at what is possible today? It should be obvious that the Space-Time-Cube is not the solution for the visualization of all spatio-temporal data one can think of. The next two sections will put the cube in a more generic geovisualization and temporal visualization perspective.
3 Space-Time-Cube in a Geovisualization Context During Hägerstrand’s active career - the period before Geographic Information Systems - paper maps and statistics were probably the most prominent tools for researchers to study their geospatial data. To work with those data, analytical and map use techniques were developed, among them the concepts of time-geography. Many of these ideas can still be found in the commands of many GIS packages. Today GIS offers researchers access to large and powerful sets of computerized tools such as spreadsheets, databases and graphic tools to support their investigations. The user can interact with the map and the data behind it, an option which can be extended via links to other data accessible via the web. This capability adds a differ-
A Visualization Environment for the Space-Time-Cube
193
ent perspective to the map, as they become interactive tools for exploring the nature of the geospatial data at hand. The map should be seen as an interface to geospatial data that can support information access and exploratory activities, while it retains its traditional role as a presentation device. There is also a clear need for this capability since the magnitude and complexity of the available geospatial data pose a challenge as to how the data can be transformed into information and ultimately into knowledge. Geovisualization integrates approaches from scientific visualization, (exploratory) cartography, image analysis, information visualization, exploratory data analysis (EDA) and GIS to provide theory, methods and tools for the visual exploration, analysis, synthesis and presentation of geospatial data (MacEachren and Kraak, 2001). Via the use of computersupported, interactive, visual representations of (geospatial) data one can strengthen understanding of the data at hand. The visualizations should lead to insight that ultimately helps decision making. In this process maps and other graphics are used to stimulate (visual) thinking about geospatial patterns, relationships and trends, generate hypotheses, develop problem solutions and ultimately construct knowledge. One important approach here is to view geospatial data sets in a number of alternative ways, e.g., using multiple representations without constraints set by traditional techniques or rules. This should avoid the trap described by Finke (1992) who claims that “most researchers tend to rely on wellworn procedures and paradigms...” while they should realize that “…creative discoveries, in both art and science, often occur in unusual situations, where one is forced to think unconventionally.” This is well described by Keller and Keller (1992), who in their approach to the visualization process suggest removing mental roadblocks and taking some distance from the discipline in order to reduce the effects of traditional constraints. Why not choose the STC as an alternative mapping method to visualize temporal data? However, to be effective the alternative view should preferably be presented in combination with familiar views to avoid one gets lost. This implies a working environment that consist of multiple linked views to ensure that an action in one view is also immediate visible in all other views. Currently this approach receives much attention from both a technical as well as a usability perspective (Convertino et al., 2003; Roberts, 2003). The above trend in mapping is strongly influenced by developments in other disciplines. In the 1990s, the field of scientific visualization gave the word “visualization” an enhanced meaning (McCormick et al., 1987). This development linked visualization to more specific ways in which modern computer technology can facilitate the process of “making data visible” in real time in order to strengthen knowledge. The relations between the
194
Menno-Jan Kraak and Alexandra Koussoulakou
fields of cartography and GIS, on the one hand, and scientific visualization on the other, have been discussed, in depth (Hearnshaw and Unwin, 1994) and (MacEachren and Taylor, 1994). Next to scientific visualization, which deals mainly with medical imaging, process model visualization and molecular chemistry, another branch of visualization that influenced mapping can be recognized. This is called information visualization and focuses on visualization of non-numerical information (Card et al., 1999). Of course recent trend around GIScience play a role as well (Duckham et al., 2003). In the recent book ‘Exploring Geovisualization’ (Dykes et al., 2004) many current research problems are described. From the map perspective, it is required that in this context cartographic design and research pay attention to human computer interaction of the interfaces, and revive the attention for the usability of the products, especially since many alternative views nor their experimental environments have been really tested on their efficiency or effectiveness. Additionally, one has to work on representation issues and the integration of geocomputing in the visualization process to be able to realize the alternatives.
4 Space-Time-Cube and its Visualization Environment Taking the above discussion into account it is assumed that a better (visual) exploration and understanding of temporal events taking place in our geo-world requires the integration of geovisualization with timegeography’s Space-Time-Cube. Prominent keywords are interaction, dynamics and alternative views which each have their impact on the viewing environment proposed. Interaction in needed because the threedimensional cube has to be manipulated in space to find the best possible view, and it should be possible to query the cube’s content. Time, always present in the Space-Time-Cube automatically introduces dynamics. The alternative graphics appear outside the cube and are linked and should stimulate thinking new insights and explanations. This section will systematically discuss the functionality required in such environment. One can distinguish functions linked to a basic display environment, those linked to data display and manipulation in the cube and those related to linked view. The functions are all based on the questions: What tasks are expected to be executed when working with a Space-Time-Cube?
A Visualization Environment for the Space-Time-Cube
195
Fig. 1. The Space-Time-Cube’s basic viewing environment. The working view shows the details of the data to be studied. The 2D and 2D view assist the user in orientation and navigation. The attribute view offers the user options to define which variable will be displayed in the cube. The data displayed in the figure represent Napoleon’s 1812 march into Russia. The viewing environment has a main or working view, as can be seen in Figure 1. Additionally three other optional views are shown. These views help with orientation and navigation in the cube, as well as with the selection of the data displayed. The 2D view shows the whole map of the study area with a rectangle that matches the base map displayed in the working view’s cube. Similarly, the content of the cube in the 3d view corresponds to the content of the Space-Time-Cube. The situation in Figure 1 results after zooming in on a particular area. The views are all linked, and moving for instance the rectangle in the 2D view will also result in a different location on the overview cube in the 3d view and a different content of the working view. The attribute view shows the variables available for display, and allows the user to link those variables to the Space-Time-Cube’s display variables. These include a base map, variables to be displayed along the cubes axis (x, y, and time) and variables linked to the Space-Time-Path – its colour and width. The user can drag any of the available variables onto the display variables. However, a few limitations exist. Since we deal with a Space-Time Cube a spatial (x or y) and time component must always be there. It is possible though that different time variables exist of which one should be selected by the user. For instance time could be given
196
Menno-Jan Kraak and Alexandra Koussoulakou
in years but also according particular historical events like the reign of an administration. The base map is displayed optionally. The x or y could be used for display with another variable. In the case of the Napoleon data displayed in Figure 1 the y could be interchanged by the number of troops, as was suggested by Roth discussing the SAGE software (Roth et al., 1997). A Space-Time-Path will be automatically displayed as soon as the x, y and time variables have been assigned in the attribute view. As with most software dealing with maps it is possible to switch information layers on and off. In the cube a footprint of the Space-Time-Path is displayed for reference and measurements. Any of the cube’s axes can be dragged into the cube to measure time or location. Drop lines can be projection on the axis for the same purpose. Since we deal with the display of the third dimension the option to manipulate the cube in 3d space is a basic requirement. Rotating the cube independently around any of its axis is possible. Also introduced is a spinning option that allows one to let the cube automatically rotate around a define axis with the purpose to get an overview of the data displayed. Since users will be curious one should be able to query the Space-Time-Path. In this example the system responds with an overview of available data on the segment selected. Via the attribute view the user can change the variable attached to the axis or the Space-TimePath on the fly. For selection purposes it is also possible to use multiple slider planes along the axis to highlight an area of the cube.
5 Applicatons and Extended Functionality
5.1 Sports Sport and time-geography have met before, for instance to analyse rugby matches (Moore et al., 2003). The Space-Time-Cube is most suitable for the display and analysis of paths of (multiple) individuals, groups or other objects moving through space (Figure 2). However, other possibilities exist, and it could also be used for real-time monitoring. In an experiment the Space-Time-Cube is used to visualize a running event, with additional views linked to the cube environment. The working view and the 2D view show a path based on GPS data acquired during a fitness run. Highlighted in both views is a section of the run for which detailed information is shown in the linked views. These are a GPS data file showing the track history and a video displaying the path’s environment. A dot in the work-
A Visualization Environment for the Space-Time-Cube
197
ing view and 2d view indicate the position of the runner represented by the current video-frame and highlighted record in the GPS-log. For analytical purposes one could add the other paths of similar runs and compare the running results in relation to the geography. To expand this geographic analysis one could also use a digital terrain model as base map and see how the terrain might influence running speed. This example shows how multimedia elements can be linked to the cube. However, one can easily think of other linked views as well. With many available variables these could –when relevant- be displayed in a parallel coordinate plot which could be used as an alternative variable view and allow for linking the variable with the cube’s axes or the Space-Time-Path. 5.2 Archaeology Another discipline that might benefit from the Space-Time-Cube approach is archaeology. Complex relations between artefacts excavated at particular times found from different historical period at different location could be made visible in the cube. The location of a find will be presented by a vertical line that has a colour/thickness at the relevant time period, possibly indicating uncertainty. Since most finds will be point locations paths are not necessarily available in the archaeological version of the cube. However, apart from the excavation data (the “archaeological” part) the interpretation of the excavation (the “history” part) can provide spatiotemporal distributions of derived information, which do not refer to point locations only. Examples are: uses of space (“landuse”), locations with concentrations of specific materials (e.g. pottery, gold, stones etc.), borders of a city through time, distribution of settlements / excavations in an area, links to other geographical locations, in a smaller map-scale. Moreover, the possibility to play with the different time scales related to history, geology, archaeology etc makes it possible to for instance discover the spread of a civilisation or the influence quarry in the spread of particular artefacts. It could assist in prediction where interesting location could be found. In Figure 2 the Space-Time Cube is used to visualize the spatiotemporal location of archaeological excavations (corresponding to various settlements) in an area. Of importance here is the location of the settlements rather, than the path they follow in time (this is because there still remain settlements to be discovered). Nevertheless, the STP of all existing settlements might offer interesting patterns to archaeologists with respect to movements through time. An additional functionality requirement for archaeology has to do with the kind of use/uses of space in the various
198
Menno-Jan Kraak and Alexandra Koussoulakou
spots. Also in this application the cube environment should not be used stand alone but in active connection to other views with relevant textual, tabular and image data.
Fig. 2. STC applications: left multiple runners in an orienteering race; right: archaeology excavation where the stations indicate the duration of the existence of settlements
6 Problems and Prospects This paper has discussed the option to visually deal with the concepts of Hägerstrands Space-Time-Cube based on the opportunities offered by geovisualization. The functionality of a Space-Time-Cube’s viewing environment with its multiple linked views has been described. However, many questions remain and are currently subject of further research. Among those questions ‘How many multiple linked views can the user handle?’, ‘Can the user understand the cube when multiple Space-TimePaths are displayed?’, ‘How should the interface look like?’ All these questions deal with usability aspects of the cube’s viewing environment, and the authors currently work on a usability set up to answer the above questions. References Andrienko, N., Andrienko, G.L. and Gatalsky, P., 2003. Exploratory spatiotemporal visualization: an analytical review. Journal of Visual Languages and Computing, 14: 503-541.
A Visualization Environment for the Space-Time-Cube
199
Card, S.K., MacKinlay, J.D. and Shneiderman, B., 1999. Readings in information visualization: using vision to think. Morgan Kaufmann, San Francisco. Convertino, G., Chen, J., Yost, B., Ryu, Y.-S. and North, C., 2003. Exploring context switching and cognition in dual-view coordinated visualizations. In: J. Roberts (Editor), International conference on coordinated & multiple views in exploratory visualization. IEEE Computer Society, London, pp. 55-62. Duckham, M., Goodchild, M. and Worboys, M. (Editors), 2003. Foundations of geographic information science. Taylor & Francis, London. Dykes, J., MacEachren, A.M. and Kraak, M.J. (Editors), 2004. Exploring geovisualization. Elseviers., Amsterdam. Dykes, J.A. and Mountain, D.M., 2003. Seeking structure in records of spatiotemporal behaviour: visualization issues, efforts and applications: Computational Statistics and Data Analysis (Data Viz II). Computational Statistics & Data Analysis, 43(4): 581-603. Finke, R.A., Ward, T.B. and Smith, S.M., 1992. Creative Cognition: Theory, Research, and Applications. The MIT Press, Cambridge, Mass, 205 pp. Forer, P., 1998. Geometric approaches to the nexus of time, space, and microprocess: implementing a practical model for mundane socio-spatial systems. In: M.J. Egenhofer and R.G. Gollege (Editors), Spatial and temporal reasoning in geographic information systems. Spatial Information Systems. Oxford University Press, Oxford. Forer, P. and Huisman, 1998. Computational agents and urban life spaces: a preliminary realisation of the time-geography of student lifestyles, Third International Conference on GeoComputation, Bristol. Hägerstrand, T., 1982. Diorama, path and project. Tijdschrift voor Economische en Sociale eografie, 73: 323-339. Hearnshaw, H.M. and Unwin, D.J. (Editors), 1994. Visualization in Geographical Information System. J. Wiley and Sons, London. Hedley, N.R., Drew, C.H. and Lee, A., 1999. Hagerstrand Revisited: Interactive Space-Time Visualizations of Complex Spatial Data. Informatica: International Journal of Computing and Informatics, 23(2): 155-168. Hornsby, K. and Egenhofer, M.J., 2002. Modeling Moving Objects over Multiple Granularities. Annals of Mathematics and Artificial Intelligence, 36(1-2): 177-194. Keller, P.R. and Keller, M.M., 1992. Visual cues, practical data visualization. IEEE Press, Piscataway. Kwan, M.-P., 1998. Space-time and integral measures of individual accessibility: A comparative analysis using a point-based framework. Geographical Analysis, 30(3): 191-216. Kwan, M.-P., 1999. Gender, the home-work link, and space-time patterns of nonemployment activities. Economic Geography, 75(4): 370-94. Lenntorp, B., 1976. Paths in space time environments: a time geographic study of movement possibilities of individuals. Lund studies in Geography B: Human geography. MacEachren, A.M. and Kraak, M.J., 2001. Research challenges in geovisualization. Cartography and Geographic Information Systems, 28(1): 3-12.
200
Menno-Jan Kraak and Alexandra Koussoulakou
MacEachren, A.M. and Taylor, D.R.F. (Editors), 1994. Visualization in Modern Cartography. Pergamon Press, London. McCormick, B., DeFanti, T.A. and Brown, M.D., 1987. Visualization in Scientific Computing. Computer Graphics, 21(6). Miller, H.J., 1991. Modelling accessibility using space-time prism concepts within geographical information systems. International Journal of Geographical Information Systems, 5(3): 287-301. Miller, H.J., 1999. Measuring space-time accessibility benefits within transportation networks: basic theory and computational procedures. Geographical Analysis, 31(2): 187-212. Miller, H.J., 2002. What about people in geographic information science? In: D. Unwin (Editor), Re-Presenting Geographic Information Systems. Wiley. Monmonier, M., 2002. Spying with maps. University of Chicago Press, Chicago. Moore, A.B., Wigham, P., Holt, A., Aldridge, C. and Hodge, K., 2003. A Time Geography Approach to the Visualisation of Sport, Geocomputation 2003, Southampton. Mountain, D., 2004. Visualizing, querying and summarizing individual spatiotemporal behaviour. In: J. Dykes, A.M. MacEachren and M.J. Kraak (Editors), Exploring Geovisualization. Elseviers, London, pp. 000-000. Mountain, D.M. and Raper, J.F., 2001. Modelling human spatio-temporal behaviour: a challenge for location-based services, GeoComputation, Brisbane. Pred, A., 1977. The choreography of existence: Comments on Hagerstrand's timegeography and its usefulness. Economic Geography, 53: 207-21. Pundt, H., 2002. Field data collection with mobile GIS: Dependencies between semantics and data quality. GeoInformatica, 6(4): 363-380. Roberts, J., 2003. International conference on coordinated & multiple views in exploratory visualization. IEEE Computer Society. Roth, S.F., Chuah, M.C., Kerpedjiev, S., Kolojejchick, J.A. and Lucas, P., 1997. Towards an Information Visualization Workspace: Combining Multiple Means of Expression. Human-Computer Interaction Journal, 12(1 & 2): 131185. Wintges, T., 2003. Geodata communication on personal digital assistants (PDA). In: M.P. Peterson (Editor), Maps and the Internet. Elseviers, Amsterdam.
Finding REMO - Detecting Relative Motion Patterns in Geospatial Lifelines Patrick Laube1, Marc van Kreveld2, and Stephan Imfeld1 Department of Geography, University of Zurich, Winterthurerstrasse 190, CH–8057 Zurich, Switzerland, {plaube,imfeld}@geo.unizh.ch1 Department of Computer Science, Utrecht University, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands, [email protected]
Abstract Technological advances in position aware devices increase the availability of tracking data of everyday objects such as animals, vehicles, people or football players. We propose a geographic data mining approach to detect generic aggregation patterns such as flocking behaviour and convergence in geospatial lifeline data. Our approach considers the object's motion properties in an analytical space as well as spatial constraints of the object's lifelines in geographic space. We discuss the geometric properties of the formalised patterns with respect to their efficient computation. Keywords: Convergence, cluster detection, motion, moving point objects, pattern matching, proximity
1 Introduction Moving Point Objects (MPOs) are a frequent representation for a wide and diverse range of phenomena: for example animals in habitat and migration studies (e.g. Ganskopp 2001; Sibbald et al. 2001), vehicles in fleet management (e.g. Miller and Wu 2000), agents simulating people for modelling crowd behaviour (e.g. Batty et al. 2003) and even tracked soccer players on a football pitch (e.g. Iwase and Saito 2002). All those MPOs share motions that can be represented as geospatial lifelines: a series of observations consisting of a triple of id, location and time (Hornsby and Egenhofer 2002).
202
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
Gathering tracking data of individuals became much easier because of substantial technological advances in position aware devices such as GPS receivers, navigation systems and mobile phones. The increasing number of such devices will lead to a wealth of data on space-time trajectories documenting the space-time behaviour of animals, vehicles and people for off-line analysis. These collections of geospatial lifelines present a rich environment to analyse individual behaviour. (Geographic) data mining may detect patterns and rules to gather basic knowledge of dynamic processes or to design location based services (LBS) to simplify individual mobility (Mountain and Raper 2001; Smyth 2001; Miller 2003). Knowledge discovery in databases (KDD) and data mining are responses to the huge data volumes in operational and scientific databases. Where traditional analytical and query techniques fail, data mining attempts to distill data into information and KDD turns information into knowledge about the monitored world. The central belief in KDD is that information is hidden in very large databases in the form of interesting patterns (Miller and Han 2001). This statement is true for the spatio-temporal analysis of geospatial lifelines and thus is a key motivator for the presented research. Motion patterns help to answer the following type of questions. x Can we identify an alpha animal in the tracking data of GPS-collared wolves? x How can we quantify evidence of 'swarm intelligence' in gigabytes of log-files from agent-based models? x How can we identify which football team played the more catching lines of defense in the lifelines of 22 players sampled at seconds? The long tradition of data mining in the spatio-temporal domain is well documented (for an overview see Roddick et al. (2001)). The Geographic Information Science (GISc) community has recognized the potential of Geographic Information Systems (GIS) to 'capture, represent, analyse and explore spatio-temporal data, potentially leading to unexpected new knowledge about interactions between people, technologies and urban infrastructures (Miller 2003). Unfortunately, most commercial GIS are based on a static place-based perspective and are still notoriously weak in providing tools for handling the temporal dimensions of geographic information (Mark 2003). Miller postulates expanding GIS from the place-based perspective to encompass a people-based perspective. He identifies the development of a formal representation theory for dynamic spatial objects and of new spatio-temporal data mining and exploratory visualization techniques as key research issues for GISc.
Detecting Relative Motion Patterns in Geospatial Lifelines
203
In this paper work is presented which extends a concept developed to analyse relative motion patterns for groups of MPOs (Laube and Imfeld 2002) to also analyse the object's absolute locations. The work allows the identification of generic formalised motion patterns in tracking data and the extraction of instances of these formalised patterns. The significance of these patterns is discussed.
2 Aggregation in Space and Time Following Waldo Tobler's first law of geography, near things are more related than distant things (Tobler 1970). Tolber's law is often referred to as being the core of spatial autocorrelation (Miller 2004). Nearness as a concept can be extended to include both space and time. Thus analysing geospatial lifelines we are interested in objects near in space-time. Objects that are near at certain times might be related. Although correlation is not causality, it provides evidence of causality that can (and should) be assessed in the light of theory and/or other evidence. Since this paper focuses on the formal and geometrical definition and the algorithmic detection of motion patterns we use geometric proximity in euclidian space to avoid the vague term nearness. To analyse geospatial lifelines this could mean that MPOs moving within a certain range influence each other. E.g. an alpha wolf leads its pack by being seen or heard, thus all wolves have to be located within the range of vision or earshot respectively. Analysing geospatial lifelines we are interested in first identifying motion patterns of individuals moving in proximity. Second we want to know how, when and where sets of MPOs aggregate, converge and build clusters respectively. Investigating aggregation of point data in space and time is not new. Most approaches focus on detecting localized clustering at certain time slices (e.g. Openshaw 1994; Openshaw et al. 1999). This concept of spatial clusters is static, rooted in the time sliced static map representation of the world. With a true spatio-temporal view of the world aggregation must be perceived as the momentary process convergence and the final static cluster as its possible result. The opposite of convergence, divergence, is equally interesting. Its possible result, some form of dispersal, is much less obvious and thus much harder to perceive and describe. A cluster is not the compulsory outcome of a convergence process and vice versa. A set of MPOs can very well be converging for a long time without building a cluster. The 22 players of a football match may converge during an attack without ever forming a detectable cluster on the
204
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
pitch. In reverse, MPOs moving around in a circle may build a wonderful cluster but never be converging. In addition the process of convergence and the final cluster are in many cases sequential. Consider the lifelines of a swarm of bees. At sunset the bees move back to the hive from the surrounding meadows, showing a strong convergence pattern without building a spatial cluster. In the hive the bees wiggle around in a very dense cluster, but do not converge anymore. In short, even though convergence and clustering are often spatially and/or temporally tied up, there need not be a detectable relation in an individual data frame under investigation.
3 The Basic REMO–Analysis Concept The basic idea of the analysis concept is to compare the motion attributes of point objects over space and time, and thus to relate one object's motion to the motion of all others (Laube and Imfeld 2002). Suitable geospatial lifeline data consist of a set of MPOs each featuring a list of fixes. The REMO concept (RElative MOtion) is based on two key features: First, a transformation of the lifeline data to a REMO matrix featuring motion attributes (i.e. speed, change of speed or motion azimuth). Second, matching of formalized patterns on the matrix (Fig. 1). Two simple examples illustrate the above definitions: Let the geospatial lifelines in Fig. 1a be the tracks of four GPS-collared deer. The deer O1 moving with a constant motion azimuth of 45° during an interval t2 to t5, i.e. four discrete time steps of length t, is showing constance. In contrast, four deer performing a motion azimuth of 45° at the same time t4 show concurrence.
objects
time t1 t2 t3 t4 t5
O1
O1 O2 O3 O4
time t1 t2 t3 t4 t5 O1 O2 O3 O4
45 45 45
constance
90
45 45 45
45
90 315 45 315
0
45
45
90 180 45
45
90 315 90 45
0
45 45 45 45
concurrence
45 45 45 45 45
O3 O O 2 4
45
(a)
(b)
(c)
trend-setter (d)
Fig. 1. The geospatial lifelines of four MPOs (a) are used to derive in regular intervals the motion azimuth (b). In the REMO analysis matrix (c) generic motion patterns are matched (d).
Detecting Relative Motion Patterns in Geospatial Lifelines
205
The REMO concept allows construction of a wide variety of motion patterns. See the following three basic examples: x Constance: Sequence of equal motion attributes for r consecutive time steps (e.g. deer O1 with motion azimuth 45° from t2 to t5). x Concurrence: Incident of n MPOs showing the same motion attributes value at time t (e.g. deer O1, O2, O3, and O4 with motion azimuth 45° at t4 ) x Trend-setter: One trend-setting MPO anticipates the motion of n others. Thus, a trend-setter consists of a constance linked to a concurrence (e.g. deer O1 anticipates at t2 the motion azimuth 45° that is reproduced by all other MPOs at time t4) For simplicity we focus in the remainder of this paper on the motion attribute azimuth, even though most facets of the REMO concept are equally valid for speed or change of speed.
4 Spatially Constrained REMO patterns The construction of the REMO-matrix is an essential reduction of the information space. However it should be noted that this step factors out the absolute locations of the fixes of the MPOs. The following two examples illustrate generic motion patterns where the absolute locations must be considered. x Three geese all heading north-west at the same time – one over London, one over Cardiff and one over Glasgow are unlikely to be influenced by each other. In contrast, three geese moving north-west in the same gaggle are probably influenced. Thus, for flocking behaviours the spatial proximity of the MPOs has to be considered. x Three geese all heading for Leicester at the same time – one starting over London, one over Cardiff and one over Glasgow show three different motion azimuths, not building any pattern in the REMO matrix. Thus, convergence can only be detected considering the absolute locations of the MPOs. The basic REMO concept must be extended to detect such spatially constrained REMO patterns. In Section 4.1 spatial proximity is integrated and in Section 4.2 an approach is presented to detect convergence in relative motion. Section 4.3 evaluates algorithmic issues of the proposed approaches.
206
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
4.1 Relative Motion with Spatial Proximity Many sheep moving in a similar way is not enough to define a flocking pattern. We expect additionally that all the sheep of a flock graze on the same hillside. Formalised as a generic motion pattern we expect for a flocking the MPOs to be in spatial proximity. To test the proximity of m MPOs building a pattern at a certain time we can compute the spatial proximity of the m MPO's fixes in that time frame. Following Tobler's first law, proximity among MPOs can be considered as impact ranges, or the other way around: a spatio-temporally clustered set of MPOs is evidence to suggest an interrelation among the involved MPOs. The meaning of spatial constraint in a motion pattern is different if we consider the geospatial lifeline of a single MPO. The consecutive observations (fixes) of a single sheep building a lifeline can be tested for proximity. Thus the proximity measure constrains the spatial extent of single object's motion pattern. A constance for a GPS-collared sheep may only be meaningful if it spans a certain distance, excluding pseudo-motion caused by inaccurate fix measurements. Different geometrical and topological measures could be used to constrain motion patterns spatially. The REMO analysis concept focuses on the following open list of geometric proximity measures. x A first geometric constraint is the mean distance to the mean or median center (length of star plot). x Another approach to indicate the spatial proximity of points uses the Delaunay diagram, applied for cluster detection in 2-D point sets (e.g. Estivill-Castro and Lee 2002) or for the visualisation of habitat-use intensity of animals (e.g. Casaer et al. 1999). According to the cluster detection approach two points belong to the same cluster, if they are connected by a small enough Delaunay edge. Thus, adapted to the REMO concept a second distance proximity measure is to limit the average length of the Delauney edges of a point group forming a REMO pattern. x Proximity measures can have the form of bounding boxes, circles or ellipses (Fig. 2). The simplest way of indicating an impact range would be to specify a maximal bounding box that enclosed all fixes relevant to the pattern. Circular criteria can require enclosing all relevant fixes within radius r or include the constraint to be spanned around the mean or median center of the fixes. Ellipses are used to rule the directional elongation of the point cluster (major axis a, minor axis b). x Another areal proximity measure for a set of fixes is the indication of a maximal border length of the convex hull.
Detecting Relative Motion Patterns in Geospatial Lifelines
207
Using these spatial constraints the list of basic motion patterns introduced in Section 3 can be amended with the spatially constrained REMO patterns (Fig. 2). x Track: Consists of the REMO pattern constance and the attachment of spatial constraint. Definition: constance + spatial constraint S. x Flock: Consists of the REMO pattern concurrence and the attachment of a spatial constraint. Definition: concurrence + spatial constraint S. x Leadership: Consists of the REMO pattern trend-setter and the attachment of a spatial constraint. For example the followers must lie within the range (x, y) when they join the motion of the trend-setter. Definition: trend-setter + spatial constraint S. ANALYSIS SPACE
(a)
time ∂ymax
(b)
rmax
objects
track
GEOGRAPHIC SPACE
∂xmax
t3 t2 t1 to
objects
flock
time
r a
∂y
b ∂x
objects
t3 to
t1
t2 ∂xmax
leadership
time
∂ymax
Fig. 2. The figure illustrates the constraints of the patterns track, flock and leadership in the analysis space (the REMO matrix) and in the geographic space. Fixes matched in the analysis space are represented as solid forms, fixes not matched as empty forms. Some possible spatial constraints are represented as ranges with dashed lines. Whereas in the situations (a) the spatial constraints for the absolute positions of the fixes are fulfilled they are not in the situations (b): For track the last fix lies beyond the range, for flock and leadership the quadratic object lies outside the range.
208
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
4.2 Convergence At the same time self-evident and fascinating are groups of MPOs aggregating and disaggregating in space and time. An example is wild animals suddenly heading in a synchronised fashion for a mating place. Wildlife biologists could be interested in the who, when and where of this motion pattern. Who is joining this spatio-temporal trend? Who is not? When does the process start, when does it end? Where lies the mating place, what spatial extent or form does it have? A second example comes from the analysis of crowd behaviour. Can we identify points of interest attracting people only at certain times, events of interest rather than points of interest, losing their attractiveness after a while? To answer such questions we propose the spatial REMO pattern convergence. The phenomenon aggregation has a spatial and a spatio-temporal form. An example may help to illustrate the difference. Let A be a set of n antelopes. A wildlife biologist may be interested in identifying sets of antelopes heading for some location at certain time. The time would indicate the beginning of the mating season, the selected set of m MPOs the readyto-mate individuals, and the spot might be the mating area. This is a convergence pattern. It is primarily spatial, that means the MPOs head for an area but may reach it at different times. On the other hand the wildlife biologist and the antelopes may share the vital interest to identify MPOs that head for some location and actually meet there at some time extrapolating their current motion. Thus, the pattern encounter includes considerations about speed, excluding MPOs heading for a meeting range but not arriving there at a particular time with the others. x Convergence: Heading for R. Set of m MPOs at interval i with motion azimuth vectors intersecting within a range R of radius r. x Encounter: Extrapolated meeting within R. Set of m MPOs at interval i with motion azimuth vectors intersecting within a range R of radius r and actually meeting within R extrapolating the current motion.
Detecting Relative Motion Patterns in Geospatial Lifelines
209
r
t0 t1 1
t2 t3 t4
t6
2
2 3
t5
4
3 2 r
3
2
Fig. 3. Geometric detection of convergence. Let S be a set of 4 MPOs with 7 fixes from t0 to t6. The illustration shows a convergence pattern found with the parameters 4 MPOs at the temporal interval t1 to t3. The darkest polygon denotes an area where all 4 direction vectors are passing at a distance closer than r. The pattern convergence is found if such a polygon exists. Please note that the MPOs do not build a cluster but nevertheless show a convergence pattern.
The convergence pattern is illustrated in Figure 3. Let S be a set of MPOs with n fixes from t0 to tn-1. For every MPO and for every interval of length i an azimuth vector fitting in its fixes within i represents the current motion. The azimuth vector can be seen as a half-line projected in the direction of motion. The convergence is matched if there is at any time a circle of radius r that intersects n directed half-lines fitted for each MPO in the fixes within i. For the encounter pattern whether the objects actually meet in future must additionally be tested. The opposites of the above described patterns are termed divergence and breakup. The latter term integrates a spatial divergence pattern with the temporal constraint of a precedent meeting in a range R. The graphical representation of the divergence pattern is highly similar to Fig. 3. The only
210
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
difference lies in the construction of the strips, heading backwards instead of forwards, relative to the direction of motion. 4.3 Algorithms and Implementation Issues In this section we develop algorithms to detect the above introduced motion patterns and analyse their efficiency. The basic motion patterns in the REMO concept are relatively easy to determine in linear time. The addition of positions requires more complex techniques to obtain efficient algorithms. We analyse the efficiency of pattern discovery for track, flock, leadership, convergence, and encounter in this section. We let the range be a circle of given radius R. Let n denote the number of MPOs in the data set, and t the number of time steps. The total input size is proportional to nt, so a linear time algorithm requires O(nt) time. We let m denote the number of MPOs that must be involved in a pattern to make it interesting. Finally, we assume that the length of a time interval is fixed and given. The addition of geographic position to the REMO framework requires the addition of geographic tests or the use of geometric algorithms. The track pattern can simply be tested by checking each basic constance pattern found for each MPO. If the constance pattern also satisfies the range condition, a track pattern is found. The test takes constant additional time per pattern, and hence the detection of track patterns takes O(nt) time overall. Efficient detection of the flock pattern is more challenging. We first separate the input data by equal time and equal motion direction, so that we get a set of n' n points with the same direction and at the same time. The 8t point sets in which patterns are sought have total size O(nt). To discover whether a subset of size at least m of the n points lie close together, within a circle of radius R, we use higher-order Voronoi diagrams. The mth order Voronoi diagram is the subdivision of the plane into cells, such that for any point inside a cell, some subset of m points are the closest among all the points. The number of cells is O(m(n'-m)) (Aurenhammer 1991), and the smallest enclosing circle of each subset of m points can be determined in O(m) time (de Berg et al. 2000, Sect. 4.7). If the smallest enclosing circle has radius at most R, we have discovered a pattern. The sum of the n' values over all 8t point sets is O(nt), so the total time needed to detect these patterns is O(ntm2 + nt log n). This includes the time to compute the m-th order Voronoi diagram (Ramos 1999). Leadership pattern detection can be seen as an extension of flock pattern detection. The additional condition is that one of the MPOs shows con-
Detecting Relative Motion Patterns in Geospatial Lifelines
211
stance over the previous time steps. Leadership detection also requires O(ntm2+nt log n) time. For the convergence pattern, consider a particular time interval. The n MPOs give rise to n azimuth vectors, which we can see as directed halflines. To test whether at least m MPOs out of n converge, we compute the arrangement formed by the thickened half-lines, which are half-strips of width 2r. For every cell in the arrangement we determine how many thickened half-lines contribute, which can be done by traversing the arrangement once and maintaining a counter that shows in how many half-strips the current cell is. If a cell is contained in at least m half-strips, it constitutes a pattern. Computing the arrangement of n half-strips and setting the counters can be done in O(n2) time in total; the algorithm is very similar to computing levels in arrangements (de Berg et al. 2000, Chap. 8). Since we consider t different time intervals, the total running time becomes O(n2t). The encounter pattern is the most complex one to compute. The reason is that extrapolated meeting times must also match, which adds a dimension to the space in which geometric algorithms are needed. We lift the problem into 3-D space, where the third dimension is time. The MPOs become half-lines that go upward from a common horizontal plane representing the beginning of the time interval; the slope of the half-lines will now be the speed. The geometric problem to be solved is finding horizontal circles of radius R that are crossed by at least m half-lines, which can be solved in O(n4) time with a simple algorithm. For all time intervals of a given length, the algorithm needs O(n4t) time.
5 Discussion The REMO approach has been designed to analyse motion basing on geospatial lifelines. Since motion is expressed by a change in location over time the REMO patterns intrinsically span over space and time. Our approach thus overcomes the limitation of only either detecting spatial clusters on snapshots or highlighting temporal trends in attributes of spatial units. It allows pattern detection in space-time. REMO patterns rely solely on point observations and are thus expressible for any objects that can be represented as points and leave a track in a euclidean space. Having translated the expected behaviours into REMO patterns, the detection process runs unsupervised, listing every pattern occurrence. The introduced patterns can be detected within reasonable time. Many simple patterns can be detected in close to linear time if the size of the subset m that constitutes a pattern is a constant, which is natural in
212
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
many situations. The encounter pattern is quite expensive to compute, but since we focus on off-line analysis, we can still deal with data sets consisting of several hundreds of MPOs. Note that the dependency on the number of time steps is always linear for fixed length time intervals. The most promising way to obtain more efficient algorithms is by using approximation algorithms, which can save orders of magnitude by stating the computational problem slightly less firm (Bern and Eppstein 1997). In short, the REMO concept can cope with the emerging data volumes of tracking data. Syntactic pattern recognition adopts a hierarchical perspective where complex patterns are viewed as being composed of simple primitives and grammatical rules (Jain et al. 2000). Sections 3 and 4 introduced a subset of possible pattern primitives of the REMO analysis concept. Using the primitives and a pattern description formalism almost arbitrary motion patterns can be described and detected. Due to this hierarchical design the concept easily adapts to the special requirements of various application fields. Thus, the approach is flexible and universal, suited for various lifelines such as of animals, vehicles, people, agents or even soccer players. The detection of patterns of higher complexity requires more sophisticated and flexible pattern matching algorithms than the ones available today. The potential users of the REMO method know the phenomenon they investigate and the data describing it. Hence, in contrast to traditional data mining assuming no prior knowledge, the users come up with expectations about conceivable motion patterns and are able to assess the results of the pattern matching process. Therein lies a downside of the REMO pattern detection approach: It requires relatively sophisticated knowledge about the patterns to be searched for. For instance, the setting of an appropriate impact range for a flock pattern is highly dependent on the investigated process and thus dependent on the user. In general the parametrisation of the spatial constraints influences the number of patterns detected. Further research is needed to see whether autocalibration of pattern detection will be possible within the REMO concept. Even though the REMO analysis concept assumes users specifying patterns they are interested in, the pattern extent can also be viewed as an analysis parameter of the data mining approach. One reason to do so is to detect scale effects lurking in different granularities of geospatial lifeline data. The number of matched patterns may be highly dependent on the spatial, temporal and attributal granularity of the pattern matching process. For example the classification of motion azimuth in only the two classes east and west reveals a lot of presumably meaningless constance patterns. In contrast, the probability of finding constance patterns with 360 azimuth classes is much smaller, or take the selection of the impact range r for the flock pattern in sheep as another example. By testing the length of the im-
Detecting Relative Motion Patterns in Geospatial Lifelines
213
pact range r against the amount of matched patterns one could search for a critical maximal impact range within a flock of sheep. Future research will address numerical experiments with various data to investigate such relations. A critical issue in detecting convergence is fitting the direction vector in a set of fixes. Only slight changes in its azimuth may have huge effects on the overlapping regions. A straightforward solution approach to this problem is to smooth the lifelines and then fitting the azimuth vector to a segment of the smoothed lifeline. The paper illustrates the REMO concept referring to ideal geospatial lifeline data. In reality lifeline data are often imprecise and uncertain. Sudden gaps in lifelines, irregular fixing intervals or positional uncertainty of fixes require sophisticated interpolation and uncertainty considerations on the implementation side (e.g. Pfoser and Jensen 1999).
6 Conclusions With the technology driven shift from the static map view of the world to a dynamic process in GIScience, cluster detection on snapshots is insufficient. What we need are new methods that can detect convergence processes as well as static clusters, especially if these two aspects of space-time aggregation are separated. We propose a generic, understandable and extendable approach for data mining in geospatial lifelines. Our approach integrates individuals as well as groups of MPOs. It also integrates parameters describing the motion as well as the footprints of the MPOs in spacetime. Acknowledgements The ideas to this work prospered in the creative ambience of Dagstuhl Seminar No. 03401 on 'Computational Cartography and Spatial Modelling'. The authors would like to acknowledge invaluable input from Ross Purves, University of Zurich. References Aurenhammer F (1991) Voronoi diagrams: A survey of a fundamental geometric data structure. ACM Comput. Surv. 23(3):345-405
214
Patrick Laube, Marc van Kreveld, and Stephan Imfeld
Batty M, Desyllas J, Duxbury E (2003) The discrete dynamics of small-scale spatial events: agent-based models of mobility in carnivals and street parades. Int. J. Geographical Information Systems 17(7):673-697 Bern M, Eppstein D (1997) Approximation algorithms for geometric problems. In Hochbaum DS (ed) Approximation Algorithms for NP-Hard Problems, PWS Publishing Company, Boston, MA, pp 296-345 Casaer J, Hermy M, Coppin P, Verhagen R (1999) Analysing space use patterns by Thiessen polygon and triangulated irregular network interpolation: a nonparametric method for processing telemetric animal fixes. Int. J. Geographical Information Systems 13(5):499-511 de Berg M, van Kreveld M, Overmars M, Schwarzkopf O (2000) Computational Geometry - Algorithms and Applications. Springer, Berlin, 2nd edition Estivill-Castro V, Lee I (2002) Multi-level clustering and its visualization for exploratory data analysis. GeoInformatica 6(2):123-152 Ganskopp, D. (2001) Manipulating cattle distribution with salt and water in large arid-land pastures: a GPS/GIS assessment. Applied Animal Behaviour Science 73(4):251-262 Hornsby K, Egenhofer M (2002) Modeling moving objects over multiple granularities. Annals of Mathematics and Artificial Intelligence 36(1-2):177194 Iwase S, Saito H (2002) Tracking soccer player using multiple views. In IAPR Workshop on Machine Vision Applications, MVA Proceedings, pp 102-105 Jain A, Duin R, Mao J (2000) Statistical pattern recognition: A review. IEEE Transactions on Pattern Recognition and Machine Intelligence 22(1):4-37 Laube P, Imfeld S (2002) Analyzing relative motion within groups of trackable moving point objects. In: Egenhofer M, Mark D (eds), Geographic Information Science, Second International Conference, GIScience 2002, Boulder, CO, USA, September 2002, LNCS 2478, Springer, Berlin, pp 132-144 Mark, D (2003) Geographic information science: Defining the field. In: Duckham M, Goodchild M, Worboys M (eds), Foundations of Geographic Information Science, chap. 1, Taylor and Francis, London New York, pp 3-18 Miller H (2003) What about people in geographic information science? Computers, Environment and Urban Systems 27(5):447-453 Miller H (2004) Tobler's first law and spatial analysis. in preparation. Miller H, Han J (2001) Geographic data mining and knowledge discovery: An overview. In: Miller H, Han J (eds) Geographic data mining and knowledge discovery, Taylor and Francis, London New York, pp 3-32 Miller H, Wu Y (2000) GIS software for measuring space-time accessibility in transportation planning and analysis. GeoInformatica 4(2):141-159 Mountain D, Raper J (2001) Modelling human spatio-temporal behaviour: A challenge for location-based services. Proceedings of GeoComputation, Brisbane, 6 Openshaw S (1994) Two exploratory space-time-attribute pattern analysers relevant to GIS. In: Fotheringham S, Gogerson P (eds) GIS and Spatial Analysis, chap. 5, Taylor and Francis, London New York, pp 83-104
Detecting Relative Motion Patterns in Geospatial Lifelines
215
Openshaw S, Turton I, MacGill J (1999) Using geographic analysis machine to analyze limiting long-term illness census data. Geographical and Environmental Modelling 3(1):83-99 Pfoser D, Jensen C (1999) Capturing the uncertainty of moving-object representations. In: Gueting R, Papadias D, Lochowsky, F (eds) Advances in Spatial Databases, 6th International Symposium, SSD'99, Hong Kong, China, July 1999. LNCS 1651, Springer, Berlin Heidelberg, pp 111-131 Ramos E (1999) On range reporting, ray shooting and k-level construction. In: Proc. 15th Annu. ACM Symp. on Computational Geometry, pp 390-399 Roddick J, Hornsby K, Spiliopoulou M (2001) An updated bibliography of temporal, spatial, and spatio-temporal data mining research. In: Roddick J, Hornsby K (eds), Temporal, spatial and spatio-temporal data mining, TSDM 2000, LNAI 2007, Springer, Berlin Heidelberg, pp 147-163 Sibbald AM, Hooper R, Gordon IJ, Cumming S (2001) Using GPS to study the effect of human disturbance on the behaviour of the red deer stags on a highland estate in Scotland. In: Sibbald A, and Gordon IJ (eds) Tracking Animals with GPS, Macaulay Institute, pp 39-43 Smyth C (2001) Mining mobile trajectories. In: Miller H, Han J (eds) Geographic data mining and knowledge discovery, Taylor and Francis, London New York, pp 337-361 Tobler W (1970) A computer movie simulating urban growth in the Detroit region. Economic Geography 46(2):234-240
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems Karim Zerioh2, Omar El Beqqali2 and Robert Laurini1 1
LIRIS Laboratory, INSA-Lyon – Bât B. Pascal – 7 av. Capelle F – 69621 Villeurbanne Cedex France [email protected] 2 Dhar Mehraz Faculty of Science, Mathematics and Computer Science Department, B.P. 1897 Fes-Atlas, 30000, Fes, Morocco. [email protected], [email protected]
Abstract In a context-aware environment, the system must be able to refresh the answers to all pending queries in reaction to perpetual changes in the user’s context. This added to the fact that mobile systems suffer from problems like scarce bandwidth, low quality communication and frequent disconnections, leads to high delays before giving up to date answers to the user. A solution to reduce latency is to use hoarding techniques. We propose a hoarding policy particularly adapted for location-dependent information systems managing a huge amount of multimedia information and where no assumptions can be made about the future user’s location. We use the user’s position as a criterion for both hoarding and cache invalidation. Keywords: Hoarding, cache invalidation, mobile queries, locationdependent systems, spatial information systems.
1 Introduction The growing popularity of mobile computing has lead to more and more elaborated mobile information systems. Nowadays, mobile applications are aware of the user's context (time, location, weather, temperature, surrounding noise, ...). One of the most popular context-aware applications is the tourist guide (Cheverst et al. 2000, Abowd 1997, Malaka 1999, Poslad et al. 2001, Zarikas et al. 2001). In this paper we deal only with location.
218
Karim Zerioh, Omar El Beqqali and Robert Laurini
Let’s consider the scenario of a tourist with a mobile tourist guide asking where the nearest restaurant is located. This query must be answered immediately. Otherwise, if the answer takes some delay, it may be obsolete if the tourist in movement is already nearer to a different restaurant. So the system must refresh the responses that have been invalidated by context changes, to all pending queries. This operation can be repeated several times, depending on the number of users and the frequency of their queries and the context changes. On the other hand, mobile systems still suffer from scarce bandwidth, low quality communication and frequent network disconnections. All these factors lead to high delays before satisfying user's queries. But this delay will not occur if the answer is already in the client's cache. Caching techniques have proven their usefulness in wired systems. The answer of a query is stored in the cache for future use and when a user asks the same query it is answered by the cache. However, in location-dependent systems, where the answer of the same query changes if only the user's position is different, and where users rarely return to the same place (for example, a user with a tourist guide, after visiting a museum, has a very low chance to return to it after a while), the benefits of caching are not so obvious. But, if useful information is transferred to the client before the user requests it, the problem of latency will be resolved. Hoarding techniques must predict in advance which information the user will request, and try to transfer as less as possible unusable data for not wasting the scarce bandwidth resources and the usually limited memory and storage capacity in the user’s device. Tourist guides, nowadays, are very elaborated. They use maps for guided tours, present audio content to allow the user to walk while listening to explications, provide virtual 3D reconstructions of historical sites and 3D representations of the place where the target asked by the user is being for making its recognition easier, offer a life show of a tour in a hotel… So the amount of the multimedia data dealt with is really huge. This makes the necessity of appropriate cache invalidation schemes for freeing space in the user’s device. In this paper, we present a hoarding technique particularly adapted for location-dependent systems managing an important amount of data (called spatial hoarding). We use the user’s location both as a prediction criterion and as a cache invalidation one. The rest of the paper is organised as follows. In section 2 we begin with the related work. In section 3 we give a description of our method, we define the client’s capability and propose to operate in disconnected mode to save power. In section 4, we present two necessary algorithms for implementing the SH strategy, and discuss how data must be organised for the
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
219
determination of the information that must be hoarded. Section 5 gives an overview of our future work. Finally, section 6 concludes the paper.
2 Related Work Caching is the operation of storing information in the user’s device after it has been sent from the server. This allows future accesses to this information to be satisfied by the client. Invalidation schemes are used for maintaining consistency between the client cache and the server. Invalid data in the cache may be dropped for freeing memory for more accurate data. In location-dependent systems a cached data value becomes invalid when this data is updated in the server (temporal-dependent invalidation), or when the user moves to a new location (location-dependent). The team of the University of Science and Technology at Hong Kong (Xu et al. 1999, Zheng et al. 2002) investigated the integration of temporal-dependent and location-dependent updates. They assume the geographical coverage area partitioned into service areas, and define the valid scope of an item as the set of service areas where the item value is valid. To every data item sent from the server is attached its valid scope. So a data item becomes invalid when the user enters in a service area not belonging to its valid scope. However, as noted by (Kubach and Rothermel 2001), caching never speeds up the first access to an information item, and caching locationdependent data is not beneficial if the user does not return frequently to previously visited locations. Their simulation results have proven that hoarding gives better results than caching in location-dependent systems, although it is assumed that memory available for caching is unlimited and no information is removed from the cache. Hoarding is the process of predicting the information that the user will request, for transferring it in advance to the client cache. So, the future user’s query will be satisfied by the client, although the response contains a data item that has never been requested before. Several hoarding techniques have been proposed in the literature. The first proposed methods were requiring user intervention, making the system less convivial and are useless in systems where the user does not know in advance what kind of information he will need. Automated hoarding is the process of predicting the hoard set without user intervention. Kuening and Popek (Kuenning and Popek 1997) propose an automated hoarding method where a measure called semantic distance between files is used to feed a clustering algorithm that selects the files that should be hoarded. Saygin et al (Saygin et al. 2000) propose another method based on data mining techniques. This
220
Karim Zerioh, Omar El Beqqali and Robert Laurini
latter uses association rules for determining the information that should be hoarded. Khushraj et al (Khushraj et al. 2002) propose a hoarding and reintegration strategy substituting whole file transfers between the client and the server, by transferring only the changes between their different versions. These changes are patched to the old version once in the destination machine. This incremental hoarding and reintegration mechanism is built within the Coda file system (Coda Group), based on the Revision Control System (RCS) (Tichy1985). Cao (Cao 2002) propose a method that allows making a compromise between hoarding and available power resources. None of these methods deal with the spatial property of locationdependent systems. De Nitto et al (De Nitto et al. 1998) propose a model to evaluate the effectiveness of hoarding strategies for context aware systems, based on cost measures. They apply these measures to some idealised motion models. For the two-dimensional model, the area is divided in adjacent polygons. The distance between two polygons is the number of polygons that must be traversed to pass from the first polygon to the other one. The ring k is defined as the set of all polygons whose distance from a given polygon is equal to k. All the information associated to rings 0, 1, …, k around the starting position is hoarded. The next hoard does not occur until the user enters a polygon outside the circle k. One drawback of this strategy is that a lot of unnecessary information is hoarded because the user’s direction is not taken into account (all the information associated with the area behind the user is useless). Another drawback is that the hoard does not begin until the user is out of the hoarded circle. So hoard misses will occur until the information related with the new user’s circle is hoarded. Kubach and Rothermel (Kubach and Rothermel 2001) propose a hoarding mechanism for location-dependent systems based on infostations (Badrinath et al. 1996). When the user is near an infostation the information related to its area is hoarded. An access probability table is maintained where each data item is associated with the average probability that it will be requested. Only a fixed number of the first data items with the highest probabilities are hoarded for the purpose of not wasting bandwidth and the user’s device resources. No explication is given about what a data item is. We find that a fixed number m is not a good criteria for this purpose in a realistic system because what ever data items can be (files, tables, fields in a table, web pages, real word entities…) there will be always differences in the memory required for them, so a fixed number of data items can always exceed the space reserved for them. When no assumptions can be made on the future user’s movement, all the information related to the infostation area is hoarded. So this mechanism is not adapted for systems managing a huge amount of data.
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
221
3 Proposed Method: Spatial Hoarding
3.1 Method Description In pervasive location-dependent systems, the majority of the user’s queries are related to the area where he is. The problem of latency is crucial for local queries (Zheng et al. 2002) whereas for non-local queries user movement for a short time does not invalidate the response. So our mechanism must hoard the information related to the current position of the user. As mobile devices usually suffer from limited storage and memory capacities, we cannot hoard the information associated to a big area. We consider the space divided in adjacent squares. We use the Peano encoding with Nordering (Laurini and Thompson 1992, Moktel et al. 2003) for its known advantage of stability (we can add an other area to the area covered by the system without affecting the encoding), its usefulness for indexing spatial databases and because it allows the use of the Peano algebra for solving spatial queries. In the following we will call a square a square of length 2 and a sub-square a square of length 1 (see Fig. 1). The curve in the figure shows the path of the user. The purpose of the method is to hoard the information that the user will the most probably access. The user after leaving the square, in which he is located, will be in one of the eight adjacent squares. But hoarding all the squares will waste resources with unnecessary information because the user direction is not taken into account. For the purpose of exploiting user’s orientation, we choose to make the hoarding decision when the user enters in a new sub-square. This way, if the user leaves his current square, he will be in one of the three squares adjacent to his sub-square. Thus, dividing squares in sub-squares and making the hoarding decision in the sub-squares boundaries, allows us to take the user’s direction into account and to restrict the number of squares to hoard from eight to only three. Usually, one or two of the adjacent squares are already in the client and only the remaining one or two squares will be hoarded. As discussed above a caching invalidation scheme is also necessary for freeing memory for more accurate data. The user’s location is also used as an invalidation criterion, and we invalidate all the squares that are not adjacent to the user’s square. We summarise this as follows: x When the user enters in a new sub-square, its three adjacent squares must be hoarded. x The information located in the squares non adjacent to the user’s square must be dropped.
222
Karim Zerioh, Omar El Beqqali and Robert Laurini
As the spatial property is used both as a hoarding and a cache invalidation criterion, we call this method “Spatial Hoarding” (SH).
20
17
28
19
25
52
27
49
60
51
16
18
24
26 i
48
50
5
7
13
15
37
39
4
6
12
14
36
38
A
56
User's Location
44 x
B 0
8 i
32
40
Fig. 1 User’s route in an area divided into adjacent peano N-ordered squares
Table 1 summarises how the Spatial Hoarding (SH) method is applied to the user’s path portion of figure 1. When the user is in the sub-square 36, the squares 8, 12, 32, 36 are available in the client’s cache. When he enters to the sub-square 14, the hoarding and the cache invalidation criteria are checked to decide which squares to hoard and which one to delete. The three adjacent squares of the sub-square 14 are 8, 32 and 36, which are already in the client’s cache. So no hoard is needed. For the cache invalidation criteria, there are no non-adjacent squares to the square 12, so no deletion is needed. Then, the user moves to the sub-square 12, which is adjacent to the squares 0, 4 and 8. Squares 0 and 4 are not available in the client’s cache, so they will be hoarded. For the cache invalidation criteria, here again there are no non-adjacent squares to the square 12. The cache invalidation criterion is not satisfied until the user reaches the sub-square 7. 7 is a sub-square of square 4. The adjacent squares of square 4 are 0, 8, 12, 16, and 24. As Squares 32 and 36 are in the client’s cache and are not adjacent to the square 4, they are invalidated by the cache invalidation criteria and are dropped. We give an algorithm of the SH method in section 4.
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
223
Table 1. The Spatial Hoarding method applied to the user’s path of Fig. 1 Sub-squares Available squares
Squares to hoard Squares to drop
36
8, 12, 32, 36
14
8, 12, 32, 36
12
8, 12, 32, 36
0, 4
13
0, 4, 8, 12, 32, 36
16, 24
7
0, 4, 8, 12, 16, 24, 32, 36
32, 36
18
0, 4, 8, 12, 16, 24
0, 8
19
4, 12, 16, 24
25
4, 12, 16, 20, 24, 28
27
4, 12, 16, 20, 24, 28
49
4, 12, 16, 20, 24, 28, 48, 52
51
12, 24, 28, 48, 52
20, 28
48, 52 4, 16, 20 56, 60
3.2 Client’s Capability Let’s consider again the scenario of a tourist looking for the nearest restaurant. In Figure 1, the tourist is in the sub-square number 14, there is a restaurant A in the sub-square 26 and a restaurant B in the sub-square 8. If the client answers the query of the nearest restaurant, the response will be restaurant B, although restaurant A is nearer, because the sub-square 26 is not available in the client. We define the capability of the client as the area where the client is able to give a correct answer to a query of this kind. The client’s capability is the area delimited by the circle whose centre is the current user’s location and whose radius is the distance between the user and the nearest vertex of the polygon delimiting the squares available on the client. For our example, this vertex is the upper side of the square 12. So, before giving an answer to the user, the client must look for the nearest restaurant within its capability circle. If no restaurant is found, the query must be transferred to the server.
224
Karim Zerioh, Omar El Beqqali and Robert Laurini
3.3 Operating in Disconnected Mode Another limit to mobile devices is their low power capacity. The SH method applied to the example of Figure 1 shows that hoarding is needed only in 5 sub-squares of the 11 ones traversed. We can exploit this by allowing the user to operate in doze or disconnected mode for saving power consumption. The application interface must allow the user to know when he can switch to disconnected mode and when he must reconnect.
4 Implementation
4.1 Algorithm We model the available squares in the client’s cache by a linked list whose nodes are squares having an attribute where we store its corresponding Peano key. Algorithm 1 is the algorithm implementing the cache invalidation and the hoarding operations. Algorithm 2 is the algorithm retrieving the 3 adjacent squares to a given sub-square. Algorithm 1: Application of the cache invalidation and the hoarding criteria’s Input: List of the available squares (list), Array of the adjacent squares to the current square (T1[9]), Array of the adjacent squares to the current sub-square (T2[3]) Procedure: Integer i; /*Cache invalidation criteria*/ temp := list.first; temp2 := new(square); while ((notin(temp, T1) = = true) and (temp != NULL) do first := temp.next; temp := first; end while while (temp != NULL) do if (temp.next != NULL) then
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
225
if (notin(temp.next, T1) = = true) then begin temp2 := temp.next; temp.next := temp2.next; if (tmp2 = = last) then last = tmp; end if free(temp); end else then temp := temp.next; end if else then temp := temp.next; end if end while /* Hoarding Criteria*/ for i := 1 to T2.length do if (isnotin(T2[i], list) = = true) then add(list, T2[i]); end for The algorithm begins with the application of the cache invalidation criteria. The first element of the list is treated alone because it has no previous node pointing to it. Each node of the list is compared, using the function “notin” with the array T1; if a node (square) does not exist in the array T1 (if this square is not adjacent to the current square), it is dropped from the list. Then, the hoarding criterion is applied. Each element of T2 (array of the adjacent squares to the current sub-square) is compared to the list using the function “isnotin”. Every element of T2, which does not exist in the list, is added to the latter using the function “add”. Algorithm 2: Determination of the adjacent squares to the current sub-square. Input: Peano key of the current sub-square (P) Output: The array of the 3 adjacent Squares T[3] Procedure: integer T[3]; integer A[3][2]; integer x, y, i; x := get_x(P); y :=get_y(P);
226
Karim Zerioh, Omar El Beqqali and Robert Laurini
switch (P mod 4) begin case 0 : A[1][1] := x – 1; A[1][2] := y; A[2][1] := x – 1; A[2][2] := y A[3][1] := x ; A[3][2] := y - 1; Break; case 1 : A[1][1] := x; A[1][2] := y + 1; A[2][1] := x – 1; A[2][2] := y + A[3][1] := x - 1; A[3][2] := y; Break; case 2 : A[1][1] := x; A[1][2] := y - 1; A[2][1] := x + 1; A[2][2] := y A[3][1] := x + 1; A[3][2] := y; Break; case 3 : A[1][1] := x; A[1][2] := y + 1; A[2][1] := x + 1; A[2][2] := y + A[3][1] := x +1; A[3][2] := y; Break; end for i := 1 to 3 do T[i] := get_p(A[i][1],A[i][2]); T[i] := T[i] – (T[i] mod 4) end for output T
1;
1;
1;
1;
First, the coordinates x and y are deducted from the Peano key P using the functions “get_x” and “get_y”. Then, the position of the current subsquare in its parent square is determined, because the determination of the adjacent sub-squares depends on it. After determining the coordinates of the adjacent sub-squares, the corresponding Peano keys are deducted using the function “get_p”. Then, the number of the left bottom sub-square is deducted, because the number of this sub-square added to a length 2 is the name of the square. We do not give the algorithm of the determination of the adjacent squares to the current square, because it is quite similar to algorithm 2. 4.2 What to Hoard Location-dependent systems such as tourist guides, as noted before, can use different kinds of data (text, graphics, maps, images, audio clips, film clips…). So the amount of the multimedia data managed is very important.
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
227
We have proposed to divide the area covered by the system into adjacent squares, because hoarding the information related to all the area cannot be implemented in a real system, because of the limited memory and storage capacity in the user’s device. Depending on the user’s device resources and the available bandwidth, a fixed amount of space A will be reserved to the client’s cache. As the maximal number of squares that can be available in the client’s cache is 9, the amount of hoarded data cannot exceed A/9 in each square. As we deal here with systems that manage a huge amount of data, the amount of information related to some squares can exceed the A/9 value. Also, some data items associated to a square may have low probabilities of access. So transferring this data may only waste resources. Kubach and Rothermel (Kubach and Rothermel 2001) have described how to maintain access probability tables, where each data item is associated with the average probability that it will be requested, and propose to hoard only a fixed number of the first elements. We think that, a fixed number of data items is not a criteria that can be implemented in practice, because data items do not require the same space in memory, so hoarding a fixed number of data items can exceed the space reserved for the cache. In the following we will determine what we mean by a data item within the SH method, we will explain how we will be able to add and drop data items dynamically depending on the amount of space available on the client’s cache and how to retrieve all the information related to a given square. Laurini and Thompson (Laurini and Thompson 1992) have generalised the concept of hypertext to hyperdocument where the content portions can always be displayed on a screen or presented via a loudspeaker, and define the hypermap as multimedia documents with a geographic coordinate based access via mouse clicking or its equivalent. Database entities are represented by graphical means, and clicking a reference word or picture allows the user to go to another node. They present the following relational model: WEB(Document_node-ID, (Word_locator (To_document_node-ID, Type_of_link)*)*, (From_document_node-ID, Type_of_link)*) Where * indicates the nesting (the data nested are in a separate table; however, they may be stored as a long list in the parent table). Type_of_link refers to the nature of the path from one node to another. Word_locator is the word, graph unit, or pixel, or other element, in the first element. They explain also how to deal with spatial queries for retrieving hypermap nodes. By Peano relations the solution is: Document(Document_node-ID, (Peano_key, Side_length)*)
228
Karim Zerioh, Omar El Beqqali and Robert Laurini
Within the SH method we will consider document nodes as the data items to which the average access probability will be associated: Document(Document_node-ID, P(Document_node-ID), (Peano_key, Side_length)*) The first operation is to retrieve all the documents related to a candidate square for hoarding, sorted by decreasing order of their average probability of access. Then, the application will use the operating system and DBMS primitives for associating each document to its amount of space in memory. The documents will be added to the hoard set until the maximum value fixed for their square is reached. This value can be exceeded if there is sufficient space in the client. As noted before, the client can drop the data items with the lower probabilities later if necessary. Every click on a node or a document consultation will be kept in a log file in the client, and will be sent later to the server for updating the average access probability for each document node.
5 Future Work We are in the final stages of the development of a simulation prototype for the Spatial Hoarding method. Our preliminary simulation results show that the SH policy improves the cache hit ratio and reduces significantly the query latency. When there is an important number of user’s in a given area, the information related to the same square may be hoarded several times for different clients. In our future work we will focus on the issue of saving the bandwidth in the case of multiple users.
6 Conclusion We have presented an innovative policy for resolving the problem of latency in location-dependent systems. Our mechanism makes no assumptions about the future user’s movement and thus deals with the complexity of real world applications. We proposed solutions to all the problems related to the method’s implementation in an elaborated spatial information system managing multimedia information Our hoarding mechanism improves the cache hit ratio, thus reduces the uplink requests, and reduces the query latency. Compared to the previous hoarding mechanisms whose main aim was to allow disconnected information access, but making the user in the risk of accessing obsolete informa-
Spatial Hoarding: A Hoarding Strategy for Location-Dependent Systems
229
tion, our method allows the user to have access to the most recent data. Using the user’s position as a cache invalidation criterion reduces the need for extra communication between the client and the server for checking cache consistency.
References Badrinath, B.R., Imielinsky, T., Frankiel, R. and Goodman, D., 1996. Nimble: Many-time, many-where communication support for information systems in highly mobile and wireless environments, http://www.cs.rutgers.edu/~badri/dataman/nimble/. Cao, G., 2002, Proactive power-aware cache management for mobile computing systems. IEEE Transactions on computers 51, 6, 608-621. Cheverst, K., Davis, N., Mitchell, K., Friday and Efstriatou C., 2000, Developing a context-aware electronic tourist guide: some issues and experiences. In Proceedings of CHI’2000, Netherlands, pp. 17-24. The Coda Group, Coda file system, http://www.coda.cs.cmu.edu/. De Nitto, V.P., Grassi, V., Morlupi, A., 1998, Modeling and evaluation of prefetching policies for context-aware information services. In Proceedings of the 4th Annual International Conference on Mobile Computing and Networking, (Dallas, Texas, USA), pp. 55-64. Khushraj, A., Helal, A., Zhang, J., 2002, Incremental hoarding and reintegration in mobile environments. In Proceedings of the International Symposium on Applications and the Internet (Nara, Japan). Kubach, U., and Rothermel, K., 2001, Exploiting location information for infostation-based hoarding. In Proceedings of the 7th International Conference on Mobile Computing and Networking (New York, ACM Press), pp. 15-27. Kuenning, G. H., and Popek, G. J., 1997, Automated hoarding for mobile computers. In Proceedings of the 16th ACM Symposium on Operating Systems Principles, (St. Malo, France), pp. 264-275. Laurini, R., and Thompson, A.D., 1992, Fundamentals of Spatial Information Systems (A.P.I.C. Series, Academic Press, New York, NY). Lee, D.L., Lee, W.C., Xu, J., and Zheng, B., 2002, Data management in locationdependent information services. IEEE Pervasive Computing, 1, 3, 65-72. Abowd, G.D., Atkeson, C.G., Hong, J., Long, S., Kooper, R., Pinkerton, M., 1997, Cyberguide: a mobile context-aware tour guide. Wireless Networks 3, 5, pp. 421-433. Malaka, R., 1999, Deep Map: the multilingual tourist guide. In Proceedings of the C-STAR workshop. Mokbel M-F, Aref W-G., Kamel I.: Analysis of Multi-Dimensional Space-Filling Curves. GeoInformatica 7(3): 179-209 (2003) Poslad, S., Laamanen, H., Malaka, R., Nick, A., Buckle, P., and Zipf, A., 2001, CRUMPET: Creation of user-friendly mobile services personalised for tour-
230
Karim Zerioh, Omar El Beqqali and Robert Laurini
ism. In Second International Conference on 3G Mobile Communication Technologies (London UK), pp. 28-32. Saygin, Y., Ulusoy, Ö., and Elmagarmid, A.K., 2000, Association rules for supporting hoarding in mobile computing environments. In Proceedings of the 10th International Workshop on Research Issues in Data Engineering (IEEE Computer Society Press). Tichy, W.F., 1985, RCS - A system for version control. Software-Practice and Experience, 15, 7, pp. 637-654. Xu, J., Tang, X., Lee, D.L., and Hu, Q., 1999, Cache coherency in locationdependent information services for mobile environments. In Proceedings of the 1st International Conference on Mobile Data Acess (Springer, Heidelberg, Germany), pp. 182-193. Zarikas, V., Papatzanis, G., and Stephanidis, C., 2001, An architecture for a selfadapting information system for tourists. In Proceedings of the 2001 workshop on Multiple User Interfaces over the Internet, http://cs.concordia.ca/~seffah/ihm2001/papers/zarikas.pdf. Zheng, B., Xu, J., and Lee, D.L., 2002, Cache invalidation and replacement strategies for location-dependent data in mobile environments. IEEE Transactions on Computers 51, 10, pp. 1141-1153.
Distributed Ranking Methods for Geographic Information Retrieval Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, and Roelof van Zwol Institute of Information and Computing Sciences, Utrecht University P.O.Box 80.089, 3508 TB Utrecht, The Netherlands [email protected], [email protected], [email protected], [email protected]
Summary. Geographic Information Retrieval is concerned with retrieving documents that are related to some location. This paper addresses the ranking of documents by both textual and spatial relevance. To this end, we introduce distributed ranking, where similar documents are ranked spread in the list instead of consecutively. The effect of this is that documents close together in the ranked list have less redundant information. We present various ranking methods, efficient algorithms to implement them, and experiments to show the outcome of the methods.
1 Introduction The most common way to return a set of documents obtained from a Web query is by a ranked list. The search engine attempts to determine which document seems to be the most relevant to the user and will put it first in the list. In short, every document receives a score, or distance to the query, and the returned documents are sorted by this score or distance. There are situations where the sorting by score may not be the most useful one. When a more complex query is done, composed of more than one query term or aspect, documents can also be returned with two or more scores instead of one. This is particularly useful in geographic information retrieval (Jones et al. 2002, Rauch et al. 2003, Visser et al. 2002). For example, the Web search could be for castles in the neighborhood of Koblenz, and the documents returned ideally have a score for the query term “castle” and a score for the proximity to Koblenz. This implies that a Web document resulting from this query can be mapped to a point in the 2-dimensional plane. A cluster of points in this plane could be several documents about the same castle. If this castle is in the immediate vicinity of Koblenz, all of these documents would be ranked high, provided that they also have a high score on the term “castle”. However, the user probably also wants documents about other castles that may be a bit further away, especially when these documents
This research is supported by the EU-IST Project No. IST-2001-35047 (SPIRIT).
232
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
are more relevant for the term “castle”. To incorporate this idea in the ranking, we introduce distributed ranking in this paper. We present various models that generate ranked lists that have diversity. We also present efficient algorithms that compute the distributed rankings. To keep server load low, it is important to have efficient algorithms. There are several reasons to rank documents according to more than one score. For example we could distinguish between the scores of two textual terms, or a textual term and metadata information, or a textual term and a spatial term, and so on. A common example of metadata for a document is the number of hyperlinks that link to that document; a document is probably more relevant if there are many links to it. In all of these cases we get two scores which need to be combined for a ranking. In traditional information retrieval, the two scores of each document would be combined into a single score (e.g., by a weighted sum or product) which produces the ranked list by sorting. Besides the problem that it is unclear how the two scores should be combined, it also makes a distributed ranking impossible. Two documents with the same combined score could be similar documents or quite different. If two documents have two scores that are the same, one has more reason to suspect that the documents themselves are similar than when two documents have one (combined) score that is the same. The topic of geographic information retrieval is studied in the SPIRIT project (Jones et al. 2002). The idea is to build a search engine that has spatial intelligence because it will understand spatial relationships like close to, to the North of, adjacent to, and inside, for example. The core search engine will process a user query in such a way that both the textual relevance and the spatial relevance of a document is obtained in a score. This is possible because the search engine will not only have a term index, but also a spatial index. These two indices provide the two scores that are needed to obtain a distributed ranking. The ranking study presented here will form part of the geographic search engine to be developed for the SPIRIT project. Related research has been conducted in (Rauch et al. 2003), which focuses on disambiguating geographic terms of a user query. The disambiguation of the geographic location is done by combining textual information, spatial patterns of other geographic references, relative geographic references from the document itself, and population heuristics from a gazetteer. This gives the final value for geoconfidence. The georelevance is composed of the geoconfidence and the emphasis of the place name in the document. The textual relevance of a document is computed as usual in information retrieval. Once both textual and geographic relevance are computed, they are combined by a weighted sum. Finding relevant information and at the same time trying to avoid redundancy has so far mainly been addressed in producing summaries of one or more documents. (Carbonell and Goldstein 1998) uses the maximal marginal relevance (MMR), which is a linear combination of the relevance of the document to the user query and its independence of already selected documents.
Distributed Ranking Methods for Geographic Information Retrieval
233
MMR is used for the reordering of documents. A user study has been performed in which the users preferred MMR to the usual ranking of documents. This paper contains no algorithm how to actually (efficiently) compute the MMR. Following up on this, a Novelty Track of TREC (Harman 2002) discusses experimenting with ranking of textual documents such that every next document has as much additional information as possible. (Goldstein et al. 1999) proposes another scoring function for summarizing text documents. Every sentence is assigned a score combined of the occurence of statistical features and on the occurrence of linguistic features. They are combined linearly with a weighting function. In (Goldstein et al. 2000), MMR is refined and used to summarize multiple documents. Different passages or sentences respectively are assigned a score instead of full documents. The remainder of this paper is organized as follows. In Section 2 we present several different ranking methods and the algorithms to compute them. In Section 3 we show how the ranking methods behave on real-world data. In the conclusions we mention other research and experiments that we have done or we are planning to do.
2 Distributed Ranking Methods In this section we will present specific ranking methods. Like in traditional information retrieval, we want the most relevant documents to appear in the ranking, while avoiding that documents with similar information appear close to documents already ranked. We will focus on the two dimensional case only, although in principle the idea and formulas apply in higher dimensions too. We assume that a Web query has been conducted and a number of relevant documents were found. Each document is associated with two scores, for example a textual score and a spatial score (which is the case in the SPIRIT search engine). The relevant documents are mapped to points in the plane, and the query is also mapped to a point. We perform the mapping in such a way that the query is a point Q at the origin, and the documents are mapped to points p1 , . . . , pn in the upper right quadrant, where documents with high scores are points close to Q. We can now formulate the two main objectives for our ranking procedure: 1. Proximity to query: Points close to the query Q are favored. 2. Spreading: Points farther away from already ranked points are favored. A ranking that simply sorts all points in the representation plane by distance to Q is optimal with respect to the first objective. However, it can perform badly with respect to the second. Selecting a highly distributed subset of points is good with respect to the second objective, but the ranked list would contain too many documents with little relevance early in the list. We therefore seek a compromise where both criteria are considered simultaneously. Note
234
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
that the use of a weighted sum to combine the two scores into one makes it impossible to obtain a spreaded ranking. The point with the smallest Euclidean distance to the query is considered the most relevant document and is always first in any ranking. The remaining points are ranked with respect to already ranked points. At any moment during the ranking, we have a subset R ⊂ P of points that have already been ranked, and a subset U ⊂ P of points that are not ranked yet. We choose from U the “best” point to rank next, where “best” is determined by a scoring function that depends on both the distance to the query Q and the set R of ranked points. Intuitively, an unranked point has a higher added value or relevance if it is not close to any ranked points. For every unranked point p,
y
p2
p p − pi
p φ Q
pi p1
p3 x
Fig. 1. An unranked point p amidst ranked points p1 , p2 , p3 , pi , where p is closest to pi by distance and by angle.
we consider only the closest point pi ∈ R, where closeness is measured either in the Euclidean sense, or by angle with respect to the query point Q. This is illustrated by p − pi and φ, respectively, in Figure 1. Using the angle to evaluate the similarity of p and pi seems less precise than using the Euclidean distance, but it allows more efficient algorithms, and certain extensions of angle-based ranking methods give well-distributed results.
2.1 Distance to query and angle to ranked Our first ranking method uses the angle measure to obtain the similarity between an unranked point and a ranked point. In the triangle pQpi (see Figure 1) consider the angle φ = φ(p, pi ) and rank according to the score S(p, R) ∈ [0, 1], which can be derived from the following normalized equation:
k 1 2(φ(p, pi ) + c) (1) · S(p, R) = min pi ∈R 1 + p π + 2c
Distributed Ranking Methods for Geographic Information Retrieval
235
Here, k denotes a constant; if k < 1, the emphasis lies on the distribution, if k > 1, we assign a bigger weight to the proximity to the query. The additive constant 0 < c 1 ensures that all unranked points p ∈ N are assigned an angle dependent factor greater than 0. The score S(p, R) necessarily lies between 0 and 1, and is appropriate if we do not have a natural upper bound on the maximum distance of unranked points to the query. If such an upper bound was available, there are other formulas that give normalized scores. During the ranking algorithm, we always choose the unranked point p that has the highest S(p, R) score and rank it next. This implies an addition to the set R, and hence, recomputation of the scores of unranked points may be necessary. We first give a generic algorithm with a running time of O(n2 ). Algorithm 1: Input: A set P with n points in the plane. 1. Rank the point r closest to the query Q first. Add it to R and delete it from P . 2. For every unranked point p ∈ P do a) Store with p the point r ∈ R with the samallest angle to p b) Compute the score S(p, R) = S(p, r) 3. Determine and choose the point with the highest score S(p, R) to be next in the ranking; add it to R and delete it from P . 4. Compute for every point p ∈ P the angle to the last ranked point p. If it is smaller than the angle to the point stored with p , then store p with p and update the score S(p , R). 5. Continue with step 3 as long as there are unranked points. The first four steps all take linear time. As we need to repeat steps 3 and 4 until all points are ranked, the overall running time of this algorithm is O(n 2 ). It is a simple algorithm, and can be modified to work for different score and distance functions. In fact, it can be applied to all the ranking models that will follow. Theorem 1. A set of n points in the plane can be ranked according to the distance-to-query and angle-to-ranked model in O(n2 ) time. If we are only interested in the top 10 documents of the ranking, we only need O(n) for the computation. More generally, the top t documents are determined in O(tn) time.
2.2 Distance to query and distance to ranked In the last section we ranked by angle to the closest ranked point. It may be more natural to consider the Euclidean distance to the closest ranked point instead. In the triangle pQpi of Figure 1, take the distance p − pi from p to the closest ranked point pi and rank according to the outcome of the following equation:
p − pi (2) S(p, R) = min pi ∈R p2
236
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
The denominator needs a squaring of p (or another power > 1) to assure that documents far from Q do not end up too early in the ranking, which would conflict with the proximity to query requirement. A normalized equation such that S(p, R) ∈ [0, 1] is the following:
1 −λ·p−pi (3) )· S(p, R) = min (1 − e pi ∈R 1 + p
Here, λ is a constant that defines the slope of the exponential function. Algorithm 1 can be modified to work here as well with a running time of O(n2 ). Theorem 2. A set of n points in the plane can be ranked according to the distance-to-query and distance-to-ranked model in O(n2 ) time.
2.3 Addition models So far, our distributed methods were all based on a formula that divided angle or distance to the closest ranked point by the distance to the query. In this way, points closer to the query get a higher relevance. We can obtain a similar effect but a different ranking by adding up these values. It is not clear beforehand which model will be more satisfactory for users, so we analyze these models as well.
2 (4) S(p, R) = min α · (1 − e−λ·(p/pmax ) ) + (1 − α) · φ(p, pi ) · pi ∈R π
In this equation, pmax is the point with maximum distance to the query, α ∈ [0, 1] denotes a variable which is used to put an emphasis on either distance or angle, and λ is a constant that defines the base eλ of the exponential function. Algorithm 1 can be modified for the addition models, but as the angle φ(p, p i ) is an additive and not a multiplicative part of the score equation, we can give algorithms with better running time. The point set is initially stored in the leaves of a binary tree T , sorted by counterclockwise (ccw) angle to the y-axis. In every leaf of the tree we also store: (i) ccw and clockwise (cw) angle to y and x-axis respectively; (ii) the distance to the query; (iii) ccw and cw score. We augment T as follows (see e.g. (Cormen et al. 1990) for augmenting data structures): In every internal node we store the best cw and ccw score per subtree. Later in the algorithm, we additionaly store the angle of the closest ccw and cw ranked point and whether the closest ranked point is in cw or ccw direction in the root of each subtree. Furthermore, we store the best score per tree in a heap for quicker localization. As shown left in Figure 2, between two already ranked points p 1 and p2 , indicated by 1 and 2 , there are two binary trees, T1 cw and T2 ccw of the bisecting barrier line 12 . All the points in T1 are closer in angle to p1 and all the points in T2 are closer in angle to p2 . If we insert a new point p3 to the ranking, this means we insert a new imaginary line 3 through p3 and we need to perform the following operations on the trees:
Distributed Ranking Methods for Geographic Information Retrieval
T3
‘2
‘2 T2
p2
p1
p3
0 Tccw
‘12
00 Tcw
T1 T0
‘1
‘32
237
‘12 ‘3
00 Tccw 0 Tcw
‘13 ‘1
Fig. 2. The split and concatenate of trees in Algorithm 2.
1. Split T1 and T2 at the angle-bisectors ‘32 and ‘13 , creating the new trees 0 0 and Tccw and two intermediate trees T cw and T ccw Tcw 2. Concatenate the intermediate trees from (1), creating one tree T . 00 00 and Tccw . 3. Split T at the newly ranked point p3 , creating Tcw
Figure 2 right, shows the outcome of these operations. Whenever we split or concatenate the binary trees we need to make sure that the augmentation remains correct. In our case, this is no problem, as we only store the best initial scores in the inner leaves. However, we need to update the information in the root of each tree about the closest cw and ccw ranked point and the best scores. As the scores are additive, and all scores for points in the same tree are calculated with respect to the same ranked point, we simply subtract (1 − fi) · φ0 · 2=π, where φ0 denotes the cw(ccw) angle of the closest ranked point, from the cw (ccw) best score to get the new best score for the tree. We also need to update the score information in the heap. Now we can formulate an algorithm for the addition-model that runs in O(n log n) time. Algorithm 2: Input: A set P with n points in the plane. 1. Create T with all points of P , the augmentation and a heap that contains only the point p closest to the query Q. 2. Choose the point p with the highest score S(p, R) as next in the ranking by deleting the best one from the heap. 3. For every last ranked point p do: a) Split and concatenate the binary trees as described above and update the information in their roots. b) Update the best-score information in the heap: i. Delete the best score of the old tree T1 or T2 that did not contain p. 0 00 00 0 , Tccw , Tcw , and Tccw ii. Find the four best scores of the new trees Tcw and insert them in the heap. 4. Continue with step 2. Theorem 3. A set of n points in the plane can be ranked according to the angle-distance addition model in O(n log n) time. Another, similar, addition model adds up the distance to the query and the distance to the closest ranked point:
238
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
Again, pmax is the point with maximum distance to the query, α ∈ [0, 1] is a variable used to influence the weight given to the distance to the query (proximity to query) or to the distance to the closest point in the ranking (high spreading), and λ1 and λ2 are constants that define the base of the exponential function. Note that Algorithm 2 is not applicable for this addition model. This is easy to see, since the distance to the closest ranked point does not change by the same amount for a group of points. This implies that the score for every unranked point needs to be adjusted individually when adding a point to R. We can modify Algorithm 1 for this addition model. Alternatively, we can use the following algorithm that has O(n2 ) running time in the worst case, but a typical running time of O(n log n). Algorithm 3: Input: A set P with n points in the plane. 1. Rank the point p closest to the query Q first. Add it to R and delete it from P . Initialize a list with all unranked points. 2. For every newly ranked point p ∈ R do: a) Insert it to the Voronoi diagram of R. b) Create for the newly created Voronoi cell a list of unranked points that lie in it by taking those points that have p as closest ranked from the lists of the neighboring cells. For all R ⊆ R Voronoi cells that changed, update their lists of unranked points. c) Compute the point with the best score for the newly created Voronoi cell and insert it in a heap H. For all R ⊆ R Voronoi cells that changed, recompute the best score and update the heap H accordingly. 3. Choose the point with the best overall score from the heap H as next in the ranking; add it to R and delete it from P and H. 4. Continue with step 2. Since the average degree of a Voronoi cell is six, one can expect that a typical addition of a point p to the ranked points involves a set R with six ranked points. If we also assume that, typically, a point in R loses a constant fraction of the unranked points in its list, we can prove an O(n log n) time bound for the whole ranking algorithm. The analysis is the same as in (Heckbert and Garland 1995, van Kreveld et al. 1997). The algorithm can be applied to all ranking methods described so far. Theorem 4. A set of n points in the plane can be ranked by the distancedistance addition model in O(n2 ) worst case and O(n log n) typical time.
3 Experiments We implemented the generic ranking Algorithm 1 for the basic ranking methods described in Subsections 2.1, 2.2, and 2.3. Furthermore we implemented
Distributed Ranking Methods for Geographic Information Retrieval
239
55
55
18
50
19
17
15
50 20
8 10
8
40
11 10
12
14
40
16
13 14 7
30
5 4
30
5
6
6
12
15
20
20
3
13
4
10
10 2
79
1
9
11
1 0
3 2
10
20
30
40
50
55
0
10
20
30
40
50
55
Fig. 3. Ranking by distance to origin only.
an extension called staircase enforcement, explained in Subsection 3.2. We compare the outcomes of these algorithms for two different point sets shown in Figure 3. The point set at the left consists of 20 uniformly distributed points, the point set at the right shows the 15 most relevant points for the query ‘safari africa’ which was performed on a dataset consisting of 6,500 Lonely Planet web pages. The small size of the point sets was chosen out of readability considerations.
3.1 Basic ranking algorithms Figure 3 shows the output of a ranking by distance-to-query only. It will function as a reference point for the other rankings. Points close together in space are also close in the ranking. In the other ranking methods, see Figure 4, this is not the case anymore. This is visible in the ‘breaking up’ of the cluster of four points in the upper left corner of the Lonely Planet point set rankings. Note also that the points ranked last by the simple distance ranking are always ranked earlier by the other methods. This is because we enforced higher spreading over proximity to the query by the choice of parameters. The rankings are severely influenced by this choice. In our choice of parameters we did not attempt to obtain a “best” ranking. We used the same parameters in all three ranking methods to simplify qualitative comparison.
3.2 Staircase enforced ranking algorithms In the staircase enforced methods, shown in Figure 5, the candidates to be ranked next are only those points that lie on the (lower left) staircase of the point set. The scoring functions are as before. A point p is on the staircase of a point set P if and only if for all p ∈ P \ {p}, we have px < px or py < py . So, with this limitation, proximity to the query gets a somewhat higher
240
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol 55
55
15
50
20
6
6
50 19
7 14 7
40
11
15 12
8
40
17
5 18 30
13
10 14
30
16
9
12 10
5
20
20
4
9
8
4 2
3 13
1
10
10 2
11 3
1
50
40
30
20
10
0 55
55
18
50
20
11
10
0 55
50
55
50
55
50
55
9
50 12
7 17
10
40
40
30
20
13 14
11 10
40
6
8 15 7
30
12 3
30
4
5
14 8
16
20
20
9
13
3
4 2
6 15
1
10
10 2
19 5
1 30
20
10
0 55
50
40
55
5
50
16
13
0 55
10
40
30
4
50 19
3 12 6
40
20
15 12
10 7
40
10
6 17 15
30
11 14
30
3
9
20 9
4
20
20
7
14
8
10
10 18
2 13
1
11 2
1 0
5
8
10
20
30
40
50
55
0
10
20
30
40
Fig. 4. Top: Ranking by distance to origin and angle to closest (k = 1, c = 0.1). Middle: Ranking by distance to origin and distance to closest (Equation 3, λ = 0.05). Bottom: Ranking by additive distance to origin and angle to closest (α = 0.4, λ = 0.05).
Distributed Ranking Methods for Geographic Information Retrieval
241
55
55
16
50
20
15
15
50 19
6 14
13
40
13 12
14 10
40
17
11 18 12
30
5 8
30
11
9
9 6
10
20
20
4
8
5
4 2
37
1
10
10 2
7 3
1
50
40
30
20
10
0 55
55
15
50
18
14
10
0 55
50
55
50
55
50
55
14
50 20
6 13
9
40
40
30
20
11 12
13 9
40
11
10 12 8
30
4 7
30
4
8
10 7
19
20
20
6
17
3
5 3 2 15
1
10
10 2
16 5
1
50
40
30
20
10
0 55
55
10
0
30
20
40
55
16
50
15
20
15
50
19 6
12 13
40
13 12
14 10
40
17
11
18 14
30
5 8
30
11
9
10
8
9
20
20
4
10
6
7
10
3
27
1
5 2
1
0
4 3
10
20
30
40
50
55
0
10
20
30
Fig. 5. Same as Figure 4, but now staircase enforced.
40
242
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol
importance compared to the basic algorithms, which is clearly visible in the figures, as the points farthest away from the query are almost always ranked last. It appears that staircase enforced methods perform better on distance to query while keeping a good distribution. The staircase enforced rankings can be implemented efficiently by adapting the algorithms we presented before.
4 Conclusions This paper introduced distributed relevance ranking for documents that have two scores. It is particularly useful for geographic information retrieval, where documents have both a textual and a spatial score. The concept can easily be extended to more than two scores, although it is not clear how to obtain efficient algorithms that run in subquadratic time. The experiments indicate that both requirements for a good ranking, distance to query and spreading, can be obtained simultaneously. Especially the staircase enforced methods seem to perform well. User evaluation is needed to discover which ranking method is preferred most, and which parameters should be used. We have examined more extensions and performed more experiments than were reported in this paper. For example, we also analyzed the case where the unranked points are only related to the 10 (or any number of) most recently ranked points, to guarantee that similar points are sufficiently far apart in the ranked list. Also for this variation, user evaluation is needed to determine the most preferred methods of ranking.
References Carbonell, J.G., and Goldstein, J., 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Research and Development in Information Retrieval, pages 335–336. Cormen, T.H., Leiserson, C.E., and Rivest, R.L., 1990. Introduction to Algorithms. MIT Press, Cambridge, MA. Goldstein, J., Kantrowitz, M., Mittal, V.O., and Carbonell, J.G., 1999. Summarizing text documents: Sentence selection and evaluation metrics. In Research and Development in Information Retrieval, pages 121–128. Goldstein, J., Mittal, V.O., Carbonell, J.G., and Callan, J.P., 2000. Creating and evaluating multi-document sentence extract summaries. In Proc. CIKM, pages 165–172. Harman, D., 2002. Overview of the TREC 2002 novelty track. In NISI Special Publication 500-251: Proc. 11th Text Retrieval Conference (TREC 2002). Heckbert, P.S., and Garland, M., 1995. Fast polygonal approximation of terrains and height fields. Report CMU-CS-95-181, Carnegie Mellon University. Jones, C.B., Purves, R., Ruas, A., Sanderson, M., Sester, M., van Kreveld, M.J., and Weibel, R., 2002. Spatial information retrieval and geographi-
Distributed Ranking Methods for Geographic Information Retrieval
243
cal ontologies – an overview of the SPIRIT project. In Proc. 25th Annu. Int. Conf. on Research and Development in Information Retrieval (SIGIR 2002), pages 387–388. Rauch, E., Bukatin, M., and Naker, K., 2003. A confidence-based framework for disambiguating geographic terms. In Proc. Workshop on the Analysis of Geographic References. http://www.metacarta.com/kornai/NAACL/WS9/Conf/ws917.pdf. van Kreveld, M., van Oostrum, R., and Snoeyink, J., 1997. Efficient settlement selection for interactive display. In Proc. Auto-Carto 13: ACSM/ASPRS Annual Convention Technical Papers, pages 287–296. Visser, U., V¨ ogele, T., and Schlieder, C., 2002. Spatio-terminological information retrieval using the BUSTER system. In Proc. of the EnviroInfo, pages 93–100.
Representing Topological Relationships between Complex Regions by F-Histograms Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko Department of Computing and Information Science University of Guelph, Guelph, ON, N1G 2W1, Canada {lwawrzyn, matsakis, dnikiten}@cis.uoguelph.ca
Abstract In earlier work, we introduced the notion of the F-histogram and demonstrated that it can be of great use in understanding the spatial organization of regions in images. Moreover, we have recently designed F-histograms coupled with mutually exclusive and collectively exhaustive relations between line segments. These histograms constitute a valuable tool for extracting topological relationship information from 2D concave objects. For any direction in the plane, they define a fuzzy partition of all object pairs, and each class of the partition corresponds to one of the above relations. The present paper continues this line of research. It lays the foundation for generating a linguistic description that captures the essence of the topological relationships between two regions in terms of the thirteen Allen relations. An index to measure the complexity of the relationships in an arbitrary direction is developed, and experiments are performed on real data.
1 Introduction Work in the modeling of topological relationships often relies on an extension into the spatial domain of Allen’s temporal relations (Allen 1983). Although several alternatives and refinements have been proposed, a common procedure is to approximate the geometry of spatial objects by Minimum Bounding Rectangles (Nabil et al. 1995; Sharma and Flewelling 1995). Many authors, e.g., (Goodchild and Gopal 1990), have stressed the need to handle imprecise and uncertain information about spatial data. Qualitative spatial reasoning aims at modeling commonsense knowledge of space. Nevertheless, computational approaches for spatial modeling and reasoning can benefit from more quantitative measures, and the interest of fuzzy approaches has been widely recognized (Dutta 1991; Freeman 1975).
246 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko
In previous publications, we introduced the notion of the F-histogram (Matsakis 1998; Matsakis and Wendling 1999), a generic quantitative representation of the relative position between two 2D objects. Most work focused on particular F-histograms called force histograms. As demonstrated in (Matsakis 2002), these histograms can be of great use in understanding the spatial organization of regions in images. For instance, they can provide inputs to systems for linguistic scene description (Matsakis et al. 2001). Moreover, we have recently shown (Matsakis and Nikitenko, to appear) that the F-histogram constitutes a valuable tool for extracting topological relationship information from 2D concave objects. The present paper builds both on (Matsakis et al. 2001) and (Matsakis and Nikitenko, to appear). It lays the foundation for generating a linguistic description that captures the essence of the topological relationships between two complex regions in terms of the thirteen Allen relations. The notion of the Fhistogram is briefly described in Sect. 2. The way F-histograms can be coupled with Allen relations using fuzzy set theory is examined in Sect. 3. Section 4 describes experiments on real data. It shows that the F-histograms associated with a given pair of objects carry lots of topological relationship information. An index to measure the complexity of the relationships in an arbitrary direction is developed in Sect. 5. This index will play an important role in the generation of linguistic descriptions. Conclusions are given in Sect. 6.
2 F-Histograms As shown in Fig. 1, the plane reference frame is a positively oriented orthonormal frame (O, i , j ). For any real numbers D and v, the vectors iD and jD are the respective images of i and j through the D-angle rotation, and 'D(v) is the oriented line whose reference frame is defined by iD and the point of coordinates (0,v) — relative to (O, iD , jD ). An object is a nonempty bounded set of points, E, equal to its interior closure , and such that for any D and v the intersection ED(v)=E'D(v) is the union of a finite number of mutually disjoint segments. An object may have holes in it and may consist of many connected components. ED(v) is a longitudinal section of E. Finally, T denotes the set of all triples (D,ED(v),GD(v)), where D and v are any real numbers and E and G are any objects. Now, consider two objects A and B (the argument and the referent), a direction T and some proposition PAB(T) like “A is after B in direction T,” “A overlaps B in direction T,” or “A surrounds B in direction T.” We want 1
1
In other words, it is a 2D object that does not include any “grafting,” such as an arc or isolated point.
Representing Topological Rrelationships between Complex Regions 247
to attach a weight to PAB(T). To do so, the objects A and B are handled as longitudinal sections. x For each v, the pair (AT(v),BT(v)) of longitudinal sections is viewed as an argument put forward to support PAB(T). x A function F from T into IR + (the set of non-negative real numbers) attaches the weight F(T,AT(v),BT(v)) to this argument (AT(v),BT(v)). x The total weight FAB(T) of the arguments stated in favor of PAB(T) f is naturally set to (Fig. 2): FAB(T) = ³ f F(T,AT(v),BT(v)) dv. The function FAB so defined is called the Fhistogram associated with (A,B). It is one possible representation of the position of A with regard to B. F-histograms include f-histograms, which include M-histograms, which themselves include force histograms (Matsakis 1998; Matsakis and Nikitenko, to appear). Most work has focused on force histograms (Matsakis 2002). Malki et al. (2002), however, use f-histograms 2 to attach weights to the propositions PAr B(T) { “A r B in direction T,” where r belongs to the set {>, mi, oi, f, d, si, =, s, di, fi, o, m, <} of Allen relations (Fig. 3). But the thirteen f-histograms are not defined in a consistent manner and only convex objects are considered. The work of Malki et al. is discussed and revisited in a book chapter by Matsakis and Nikitenko (to appear). The F-histograms designed in that chapter are presented in Sect. 3. A B
'Tv)
v T
Fig. 1. Oriented straight lines and longitudinal sections. ED(v)=E'D(v) is here the union of three segments.
2
Fig. 2. The objects A and B are handled as longitudinal sections: FAB(T) = ³ f f F(T,A'T(v),B'T(v)) dv.
Although the authors rely on the research presented in (Matsakis 1998), they refer to these f-histograms as the histogram of spatial relations. In their publications, they also use the term of orientation histogram instead of M-histogram or force histogram. We do not subscribe to these changes in terminology.
248 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko
3 F-Histograms and Allen Relations Consider a set of mutually exclusive and collectively exhaustive relations between segments of an oriented line. F-histograms can be coupled with such relations using fuzzy set theory. We consider here the well-known set of Allen relations (Fig. 3). More details about the F-histograms described below can be found in (Matsakis and Nikitenko, to appear).
di (contains) fi (finished by) m (meets)
< (before)
= (equals)
si (started by) mi (met by)
oi (overlapped by)
o (overlaps) s (starts)
> (after)
f (finishes) d (during)
Fig. 3. Allen relations (Allen 1983) between two segments of an oriented line. The black segment is the referent, the gray segment is the argument. Two relations r1 and r2 are linked if and only if they are conceptual neighbors, i.e., r1 can be obtained directly from r2 by moving or deforming the segments in a continuous way.
Let r denote an Allen relation, A and B two objects (convex or not), and T a direction. To attach a weight to the proposition PAr B(T) { “A r B in direction T,” each pair (AT(v),BT(v)) of longitudinal sections is viewed as an argument put forward to support P Ar B(T) (Sect. 2). A function Fr attaches the weight Fr (T,AT(v),BT(v)) to this argument, and the total weight F Ar B(T) of the arguments stated in favor of P Ar B(T) is set to: f F AB r (T) = ³ f Fr (T,AT(v),BT(v)) dv.
The question, of course, is how to define Fr . Small changes in the longitudinal sections should not affect F AB r (T) significantly. Fuzzy set theoretic approaches have been widely used to handle imprecision and achieve robustness in spatial analysis. Allen relations are fuzzified in Sect. 3.1 and longitudinal sections in Sect. 3.2. The last section, Sect. 3.3, defines the function Fr .
Representing Topological Rrelationships between Complex Regions 249 3.1
Fuzzification of Allen Relations
An Allen relation r can -(b+a)/2 -(b+a)/2 -a 0 -b -b-a be fuzzified in many a/2 -b-3a/2 -b-a/2 -b+a/2 -3a/2 -a/2 ways, depending on the < y > intent of the work. For y instance, Guesgen mi m y y (2002) proceeds in a qualitative manner. o y oi y Here, we proceed in a quantitative manner. y y The 13 Allen relations are y y fuzzified as shown in s f x x z/2 z z/2 z Fig. 4. Each relation, except =, is defined by y y the min of a few trapezy y oid membership functions. Let A be the set of si x x fi z 2z z 2z all thirteen fuzzy relations. Three properties are y worth noticing. First, d x for any pair (I,J) of z/2 z segments, we have y 6rA r(I,J) = 1, where r(I,J) denotes the degree di x z 2z to which the statement I r J is to be considered < m o s f d true. This, of course, = = 1 comes from the defini> mi oi si fi di tion of = (and it can be shown that = takes its Fig. 4. Fuzzified Allen relations between two values in [0,1]). Second, segments I and J of an oriented line. Each relfor any r in A, there exation, except =, is defined by the min of a few ist pairs (I,J) such that membership functions (one for <, >, m, mi, o, oi; r(I,J)=1. Lastly, for any three for s, si, f, fi; and two for d and di). x is the length of I (the argument), z is the length of J (the pair (I,J) and any r1 and referent), a=min(x,z), b=max(x,z) and y is the r2 in A , if r1(I,J)z0 and signed distance from the end of J to the start of I. r2(I,J)z0 then r1 and r2 are direct neighbors in the graph of Fig. 3.
250 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko 3.2 Fuzzification of Longitudinal Sections
The idea is to consider that if two segments are close enough relative to their lengths, then they should be seen, to a certain extent, as a single segment. Let I be the longitudinal section ET(v) of some object E. Assume I is not empty. There exists one set {Ii}i1..n (and only one) of mutually disjoint segments such that: I = i1..n Ii. The indexing can be chosen such that, for any i in 1..n1, the segment Ii+1 is after Ii in direction T. Let Ji be the open interval “between” Ii and Ii+1. The longitudinal section I is considered a fuzzy set on 'T(v) with membership function PI. For any point M on any Ii , the value PI(M) is 1. For any point M on any Ji , the value PI(M) is Di — and, initially, Di = 0. Fuzzification of I proceeds by increasing these membership degrees Di. An example is presented in Fig. 5. Details can be found in (Matsakis and Nikitenko, to appear). 3.3 Coupling F-Histograms with Allen Relations
Consider an Allen relation r and the longitudinal sections AT(v) and BT(v) of some objects A and B. We are now able to define the value Fr (T,AT(v),BT(v)) 1
(a)
input 'Tv
0
1
output
(b) 0
'Tv
Fig. 5. Fuzzification of a longitudinal section I. (a) Membership function PI before fuzzification. (b) Membership function after fuzzification.
(see the introductory paragraph of Sect. 3). If AT(v)= or BT(v)= then Fr (T,AT(v),BT(v)) is naturally set to 0. Assume AT(v)z and BT(v)z. Assume r, AT(v) and BT(v) have been fuzzified as described in Sects. 3.1 and 3.2. There exists a tuple (D0,D1,…,Dc) of real numbers such that D0=0
Representing Topological Rrelationships between Complex Regions 251 k
k
exists one set {J j }j1..nk of segments such that Dk BT(v) = j1..nk J j. For k any i in 1..mk, the length of Iki is denoted by xi . For any j in 1..nk, the length k k of Jj is denoted by zj . The value Fr (T,AT(v),BT(v)) is defined as 3 : xz
k
k
k k
Fr (T,AT(v),BT(v)) = w 6k1..c 6i1..mk 6j1..nk [xi zj (DkDk1)] r(Ii ,Jj ), c
c
k
(1)
k
with x = 6i1..mc x i , z = 6j1..nc z j and w = 6k1..c 6i1..mk 6j1..nk [x i z j (DkDk1)]. It can be shown that small changes in the longitudinal sections do not affect Fr (T,AT(v),BT(v)) significantly (Matsakis and Nikitenko, to appear). Continuity is satisfied and, hence, robustness is achieved. Moreover, 6rA F Ar B(T) measures to what extent the objects are involved in some spatial relationships along direction T. If this information is judged unimportant, the Fr histograms, of course, can be normalized. Let us denote by AB AB 3 ªF r ¼ the histogram F r after normalization : AB T IR , ªF Ar B¼ (T) = F Ar B(T) / 6UA FU (T). (2) For a given direction T, the normalized Fr histograms define a fuzzy 13partition of the set of all object pairs, and each class of the partition corresponds to an Allen relation.
4 Experiments In Figs. 6 and 7, a grayscale value is associated with each Allen relation (Fig. 6(a)). The thirteen normalized Fr histograms that represent the extracted directional and topological relationship information are plotted in the same diagram (Fig. 6(b)). The topological relationships along direction T (on the X-axis) are described by the vector composed of the thirteen 1 AB
f
ªFf ¼ (T)
d AB ªFd ¼
(a)
s
(T) 0
T
(b)
Fig. 6. (a) Allen relations and attached grayscale values. (b) An example of normalized Fr histogram.
3
In Eqs. 1 and 2, we agree that a fraction is 0 if its denominator is 0.
2S
252 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko AB ªF r ¼ (T) values (on the Y-axis). Usually, most of these values are zero. The histograms are arranged in “layers.” Several synthetic examples and histogram properties are presented in (Matsakis and Nikitenko, to appear).
(a)
(d)
(b)
(e)
(c)
(f)
Fig. 7. Mayfly mating sequence captured by Doppler radar and the corresponding normalized Fr histograms.
Figure 7 represents a sequence of National Weather Service Detroit / Pontiac Doppler radar images. The sequence, captured on June 26, 2001, shows a mayfly aerial courtship over St. Clair County, Michigan (http://www.crh.noaa.gov/dtx/mayfly.htm). The argument is the mayfly swarm (light gray) and the referent is St. Clair County (dark gray). In (a), only the relations before and after are present; the swarm is born. In (b), the swarm becomes an unconnected object. The fragments close to and at the county border are responsible for the introduction of the relations s, f, m, mi, o, oi, and d. In (c), the swarm has grown considerably and moved over the county. The relation during clearly dominates at T | 90q and T | 270q with just a tiny bit of before and after caused by the single disjoint fragment below the county border. In (d), the relation equals becomes more prominent for the near horizontal directions T | 0q and T | 180q. Figure 7(e) shows that the only prominent relations are equals, during, starts,
Representing Topological Rrelationships between Complex Regions 253
and finishes, which indicate that the swarm object is strictly contained by or is inner adjacent to the referent. Trace amounts of overlaps and overlapped by are still present, as the swarm object “spills over” the county boundary in some places. Note how before and after gradually diminishes as we progress from (a) to (e). Figure 7(f) demonstrates that Fr histograms can handle highly irregular and unconnected objects.
5 Towards a Linguistic Description of the Topological Relationships Fr histograms carry lots of topological relationship information. In (Matsakis et al. 2001), we used force histograms to generate a linguistic description of the relative position between two objects in terms of the four primitive directional relationships (“to the right of,” “above,” “to the left of,” “below”). In future work, we plan to generate a linguistic expression that describes the topological relationships between two objects in terms of the thirteen Allen relations. Consider the direction T=0q and the object pair of Fig. 7(d). Is the argument before or contained by (during) the referent? Does it start, finish, or overlap it? All of these relations are present to some degree, but which one(s) give(s) the best description of the topological relationships between the two objects? The generated linguistic description should be terse and, at the same time, capture the essence of the relationships. Ultimately, it will consist of (i) a topological component to depict the relationships in terms of the most prominent Allen relations, (ii) a directional component to provide, if relevant, the direction where these relations hold true, (iii) a self-assessment component to give an indication of how satisfactory the description is—or how ambiguous the configuration is. None of these components are independent of each other. In this section, we focus our discussion on computing a satisfactory index for an arbitrary direction based on the Allen relations present along that direction. 5.1 Coherent Sets of Allen Relations and Satisfactory Indices
A linguistic expression like “A is before B in direction T” might not describe satisfactorily a given configuration (another Allen relation might capture better the essence of the topological relationships between the two objects; the configuration might be ambiguous), but it certainly sounds coherent (it is easy to picture two such objects A and B). Although more complex, “A is mostly before but partially meets and overlaps B in direction T” might also sound coherent to the reader. Three Allen relations are involved in this expression, but they do not semantically contradict each other. “A contains and is after B in direction T” might seem less coherent.
254 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko
Whether a description sounds coherent or not is, of course, a subjective matter. Let us now formalize this discussion. 5.1.1 Coherent Sets of Allen Relations
We will say that a set of Allen relations is coherent iff it belongs to some subset C of the power set 2A. The relations within a coherent set are considered not to semantically contradict each other and, therefore, might be used together in a linguistic description. Here are 3 possible choices for C: C1 = rA {{r}} C2 = C1 {{r,r'}A | Grr' = 1} C3 = C2 {{r,r',r"}A | Grr' = Grr" = Gr'r" = 1}
(3) (4) (5)
In these formulas, Grr' denotes the conceptual distance between r and r'. It is the length of the shortest path between r and r' in the graph of Fig. 3. For instance, Gmm =0, Gmo =1 and Gmf =3. The set C1 contains 13 singletons: {<}, {m}, {o}, etc. By choosing C=C1, we indicate that, in our opinion, coherent descriptions cannot involve more than one Allen relation. Therefore, an expression like “A is mostly before but partially meets B in direction T” should not be produced by the system for linguistic description generation. The set C2 includes C1 and contains all pairs of neighbor relations (like {<,m}, but not {di,>}). In the case where C=C2, the above-mentioned expression might be generated. “A contains and is after B in direction T,” on the other hand, will be rejected by the system. C3 also contains elements like {s,d,eq} (but not {s,d,f}). There are, of course, other possible choices for C. It seems reasonable to state that C should not contain the empty set and should include C1. 5.1.2 Satisfactory Indices
Several linguistic expressions can be associated with the same coherent set of Allen relations. For instance, “A is before B in direction T” and “A is mostly before B in direction T” are both associated with {<} (we will not discuss here the problem of finding the most appropriate expression). Again, such descriptions might be coherent, but not satisfactory for the configuration in hand. For any relation r, let vr denote the value ªF Ar B¼(T). Here is the simplest way to attach a satisfactory index V{r} to the coherent set {r} (i.e., to the most appropriate description associated with {r}): V{r} = vr . One might argue, however, that V{<} should be higher when v< = 0.7 and vm = 0.3 (before and meets coexist) than when v< = 0.7 and v> = 0.3 (before and after coexist). V{r} = max (0, vr 6UA{r} (GUr /6) vU) is another way to define a satisfactory index. Note that 6 is the maximum possible conceptual distance between two Allen relations. Any relation U that coexists with r makes V{r} decrease, and the higher its distance to r, the bigger the de-
Representing Topological Rrelationships between Complex Regions 255
crease. V{r} belongs to the interval [0,1]. It is 1 if and only if vr is 1 (r is the only relation present), and cannot be 0 if vr is greater than 0.5. Generalization is easy, and a satisfactory index Vc can be attached to any coherent set c of Allen relations. Here are two possible definitions: Vc = 6 rc vr , Vc = max (0, 6 rc vr 6UAc (GUc /6) vU)
(6) (7)
In Eq. 7, GUc denotes a weighted average conceptual distance between the Allen relation U and the coherent set c: GUc = (6rc vr GUr) / 6rc vr . The index Vc is a continuous function of all the vr values. Moreover, if c = c'{r0} with c' another coherent set and r0 such that vr0 = 0, then Vc = Vc'. The transition between different coherent sets is also continuous. 5.2 Examples and Future Work Consider the object pair in Fig. 7(d) and the direction T = 0q. Table 1 shows the highest satisfactory index for two different definitions of Vc (Eqs. 6 and 7) and three different choices of C (Eqs. 3 to 5). Here, both definitions agree. In direction T=0q, if the topological relationships between the swarm and the county had to be described by exactly one Allen relation, that relation should be equals. The description, however, TABLE 1. Highest satisfactory index for the object pair in Fig. 7(d) and the direction T=0q. Vc defined by Eq. 6 C C1 C2 C3
maxcC Vc
argmaxcC Vc
0.373 0.541 0.687
{eq} {eq , f} {eq , f , d}
C C1 C2 C3
Vc defined by Eq. 7 maxcC Vc argmaxcC Vc {eq} 0.247 {eq , f} 0.426 {eq , f , d} 0.595
would not be very satisfactory. A better description would be obtained if equals, finishes and during were considered not to semantically contradict each other. As expected, Eq. 7 gives lower values than Eq. 6 due to the negative influence of the relations outside of the winning coherent sets. Much work remains to be done before we can generate a linguistic description that captures the essence of the topological relationships between two complex objects in terms of the thirteen Allen relations. First, we intend to find a direction where a description would be most representative. Intuitively, the direction T0 we seek maximizes both the highest satisfactory index maxcC Vc(T) and the degree 6rA F Ar B(T) of object interaction. The linguistic expression generated by the system will involve the Allen relations comprising c0, the coherent set that maximizes Vc(T0). A fuzzy
256 Lukasz Wawrzyniak, Pascal Matsakis, and Dennis Nikitenko
rule base will be used to produce the most appropriate expression given the values ªF Ar B¼(T0). Finally, the self-assessment component of the description will be derived from the satisfactory index Vc0(T0).
6 Conclusions The F-histogram is a powerful generic quantitative representation of the relative position between two 2D objects. In this paper, we have considered Fr histograms, which are dedicated to the extraction of directional and topological relationship information. Imprecision is handled and robustness achieved through fuzzy set theoretic approaches. For any direction in the plane, the Fr histograms define a fuzzy 13-partition of all object pairs, and each class of the partition corresponds to an Allen relation. The objects are not necessarily convex, nor connected, and their geometry is not approximated through, e.g., Minimum Bounding Rectangles. Experiments on real data have shown that Fr histograms carry lots of topological relationship information. An index to measure the complexity of the relationships in an arbitrary direction has been developed. This index will play an important role in the generation of linguistic descriptions that capture the essence of the topological relationships between regions in terms of the Allen relations.
Acknowledgments The authors want to express their gratitude for support from the Natural Science and Engineering Research Council of Canada (NSERC), grant 045638.
References Allen JF (1983) Maintaining Knowledge about Temporal Intervals. Communications of the ACM 26(11):832-843 Dutta S (1991) Approximate Spatial Reasoning: Integrating Qualitative and Quantitative Constraints. International Journal of Approximate Reasoning 5:307331 Freeman J (1975) The Modeling of Spatial Relations. Computer Graphics and Image Processing 4:156-171 Goodchild M, Gopal S (1990) (Eds.) The Accuracy of Spatial Databases. Taylor and Francis, Basingstoke, UK
Representing Topological Rrelationships between Complex Regions 257 Guesgen HW (2002) Fuzzifying Spatial Relations. In: Matsakis P, Sztandera L (Eds.) Applying Soft Computing in Defining Spatial Relations. Studies in Fuzziness and Soft Computing, Physica-Verlag, 106:1-16 Malki J, Zahzah EH, Mascarilla L (2002) Indexation et recherche d'image fondées sur les relations spatiales entre objets. Traitement du Signal 18(4) Matsakis P (1998) Relations spatiales structurelles et interprétation d’images. Ph. D. Thesis, Institut de Recherche en Informatique de Toulouse, France Matsakis P (2002) Understanding the Spatial Organization of Image Regions by Means of Force Histograms: A Guided Tour. In: Matsakis P, Sztandera L (Eds.) Applying Soft Computing in Defining Spatial Relations. Studies in Fuzziness and Soft Computing, Physica-Verlag, 106:99-122 Matsakis P, Keller J, Wendling L, Marjamaa J, Sjahputera O (2001) Linguistic Description of Relative Positions in Images. IEEE Transactions on Systems, Man and Cybernetics, Part B 31(4):573-588 Matsakis P, Nikitenko D (to appear) Combined Extraction of Directional and Topological Relationship Information from 2D Concave Objects. In: Cobb M, Petry F, Robinson V (Eds.) Fuzzy Modeling with Spatial Information for Geographic Problems, Springer-Verlag Publications Matsakis P, Wendling L (1999) A New Way to Represent the Relative Position between Areal Objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(7):634-643 Nabil M, Shepherd J, Ngu AHH (1995) 2D Projection Interval Relationships: A Symbolic Representation of Spatial Relationships. SSD '95 (Advances in Spatial Databases: 42nd Symposium) pp. 292-309 Sharma J, Flewelling FM (1995) Inferences from Combined Knowledge about Topology and Direction. SSD'95 (Advances in Spatial Databases: 42nd Symposium) pp. 279-291
The Po-tree: a Real-time Spatiotemporal Data Indexing Structure Guillaume Noël, Sylvie Servigne, and Robert Laurini Liris, INSA-Lyon, Bat B. Pascal,20 av. A. Einstein, 69622 Villeurbanne Cedex FRANCE {noel.guillaume, sylvie.servigne, laurini}@insa-lyon.fr
Abstract This document describes the Po-tree, a new indexing structure for spatiotemporal databases with real-time constraints. Natural risks management and other system can use arrays of spatially referenced sensors, each of them sending their measurements to a central database. Our solution tries to facilitate the indexing of these data, while favoring the newer ones. It does so by combining two sub-structures for the spatial and temporal components. While Mobility is not yet supported, evolutions of the structures shall be able to deal with it. Keywords: Spatio-temporal, Database indexing, Soft Real-time, Natural disaster prevention
1 Introduction Geographic Information Systems provide solutions to a wide panel of problems, from agronomy to urban planning or natural risk management. The databases linked to such systems usually are very large and cumbersome. They have to keep tracks of numerous heterogeneous data. Solutions exist to face this kind of needs. Yet, a particular aspect remains partially untouched: spatio-temporal indexing with real time constraints. While our application case is linked to natural disasters prevention, we shall show that the structure we propose can be extended to cover different needs. We currently propose a new database spatio temporal access method for spatially referenced sensors, the Po-tree. This paper is divided into three chapters. First we define more precisely the problem we intend to address, then we follow by an introduction to our
260
Guillaume Noël, Sylvie Servigne, and Robert Laurini
application case. Next comes a brief state of the art. Finally, we introduce our solution, the Po-tree, and some test results to study its usefulness.
2 Application case Our study fields is linked with the work of volcanologists, trying to monitor a Mexican volcano, the Popocatepetl (CENAPRED,2003). An array of fixed measurement stations, hosting various sensors has been set up around the volcano. 15 stations record data within 1.5km from the crater. Others are located further down the slopes. Each sensor, spatially referenced as a fixed point, sends measurement datum toward a central database. This database is later on replicated to a data warehouse. See Figure 1 for a visual description. The Po-tree aims at indexing the database, while keeping in mind some recommendation for environmental data warehousing (Adam & al, 2002).
Fig. 1. application case
Real time constraints stems from the number of data to process. Updates occur in a periodic - chronological order. Measurement frequency for a sensor can be up to 100Hz, as for seismographs. On this aspect, update transactions are more important than lookup queries. The volcanologists tend to consider the most recent data as the more valuable, as they help understanding the actual activity of the volcano. Users usually query the database so as to fetch data coming from a specific
The Po-tree: a Real-time Spatiotemporal Data Indexing Structure
261
sensor (spatial location) for given amount of time (temporal interval). Volcanologists generally use reference sensors so as to determine the global state of the volcano. Later on they query complementary sensors to confirm their analysis. Therefore, most lookups are spatial-point / timeinterval. They are followed by range / interval queries, as defined by Erwig (Erwig, & al, 1999). The specific needs of this kind of application are now quite well understood. The spatio-temporal access method used should focus on two aspects: the update transactions cost and the priority given to the newest data. Different studies have already opened the path for real-time, spatial, temporal and spatio-temporal databases.
3 Other studies Real-Time approaches are based on the respect of time constraints, as defined in (Lam & Kuo, 2001). For databases, it means that transaction are to commit before a deadline. In the case of an array of sensors, the deadline is the arrival time of a new measurement. Spatial approaches, as defined in (Ooi & Tan, 1997) can be divided into three categories. x The first one tend to linearize the data, represented as points, in order to use well known indexing structures, such as the B+-Tree. x Another approach intends to use a non-overlapping native space. The space is divided into non-overlapping sub-spaces. The objects are referenced within these subspaces. An object spanning across two sub-spaces may be duplicated or clipped. x The last approach is to use an overlapping native space. The space is divided into overlapping sub-spaces. Among the main indexes, two can be noted. The R-tree (Guttman, 1984) family uses minimum bounding structure as sub-spaces to create a hierarchy accessible through a B-tree. In the Kd-tree (Bentley, 1975), a binary tree, points are sorted according to reference points and reference dimensions. In temporal approaches, the notion of time can lead to the use of balanced trees, such as the AP-tree (Gunadhi & Segev, 1993). Another approach is simply to consider the time dimension as another spatial dimension. This distinction has lead to different spatio-temporal approaches, as defined in (Wang, & al, 2000). Objects can be considered as:
262
Guillaume Noël, Sylvie Servigne, and Robert Laurini
x Objects that continuously move x Objects that discretely change, such as in our case x Continuous change of movements Different families of access method can be defined. x Time can be considered as another spatial dimension, as in the 3D-RTree case (Theodoridis, Vazirgiannis & Sellis, 1996) x Multiversion tables can be kept to track the data, as for MVLQ (Tzouramanis T., Vassilakopoulos M. & Manolopoulos Y, 2000) x Overlapping snapshots can be used as an alternative to multiversioning. HR-trees are a good example (Nascimento,M. & Silva J., 1998). x A dimension can be prioritized over another, as the spatial priority given in TR-trees (Xu, Han & Lu, 1990). None of these approaches are completely compatible with our case study. R-trees require lengthy Construction times. Approaches usually do not focus on the most recent data. However some ideas have lead to the development of a new tree: the Po-tree.
Fig. 2. Po-tree structure
4 Po-Tree Our solution, the Po-tree, is based on the differentiation of temporal and spatial data, with a focus given to the latter. The spatial aspect is indexed through a Kd-tree, while the temporal aspect uses modified B+-trees (figure 2). The measurement stations being immobile, the current structure
The Po-tree: a Real-time Spatiotemporal Data Indexing Structure
263
does not allow mobile sources of information, mobile sensors. Every spatial location, similar to a spatial object (sensor) is directly linked to a specific temporal tree. Queries shall first determine the spatial nodes concerned by a transaction and later on determine the temporal nodes. 4.1 Po-tree Structure The Po-tree can be divided in to parts. A first sub-structure covers the spatial aspect. Each location, each sensor is linked through this spatial subtree to a temporal sub-tree. 4.2 Spatial Component Kd-trees are simple spatial structures but not perfect ones. Their main problem is the fact that their final shape relies on the data insertion order. If data are entered in different orders, the final trees may have different shapes. However, Index Concurrency Control methods, originally designed for B-trees with real-time constraints can easily be adapted to cover Kd-trees. Latches, 'fast locks', can be converted to be used on binary trees. Different tests have also proven that Kd-trees fared reasonably well compared to Rtrees for small number of data (Paspalis, 2002). Within a Kd-tree, entries take the from . In order to create a Kd- tree, a reference dimension D and a reference point Pi are taken. Then, all points with a greater value for the dimension D than Pi shall fall to the right part of the branch, and those with a lesser value to the left. At the next level of the tree, other reference points Pi+1 and Pi+2 are taken, and the reference dimension becomes D+1. In the Po-tree, each spatial point is composed of a spatial definition of the point and a link to a temporal sub-tree. It does not directly record the temporal components. Therefore, a whole definition of a spatial sub-tree entry would be . As spatial deletes should not occur, there shall not be empty spatial nodes.
264
Guillaume Noël, Sylvie Servigne, and Robert Laurini
4.3 Temporal Component The temporal components to index are held within modified B+-trees. The tree records the measurement times and links to the actual measured values. Each spatial object is linked to a different temporal sub-tree. As for the temporal aspect, it has been noticed that the most recent data are considered of higher interest than older data. It has also been noticed that data insertions are generally held at rightmost leaf of the temporal substructure, where are found the newest nodes. Therefore, the temporal sub-tree has been modified to add a direct link between the root and the last node. While maintaining this link requires minimum computation from the system, a simple test during query processing prevents being forced to traverse the whole tree so as to append or to find the requested data. This direct link, updated during the data insertion procedure, is useful to save processing time. As each temporal sub-tree is linked to an object, it is possible to develop a secondary structure in order to directly access the data of specific objects, without the need to search through their spatial position. This would however incur the addition of an Information Source identifier to the temporal sub-structure. However, this could be useful for the notion of hierarchy of information sources. As most, if not all, of the updates take place to the rightmost part of the temporal sub-tree, the fill factor of leaf nodes can be placed higher than usual. Delete transactions should be somehow rare under normal conditions, and a posteriori updates should be as rare. The exception being when the transmitting systems experience lag time due to network problems between the sensors and the database, or when the database has to restart update transactions, which can occur because of real-time constraints and node access concurrency control. Therefore split and merge procedures can be changed so that the nodes can be filled almost at their maximum capacity. Within the temporal sub-tree, the data are indexed according to their start-time. The periodicity of the sensors, and the assumption that they shall not go down unnoticed makes us considering the start-time only. For lower frequencies the end-time should be included as well in order to define the data lifetime. So far, entries take the shape of <pointer-0; key-1; pointer-1; key-2...; pointer-n>. Furthermore, a direct link between the root and the last node accelerate the query processing of the newest data. A simple test between the value to process and the first key of the last node determines if the query is related to the most recent data or to older one. If the value is greater than the key, the last node is directly returned, searched
The Po-tree: a Real-time Spatiotemporal Data Indexing Structure
265
or updated. This greatly helps in lowering the processing time by preventing lengthy tree traversals. 4.4 Spatio-temporal linking Requests on the Po-tree can be divided in three different parts: spatial localization, Information Source linking and temporal localization. Lookups start by searching the spatial tree. For point lookups, it directly fetches the Information Source and its temporal sub-tree. From here on, the lookup query searches the temporal sub-tree. For range lookups, the query starts by determining the different Information Sources within the spatial range. Then, for each of them, it then starts a temporal lookup. The queries are answered by giving the specific results of Information Sources one by one. For insertions, the transaction starts by defining the spatial position of the inserted data. If the position is already defined, the transaction can directly proceed with its temporal part. If the position is not defined, the transaction starts by inserting the spatial point in the sub-tree. After this first step, it creates a new temporal sub-tree and links it to a new Information Source. Information Source linking references the change between the spatial and the temporal sub-trees as each of them holds a part of the indexed data. Temporal queries start by comparing the timestamp or the end of the interval with the first temporal key of the last node. This node is directly accessible through the root of the tree. If the query deals with recent data, the query can act directly on the last node. On other cases, the query then proceeds with a B+-tree lookup. Temporal intervals are dealt with by first finding the end of the interval in the tree and by using the node-links to cover the entire interval length. Queries use the B+-tree rules. For updates however, if a new leaf-node is created to the right of the tree the link between the root and the last node is accordingly updated. This configuration implies the Po-tree is more specifically designed for queries on the most recent data. Spatial range / Temporal interval requests that does not ends at the present time does not fully take the advantages of the specificities of the tree.
266
Guillaume Noël, Sylvie Servigne, and Robert Laurini
5 Tests Different tests have been conducted between the Po-tree, R and R*-tree structures (Hadjieleftheriou, 2003). Randomly generated data have been generated and sequentially issued to a fixed number of random points acting as Information Sources. Tests have been conducted changing the total number of data to index (1000-200000), the number of information sources (10-5000) used and the portion of the base to scan for interval queries (the 5 – 30 last percents). The tests have been conducted on a 1.6 G Hz, 128 Mo RAM computer, running Linux. The programming phase was done under Java SDK 1.2
Construction Time (ms)
1400 1200 1000 800 Po-Tree 600 400 200 198
183
168
153
138
123
108
93
78
63
48
33
3
18
0
Number of Points (x1000)
Fig. 3. Po-tree Construction Time
The first notable fact about the Po-tree comes from its construction time (see Figure 3). Tests with a fixed set of 30 sensor, compatible with our test case, along with 200 000 updates (batch) show a linear building time. From this point, it is interesting to examine the effect of the number of different positions for queries, as we use two complementing structures so as to obtain our Po-tree (see Figure 4). The tests have dealt with a fixed total set of data and a varying number of spatial positions, from 50 to 5000. The actual variation of the construction time (similar results have been found for lookups) is steady, increasing by steps. Note that the Kd-tree performances are linked to the order of insertion of data, as the shape of this tree is not deterministic.
The Po-tree: a Real-time Spatiotemporal Data Indexing Structure
267
Po-Tree 250
Time (ms)
200 150 Building Time 100 50
50
0 Number of Position
Fig. 4. Influence of the number of sources Complementary tests between the Po-tree and the R*-tree have shown some interesting properties of the new structure. First of all, the construction time of the Po-tree is by far lesser for the Po-tree. To index (batch) 25 000 points from 30 information sources, the R*-tree takes some 45 seconds, while the Po-tree takes less than one. This can be partly explained by the fact that the R*-tree has to deal with Minimum Bounding Structures, while the Po-tree simply has to determine which B+-tree to append. While the R*-tree must consider the whole set of data for answering queries, the Po-tree segments the dataset in spatially different sub-parts. 300
200 Po RI
150
R* RI
100 50
67
61
55
49
43
37
31
25
19
13
7
0 1
Time (ms)
250
Number of Objects (x1000)
Fig. 5. Range-Interval Search Time
268
Guillaume Noël, Sylvie Servigne, and Robert Laurini
As for lookups, the R*-tree performs slightly better for small number of objects, but this trend soon changes when there are more than 5000 objects (see Figure 5). This can be explained by the fact that the R*-tree must work with the whole set of data while the Po-tree, with a first spatial filtering later on only has to consider a part of the set. The more data are processed, the bigger the differences are. This has been verified for spatial Point / temporal Point, Point / Interval and Range / Interval queries. The results obtained have shown that the Po-tree was compatible with the constraints set by our application case: favoring the newest data, processing of big quantities of data in a given time, fixed set of spatial sources, possibility to use in a real-time system. Even though the mobility is not yet easily managed, the Po-tree meets the initial specifications.
6 Conclusion The Po-tree aims at indexing spatio-temporal data issued from a network of spatially referenced sensors, with a focus given to the newest data. Our goal was to accelerate answering time to real-time queries. Our application case is linked to volcanic monitoring, yet it can be extended to include other natural disaster prevention scenarios, or scenarios where a set of fixed spatially referenced sensors sends huge quantities of data to a central database. The structure of the Po-tree uses two parts. The main difference with the existing solutions stems from this division. The Po-tree uses the spatial dimension to divide the dataset into temporal sub-trees. The spatial sub-tree references the positions of fixed sensors, Information Sources. Each position, each sensor, is then linked to a temporal sub-tree pointing to the actual data. The spatial sub-tree is based on the Kd-tree, compatible with Index Concurrency Control protocols, yet sensible to the insertion order while the temporal tree is a modified B+-tree, akin to an AP-tree. Different tests have shown that this solution can be used with ease when updates from a given set of sensor are frequent. The lookup strategy has been designed to favor the newest data thanks to a direct link between the root of the temporal sub-tree and last node. The notion of Information Source allow fast interval queries as the data from a given source are linked through the temporal sub-tree. Further developments of the structure will include mobility and the use of Quadtrees to replace the actual spatial sub-tree. Another point that shall be studied will be the linkage of the database to the data warehouse. The Po-tree has been designed with specific needs in mind, but it can be easily
The Po-tree: a Real-time Spatiotemporal Data Indexing Structure
269
adapted to cover a majority of natural disaster cases where fixed sensors are the main source of information. Acknowledgments should be made to the Universidad de las Americas, Puebla, for their work on the Popocatepetl monitoring. They coordinate and greatly help the different researches based on this volcano.
References Adam N., Atluri V., Yu S. & al, 2002, Efficient Storage and Management of Environmental Information, in Proceedings of the 19th IEEE Symposium on Mass Storage Systems, (USA, Maryland) Bentley J.L., 1975, Multidimensional binary search trees in database application, IEEE Transaction on software engineering, 5(4), 333-340. Bliujute R., Jensen C.S., Saltenis S. & al., 2000, Light-Weight Indexing of Bitemporal Data, in Proceedings of the 12th International Conference on Scientific and Statistical Database Management (Germany, Berlin), pp. 125-138. CENAPRED, 2003, Monitoreo y Vigilancia del Volcan Popocatepetl, http://tornado.cenapred.unam.mx/mvolcan.html Erwif M., Güting R.H., Schenider M. & al, 1999, Spatio-Temporal Data Types: An Approach to Modeling and Querying Moving Objects in Databases, GeoInformatica, 3(3), 269-296 Guttman A., 1984, R-trees: a dynamic index structure for spatial searching. In Proceedings 1984 ACM SIGMOD International Conference on Management of Data, (USA, Boston) pp. 47-57 Hadjieleftheriou M., 2003, Spatial Index Library, http://www.cs.ucr.edu/~marioh/spatialindex/ Haritsa J.R., Seshadri S., 2001, Real- time index concurrency control. In Real Time Database System – Architecture and Techniques, edited by K.Y. Lam and T.W. Kuo (Boston : KluwerAcademic Publishers) ISBN: 0-7923-7218-2, pp. 60-74 Lam K.Y., Kuo T.W., 2001, Real time database systems: an overview of systems characteristics and issues. IIn Real Time Database System – Architecture and Techniques, edited by K.Y. Lam and T.W. Kuo (Boston : KluwerAcademic Publishers) ISBN: 0-7923-7218-2 , pp. 4-16 Lam K.Y., Kuo T.W., Tsang N.W.H., & al, 2000, The reduced ceiling Protocol for concurrency control in real-time database with mixed transactions, The computer journal, 43(1), 65-80 Mokbel M., Ghanem T.M. & Aref W.G., 2003, Spatio-temporal Access Methods, IEEE Data Engineering Bulletin, 26(2), pp. 40-49 Nascimento,M. & Silva J., 1998, Towards Historical R-trees. In Proceedings of 1998 ACM Symposium on Applied Computing, (USA, Atlanta) pp. 235—240 Ooi B.C., Tan K.L., 1997, temporal databases. In Indexing Techniques for Advanced Database Systems, edited by E. BertinoB.C. Ooi, R. Sack-Davies & al (Boston, Kluwer Academic Publishers), ISBN 0-7923-9985-4, 113-150
270
Guillaume Noël, Sylvie Servigne, and Robert Laurini
Paspalis N., 2003, Implementation of Range searching Data-Structures and Algorithms, http://www.cs.ucsb.edu/~nearchos/cs235/cs235.html Theodoridis Y., Vazirgiannis M. & Sellis T., 1996, Spatio-temporal indexing for large multimedia application, In Proceedings of the 3rd IEEE conference on multimedia computing and systems, (Japan, Hiroshima) Tzouramanis T., Vassilakopoulos M. & Manolopoulos Y, 2000, Multiversion Linear Quadtrees for Spatio-temporal Data, Proceedings 4th East-European Conference on Advanced Databases and Information Systems, (Czec Republic, Prague) pp.279-292 Xu X., Han J. & Lu W.,1990, RT-Tree: an improved R-tree indexing structure for temporal spatial databases, in Proceedings of the 4th International Symposium on Spatial Data Handling, (Switzerland, Zurich) pp. 1040-1049 Wang X., Zhou X., Lu S., 2000, Spatiotemporal Data Modeling and Management: A Survey, In Proceedings of the 36th International Conference on Technology of Object-Oriented languages and Systems, (China, Xi'an), pp. 202221
Empirical Study on Location Indeterminacy of Localities Sungsoon Hwang, and Jean-Claude Thill Department of Geography, State University of New York at Buffalo, 105 Wilkeson Quad, Buffalo NY 14261 U.S.A.
Abstract Humans perceive the boundary of locality vaguely. This paper presents how the indeterminate boundaries of localities can be represented in GIS. For this task, indeterminate boundaries of localities are modeled by a fuzzy set membership function in which generic rules on geospatial objects are incorporated. Georeferenced traffic crash data reveal that police officers identify localities precisely at best 88% of the time. An empirical analysis indicates that people are 6% more confident in identifying urban localities than rural localities. As a conclusion, fuzzy set theory seems to provide a reasonable mechanics to represent vague concept of geospatial objects. The comparison of urban versus rural localities with respect to location indeterminacy suggests that neighborhood types may affect the way humans acquire spatial knowledge and forge mental representations of it. Keywords: GIS, uncertainty, fuzzy set theory, mental maps
1 Introduction Humans perceive localities on a daily basis but perception is likely to be vague, particularly when it comes to boundaries of localities. In contrast to vague human perception of localities, data and operations used in a geographic information system (GIS) are dominantly based on crisp sets. Due to the discrete nature of computing environments, vague concepts prevalent in spatial objects (e.g. nearness and other qualitative relations) are usually forced into discrete constructs. For example, a vaguely stated phrase such as “near Chicago” is usually only georeferenced either in or outside of Chicago in GIS, although it can be better seen as confidence interval.
272
Sungsoon Hwang, and Jean-Claude Thill
terval. Accordingly, if georeferencing is broadly understood as pinpointing the most plausible location intended by users, the reliability of georeferencing depends on mental maps of spatial entities and of their mutual spatial relationships. The problem is that such mental maps lack sharp boundaries of localities so that the question comes down to how to best model such indeterminate boundaries of localities given the crisp nature of common spatial datasets. There are two possible approaches to modeling indeterminate boundaries of localities (Robinson 1988). The first approach is purely empirical. In short, empirical evidence can be directly collected from those who perceive localities and weaved together to compose boundaries of localities, provided that a certain degree of agreement (i.e. consistently overlapping boundaries) is met (Montello et al 2003). Second, a hypothetical model can be adopted where it is assumed that the location indeterminacy of locality is governed by some generally accepted rules (e.g. spatial autocorrelation). In this study, the second approach is chosen over the first one because the first approach is not feasible when considering the purpose of this study – examining location indeterminacy across multiple localities. In this study, fuzzy logic (Zadeh 1965) is applied to geographic databases where fuzzy set membership of a locality is determined by (1) absolute distance measures, (2) qualitative spatial relations, and (3) scale associated with the locality. The model of fuzzy regions allows a vaguely-conveyed locality to be georeferenced while it could be considered inadequate to georeferencing by a conventional method. When traffic crash data are georeferenced on the basis of a fuzzy locality model, the georeferenced data can be used to examine the gap between what is perceived and what physically exists. In other words, it can help us delineate possible locality boundaries represented in mental maps. The location indeterminacy of locality is measured as the ratio of locality entities that are georeferenced outside of their actual boundaries relative to records believed to be in the boundary. The variation of location indeterminacy may be associated with some characteristics of localities. Analysis of variance examines if location indeterminacy can vary by different neighborhood types such as urban versus rural area. This study can help address the following related research questions: Can we represent the perceptual environments (e.g. mental maps) in a crisp-set based computing environment (e.g. GIS)? Is a fuzzy-set approach promising in accomplishing this task? What kind of factors do we have to take into account to represent mental maps in GIS? How can it be implemented in GIS? To what extent are humans confident in identifying the boundary of a locality? Does the level of confidence vary with certain characteristics of localities (such as urban versus rural)?
Empirical Study on Location Indeterminacy of Localities
273
The research objectives of this study are twofold: One is to model indeterminate boundaries of localities by applying the concept of fuzzy sets to geographic databases; the second is to examine if the location indeterminacy of localities significantly varies by the urban versus rural settings. The rest of this paper is organized as follows: Sect. 2 reviews background issues for the problems in hand. In Sect. 3, we will discuss how to model indeterminate boundaries of localities in GIS. Sect. 4 presents the results of an empirical analysis to compare the mean of location indeterminacy between urban and rural areas. In Sect. 5, analysis results are interpreted and the project is summarized.
2 Background Let us consider traffic crash sites where police officers compile required information in an accident report. The typical way of recording accident location is either by means of a linear referencing system (LRS) for highway accidents (which is not considered in this study), or by reporting the names of the roadway and locality where the crash occurred. As illustrated in Fig. 1, an accident report does not necessarily reflect what is out there as it actually is; In particular, accident location is captured through the cognitive and perceptual filter of filing officers. Moreover, police officers are often forced to make crisp judgment on vague perception and cognition in response to rigid requirements of coding forms. This study is particularly focused on the vague perception and cognition of locality boundaries. Given the spatial resolution of data used in this study, the type of locality we address is confined to local jurisdiction (e.g. City of Buffalo). The central issue of this study is depicted in Fig. 1. In the lower right corner of this figure, the CITY column suggests a spatial relation in, but it can be argued that near also applies under uncertainty.
274
Sungsoon Hwang, and Jean-Claude Thill
Fig. 1. How accident location is recorded: losing certainty The example of accident coding form is adapted from NHTSA (1995)
Mental maps can be seen from different levels of abstraction – idiosyncrasy versus general principles. As for idiosyncrasy, mental maps vary with individuals – their unique experiences, preferences, and the level of familiarity with areas (Gould and White 1986, Thill and Sui 1993). Thus, it is hardly amenable to generalization. Similarly, mental maps may be affected by unique characteristics of surroundings. For instance, people may find it easier to identify the boundary of localities surrounded by a water body (e.g. island, peninsular) or high mountains (e.g. river basin). Besides, it is easy to identify a locality with salient characteristics (e.g. Paris with Eiffel Tower). All these exemplify specifics of cognitive mapping of locality. In contrast, properties inherent to a geospatial object determine the degree to which its boundary is perceived with vagueness. First, the degree of belonging to a certain locality declines as the distance to the locality increases (similar to Tobler’s First Law of Geography). That is, it is a function of Euclidean distance or some other measure of spatial separation (Yao and Thill 2004). This is widely accepted, but this alone cannot explain irregular or asymmetric form of indeterminate boundaries. Second, it is affected by spatial qualitative relation. In Fig. 2, suppose that A and B are within the indeterminate boundaries of the locality of Syracuse. Location A may be perceived less near Syracuse than B due to the intervening locality of Geddes between A and Syracuse (Ullman 1956). Third, it is scale-dependent (Goodchild 2001). The size or hierarchy of referents matters (Gahegan 1995). For example, Buffalo, NY is near Syracuse, NY at a regional scale but the two localities cannot be considered near at the local
Empirical Study on Location Indeterminacy of Localities
275
scale. The three properties mentioned above are key determinants of location indeterminacy of locality.
Fig. 2. The effect of qualitative spatial relation on nearness
It can be noted that this study addresses localities with official recognition in contrast to localities with informal recognition such as vernacular regions (e.g. Midwest) (Zelinsky 1980). Moreover, accident reports are filled in by police officers who are presumably familiar with surrounding areas, so that the variation of mental maps on an individual basis is reasonably controlled for. In that sense, the results of this study should not simply be seen as representative of a statistical sample, but as conservative when it comes to location indeterminacy of locality.
3 Modeling Location Indeterminacy of Locality Let us consider locality l to be a kind of fuzzy region. As such, it is denoted as Ãl. Ãl is composed of the following three parts: Core, Boundary, and Exterior (also known as egg-yolk model; Cohn and Gotts 1996). These parts are defined as crisp regions (regular closed sets) denoted by regc. Also let 2 denote the two-dimensional geographic space, and let the
276
Sungsoon Hwang, and Jean-Claude Thill
fuzzy-set membership function of Ãl be PÃl. Ãl and PÃl are defined as follows: Ãl = Core(Ãl) Boundary(Ãl) Exterior(Ãl) Core(Ãl) = regc ({(x,y) 2 | PÃl (x,y) = 1}) Exterior(Ãl) = regc ({(x,y) 2 | PÃl (x,y) = 0}) Boundary(Ãl) = regc ({(x,y) 2 | 0 < PÃl (x,y) < 1}) The core identifies the part of the region that definitely belongs to Ãl. The exterior determines the part that definitely does not belong to Ãl. The indeterminate character of Ãl is summarized in the boundary of Ãl in a unified and simplified manner (Erwig and Schneider 1997; Schneider 1999). The core and boundary can be adjacent with a common border, and core and/or boundary can be empty. When the boundary is an empty set, Ãl becomes a crisp region. Thus, a crisp region is a special case of a fuzzy region. The question boils down to delineating the nonempty set boundary, which is illustrated in Fig. 2. Where the function named FirstOrderHNGroup(x) is defined on the set of neighbors which “meet” locality x, where each neighbor has the same spatial resolution level as x (HN is the abbreviation of HorizontalNeighbor in the sense that the function results in a group with the same resolution level as x). For example, x and its considered neighbors should belong to the same level of administrative boundary (e.g. city, county, state). SecondOrderHNGroup(x) is defined on the set of neighbors which “meet” FirstOrderHNGroup(x) at the exclusion of x. The term “meet” is one of the spatial relation predicates defined in Egenhofer and Franzosa (1991) to extend Allen’s (1983) temporal logic to the spatial domain. The exterior of FirstOrderHNGroup(x) becomes the 0.5-cut boundary. Accordingly, the fuzzy set membership value for boundary can be redefined as follows: PÃl (x,y) = [0, 0.5] if (x,y) is in SecondOrderHNGroup(l) PÃl (x,y) = [0.5, 1] if (x,y) is in FirstOrderHNGroup(l) To compute the continuous and linear1 fuzzy-set membership value in Boundary(Ãl), we create a Delaunay Triangulation whose nodes are comprised of any vertices on core, 0.5-cut boundary, and exterior. Fig. 3 illustrates two nodes on the 0.5-cut boundary, and one node on core. The
1
Some studies on nearness suggest that the relationship between distance and nearness can be approximated by a linear relationship (Gahegan 1995). An sshaped function has also been proposed (Worboys 2001).
Empirical Study on Location Indeterminacy of Localities
277
membership value is obtained by intersecting a vertical line with the plane defined by the three nodes of the triangle as shown in Fig. 3.
(x1, y1, 1)
(x, y, z)
(x3, y3, 0.5)
x
(x2, y2, 0.5)
Fig. 3. TIN surface created to interpolate fuzzy set membership value SHELBY NIAGARA WHEATFIELD
NIAGARA FALLS
LOCKPORT PENDLETON
ROYALTON
ALABAMA NORTH TONAWANDA GRAND ISLAND CLARENCE
AMHERST
NEWSTEAD
TONAWANDA
PEMBROKE
CHEEKTOWAGA
¯
LANCASTER
ALDEN
BUFFALO
Legend
DARIEN
WEST SENECA
County Boundary
LACKAWANNA
TownorCity Boundary (PLACE_PL) WaterBody
ELMA
MARILLA
BENNINGTON
Lake Erie
FuzzySet Membership of Locality Value
ORCHARD PARK
High : 1
Low : 0
HAMBURG
AURORA 0
EVANS
3,750
7,500
WALES 15,000
SHELDON
Meters
EDEN
Fig. 4. Fuzzy set membership of locality Buffalo, NY
In Fig. 3, the generalized equation for linear interpolation of a point (x, y, z) in a triangle facet is Ax + By + Cz + D = 0, where A, B, C, and D are constants determined by the coordinates of the triangle’s three nodes.
278
Sungsoon Hwang, and Jean-Claude Thill
Thus, the fuzzy set membership value can be obtained from the following equation given the x- and y- coordinates: PÃl (x,y) = (-Ax –By – D) / C. As illustrated in Fig. 4, a continuous membership value between 1 and 0 is computed in the indeterminate boundary (Wang and Hall 1996; Stefanakis et al 1999). In Figure 4, the locality Buffalo has a full membership in its core, a partial membership in its boundary (e.g. Amherst), and no membership beyond exterior (e.g. Alden).
4 Empirical Analysis of Location Indeterminacy of Localities This section presents location indeterminacy revealed in a time series of 8631 traffic crashes that occurred between 1996 and 2001 in New York State and compiled at the accident level in the Fatal Accident Reporting System (FARS) (NHTSA 1995). Of these, 5460 cases2 are considered for examining location indeterminacy. First, crash records in the dataset are georeferenced by taking into account uncertainty (e.g. incompatible name of roadway between FARS and reference database, nearness implicit in identifying locality) (Hwang and Thill 2003). Second, the geographic location of a georeferenced record is compared with locality information given in the original FARS data record. The discrepancy between them may indicate location indeterminacy of the concerned locality. If a record is georeferenced exactly in the locality as given in the original data (i.e. no discrepancy), fuzzy set membership becomes 1. If a record is georeferenced to the indeterminate boundary of the locality (that is, near locality), fuzzy set membership becomes any value between 0 and 1 based on the definition of fuzzy set membership function described in sect. 3. Fig.5. graphs 5460 cases by reference data (Place_PL or Place_PT3), and locality matching criteria (In or Near). Near-cases account for 12.4% (677/5460). This means that police officers misidentify the locality of car crashes 12.4% of the time. But it should be noted that it is a very conservative estimate because the total count of 5460 usable cases only accounts
2
3
Excluded cases are composed of 2053 cases georeferenced by LRS, 614 cases with invalid code for locality matching (e.g. null or incorrect code), and nongeocodable cases (504). Place_PL corresponds to County Subdivisions in U.S. Census terms. On the other hand, the polygon boundaries of Place_PT are usually not well –defined. Thiessen polygons drawn around centroids derived from hardcopy maps are used as a proxy of polygon boundary if unknown.
Empirical Study on Location Indeterminacy of Localities
279
for cases that are georeferenced with a reasonably good quality (i.e. data in poor quality is more likely to have near-cases).
Number of Accidents
5000
505
4000 3000 Place_PT
4278 2000
Place_PL
1000
321 356
0 In
Near
Locality Matching Criteria
Fig. 5 Probability of pinpointing localities exactly (in) versus roughly (near) Source: FARS accident level data (1996-2001 New York State)
In the third of the analysis, location indeterminacy of locality i is computed as follows: Ui = 1 - (ȈPi)/n where Pi is the fuzzy set membership value of locality i for each record, and n is the total number of records that are reported to occur in locality i. To illustrate this, consider the three different scenarios in Fig. 6. Fuzzy set membership value 1 is assigned to accidents within Core, and [0,1] is assigned to accidents in Boundary. It is shown that B has the lowest location indeterminacy 5% while C has the highest location indeterminacy 42%.
Fig. 6 Illustration of computation of location indeterminacy of locality
280
Sungsoon Hwang, and Jean-Claude Thill
Finally, localities are classified as urban or rural 4 in an attempt to examine if location indeterminacy can vary by neighborhood types. The average number of fatal crashes that occurred during the study period in rural areas is 2 (612 accidents / 298 localities), while that of fatal crashes in urban areas is 16 (3822 accidents / 246 localities). The mean value of location indeterminacy for rural localities turns out to be 0.1165, which compares to 0.0966 for urban localities. At first glance, the difference appears to be trivial, but interpretation of the location indeterminacy rate of rural areas requires some caution because of the small number problem (Kennedy 1989). That is, location indeterminacy rate is highly variable when the number of crashes within an area is rather small. To work around the small number problem, we convert observed rates to empirical Bayes estimates in a way that prior distribution is taken into account (Bailey and Gatrell 1995). Consequently, somewhat unreliable values of location indeterminacy are smoothed out while relatively reliable values are expected not to change much. Instead of adjusting for overall distribution, two sets of observed rates (which is urban versus rural) are adjusted for their within-group distributions because of significant difference in their respective overall patterns. Table 1 presents descriptive statistics of location indeterminacy value for urban and rural areas, after adjustment for the overall pattern of location indeterminacy within each group. It indicates that people are 94% (or somewhere between 93% and 95%) sure in identifying urban localities while they are 88% (or somewhere between 86% and 90%) sure in identifying rural localities. It means that the boundary of rural areas is perceived 6% less accurately than in urban areas. Table 1. Descriptive statistics of Bayesian estimates of location indeterminacy value: comparison between urban locality and rural locality
Urban Rural Total
N
Mean
246 298 544
0.0597 0.1178 0.0915
Std. Dev 0.0631 0.1759 0.1399
Std. Error 0.0040 0.0102 0.0060
95% Conf. Interval for Mean L Bound U Bound 0.0518 0.0977 0.0798
0.0677 0.1378 0.1033
Analysis of variance (Table 2) conducted on Bayesian estimates of location indeterminacy confirms the difference between urban versus rural locality in terms of location indeterminacy is significant (F-statistics 4
Depending on the spatial definition of Urbanized Area (U.S. Census Bureau 1999).
Empirical Study on Location Indeterminacy of Localities
281
24.209). Therefore, it can be concluded that the boundary of localities is perceived differently depending on neighborhood types. The interpretation and implication of this empirical analysis will be given in the next section. Table 2. The result of analysis of variance (One-way ANOVA) conducted on 544 localities grouped as urban or rural. Between Groups Within Groups Total
Sum of Squares 0.454 10.170 10.624
df 1 542 543
Mean Square 0.454 0.019
F 24.209
Sign. 0.000
5 Conclusions The result of analysis of variance sheds some light on our initial hypothesis that mental maps on urban settings may be less error-prone than those on rural settings. It is maybe because city provides more landmark or route upon which judgment on indeterminate boundaries of localities can be based. As suggested by Golledge et al (1995), direct experience (e.g. navigation) on geographic space helps humans acquire ultimate spatial knowledge (e.g. survey knowledge). Under the theory of spatial knowledge acquisition, it can be argued that urban settings provide more favorable conditions for forming consistent mental maps. The empirical study presented in this paper supports this idea. Moreover, it may be worthwhile considering characteristics of localities as a fiat spatial object (Smith 1995). According to Smith, a fiat object, unlike other physical environments (e.g. mountain, lake) does not exist in a physical way (thus intangible); rather it is the outcome of human conceptualization. Therefore, any factors considered to facilitate human conceptualization may refine our mental maps. One of them is a scale factor. Identifying localities (here we focus on “unity condition” (Guarino and Welty 2000) – what composes localities as a part-whole relation) becomes easier at a micro-scale. As urban settings are denser, they provide a reasonable scale in which humans can conceptualize localities without much difficulty. Conversely, a highly dispersed pattern of settlements constitutes rather challenging environments in which humans can conceptualize localities as a whole. In summary, we built a hypothetical model of localities in a way that location indeterminacy is incorporated as a fuzzy set membership value. As evident in a multi-year empirical datasets of traffic crashes in New York state, location indeterminacy of spatial objects seems to prevail, even
282
Sungsoon Hwang, and Jean-Claude Thill
though we deal with spatial objects with official recognition. Moreover, the confidence level in identifying localities may vary with neighborhood types. That is, people are found to be 6% more confident in identifying urban areas than rural areas. Boundaries of rural areas are perceived more vaguely and variably than urban areas.
References Allen JF (1983) Maintaining knowledge about temporal intervals. Communications of the ACM 26(11):832-843 Bailey TC, Gatrell AC (1995) Interactive Spatial Data Analysis. Longman Scientific & Technical, Essex, pp.303-308 Cohn A, Gotts N (1996) The ‘egg-yolk’ representation of regions with indeterminate boundaries. In: Burrough P, Frank AU (ed) Geographic Objects with Indeterminate Boundaries, Taylor & Francis, London, pp 171–187 Egenhofer M, Franzosa R (1991) Point-set topological spatial relations. INT J of Geographical Information Systems 5(2):161-174 Erwig M, Schneider M (1997) Vague regions. In: 5th international symposium on advances in spatial databases (SSD’97), LNCS Vol 1262, Springer, pp 298-320 Gahegan M (1995) Proximity operators for qualitative spatial reasoning, In: Frank AU, Kuhn W (ed) Spatial Information Theory: A Theoretical Basis for GIS. LNCS Vol 988, Springer, Berlin, Germany, pp 31-44 Golledge RG, Dougherty V, Bell S (1995) Acquiring spatial knowledge: survey versus route-based knowledge in unfamiliar environments. Annals of AAG 85(1):134-158 Goodchild MF (2001) A geographer looks at spatial information theory, In: Montello DR (ed) COSIT 2001, Springer-Verlag, London, pp.1-13 Gould P, White R (1986) Mental Maps. Harmondsworth, Penguin Guarino N, Welty C (2000) Ontological analysis of taxonomic relationships. In: Laender A, Storey V (ed) Proceedings of ER-2000: The International Conference on Conceptual Modeling, Springer-Verlag Hwang S, Thill JC (2003) Georeferencing Historical FARS Accident Data: A Preliminary Report, Unpublished document, Department of Geography and NCGIA, State University of New York at Buffalo Kennedy, S (1989) The small number problem and the accuracy of spatial databases. In: Goodchild M, Gopal S (ed) Accuracy of Spatial Databases, Taylor & Francis, London, pp 187-196 Montello DR, Goodchild MR, Gottsegen J, and Fohl P (2003) Where’s downtowns?: behavioral methods for determining referents of vague spatial queries. Spatial Cognition and Computation, 3(2&3):185-204 NHTSA (1995) FARS 1996 Coding and Validation Manual. National Center for Statistics and Analysis, National Highway Traffic Safety Administration, Department of Transportation, Washington, D.C.
Empirical Study on Location Indeterminacy of Localities
283
Robinson VB (1988) Some implications of fuzzy set theory applied to geographic databases. Computers, Environment, and Urban Systems 12:89-97 Schneider M (1999) Uncertainty management for spatial data in databases: fuzzy spatial data types. In: Guting RH, Papadias D, Lochovsky F (ed) SSD’99, LNCS Vol 1651, Springer-Verlag, Heidelberg, pp 330-351 Smith B (1995) On drawing lines on a map. In: Frank AU, Kuhn W (ed) Spatial information theory: A theoretical basis for GIS. LNCS Vol 988, Springer, Berlin, Germany, pp 475-484 Stefanakis E, Vazirgiannis M, Sellis T (1999) Incorporating fuzzy set methodologies in a DBMS repository for the application domain of GIS. INT J Geographical Information Science 13(7):657-675 Thill JC, Sui DZ (1993) Mental maps and fuzzy preferences. Professional Geographer 45: 264-276 Ullman EL (1956) The Role of Transportation and the Bases for Interaction. In: William TJ (ed) Man's Role in Changing the Face of the Earth, University of Chicago Press, pp 862-80. Wang F, Hall GB (1996) Fuzzy representation of geographic boundaries in GIS. INT J of Geographical Information Systems 10(5):573-590 Worboys M (2001) Nearness relations in environmental space. INT J Geographical Information Science 15(7):633-651 Yao X, Thill JC (2004) How far is too far? A statistical approach to contextcontingent proximity modeling. Transactions in GIS, forthcoming Zadeh LA (1965) Fuzzy sets. Information and Control 8:338-353 Zelinsky W (1980) North America’s vernacular regions. Annals of AAG 70(1):116
Registration of Remote Sensing Image with Measurement Errors and Error Propagation Yong Ge1, 4, Yee Leung2, Jianghong Ma3, and Jinfeng Wang4 1. Department of Earth and Atmospheric Science, York University, 4700 Keele St., Toronto, ON, Canada, M3J 1P3. Email: [email protected], Fax:1 416 736 5817; 2. The Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, Hong Kong; 3.Institute for Information and System Science, Faculty of Science, Xi’an Jiaotong University, Xi’an 710049, China; 4. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences & Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
Abstract Reference control points (RCPs) used to establish the regression model in registration or geometric correction are commonly assumed “perfect”. However, this assumption is often violated in practice due to RCPs actually always containing errors. Moreover, the errors in RCPs are one of main sources lowering the accuracy of geometric correction of uncorrected image. In this case Ordinary least squares (OLS) estimator, widely used in geometric correction of remotely sensed data, is biased and does not have the ability to handle explanatory variables with error and to propagate appropriately errors from RCPs to the corrected image. In this paper, we introduce the consistent adjusted least squares (CALS) estimator and propose a relaxed consistent adjusted least squares (RCALS) method, which can be applied to more general relationship, for geometric correction or registration. These estimators have good capability in correcting errors contained in the RCPs, and to propagate correctly errors of the RCPs to the corrected image with and without prior information. The objective of the CALS and our proposed RCALS estimators is to improve the accuracy of measurement value by weakening the measurement errors. For validating
286
Yong Ge, Yee Leung, Jianghong Ma, and Jinfeng Wang
CALS and RCALS estimators, we employ the CALS and RCALS estimators using real-life remotely sensed data. It has been argued and demonstrated that CALS and RCALS estimators give superior overall performances in estimating the regression coefficients and variance of measurement error. Keywords: Geometric correction, Registration, OLS, CALS, RCALS, Accuracy.
1 Introduction The purpose of geometric correction or registration is to explicitly determine the mapping polynomials by the use of reference control points (RCPs) and then determine the pixel brightness value in the image (Jensen, 1996, Richards and Jia, 1999). Ordinary least squares (OLS) are most frequently used to this preprocessing. However, if accurate registration between images is not achieved, then spurious differences will be detected (Townshend, et al., 1992). That is, instead of comparing properties of the same location at different images, we might mistakenly compare properties of different locations. Accuracy of the corrected image, of course, will have direct impact on the results of classification, change detection and data fusion. In registration or geometric correction, main uncertainties affecting their accuracy include (1) quality of uncorrected image or corrupted image; (2) size and arrangement of RCPs; (3) proficiency of the operator; (4) error from the model of geometric correction and (5) error from the RCPs. The effects of the first four factors on image classification and change detection have been studied in the literature (Congalton and Green, 1999; Janssen and Van der Wel, 1994; Dai and Khorram, 1998; Flusser and Suk, et al., 1994; Moreno and Melia, et al., 1993; Shin, 1997). Though error of RCPs i.e. factor 5, is one of the main sources affecting the accuracy of geometric correction for uncorrected image, it has seldom been studied (Congalton and Green, 1999; Townshend et al., 1992; Carmel et al, 2001; Dai and Khorram, 1998). Since RCPs mainly come from GIS and remote sensing images, errors in RCPs are essentially due to errors in data processing, and data analysis (Lunetta, et al., 1991). Such errors will then be propagated into the corrected image during the process of the registration or geometric correction. Though the most effective way to improve the accuracy of geometric correction is through ground survey with differential GPS, it is generally too costly for implementation. So, we usually use statistical procedure as a
Registration of Remote Sensing Image
287
surrogate. Common questions for registration or geometric correction are raised: MWhen the reference control points contain errors, how would these errors affect the regression coefficients and the accuracy of registration? NHow large an error in the explanatory variables is negligible? OHow to correct the errors contianed in the explanatory variables in order to improve the accuracy of registration? PMost importantly, how to propagate the error in RCPs to the corrected image and measure it in the corrected image? Since the OLS estimator of the regression coefficients is biased when the explanatory variables have errors (see section 2.2 for details), the registration based on OLS does not propagate appropriately errors from RCPs to the corrected image. Though researchers such as Buiten (1988, 1993) employed a variance-ratio and data-snooping test for the residuals calculated in the registration, error propagation and compensation of errors in RCPs have not been discussed. It is well known that error propagation plays a crucial role in the uncertainty about remotely sensed images and RCPs, as explanatory variables in the regression equation, always contain errors. Therefore, it is essential to develop new feasible methods to handle such a problem. In this paper, we only concentrate on error analysis in image-to-image registration. We introduce the consistent adjusted least squares (CALS) estimator and propose a relaxed consistent adjusted least squares (RCALS) method for registration. These estimators have good capability in correcting errors contained in the RCPs, and to propagate correctly errors of the RCPs to the corrected image with and without prior information. The objective of the CALS and our proposed RCALS estimators is to improve the accuracy of measurement value by weakening the measurement errors.
2 The Regression Model and Estimation Methods In this section, we first introduce a multiple linear measurement error (ME) model, also called as the errors-in-variables model in statistics, in which both the response variable and exploratory variables contain measurement errors. Limitations of the classical estimation method OLS are identified. Then, a flexible approach, the CALS, is introduced as a more appropriate method to handle error in variables. The RCALS, which can overcome the shortcomings of CALS, is then proposed for more flexible applications. Finally, we examine the issue of error propagation and give a significance test.
288
Yong Ge, Yee Leung, Jianghong Ma, and Jinfeng Wang
2.1 Multiple Linear ME Model In statistics, the standard multiple linear ME model assume that the “true” response K and “true” explanatory vector [ are related by
K
E0 ȟ T ȕ
(1)
Due to measurement errors, we can only observe variables y and x. That is, y K H, (2) x ȟ į ,
where the observed variables are x and y; the unobserved true variables are [ and K, and the measurement errors are G and H. As always, with Eˆ 0 K ȟ T ȕˆ , equation (1) becomes
Kˆ K
(ȟ ȟ ) T ȕˆ . To facilitate our discussion, we assume in this paper
that all data are centered. Thus, for a sample of size n, equations (1) and (2) become Y Ȅ ȕ İ , X Ȅ D.
( E1 ,, E p ) T , ȟ i and į i are p-dimensional vectors,
while Ki , yi , and H i are scalars. The measurement errors (į iT , H i ) are independent identically distributed (i.i.d.) random vectors, which are independent of the true values ȟ i . 2.2 OLS Estimator
If we ignore the measurement error when regressing Y on X, the OLS estimators of ȕ and V H2 are respectively
ȕˆ OLS
( X T X) 1 X T Y
(4)
Registration of Remote Sensing Image
(Vˆ H2 ) OLS
1 n p
Y T [I n X( X T X) 1 X T ] Y .
289
(5)
It should be noted that the above two expressions are no longer consistent estimators of ȕ and V H2 (Wansbeek and Meijer, 2000). In fact, Y Ȅ ȕ İ
~ Xȕ İ ,
~
where İ
İ Dȕ .
(6)
~ İ shares a stochastic term D with the regressor matrix X (see (3)). It im~ ~ plies that İ is correlated with X and hence E (İ | X) z 0. This lack of orthogonality means that a crucial assumption underlying the use of OLS is violated. 2.3 CALS Estimator
We consider the case that there are a sufficient number of restrictions on the parameters for model identification. These restrictions can be combined with the statistics from OLS to yield a consistent estimator of the model parameters. This estimation method is called consistent adjusted least squares (CALS) (Kapteyn and Wansbeek, 1984; Wansbeek and Meijer, 2000). We now consider equation (3). Assume that rows of D are i.i.d. with zero expectation and covariance matrix Ȉ D , and are uncorrelated with ȟ and İ , i.e., E (D | ȟ )
P Ȉ; , 0 and E (İ | D) 0 . Let S ; { 1n Ȅ TȄ o
P then S X { 1n X T X o Ȉ X . It can be shown (Wansbeek and Meijer, 2000)
ȕˆ OLS
P ( X T X) 1 X T Y o ȕ Ȉ X1 Ȉ D ȕ
(I Ȉ X1 Ȉ D ) ȕ
(7)
The bias of the OLS estimator of ȕ is (8)
Z { Ȉ x 1 Ȉ D ȕ When there is no measurement error, Ȉ D and OLS is consistent. In addition, (Vˆ H2 ) OLS
1 n p
0 , it implies that Z
P Y T [I n X( X T X) 1 X T ] Y o V H2 ȕ T Ȉ D Ȉ X1 Ȉ ; ȕ t V H2
0 (9)
can be obtained. Prior Information Known: If Ȉ D were known we can obtain a least squares estimator that is adjusted to attain consistency, denoted by CALS1
290
Yong Ge, Yee Leung, Jianghong Ma, and Jinfeng Wang
(Here CALS1 is used to distinguish it from the CALS estimator without prior information which is denoted by CALS2)
ȕˆ CALS1 where S XY {
P (S X Ȉ D ) 1 S XY o ȕ
(10)
1 T X Y . Thus, the CALS estimator ȕˆ CALS 1 can obtain better n
regression results than OLS when there are measurement errors in variables. Prior Information Unknown: In practice, Ȉ D is rarely known. We adopt the following equation to estimate the regression coefficients, denoted by CALS2.
ȕˆ CALS 2
(I S X1Oˆ ) 1 ȕˆ OLS
(11)
V H2 I , where Oˆ is the minimum eigenvalue of S, i.e. the minimum solution of S - O I 0 and S is defined as if Ȉ D
Ȉ D t I , where t ! 0 is scalar. That is, V H2 and t are not necessary to equal each other. According to the definition of Ȉ D , Ȉ D t I implies that all errors in the explanatory variables are independent and have the same variance t . It should be noted that the CALS estimator of ȕ is
ȕˆ (t ) ȕˆ CALS (I S X1ȍD ) 1ȕˆ OLS (S X t I ) 1 S X (S X1S XY )
(13)
(S X t I ) 1 S XY . When t is very small, (13) can be expressed approximatively as ȕˆ (t )
(S X1 t I ) S XY
(14)
Registration of Remote Sensing Image
291
According to the idea of orthogonal regression, we can establish the objective function f (t ) as follows: f (t )
( Y-Xȕˆ (t )) T ( Y-Xȕˆ (t )) 1 ȕˆ (t ) T ȕˆ (t )
(15)
We thus select t , as the estimator of variance V G2 of all explanatory variables, such that f (t ) is minimized. The optimization problem reduce to the solution of the following quadratic equation
D 0 D1 t D 2 t 2
0
(16)
where
D 0 { 2C 0 a T b C1a T a , D 1 { 2C 0 b T b 2C 2 a T a , D 2 { C1b T b 2C 2 a T b, a { Y Xȕˆ OLS , b { XS XY ,
(17)
C 0 { 1 S TXY S X2 S XY , C1 { 2S TXY S X1S XY , C 2 { S TXY S XY . It can be shown that a solution to equation (16) is positive and the other is negative. Only the positive solution tˆ can be selected as the estimator of V G2 . Thus we have °° ® ° °¯
Vˆ G2
tˆ ,
Vˆ H2
1 n
ȕˆ
(Y Xȕˆ OLS ) T (Y Xȕˆ OLS ) tˆ ȕˆ TLS (I tˆS X1 ) 1ȕˆ OLS ,
(I tˆ S X1 ) 1ȕˆ OLS .
(18)
292
Yong Ge, Yee Leung, Jianghong Ma, and Jinfeng Wang
2.5 The Error Propagation Model One of the great advantages of the CALS and the proposed RCALS estimators over OLS is that they can propagate errors. Having an error propagation mechanism is crucial to the analysis of remotely sensed data. According to the law of error propagation from equations (1) and (2), we can know that CALS and RCALS estimators can obtain the variance estimators of measurement error and propagate the errors in the explanatory variables to the response variable. OLS, on the other hand, does not have such a capability. Subsequently, we can obtain the estimator of variance of the response variable, follows:
V y2
ȕ TȈ ; ȕ V H2 ȕ TȈ X ȕ ȕ TȈ D ȕ V H2 ,
(19)
where Ȉ X is the variance matrix of the measurement vector of X. Here, we have two situations: (a) When ȟ is a deterministic variable, we can obtain the variance estimator of the response variable as Vˆ y2 Vˆ H2 ; (b) When ȟ is a random variable, the variance estimator of the response variable is obtained as Vˆ y2 ȕˆ T (S X Ȉˆ D ) ȕˆ Vˆ H2 . 2.6 The Significance Test
We answer here the question on how small an error in the explanatory variables can we ignore. In other words, we need to have a significance test on the variances of the measurement errors.Under the assumptions: Ȉ D V H2 I and V G2 V H2 V 2 , it can be proved that the approximate relationship Vˆ H2 ~ N (V 2 ,2V 4 ) holds (Kapteyn and Wansbeek, 1984). In order to test whether V 2 differs significantly from zero, we first specify a sufficiently small positive V 02 ! 0 . The one-side significance test is struco H1 : V 2 ! V 02 . When the inequality tured as follows: H 0 : V 2 d V 02 m
Vˆ H2 t (1 2U D )V 02 holds (where U D is the D upper quantile of the standard normal distribution), H 0 is rejected at the significance level D . We can say that there are significant measurement errors in the regression variables under the assumption Ȉ D V H2 I .
Registration of Remote Sensing Image
293
3 The Registration of Remotely Sensed Data In the remaining part of this paper, we discuss how to apply CALS and RCALS estimators to improve the accuracy in registering remotely sensed data. A real-life data is SPOT multispectral image acquired over Xinjiang, China on August 30, 1986. The size of the data set is 3000-by-3000 pixels with three channels. A 800-by-818 subset of the source image is used in this experiment (Figure 1). RCPs are obtained from ETM multispectral image acquired on September 30, 2000. The matching error of the locations of the points should be controlled within 0.1 pixels, so that the impact of matching erFig. 1 SPOT Multispectral image acquired ror on the accuracy of the regisover Xinjiang on August 30, 1986 tration can be ignored. 3.1 Mapping registration into Polynomials
It is assumed that a map (or an image) corresponding to the concerned image is of higher level of accuracy geometrically. Location of a point on the map is defined by coordinates ( g x , g y ) and that of the image is defined by coordinates (m x , m y ) . Suppose that the two coordinate systems can be related via a pair of mapping functions f and h . Though explicit forms for the mapping functions are not known, they are generally chosen as simple polynomials of first, second or third degree. For example, in the case of first order polynomial, the pair functions are expressed as ° z1 ® °¯ z 2 If we let ȟ
E 01 ȟ T ȕ (1) , E 02 ȟ T ȕ ( 2) .
( g x , g y , g x g y , g x2 , g 2y ) T and ȕ (i )
(20) ( E1i , E 2i , ..., E 5i ) T ,
i 1, 2 , then the above equations can still be expressed into (20).
294
Yong Ge, Yee Leung, Jianghong Ma, and Jinfeng Wang
Table 1. One set of remotely sensed data with 24 samples; UI(X), UI(Y) are explanatory variables; RI(X), RI(Y) are output variables. UI means Uncorrected Image; RI means Reference Image UI (X) 286.0625 223.8125 197.9375 672.0625 616.0625 538.8750 507.0312 452.0625 392.0625 416.9688 563.8125 622.0625
In this example, we do not have any prior information, i.e. V H2 and V G2 are unknown. What we do is to use the ME models to estimate the variance of the measurement errors of the explanatory variables and response variables from the sample data. For CALS estimator, the prerequisite for estimating the variance of the measurement errors is V H2 = V G2 when prior information is unknown. However, it is not practical in the registration of remotely sensed data. General speaking, the value of V H2 is much larger than that of V G2 . So under this condition, RCALS estimator is suitable. Before the registration of a remote sensing image, some preprocessings need be done. For efficient computation, we first convert the Geodetic Coordinates of RCPs RG ( RG ( RG x RG y ) T ) into the corresponding image coordinates RI ( RI
RI ( RGmin
( RI x RI y ) T ). The RCPs are shown in Table 1.
( RGmin x RGmin y ) T ) means the minimum vector in the x
direction and y direction. And then these control points are put into OLS and RCALS estimators. Table 2 and Figure 2 and 3 show the result of the registration. At the same time, we can obtain variances of measurement errors. We can observe from Table 2 that in a noisy environment the RCALS estimator has an overall superior performance compared to OLS.
Registration of Remote Sensing Image
295
Table 2. .The Comparision of RCALS and OLS Methods RCALS OLS
Eˆ0i 47.881707 -170.90637 47.881935 -170.90626
Eˆ1i 0.27295907 0.97551516 0.27295885 0.97551490
Eˆ 2 i 0.96943027 0.22110812 0.96942965 0.22110810
4 Conclusion We have discussed some basic issue of error analysis in image-to-image registration, and proposed an errors-in-variables model, called RCALS, for registration. It has been demonstrated that the OLS model is incapable of handling problems in which errors exist in the response and explanatory variables. While CALS is a suitable model to perform the task, it is too restrictive in its assumption. By introducing a more general relationship, the proposed RCALS model is more flexible in analyzing errors in registration. It also renders significance test and error propagation mechanism. The conceptual arguments have been substantiated by some simulated and real-life experiments. While the proposed RCALS model has a reasonably good performance, there are issues that need further investigations. (401727.942E, 4542481.925N)
(381797.303E, 4561611.598N)
Fig. 3 Corrected Image by OLS Fig. 2 Corrected Image by RCALS
296
Yong Ge, Yee Leung, Jianghong Ma, and Jinfeng Wang
Acknowledgement The work is supported in part by Research Grants Council of Hong Kong coded CUHK 4362/00E, by Chinese Academy of Sciences project fund coded, by National Natural Science Fund committee project coded 40201033 and by National 863 project coded 2001AA135151. The author would like to appreciate their supports.
Reference [1]
Buiten, H. J., 1988, Matching and mapping of remote sensing images: aspects of methodology and quality. Proceedings of 16th ISPRS congress, Kyoto, Japan, 27-B10 (III), pp. 321-330. [2] Buiten, H. J.; Clevers, J. G. P. W., 1993, Land observation by remote sensing: theory and applications, Gordon and Breach Science Publisher, USA. [3] Carmel, Y.; Dean, D. J.; Flather, C. H., 2001, Combing location and classification error sources for estimating multi-temporal database accuracy, Photogrammetic Engineering and Remote Sensing, Vol. 67(7), pp. 865-872. [4] Congalton, R. G.; Green, K., 1999, Assessing the Accuracy of Remotely Sensed Data: Principle and Practices, Lewis Publishers. [5] Dai, X. L.; Khorram, S., 1998, The effects of image misregistration on the accuracy of remotely sensed change detection. IEEE Transactions on Geoscience and Remote Sensing, Vol. 36(5), pp.1566-1577. [6] Flusser, J.; Suk, T., 1994, A moment-based approach to registration of images with affine geometric distortion, IEEE Transactions on Geoscience and Remote Sensing, Vol. 32(2), pp.382 –387. [7] Janssen, L. L. F.; Van der Wel, F. J. M., 1994, Accuracy Assessment of Satellite Derived Land-Cover Data: A Review, Photogrammetic Engineering and Remote Sensing, Vol. 60(4), pp.419426. [8] Jensen, J. R., 1996, Introductory digital image processing: a remote sensing perspective, Upper Saddle River, N.J.: Prentice Hall. [9] Kapteyn, A.; Wansbeek, T.J., 1984, Errors in variables: Consistent Adjusted Least Squares (CALS) estimator. Communication in Statistics ----Theory and Methods, 13, 1811-1837. [10] Lunetta, R. S.; Congalton, R. G.; Fenstermaker, L. K., et al., 1991, Remote Sensing and Geographic Information System Data Integration: Error Sources and Research Issues, Photogrammetic Engineering and Remote Sensing, 57(6): 677-687. [11] Moreno, J. F.; Melia, J., 1993, A method for accurate geometric correction of NOAA AVHRR HRPT data, IEEE Transactions on Geoscience and Remote Sensing, Vol. 31(1), pp.204 –226.
Registration of Remote Sensing Image
297
[12] Richards, J. A.; Jia, X. P., 1999, Remote sensing digital image analysis: an introduction, Berlin; New York: Springer-Verlag, 3rd
Double Vagueness: Effect of Scale on the Modelling of Fuzzy Spatial Objects Tao Cheng1, Pete Fisher2 and Zhilin Li1 1Department of Land Surveying and GeoInformatics, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong {lstc, lszlli} @polyu.edu.hk
2 Department of Geography, University of Leicester, Leicester, LE1 7RH, United Kingdom [email protected]
Abstract In the identification of landscape features vagueness arises from the fact that the attributes and parameters that make up a landscape vary over space and scale. In most existing studies, these two kinds of vagueness are studied separately. This paper investigates their combination (double vagueness) in the identification of coastal landscape units. Fuzzy set theory is used to describe the vagueness of geomorphic features based on continuity in space. The vagueness resulting from the scale of measurement is evaluated by statistical indicators. The differences of fuzzy objects derived from data at differing resolutions are studied in order to examine these higherorder uncertainties. Multi-scale analysis of the landscape is carried out using a moving window, ranging in size from 60x60 meters to 1500x1500 meters. The statistics of the fuzziness of the fuzzy landscape units are calculated, and the variability of them with scale is assessed. It shows that a major affect of scale on the mapping of geomorphic landscape units, is determination of the area of those units. This result implies that caution must be exercised in comparing landscapes at different scales and in choosing the resolution of the data that best describes the process under study. Keywords: Uncertainty, Vagueness, Multiscale, Fuzzy Spatial Objects
300
Tao Cheng, Pete Fisher and Zhilin Li
1 Introduction The inherent characteristics of geographical entities, i.e. continuity, heterogeneity, dynamics and scale-dependence, imply that most of them are naturally indeterminate or fuzzy (Cheng, 2002). A number of researchers have introduced the idea that terrain objects are fundamentally vague, and may be appropriate for analysis by fuzzy sets. Vague landscape features have been defined by two methods: either elevation as the basis of a semantic import model where some a priori knowledge is used to assign a certain value of fuzzy membership of a landscape feature to a particular height above datum (Cheng & Molenaar, 1999; Usery, 1996), or surface derivatives (such as slope and curvature) as input to a multivariate fuzzy classification (Burrough et al., 2000; Mackay et al., 2003). An extensive review of the theory of fuzzy sets and their use in GIS can be found in Robinson (2003). It is argued that as GIS-related applications increase in their levels of complexity and sophistication fuzzy sets will play a major cost effective role in their development. So far, most fuzzy set methods have been applied only at one spatial scale, for example the fixed resolution of a DEM used in terrain analysis. The effect of scale on the fuzzy modeling is ignored. Scale is inherent in the ways process operates. It is a major unsolved issue in geographical information related sciences although some attempts have been made. For example, in cartography, “how to derive small scale maps from large scale maps” is a key issue for automated map generalization (Li & Openshaw, 1993). In human geography, “how to aggregate data from small enumeration units to larger units for processing” on spatial analysis and modelling is called “the modifiable areal unit problem” (Openshaw, 1983; Marceau, 1999). In physical geography, “how to extrapolate information across scales” is often being asked to improve the cost-befit ratio of sampling (Wu & Qi, 2000). When vagueness is involved in the definition of the geographical objects, the multiscale problem makes the modeling and representation of the geographical objects more complicated and uncertain; there is not just uncertainty about the extent of the object, but about the estimate of the uncertainty of the extent of the object. However, very little research has looked at this double vagueness: vagueness from both space and scale, at the same time. “Will scale affect the result of modeling fuzzy spatial objects” is an un-tackled question. The research reported here aims to answer this question. In this paper, we evaluate the effect of scale on modeling fuzzy spatial objects, i.e., the subjectivity of the assignment of the fuzzy membership values to the scale
Effect of Scale on the Modelling of Fuzzy Spatial Objects
301
of measurement. The work is illustrated by a coastal geomorphologic case. Multi-scale analysis of the landscape is carried out using a moving window, ranging in size from 60x60 meters to 1500x1500 meters cells. The differences of fuzzy objects derived from data at differing resolutions are studied in order to examine the effect of double vagueness. The statistics of the fuzziness of the fuzzy landscape units are calculated, and the variability of them with scale is assessed. This paper is organized as follows. Sections 2 reviews the multiscale approaches. Section 3 introduces the methodology applied. Section 4 presents the case study and Section 5 reports the experimental results and statistical analysis. The last section summarizes the major findings with suggestions for further research.
2 Modeling Of Scale Scale can refer both to the level of detail of a description, and to the scope or extent of the area covered. To deal with scale in modeling human and physical systems, and to model the effect of scale on description is a challenging issue in geographical information science. In cartography, maps are produced at certain scales with different application, e.g. 1:10,000 and 1:100,000. Small-scale maps provide better overview while large-scale maps provide more detailed and precise information. It is intuitive that the same number of map symbols cannot be represented when the map scale is smaller. It means that the representation of the same features on the ground will be different on maps of different scales. The issue arising is “how to derive small scale maps from large scale maps” through operations such as simplification, aggregation and selective omission (Brassel & Weibel, 1988; McMaster & Shea, 1992). This issue is on the representation of spatial data and is called “map generalization”. As map generalization is not directly relevant to current research, it will not be discussed further here. In geography, there is a similar issue. Normally, geographical data are sampled in small enumeration units (also called small scale), and in some applications, these data need to be aggregated to a larger enumeration unit. However, the statistical results will be different when the analysis is carried out based on different size of enumeration units (specially on the zones used to produce aggregate statistics, i.e. different scales), and different aggregations of the same size. Therefore, there is an issue of “how to aggregate data from small enumeration units to larger units for processing”. This issue is called “the modifiable areal unit problem” (Openshaw, 1983; Marceau, 1999).
302
Tao Cheng, Pete Fisher and Zhilin Li
There is a similar issue in all geographical information related sciences, such as geomorphology, oceanography, soil science, biology, biophysics, social sciences, hydrology, environmental sciences and landscape ecology. In general, there are two related but distinctive goals for conducting a multiscale analysis in these studies. The first is to characterize the multiscale structure of a landscape. The second is to detect or identify “scale breaks” or “hierarchical levels” in the landscape, which often can be studied as a spatially nested hierarchy (Wu et al, 2000). Two approaches to multiscale analyses are possible: (1) the direct multiscale approach that uses inherently multiscale methods, and (2) the indirect multiscale approach that uses single methods repeatedly at different scales. Frequently used direct multiscale methods include semivariance analysis, wavelet analysis, fractal analysis, lacunarity analysis, and blocking quadrate variance analysis. All these methods contain multiscale components in their mathematical formulation or procedures, and thus are either hierarchical or multiscaled (Wu et al., 2000). On the other hand, the indirect approach to multiscale analysis can use methods redesigned from single scale analysis, such as a wide variety of landscape metrics (e.g. diversity, contagion, perimeter-area ratios, spatial autocorrelation indices) as well as statistical measures (e.g. mean, variance, correlation or regression coefficients). The scale multiplicity in the indirect approach is realized by resampling the data at different scales, albeit grain or extent, and then repeatedly computing the metrics or statistical measures using sampled data at different scales (Francis & Klopatek, 2000). Many studies have dealt with numerical aggregation such as zoning or modifiable area unit problems (Fotheringham & Wong, 1991; Jelinski & Wu 1996). Some studies have used categorical aggregation based either on a majority or a random rule (He at al, 2002). The statistical approach has been broadly applied in multiscale analysis, just as it has been widely used to model spatial uncertainty and its propagation (Heuvelink & Burrough, 2002). However, Fisher et al (2004) have used fuzzy sets to calibrate the vagueness resulting from multiscale analysis. Recent research on scales in GIS can be found in the book edited by Tate & Atkinson (2001). Five key issues of scales such as, “changing the scale of measurement”, “non-stationary modeling”, “dynamic modeling”, “conditional simulation” and “constrained optimization”, are put forward as recommended for further research for GI Science (p.268 – 269). It is argued that while regularization provides an important tool for modeling change of scale, it does not solve the problem of changing the scale of measurement for an actual data layer. When the change to the scale of measurement is facilitated by interpolation, the inherent smoothing which results in the predicted values may alter the bivariate distribution between
Effect of Scale on the Modelling of Fuzzy Spatial Objects
303
that variable and any other. Solutions based on simulation are inadequate. Therefore, the issue of “Changing the scale of measurement” is most important and should be given the highest priority by researchers among these five problems. Although fuzzy set theory has been widely used in GIS, the scale issue has not been investigated; while statistical approaches have been applied in multiscale analysis, the fuzzy aspects of the geographical features are ignored. This current research attempts to combine these two relevant and inherent issues by studying the effect of scale on modeling fuzzy spatial objects, as an approach to modeling the higher order fuzziness in spatial objects: double vagueness.
3 Principle Of Experimental Design Several steps should be implemented to derive fuzzy spatial objects from field observations (Cheng et al., 1997), i.e., field observation, interpolation, classification, and segmentation and identification. If field measurements are available and landscape units are mapped from such data, the effect of scale on the modeling will act on the interpolation and the effect of continuity in space (fuzziness) will act on the classification. The effect of double vagueness will be revealed in the final identification of fuzzy spatial objects. Therefore, an experiment will consist of three steps as follows. 3.1 Multiscale Analysis Since the hierarchical analysis does not have to assume the existence of a hierarchical structure in the landscape under study, we adopted the indirect approach to multi-scale analysis. The multiplicity of scales is realized by resampling the data at different resolutions (resolution acting as a surrogate for scale), and then repeatedly computing the statistical measures using sampled data at different resolutions. One way of resampling data is to systematically aggregate the original fine-resolution data set and produce a hierarchically nested data set, which leads to a hierarchical analysis using single-resolution methods. Therefore, in this step a fine-resolution data set will be scaled-up by the same algorithms to derive a series of coarserresolution data sets.
304
Tao Cheng, Pete Fisher and Zhilin Li
3.2 Fuzzy Classification As described in Section 1, the extent of landscape units are a fuzzy concept, but based on height observation, it is possible to derive a measure of different landscape forms. We can use a fuzzy set to represent the vagueness in the definition of the landscape units. A fuzzy membership function should be built to modify the crisp classification criteria. It will be applied to the multiscale data sets and results in a series of multiscale fuzzy classifications. 3.3 Identification of Multiscale Fuzzy Spatial Objects The estimation of the spatial extent of objects from the fuzzy classifications is related to the interpretation of fuzziness of the objects and their topological relationships, i.e., a pre-defined fuzzy object model is required. For example, if foreshore, beach and foredune are considered to be spatially disjointing objects, the conceptual model suggests that a specific location should either belong to beach or foredune, but not to both and a boundary has to be set to define explicitly the spatial extent of any object by assigning each grid cell to exactly one object. In such cases criteria (conditions) have to be applied to assign a cell to a specific class. After segmentation, the spatial extents of objects are identified and the boundaries between them are apparent automatically. These boundaries are called conditional-boundaries since they are based upon conditions (or criteria). In this case, the concept of objects with fuzzy spatial extent is applied, which means the objects are represented as fields with varying fuzziness and conditional boundaries (see also Cheng, 2002). A series of multiscale fuzzy spatial objects will be formed based upon the fuzzy classifications ion the previous step.
4 The Case Study
4.1 The Case Study Area A barrier island, Ameland, in the north of the Netherlands is adopted as a case study here. The process of coastal change involves the erosion and accumulation of sediments along the coast. It can be monitored through the observation of changes of landscape units such as foreshore, beach and
Effect of Scale on the Modelling of Fuzzy Spatial Objects
305
foredune. The process of coastal change is scale-dependent in space and time. Height observations have been made by laser scanning of the beach and dune area and by echo sounding on the foreshore. These data have been interpolated to form a full height raster of the test area. Experiments show that the uncertainty of the interpolated heights of the raster can be expressed by standard deviation ( V 0.15m ). However, in the following analysis, the error of the height raster, which was used as the original fine resolution DEM, is ignored. 4.1 Multiscale DEMs We used the software Landserf developed by Wood (2003) to aggregate the original fine-resolution DEM to coarse data sets. In Wood’s software the surface is modeled as a quadratic surface using the central point and the outer points of an expanding window, and calculating a generalized value of the elevation for the centre point of the surface. The reason for using algorithm is the availability of the implementation. Due to the size of the test area, we created DEMs at 13 resolutions. It means that we aggregated the original fine-resolution DEM (60m * 60 m) to coarse-resolution DEMs, using a moving window ranging in size from 3x3 cells to 25x25 cells. Therefore, a series of DEMs with cell size from 60m * 60 m (1 * 1 cells) to 1500 m * 1500 m (25 * 25 cells) are created. The characterization of scale-based uncertainty so far has been described independently of the model of the surface and any operational definition of scale itself. In this way, a series of DEMs with different resolution are created. 4.2 Fuzzy Classification of Coastal Landscape Units The landscape units are defined based upon water lines. The foreshore is the area above the closure depth and beneath the low water line, beach is the area above the low water line and beneath the dune foot, the foredune is the first row of the dunes inland from dune foot. These definitions are normally different from surveyor to surveyor, from case to case and from time to time. For example, the low water line was set to be -6m in 1965 to 1984 and 1989, and -8 m in 1985 to 1988 and in 1990 to 1993. Therefore, the extent of these landscape units are a fuzzy concept, but based on height observation, it is possible to derive a measure of foreshore, beach and duneness.
306
Tao Cheng, Pete Fisher and Zhilin Li
The fuzzy membership function is built to modify the crisp classification criteria and a trapezodial membership function was adopted (Cheng et al., 1997) for fuzzy classification. After fuzzy classification, each grid cell has a membership vector, containing a value for each of the three classes. As multiresolution DEMs were created in the previous section, a series of fuzzy membership vectors were created at 13 resolutions of which three are shown in (Figure 1).
Fig. 1. Three landscape units derived from fuzzy classification at 3 selected resolutions (1, 13 and 25) shown as objects with fuzzy spatial extents.
5 The Effect Of Scale On Fuzzy Spatial Objects In multiscale analysis, two opposing hypotheses are actually proposed: fuzzy spatial object will vary smoothly with changing grain size as pixels are aggregated reflecting a decrease in variability; or fuzzy spatial objects will show discrete changes as grain changes. Additionally, we want to determine if these changes could be modeled and, if so, could these models predict scale change effects on fuzzy spatial objects at either finer or coarser scales. In order to test the hypotheses, we used statistical analysis.
Effect of Scale on the Modelling of Fuzzy Spatial Objects
307
We calculated the total cells belonging to three landscape units based up the effective image window created at grain size 25 * 25 cells. Then we calculate by scale the mean, minimum and standard deviation of fuzziness for those cells belonging to each landscape units, respectively. Notice that the regression equations, y1, y2 and y3, in the following figures represent the relationship of foreshore, beach and foredune with scale, respectively. 5.1 Change in The Fuzzy Area There is obvious change of the area of the three fuzzy objects with scale (Figure 2). With the increase of scale, the area of beach decreased till to scale at 15, then increase till to scale at 19, then decrease till to scale at 25, but the area of foreshore and foredune increased till to scale at 9 and 15, respectively; then they change in opposite ways Count 180
870 860
160
850 140
840 830
120
820
100
810 80
800 790
60 1
3
5
7
9
11
13
15
17
19
21
23
Foreshore
Foredune
Beach
Linear (Beach)
Linear (Foreshore)
Linear (Foredune)
25
Scale
Regression Equation
y1 = 0.9835x + 135.35
y2 = -2.9451x + 858.69
y3 = 1.9615x + 85.962
Coefficient of Determination
0.3547
0.6664
0.3311
Fig. 2: Change in the fuzzy area of the three objects with scales In order to test the hypotheses set in the beginning of this section, linear regression analyses and diagnoses were implemented. The results are also shown in Figure 2. We can see that the area of all the landscape units change significantly (at significant level 95%) with scale, while foreshore and foredune are in positive relationships and beach in negative relationship. However, the coefficient of determination for foreshore and foredune (R2= 0.35) shows only about 35% of the variance of area is explained by
308
Tao Cheng, Pete Fisher and Zhilin Li
its common variance with scale, suggesting low levels of explanation, indicating that other factors could be involved; but the coefficient of determination for the area change of beach indicates 67% variance is coming from the effect of scale. Further, it is not only the area of each landscape unit changes, but also the percentage of different landscape types change dramatically (See pie chart in Figure 3). Scale 1
Scal e 25
Scal e 13
1 1
1
2 2
2
3 3
3
1 Foreshore
2 Beach
3 Foredune
Figure 3. The change of percentage of three landscape units at Scales 1, 13 and 25. 5.2 Change in The Mean of The Fuzziness The Change in mean of the fuzziness of three landscape units with scales is reported in Figure 4. It can be seen from Figure 4 that the changes in the mean of fuzziness of foreshore and beach have similar tune, mostly down with scale; but the change of foredune is unstable. Regression analyses and diagnoses have also applied to the data above. From the results we can only say that the mean of fuzziness of foreshore and beach change significantly and linearly (at level 95%) with scale. But for beach, the linear regression test is not passed. 5.3 Change in The Minimum of The Fuzziness The change in minimum of fuzziness of the three landscape units with scales is illustrated in Figure 5, which shows cyclic patterns. But the amplitude and ranges are different for these three units. The change exhibits cyclic fluctuations, indicative of the periodic pattern in the landscape. Regression analyses reveal that there is no obvious linear correlation of the minimum fuzzy membership function value of the landscape units with scale. It is intuitively showed in Figure 5.
Effect of Scale on the Modelling of Fuzzy Spatial Objects
309
0.98 0.96 0.94 0.92 0.9 0.88 0.86 0.84 0.82 1
3
5
7
9
11
13
15
17
19
21
23
25Scale
Foreshore
Beach
Foredune
Linear (Foreshore)
Linear (Beach)
Linear (Foredune)
Regression Equation
y1 = -0.0022x + 0.9467
y 2= -0.0004x + 0.9661
y3 = -0.0082x + 0.9398
Coefficient of Determination
0.7957
0.0423
0.649
Fig. 4. Change in the mean fuzziness of three landscape units with the scale 5.4 Change in The Standard Deviation ( V ) of fuzziness We also calculated the standard deviation of the fuzziness of each landscape unit at different scales. There is no systematically change with the scale, but relevant. It can be seen that a linear relationship exists between V of the fuzziness of foreshore and of foredune with the scale. We can see that the standard deviation of the fuzziness of all the landscape units change significantly (at level 95%) with scale positively. The coefficient of determination for foreshore and foredune (R2= 0.6) suggests high levels of explanation, i.e. around 60% of the variance of STDEV is explained by its common variance with scale, but the coefficient of determination of beach is relatively low as 0.29, indicating other factors might involve in the variance (Figure 6).
310
Tao Cheng, Pete Fisher and Zhilin Li
0.55 0.54 0.53 0.52 0.51 0.5 1
3
5
7
9
11
Foreshore
13
15
17
19
21
Beach
23
25 Scale
Foredune
Regression Equation
y1 = 0.0021x + 0.1193
y2 = 0.0015x + 0.0885
y3 = 0.0046x + 0.1236
Coefficient of Determination
0.58
0.2881
0.632
Fig. 5. Change of the minimum of fuzziness of three landscape units 0.19 0.17 0.15 0.13 0.11 0.09 0.07 1
3
5
7
9
11
Foreshore Foredune Linear (Beach)
13
15
17
19
21
23
25 Scale
Beach Linear (Foreshore) Linear (Foredune)
Fig. 6. Change of the standard deviation of fuzziness of three landscape units 5.5 Summary In summary, we found that the area of three landscape units, the mean and the standard deviation of the foreshore and beach change significantly with scale; but the minimum of the fuzziness of the landscape units doesn’t
Effect of Scale on the Modelling of Fuzzy Spatial Objects
311
change significantly with scale. It implies that the scale has effect on fuzzy classification, i.e. the fuzzy membership values changed so that the class of the cells changed which resulted the area of fuzzy objects are different with scale. The change of STDEV is obvious with scale implies that the fuzziness of the landscape units increase with scale (becoming more uncertainty with scale). This is because the aggregation enlarges (and smoothes) the transition zone between the landscape units (please refer to Figure 1). Further, we would like to say although the linear regressions have been applied, the coefficients of determination (R2) are generally low, specially the area of the three landscape units, indicating low explanation of the variance of the statistical indicators with scale. It also suggests that the liner regression lines are not the best-fit lines. Therefore, polynomial trend lines are tried out. We may say that fouth-order polynomial line fits the trend of change in area very well.
6 Conclusions This paper combined double vagueness due to continuity in space and scale. It evaluated the effect of resolution (as a surrogate for scale) on modeling fuzzy spatial objects. A multi-resolution analysis was implemented ranging in size 3x3 cells to 25x25 cells. The statistics of the fuzziness of the landscape units were calculated. In general the scale of the measurement affects the identification of geomorphic landscape units, particularly the area of the landscape units. Fine resolution affords more detail in the original data, but does not necessarily result in the landscape appearing to be more highly fragmented and complex than the same landscape examined with a coarser resolution. Results show that it is difficult to predict the effect of resolution on fuzzy spatial objects, although the change with resolution exhibits a linear relationship in some statistical indicators. In others, the change of the fuzzy spatial objects with resolution exhibits cyclic fluctuations, indicative of the periodic pattern in the landscape, which are more suitable for a polynomial regression. In view of this, caution must be exercised in comparing landscapes at different scales and in choosing the resolution of the data that best describes the process under study. Many researches have discussed about the effect of generalization algorithm on DEM, not the derived information from DEM. For example, many discuss about the effect of support of remote sensing image itself, not the finally spatial objects extracted from the images, which are the final product we obtain. Therefore, how to evaluate the algorithm of gener-
312
Tao Cheng, Pete Fisher and Zhilin Li
alization will be a critical point for further research. Moreover, the effect of fuzzy membership function on fuzzy classification is also relevant and needs further investigation. Also, further research using additional data of landscapes and a greater range of resolution is necessary to determine whether general scaling laws be determined. The study on effect of scale on modeling fuzzy spatial information is still in its infancy. We don’t think we can draw a general conclusion about the question proposed in our paper, but rather arise further questions. The results from this study will enable us to be aware of the level of uncertainty associated with the modeling outcome and thus make precautions if necessary.
Acknowledgements The first author wishes to thank the Hong Kong Polytechnic University for the Postdoctoral Research Fellowship (no. G-YW92).
References Brassel, K. E. and Weibel, R., 1988, A Review and Conceptual Framework for Automated Map Generalisation, International Journal of Geographical Information Systems, 2, 229-244. Burrough, P. A., van Gaans, P. F. M., and MacMillan, R. A., 2000, Highresolution landform classification using fuzzy k-means. Fuzzy Sets and Systems, 113, 37–52. Cheng, T., Molenaar, M. and Bouloucos, T., 1997, Identification of fuzzy objects from field observation data. In Spatial Information Theory: A Theoretical Basis for GIS, Lecture Notes in Computer Sciences, edited by S. C. Hirtle and A. U. Frank (Berlin: Spring-Verlag), Vol. 1329, pp. 241–259. Cheng, T. and Molenaar, M, 1999, Objects with fuzzy spatial extent. Photogrammetric Engineering and Remote Sensing, 65, 797–801. Cheng, T., 2002, Fuzzy spatial objects, their change and uncertainties, Photogrammetric Engineering and Remote Sensing, 68, 41–49. Fisher, P., Wood, J., Cheng, T., and Rogers, P., 2004, Where is Helvellyn? Fuzziness of multiscale landscape morphometry. Transactions of the Institute of British Geographers, 29, 106–128. Foody, G. M., 2002, Status of land cover classification accuracy assessment. Remote Sensing of Environment, 80, 185–201.
Effect of Scale on the Modelling of Fuzzy Spatial Objects
313
Fotheringham, A. S., and Wong, D. W. S., 1991, The modifiable areal unit problem in statistical analysis, Environment and Planning, 23, 1025–1044. Francis, J. M. and Kopatek, J. M., 2000, Multiscale effects of grain size on landscape pattern analysis. Geographic Information Sciences, 6, 27–37. Hay, G. J., Blaschke., T., Marceau, D. J., and Bouchard, A., 2003, A comparision of three image-object methods for the multiscale analysis of landscape structure. ISPRS Journal of Photogrammetry & Remote Sensing, 57, 327–345. He, H.S., Venturta, S. J., and Mladenoff, D.J., 2002, Effects of spatial aggregation approaches on classified satellite imagery. International Journal of Geographical Information Science, 16, 93–109. Heuvelink, G. B. M. and Burrough, P., 2002, Developments in statistical approaches to spatial uncertainty and its Propagation, International Journal of Geographical Information Science, 16, 111–113. Jelinski, D. E., and Wu, J., 1996, The modifiable area unit problem and implications for landscape. Landscape Ecology, 11,129–140. Li, Z. L. and Openshaw, S., 1993, A natural principle for objective generalisation of digital map data. Cartography and Geographic Information System, 20, 19–29. Mackay, D. S., Samanta, S., Ahl, D. E., Ewers, B. E., Gower, S., and Burrows, S. N., 2003, Automated parameterization of land surface process models using fuzzy logic. Transactions in GIS, 7, 139–153. Marceau, D. J., 1999, The scale issue in the social and natural sciences. Canadian Journal of Remote Sensing, 25, 347–356. McMaster, R. B. and Shea, K. S., 1992, Generalization in Digital Cartography (Association of American Geographers), 134p. Openshaw, S., 1983, The Modifiable Areal Unit Problem, CATMOG 38 (Norwich, UK: Geo Books). Robinson, V. B., 2003, A perspective on the fundamentals of fuzzy sets and their use in geographic information systems. Transactions in GIS, 7, 3–30. Tate, N. and Atkinson, P. (eds.)., 2001. Modelling Scale in Geographical Information Science (Chichester: Jone Wiley & Sons), 277pp. Usery, E. L., 1996, A conceptual framework and fuzzy set implementation for geographic feature. In Geographic Objects with Indeterminate Boundaries, edited by P. A. Burrough and A. U. Frank (London: Taylor & Francis), pp. 71–85. Wang, F. and Hall, G. B., 1996, Fuzzy representation of geographical boundaries in GIS, International Journal of Geographical Information Systems, 10, 573– 590 Wood, J., 1996, The Geomorphological Characterisation of Digital Elevation Models Unpublished PhD Thesis, Department of Geography, University of Leicester. (http://www.soi.city.ac.uk/~jwo/phd), Accessed February 2003. Wu, J., and Qi, Y., 2000, Dealing with scale in landscape analysis: an overview. Geographic Information Sciences, 6, 1–5. Wu, J., Jelinski, D. E., Luck, M. and Tueller, P.T., 2000, Multiscale analysis of landscape heterogeneity: scale variance and pattern metrics. Geographic Information Sciences, 6, 6–19.
Area, Perimeter and Shape of Fuzzy Geographical Entities Cidália Costa Fonte1 and Weldon A. Lodwick2 1
Secção de Engenharia Geográfica, Departamento de Matemática, Faculdade de Ciências e Tecnologia, Universidade de Coimbra, Apartado 3008, 3001 - 454 Coimbra - Portugal, e-mail: [email protected]. 2 Mathematics Department, University of Colorado at Denver, Campus Box 170 – P.O.Box 173364, Denver, Colorado USA 80217-3364, e-mail: [email protected]
Abstract This paper focuses on crisp and fuzzy operators to compute the area and perimeter of fuzzy geographical entities. The limitations of the crisp area and perimeter operators developed by Rosenfeld (1984) are discussed, as well as the advantages of the fuzzy area operator developed by Fonte and Lodwick (2004). A new fuzzy perimeter operator generating a fuzzy number is proposed. The advantage of using operators generating fuzzy numbers is then illustrated by computing the shape of a FGE, through its compactness, using the extension principle and the fuzzy area and perimeter.
1 Introduction In object based Geographical Information Systems (GIS) the geographical information is represented by geographical entities. These entities are characterized by an attribute and a spatial location. The spatial location of a geographical entity may be represented either in a vector data structure or considering a tessellation of the geographical space formed by elementary regions to which an attribute is assigned. In the second case, the geographical entities are formed aggregating contiguous regions to which the attribute characterizing the geographical entity was assigned. The elementary regions forming the tessellation may be, for example, cells in a raster data structure, Voronoi polygons or Delaunay triangles. In this paper, for simplicity, we restrict ourselves to the second type of geographical entities,
316
Cidália C. Fonte and Weldon A. Lodwick
formed by elementary regions ri i 1,..., n . However, the operators presented herein can be extended to the continuous space. Since there may be uncertainty in the construction of the geographical entities, there may be uncertainty in their spatial location (Fonte and Lodwick 2003). In these cases, the spatial extent of the geographical entities may be represented by a fuzzy set, generating a Fuzzy Geographical Entity. A Fuzzy Geographical Entity (FGE) EAt, characterized by the attribute ‘At’, is a geographical entity whose position in the geographical space is defined by the fuzzy set E At ^ri :ri belongs to the geographical entity characterized by attribute 'At'` ,
with membership function P EAt (ri ) > 0,1@ defined for every elementary
region ri in the space of interest. The membership value one represents full membership. The membership value zero represents no membership, and the values in between correspond to membership grades to EAt, decreasing from one to zero. Many GIS applications require the computation of geometric properties of the geographical entities, such as their area, perimeter or shape. Therefore, it is necessary to develop operators capable of computing these geometric properties of Fuzzy Geographical Entities (FGEs). In section 2 some area operators are presented, such as the Rosenfeld area operator and the fuzzy area operator. Section 3 is dedicated to the perimeter operators. A crisp operator due to Rosenfeld (1984) is presented, and a new fuzzy operator, generating a fuzzy number is proposed. The evaluation of the shape of a FGE is analysed in section 4 based on the compactness operator, where the fuzziness in the area and perimeter values are propagated to the compactness using the extension principle. Throughout this paper it is assumed that the reader is familiar with the basic ideas of fuzzy set theory (Klir and Yuan 1995). The presented concepts are applicable to FGEs represented by normal fuzzy sets, called normal fuzzy geographical entities.
2 Area of a Fuzzy Geographical Entity Even though the computation of the area of geographical entities is a basic operator of any GIS, little attention has been given to the computation of the area of FGEs. Katinsky (1994) stated that an ambiguous determination of the area of a FGE required its defuzzification, while Erwig and Schneider (1997) stated that the area of a FGE was an interval, but since
Area, Perimeter and Shape of Fuzzy Geographical Entities
317
new operators had to be developed to process intervals only used the inferior and upper limit of the interval. Since a FGE is represented by a fuzzy set, the operator proposed by Rosenfeld (1984) can be applied to compute the area of a FGE. The method proposed by Rosenfeld to compute the area of a FGE considers that the contribution of the area of each elementary region to the total area is proportional to the membership function value assigned to it. That is, for a fuzzy set E, in the discrete case, AR E
¦¦ P r A r E
x
i
(2.1)
i
y
This operator is appropriate when the concept of FGE arises from mixed pixels and the membership function values correspond to the percentage of the pixel area occupied by the attribute characterizing the FGE. However, if the membership function values represent the degree of uncertainty of whether the pixels can be classified as belonging to a geographical entity characterized by a certain attribute, the spatial extent of the geographical entity is not known and its corresponding area cannot be known either. For this situation the value of the Rosenfeld area operator is just a crisp approximate value of the entity’s area, and no information regarding other possible values is given. Similarly, if the membership function values translate a degree of belonging based on similarity to the attribute, the Rosenfeld area operator has also a limited applicability (Fonte and Lodwick 2004). To overcome the limitations of the Rosenfeld area operator a new operator with a fuzzy output, called fuzzy area operator (AF), was developed (Fonte and Lodwick 2004). A FGE is characterized by a fuzzy set. A fuzzy set can be represented in a unique way by a family of level cuts, for D > 0,1@ . Since an alpha level cut is a crisp set
D
E
^r : P r t D ` , i
E
i
its area, denoted here by
Area D E , is the sum of the areas of the regions belonging to the level cut. That is, for a FGE E:
AreaE : > 0,1@ o 0
D AreaE D
Area D E
z.
Since the level cuts of a fuzzy set are nested, that is E J E E J E , the function AreaE is decreasing because E E J E Area E E t Area J E AreaE E t AreaE J .
318
Cidália C. Fonte and Weldon A. Lodwick
Let us consider a set of values D i i 1,..., n belonging to (0,1] and a set of values zi i 1,..., n , belonging to 0 , such that zi where 0 D i D i 1 d 1 . Definition: The fuzzy area AF E
^ z, P
AF E
AreaE D i ,
of a FGE E, is the fuzzy set
z ` where
P AF E : 0 > 0,1@
z P AF E z
and zk
when i : z zi max D i ° zi AreaE Di ° z zk D k 1 D k D k when i : z zi ® ° zk 1 zk °0 when z ª¬ AreaE 1 , AreaE 0 ¯
max zi and D k
min
zk AreaE D i
zi d z
(2.2)
D i .
The fuzzy area operator generates a fuzzy number (Fonte and Lodwick 2004). It’s support is the set of all values the area can take and each alpha level cut is the set of all values the area can take if level cuts of the FGE corresponding to values larger or equal to alpha are considered. Table 1 shows the areas of the level cuts of the FGE represented in Fig. 1 considering that each cell has a unitary area. The resulting fuzzy area is represented in Fig. 2.
0
0
0
0
0
0
0
0
0
0
0.5
0
0
0
0
0
0
0
0.6 0.9 0.8 0.7 0.3
0.3 0.8
0
1
1
0.8 0.4
0
1
1
0
0
0.4
0.9 0.5
0
0
0
0.2 0.6 0.7 0.9 0.4
0
0
0
0.1 0.3 0.2 0.3
0
0
0
0
0
0
0
0
0
0
Degree of membership
1
0
0.8 0.6
0.4
0.2
0 0
5
10
15
20
25
30
Area z
Fig. 1. Fuzzy geographical entity E.
Fig. 2. Fuzzy area of the fuzzy geographical entity E.
Area, Perimeter and Shape of Fuzzy Geographical Entities
319
Table 1. Area zi of the alpha level cuts of E. i
Di
1 2 3 4 5 6 7 8 9 10 11
0.001 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
zi
Area
E Di
26 26 25 23 19 16 14 12 10 7 4
Several properties of both the Rosenfeld area operator and the fuzzy area operator are analysed by Fonte and Lodwick (2004).
3 Perimeter of a Fuzzy Geographical Entity Rosenfeld (1984) proposed an operator to compute the perimeter of a fuzzy set formed by a finite set of contiguous and homogenous regions. This operator, herein designated by Rosenfeld perimeter (PR), may be applied to FGE represented by a tessellation. The Rosenfeld perimeter is given by:
PR E
n
nij
¦ ¦ P r P r l a E
i, j 1 k 1 i j
i
E
j
ijk
(3.1)
where n is the number of elementary regions forming the FGE, nij is the number of arcs separating regions ri and rj and l(aijk) is the length of k arc aijk of contact between regions ri and rj. This operator considers that each arc separating contiguous elementary regions has a degree of belonging to the perimeter equal to the difference of the membership grades associated to the elementary regions. Therefore, as for the Rosenfeld area operator, the Rosenfeld perimeter operator generates an approximate real value for the perimeter of the FGE, giving no other information about the other values it can take, nor about its variability with the grades of membership to the FGE. To overcome these limitations of the Rosenfeld perimeter a fuzzy perimeter operator is proposed.
320
Cidália C. Fonte and Weldon A. Lodwick
A level cut of a FGE E is a classical set. Let P D E be the perimeter of the alpha level cut of the fuzzy set representing the FGE. Let us now consider for each FGE E a function PE such that: PE : > 0,1@ o 0
D PE D P D E
p.
The perimeter values pi obtained for the FGE E represented in Fig. 1 corresponding to the alpha levels D i are shown in Table 2. Table 2. Perimeter pi of the alpha levels of E. i
Di
1 2 3 4 5 6 7 8 9 10 11
0.001 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
pi
P
E Di
24 24 24 26 20 20 16 16 16 14 8
Notice that while for the area E J E E J E Area E E t Area J E , since the level cuts of a fuzzy set are nested, that does not happen for the perimeter. It can vary in any way with the several level cuts of the FGE. That is,
E J E E J E P E E t P J E . Table 2 and Fig. 3 show the values of the perimeter of several level cuts of the FGE represented in Fig. 1. Notice that the perimeter of alpha levels 0.1 and 0.2 are smaller than the perimeter values of alpha level 0.3 (which has a larger area, as can be seen in Table 1). Therefore, the fuzzy perimeter cannot be built in the same way the fuzzy area was, because otherwise, in some cases, the output might not be a fuzzy set. Let us denote by pi , i 1,..., n , a set of values of 0 such that
D i : pi
PE D i , where 0 D i D i 1 d 1 .
Area, Perimeter and Shape of Fuzzy Geographical Entities
321
Definition: The fuzzy perimeter of a FGE E, is the fuzzy set
The continuous line in Fig. 3 shows the fuzzy perimeter of the FGE of Fig. 1. The fuzzy perimeter operator satisfies the following properties (proofs can be found in Appendix A) Property 1: For all FGE E, the support of PF E is a subset of 0 . Property 2: If E is a normal FGE then PF E is a fuzzy number. 1
PF(E)
Ҙ
Membership function
0.8
pi
0.6
0.4
0.2
0 5
10
15
20
25
30
Perimeter p
Fig. 3. Fuzzy perimeter of the geographical entity E.
Such as with the fuzzy area operator, the support of the fuzzy perimeter shows the set of values the perimeter of the FGE can take. Such as for the fuzzy area, each alpha level cut of the fuzzy perimeter is the set of all values the perimeter can take considering level cuts of the FGE corresponding to values larger or equal to alpha.
322
Cidália C. Fonte and Weldon A. Lodwick
4 Shape of a Fuzzy Geographical Entity The shape of a FGE may be evaluated in several ways. One of them is the compactness, given by
C E
2 S Area E PE
(4.1)
where Area(E) is the area of entity E and P(E) represents its perimeter (Forman 1997). The compactness takes values between zero and one. A value of one is obtained for a circle, the value 0.88 for a square and values gradually smaller for more contorted shapes. As the fuzzy operators for the area and perimeter generate fuzzy numbers, a fuzzy shape can be obtained applying the extension principle to the results of these fuzzy operators (see Appendix B). That is,
CF E
2 S AF E PF E
(4.2)
As the area and perimeter of a geometric figure are not independent variables, the cartesian product of the fuzzy area and perimeter values should not be considered, but a relation between these values. The result of applying the extension principle to the cartesian product generates the fuzzy set represented in Fig. 4. Notice that the support of this fuzzy set is even larger than the set [0,1]. That is, we obtain impossible values for the compactness. This happens because we are considering that it is possible to have the area of one level cut and the perimeter of another. That is, we are combining areas and perimeters of different geometric figures, which makes no geometrical sense. A relation should then be defined between the values of the fuzzy area and perimeter. For each area value corresponds a certain membership grade. So, that area value can only be related to the perimeter value having the same membership grade. Considering the fuzzy area and perimeter of FGE represented in Fig. 1, the relation presented in Table 3 is obtained. Applying the extension principle considering this relation (see Appendix B) the fuzzy set of Fig. 5 is obtained. Its support represents the possible variation of the compactness of the FGE and the level cuts of the fuzzy compactness the set of all values the compactness can take considering alpha level cuts of the FGE corresponding to values larger or equal to alpha.
Area, Perimeter and Shape of Fuzzy Geographical Entities
323
1
1
0.8
Membership function
Membership function
0.9
0.7 0.6 0.5 0.4 0.3 0.2
0.8
0.6 0.4
0.2
0.1 0
0
0.5
1
1.5
2
2.5
0 0
0.2
Compactness
Fig. 4. Compactness of a FGE obtained through the application of the extension principle to the cartesian product of the fuzzy area and perimeter of a FGE
0.4 0.6 Compactness
0.8
1
Fig. 5. Compactness of a FGE obtained through the application of the extension principle to the relation in Table 3 defined between the fuzzy area and perimeter of the FGE represented in Fig. 1.
Table 3. Relation between the area and perimeter values of a FGE. Relation between the area zi and the perimeter pi Degree of membership Di
5 Conclusions The inclusion of FGEs in a GIS requires operators capable of computing their geometric properties, such as area, perimeter and shape. The computation of crisp areas and perimeters of fuzzy sets, and therefore of FGEs, can be done using respectively the area and perimeter operators developed by Rosenfeld (1984). The crisp values generated by these operators are not satisfactory for many semantics, as they just provide an
324
Cidália C. Fonte and Weldon A. Lodwick
approximate value, giving no information about the area or perimeter variation with the grades of membership, nor about the set of the other possible values. The limitations of the crisp operators motivated the development of the fuzzy area and perimeter operators. When applied to normal FGEs, both operators generate fuzzy numbers. Their support is the set of all values the area or perimeter can take, and the degrees of belonging to the fuzzy area and fuzzy perimeter represent the possibility of occurrence associated to each area or perimeter value. Fuzzy operators incorporate therefore much more information than the crisp ones. The shape of geographical entities may be evaluated computing their compactness, which is a function of area and perimeter. As the output of both the fuzzy area and the fuzzy perimeter operators are fuzzy numbers, the compactness of a FGE can be computed using the extension principle, generating a fuzzy value for the compactness. Therefore, fuzzy area and perimeter operators enable not only the delivery of more information to the user, but also the propagation of uncertainty to other quantities. The computation of the compactness of FGE is an example of this propagation.
Appendix A Proof of property 1: The support of PF E is the set > min pi , max pi @ .
Since the perimeter of a geometric figure is always a positive number, pi t 0 , for all i 1, , n . Consequently, the support of PF E is a subset of 0 .Ŷ Proof of property 2: A fuzzy number is a fuzzy interval with a bounded support whose core has one and only one value of R (Dubois et al., 2000) and a fuzzy set A is an upper semi continuous fuzzy interval if and only if: 1. The core of A is a closed interval, represented by > a, a @ , or an interval of the form f, a @ , > a , f or f, f ; 2. The restriction of P A x to f, a @ (when appropriate), denoted by A , is a right-continuous non-decreasing function; The restriction of P A x to > a , f (when appropriate), denoted by A , is a left-continuous non-increasing function. If E is a normal FGE then its core is not an empty set. According to the definition of fuzzy perimeter, the perimeter p of the core of E has a degree
Area, Perimeter and Shape of Fuzzy Geographical Entities
325
of belonging to the fuzzy perimeter equal to one, and consequently PF E is a normal fuzzy set. Since in the definition of fuzzy perimeter the only value p that has a degree of belonging to the fuzzy perimeter is the core’s perimeter, it can be stated that there is only one value p such that P PF E p 1 . Denoting this value by pc, the core of PF E is the closed set > pn , pn @ . According to the definition of P PF E p , the restriction of P PF E p to > f, pn @ is an increasing function continuous to the right and the restriction of P PF E p to > pn , f @ is a decreasing function continuous to the left. Then, P PF E p is an upper semi continuous fuzzy interval and, since there is only one value p such that P PF E p 1 , PF E is a fuzzy number. Ŷ
Appendix B Extension principle: Let us consider a function f : X o Y and let F X
and F Y be respectively the fuzzy power sets of X and Y. Then, for every set A F X , f : F X o F Y A f A
^y : y
f x x A` ,
and the degree of belonging of each value y Y to f A is given by sup ª¬ P A x º¼ ° f x
P f A y ® x: y
if
x : y
f x
°¯0 if x : y f x . The extension principle is also valid when function f is defined on a cartesian product, that is, when X X 1 u X 2 u ... u X n and f : X 1 u X 2 u ... u X n o Y
x1 , x2 ,..., xn f x1 , x2 ,..., xn
y.
In this case, if x1 , x2 ,..., xn are non-interactive variables and A1 , A2 ,..., An are fuzzy subsets of respectively X 1 , X 2 ,..., X n , the extension principle states that (Dubois et al. 2000)
326
Cidália C. Fonte and Weldon A. Lodwick
f : F X 1 u F X 2 u u F X n o F Y is such that: f A1 , A2 ,..., An
^y : y
f x1 , x2 ,..., xn x1 , x2 ,..., xn A1 u A2 u ... u An `
and the degree of belonging of each value y Y to f A1 , A2 ,..., An is given by: P f A ,..., A y 1
n
sup min ª P A x1 ,..., P A xn º if x1 ,..., xn : y f x1 ,..., xn n ¬ 1 ¼ ° x1 ,..., xn : ® y f x1 ,..., xn °0 if x1 ,..., xn : y f x1 ,..., xn . ¯
The extension principle can also be applied to relations. In this case the extension principle is only applied to the combinations of elements of X 1 , X 2 ,..., X n that belong to the relation (Dubois et al. 2000).
References Dubois D, Prade H (2000) Fuzzy interval analysis. In: Dubois D, Prade H (eds) Fundamentals of Fuzzy Sets. The Handbook of Fuzzy Sets Series, Kluwer Acad. Publ., pp 483-581 Erwig M, Schneider M (1997) Vague Regions. In: 5th International Symposium on Advances in Spatial Databases. Springer Verlag, LNCS 1262, pp 298-320 Fonte C, Lodwick W (2003) Modelling the fuzzy spatial extent of geographical entities. In: Cobb M, Petry F, Robinson V (eds) Fuzzy Modeling with Spatial Information for Geographic Problems. Springer-Verlag. In press Fonte C, Lodwick W (2004) Areas of Fuzzy Geographical Entities. International Journal of Geographical Information Systems. Vol.18, No.2, 127-150 Forman R (1997) Land Mosaics: The Ecology of Landscapes and Regions. Cambridge University Press Katinsky M (1994) Fuzzy Set Modeling in Geographic Information Systems. Unpublished Master’s Thesis, Department of Geography, University of Wisconsin-Madison, Madison, WI Klir G, Yuan B (1995) Fuzzy Sets and Fuzzy Logic-Theory and Applications. Prentice Hall PTR, New Jersey Rosenfeld A (1984) The diameter of a fuzzy set. Fuzzy sets and Systems, Vol.13, 241-246
Why and How Evaluating Generalised Data ? Sylvain Bard and Anne Ruas Laboratoire COGIT - Institut Géographique National - 2 Avenue Pasteur 94165 Saint Mandé France. [email protected].
1 Introduction In science, one of the main challenge is measuring. A measure allows us to describe a phenomenon (e.g. the size of a human) or to compare different phenomena (e.g. one human is taller than another one). Science is divided over our capacity to measure : we talk on ‘exact sciences’ - the sciences that can be measured - and the others, the ‘experimental sciences’. In experimental sciences, such as medicine, biology or geography, measuring is essential but always complex and subject to human interpretation. Some measurements are performed on specific phenomena in order to understand a part of them. As an example one can measure different chemical components in human blood to detect a disease, the state of a disease, or the reaction of a human to a medicine used to cure a disease. Thus the health of
328
Sylvain Bard and Anne Ruas
human is perceived by means of a set of partial measures. Experimental or human sciences deal with complex phenomena that we want to understand, perhaps master, or to improve. To do so we measure these phenomena and we learn to manage the uncertainty associated with these measurements as well as the measures that we omitted. In GIS - the science of geographic information - measuring is also essential. Measuring the earth is the core of our activity, and has been since the time of Ptolemy. During the twentieth century, a tremendous effort has been made to improve the quality of the earth measurement. Much research has been undertaken to measure the quality of these new methods of positioning as well as the quality of the coordinates (geometric positional accuracy). As users, this geometric accuracy gives us a way of understand and trust the position of the data through analysis and processes. But GIS activity encompasses not only earth measurement but also its representation which includes a necessary synthesis of reality. As noted by J.L. Borges, the creation of a one to one scale map is unrealistic (and of little use), a map is always the result of a synthesis of the geographic world. This synthesis starts with the acquisition step – only a part of the reality is digitized – and continues with the representation step: only a part of the digitized information is represented on a map. In the GIS community, this process of synthesis is named generalisation. Generalisation is not a mere reduction of information – the challenge is one of preserving the geographical meaning. To do this, the initial information is analysed, structured, compared, and evaluated in order to remove what is not essential and to enhance what is important. A geographical data base is a generalisation of the reality. A map is also a generalisation of the reality. As research has been undertaken to evaluate the quality of geographical object positioning (Goodchild et al 1998), some research should also be performed to evaluate the quality of the abstraction and of the representation of earth on a map. We are thus looking for a way to measure the quality of generalised data, following other research studies such as (Ehrliholzer 1995) and (João 1998). As users, the description of the quality of generalised data would give us a way to understand how far we can use and trust the information contained in the generalised data for specific analysis and processes performed on this generalised data. This paper presents our proposal to evaluate generalised data from non generalised data. Ideally, we would like to conceive a single measurement that would be able to compute a single number that would characterise the quality of a generalised data set. Of course the evaluation is a complex process and we are still far from being able to conceive this magical generic measurement that would fit with all types of generalisation. How-
2 Decomposition of the Evaluation Process 2.1 Different types of evaluation The evaluation is useful to qualify a product but also to improve the quality of the product. The product to evaluate is either the output (i.e. the generalised data) or the system (i.e. the process of generalisation). Evaluating the results is useful in itself but can also be used to improve the system. Evaluating the system is useful to improve itself and consequently the future results (Ruas 2001). In this research we concentrate only on the evaluation of the output (the generalised data), keeping in mind it will be useful for the system improvement in the future. In the following we name ‘object’ the digital representation of a geographic entity. When we evaluate objects, we evaluate the digital representation of geographic entities. In order to reduce the complexity of evaluation, we first need to differentiate between the types of evaluation. An evaluation can be processed: x to detect inconsistency, which are the objects or groups of objects that have not been correctly generalised. In such a case, the aim of the evaluation is to detect automatically the inconsistencies to allow a user to correct the generalised data as a final step of generalisation process. We call this evaluation the evaluation for editing. The errors may come
330
Sylvain Bard and Anne Ruas
from a missing algorithm, or an algorithm that gives non coherent results for specific cases (e.g. a building simplification algorithm that would badly generalise ‘round shape’ buildings). x to improve the description of the final product as metadata. In such a case it should be more synthetic than the evaluation for editing. The aim is not to describe what is wrong but what are the characteristics of the generalised data. We call this evaluation the descriptive evaluation. This information describes the ‘distance’ between the reference and the generalised data. It is also fundamental to tune or to improve the system and its algorithms. More precisely, it describes : What has been removed from the reference: the type and the quantity, and if possibly why, under which conditions some objects are removed or aggregated (e.g. smallest buildings have been removed, whatever their type), The type and average quantity of the distortions that have been made on the maintained objects (e.g. the average of displacement from initial position is 10 m), x to mark the evaluation with a unique value that defines the quality of the generalisation. This would be very useful to compare different solutions of generalisation obtained with different set of parameters or with different software of generalisation. We name this evaluation the evaluation for marking. This evaluation is the most synthetic and the most complex one. To sum up, we differentiate three type evaluations : 1. the evaluation for editing aims at identifying errors and mistakes. It is an automatic process necessary as a checking final step of generalisation process, 2. the descriptive evaluation aims at summarising : the information reduction: what has been removed ? the information distortions: what are the differences between the generalised data and the initial ones ? 3. the evaluation for marking, aims at aggregating the descriptive evaluation into a single synthetic value. 2.2 Using different levels of detail for the evaluation Usually, a process of evaluation consists in measuring a distance between the data to evaluate and a reference. Here we have no perfect reference: the only reference we have are the non generalised data.
Why and How Evaluating Generalised Data ?
331
x The advantage is that we do not have the ‘classical evaluation problems’ such as the appropriate choice of samples, the data matching or the choice of the extrapolation function. We have all the data before and after generalisation and we measure and interpret the difference between both data sets. x The drawback is that the reference is over detailed: the functions of evaluation f are more complex than the identity (xgen = xini + H ). These functions should integrate the appropriate and required level of detail (LOD). As an example, the number of objects should be reduced and the size of the polygons should be bigger than a certain value. Moreover, some objects are removed or aggregated during the process. The evaluation in such a case should occur at a more global level. As we needed different levels of analysis to automate the process, we propose to use different levels of analysis for the evaluation. As a consequence some evaluation will occur at the object, at the group or at the population level : x We call micro evaluations, the evaluations that occur at the object level. We compare one non generalised object with one generalised object (1to-1 relationship) x We call meso evaluations the evaluations that occur at the group level: we compare a group of non generalised objects with a group of generalised objects (n-to-m relationships, n greater than m; n and m having small values). For example a meso evaluation is used to compare two groups of buildings, or the street network of a town. x We call macro evaluations the evaluations that occur at the data set level: we compare all data of a type before and after generalisation to control object removal and maintenance (n-to-m relationships, n greater than m; n and m having big values). Macro evaluation is used to control the evolution of the quantity (in number, in length, in area) of the population such as houses, streets, administrative buildings or forest. Two specific cases should be enlighten : x the 1-to-0 relationship corresponds to an object removal. It is studied at the meso level to check if this object should be contextually removed and at the macro level to check if this object should be globally removed, x the n-to-1 relationship corresponds to an aggregation : several objects are merged into a single one. This particular case can be viewed as a micro or meso evaluation. Whatever the case, the generalised object should represent the n non generalised objects.
332
Sylvain Bard and Anne Ruas
2.3 A bottom up approach To evaluate the generalised data, we propose to use a bottom up approach based on an accurate description and evaluation of each micro, meso and macro object. This information is then gradually aggregated according to the evaluation needs: 1. Each object is described by a set of characteristics before and after generalisation. Taking the example of buildings, it means that : each building (micro) is described by its characteristics (its size, its shape, its orientation, its granularity) before and after generalization, each group of buildings (meso) is described by its characteristics before and after generalisation: its number of objects, the set of distances, and the specific positioning if any, the population of buildings (macro) is described by its characteristics before and after generalisation: the number of buildings and the sum of building areas; 2. for each data type and each characteristic, a specific evaluation function is used to compare and to interpret the value of each characteristic of each object before and after generalisation. This evaluation is named the ‘characteristic evaluation’. At this step there is no aggregation of values at all. One by one, each characteristic is evaluated from very good to bad by means of an appropriate evaluation function. As a consequence all ‘bad or very bad generalisation’ cases are detected. This level is used for the evaluation for editing. It detects errors and mistakes. It also gives the qualitative values that will be used for the different aggregations; 3. a set of aggregations can then be performed at the object or characteristic levels: the ‘object evaluation’ allows to evaluate the generalisation of each object by a synthesis of the evaluation of its characteristics. As a result, each building will have one quality value that synthesises its generalisation, the ‘characteristic type evaluation’ allows to evaluate one characteristic for all objects of a type. This information enriches the description of the macro object. As an example one function is used to summarise the size of all buildings. The building macro object (i.e. the population of buildings) will have one ‘size quality value’ that describes the way the size has been generalised for all buildings, 4. an ‘object type evaluation’ can then be computed either from the synthesis of all ‘object evaluation’ of a type (i.e. the aggregation of all
Why and How Evaluating Generalised Data ?
333
buildings ‘quality value’) or from the synthesis of all ‘characteristic evaluation’ (i.e. the aggregation of all characteristics quality value) of the macro object ‘population of buildings’ 5. a final aggregation can then be performed to compute a ‘global mark’ from the previous evaluation results. In the following we present two fundamental points : the data modelling used to automate this evaluation (section 3) and the evaluation functions used to compute the ‘characteristic evaluation’ (section 4). The results and a first aggregation are presented in section 5.
3 Data Modelling for the Evaluation The proposed evaluation model combines (1) the information required to provide a generalisation evaluation modelled as classes and attributes; and (2) the information required to perform generalisation evaluation, that is, the algorithms modelled as methods. The geographic information is represented through a geographic model composed of classes of objects representing features of the real world, such as buildings, roads or vegetation. The model for evaluation schematised in figure 1 is an extension of a classical data model, with the addition of an explicit representation of the two classes that represent data before and after generalisation : Geographic_object_ungeneralised and Geographic_object_generalised. According to the multiple representation paradigm, we would say that we explicitly represent different classes for each level of detail. Then, two classes Characterisation and Evaluation have been added to hold the evaluation functions and results and at least two others classes Specification and Generalisation_Parameter contain the information that describe the parameters used for each evaluation process. These four classes are described hereafter.
Eval_Characteristic p Object evaluation Compare() Evaluate() Aggregate()
Fig. 1. The proposed model to evaluate generalised data
Characterisation. The Characterisation class holds the descriptive information of the ungeneralised and generalised objects such as the size, the shape and the orientation for a building. This class is associated with the Geographic_object class. The inheritance relation permits the characterisation of ungeneralised and generalised geographic objects. A method Characterise() is associated with this class to calculate the value of the attributes (i.e. the characteristics). Evaluation. The Evaluation class reports the evaluation information computed during the process. So this class is associated with the class of generalised objects only. One object ‘evaluation’ is thus associated with one generalised geographic object. This evaluation object holds two kinds of information : (1) a qualitative evaluation for each characteristic (the characteristic evaluation), and (2) a qualitative evaluation for the geographical object (the object evaluation) computed from the aggregation of the different characteristic evaluation values. Three methods are defined for this class: Compare(), Evaluate() and Aggregate(). Compare() quantifies for each type of characteristic the distance between the value after generalisation and what it should be according to generalisation specification; Evaluate() interprets the quantitative value computed by Compare(). The result is the ‘characteristic evaluation’ which is a qualitative evaluation for each characteristic (e.g. let us named eb.12 the evaluation object of one building, Evaluate(size(eb.12)) = good). The last method Aggregate() computes the ‘object evaluation’ which is a qualitative value for each geographic object (e.g. Aggregate(eb.12) = rather good). Parameter_generalisation. The Parameter_generalisation class holds the general information on the generalisation: initial scale, final scale and a list of thresholds. This list of thresholds corresponds to the cartographic
Why and How Evaluating Generalised Data ?
335
threshold of legibility. These thresholds are automatically computed from the scales with the method Default(). An interface allows the user to change these default values on demand. Specification. The Specification class holds the user specifications for the evaluation, i.e. the generalisation requirements. It is structured by feature class so one object of this class describes all the evaluation parameters of a class of Geographic_object: the list of the characteristics to assess, the shape of the evolution function associated to each characteristic, the parameters of this function, the tolerance and the aggregation operator. The method Default() proposes default functions and computes the parameters of these functions from the generalisation scales. An interface allows the user to change these default values on demand (see figure 5 in (Bard 2004a)).
4 Function of Evaluation of a Characteristic In this section we only concentrate on the method used to evaluate the generalisation of a characteristic of an object. This part is fundamental as the different aggregations (point 3, 4 and 5 of section 2.3) are all based on the values computed at this step of the process. The function Compare() computes a distance between the final value of the characteristic of an object and what it should be. This ideal value is computed by means of the initial value and a function of evolution that holds the constraints of generalisation. Then, the function Evaluate() computes a qualitative interpretation of the quantitative value (the deviation or the distance) computed by Compare(). Thus the evaluation is based on the evolution function (4.1) and on an interpretation function (4.2) based on a tolerance and a sensitivity values. Char_eval = Evaluate [Compare (evolution-fct, ini_value, gen_value); tolerance; sensitivity] 4.1 The evolution function The evolution function describes the ideal (or theoretical) value of a characteristic after generalisation from its initial value before generalisation. Actually, during generalisation because of legibility constraints, some characteristics should be preserved (for instance the position, the orientation and the shape), while others should often be modified (such as the size, the granularity or the proximity). These functions reflect and are parameterised by the cartographic thresholds necessary to perceive, separate
336
Sylvain Bard and Anne Ruas
and differentiate features on a map. Different shapes for evolution function have been reported in Figure 2: Preservation for preservation constraints, Simplification, Amplification and Threshold effect for modification constraints. Nevertheless, to obtain a good generalisation, a raw and systematic change of feature characteristics up to the legibility thresholds is not sufficient because it creates a loss of differentiation. A ‘good generalisation’ should preserve the main relationships between features, such as relative sizes. Thus the appropriate evolution function is often a combination of several typical evolution functions. Anyway, these different shapes of evolution functions permit to compute the theoretical or ideal value of a characteristic from each initial value and from the user's specification. This ideal value is compared to the observed value after generalisation. Final value
Final value Constraint of preservation : preservation Constraint of modification : simplification
Observed value Theoretical value
deviation
amplification threshold effect
a)
Initial value
b)
Initial value
Fig. 2. Different evolution functions (a) Definition of the theoretical value (b).
Deviation = Compare(..) = obs_val – ideal_val = obs_val – (evolution_function (initial_val)) Then, the evaluation (4.2) consists in the interpretation of this deviation: is it acceptable or not? The detection of the appropriate evolution function per characteristic has been studied for buildings at the micro level and groups of buildings at the meso level. Hereafter, we present the synthesis of the results, the tests can be found in detail in (Bard 2004b). Figure 3 synthesises the classical possible functions that can be used for the evolution. Table 1 summarises our choices for buildings at the micro and meso levels. These functions are proposed to the user during the initialisation step but they can be changed on demand.
Why and How Evaluating Generalised Data ? a)
Vf
b)
Vf
Vi
Vi
Preservation
Amplification f)
Vf
d)
Vf
Vi
Vi
Reduction e)
Vf
c)
Vf
337
Threshold g)
Vf
Different shapes Vi : Initial value Vf : Final value Vi Saturation
Vi
Vi Differentiation
Combination
Fig. 3. : The main functions of evolution detected during our experiments Table 1. The evolution functions for each characteristic Level of analysis
Characteristic Size
Building (micro level)
Group of buildings (Meso level)
Shape Position Orientation Granularity Alignment Free space Cluster of buildings Density Proximity Semantic repartition
4.2 The interpretation from the deviation, the tolerance and the sensitivity The result of the evaluation is an interpretation of the deviation computed by Compare() (4.1). The aim is to qualify the generalisation of a characteristic. This evaluation has to be flexible enough to accept different possible generalisation solutions (Spiess 1995). We propose the use of a tolerance value to distinguish the set of acceptable solutions of generalisation from the set of non acceptable ones (Figure 4a). If the observed value is con-
338
Sylvain Bard and Anne Ruas
tained inside the tolerance area, the generalisation is acceptable, if not the generalisation of a characteristic is bad. . Moreover we add another threshold – the sensitivity - which corresponds to the solution of generalisation perceived as equivalent. The sensitivity describes an interval of values in which we do not visually perceive any differences of generalisation, while the digital values are different. For instance at 1: 50 000, due to the limitation of human eye, we can not perceive any difference between a building of 450m² and the same building enlarged up to 465 m². As a consequence we consider that these buildings have equivalent sizes. c)
Initial value
Initial value
Observed value
Theoretical value
0
Rather bad
2 Rather good
Tolerance area Acceptable values
deviation
Good
Observed value Theoretical value
Tolerance
1
tolerance
Bad
b)
Final value
Sensitivity
a)
deviation
Fig. 4. a) the tolerance, b) the sensitivity c) the qualitative interpretation
The qualification of the deviation with an interpretation function facilitates the understanding of the evaluation and constitutes a normalisation of the different values of evaluation. Four qualitative values have been selected to describe good, rather good, rather bad and bad quality (Figure 4c). The tolerance value separates acceptable (1) and not acceptable results of generalisation (2). The sensitivity value (not shown in figure 4c) allows a better interpretation of values, particularly at the aggregation step. Both threshold values (the tolerance and the sensitivity) depend on the final scale and on the user requirement. Default values are defined by means of text analysis (such as (Spiess 1995)) and specific experiments.
5 Experiments and Results The aim of this section is to present some results to illustrate our method of evaluation. For this we have selected 1) the evaluation of a characteristic at micro and meso level (5.1 and 5.2), and (5.3) the complete evaluation of a geographic object for a set of characteristics (object evaluation). We deliberately made bad generalisation examples to enrich our experiments. As said previously, the evolution functions we have chosen for our ex-
Why and How Evaluating Generalised Data ?
339
periments are only choices (and not the truth) and they can be changed on demand. 5.1 The evaluation of the size characteristic (micro level) The assessment of the size characteristic is based on three components : (1) the respect of a minimum size (legibility threshold), (2) the preservation of the order of building sizes and (3) a moderate amplification of buildings of average size. The shape of the evolution function is based on these components and is presented in figure 5: the threshold is fixed to 400m² and the tolerance for the interpretation is 50m². This function is adapted for 1-to-1 relationship between generalised and non generalised buildings. For n-to-1 relationships (aggregation) the initial value is replaced by a representative value. The representative value is based either on the sum of the size of the n initial objects, either on the size of the convex hull of the n initial objects. Final value
Threshold Initial value
Fig. 5. The evolution function for the characteristic of size
This function has been tested on a extract of urban data. The figure 6 shows some bad (b) and good (c) examples of generalisation. Our evaluation function correctly separates the good and the bad results. As an example this function detects the building that have been to much emphasized.
Fig. 6. Evaluation of size characteristic : initial state (a), generalisation of bad quality (b) and generalisation of good quality (c) (black : bad, white : good)
240
Marc van Kreveld, Iris Reinbacher, Avi Arampatzis, Roelof van Zwol 55
55
15
50
20
6
6
50 19
7 14 7
40
15 12
11 8
40
17
5 18 13
30
10 14
30
16
9
12 10
5
20
20
4
9
8
4 2
3 13
1
10
10 2
11 3
1
50
40
30
20
10
0 55
55
18
50
20
11
10
0 55
50
55
50
55
50
55
9
50 12
7
17
10
40
40
30
20
13 14
11
10
40
6
8 15 7
30
12 3
30
4
5
14 8
16
20
20
9
13
3
4
2 6 15
1
10
10 2
19 5
1
50
40
30
20
10
0 55
55
0
10
20
30
40
55
5
50
13
16
4
50
19 3
40
12
6
15 12
10
7
40
10
6
17 15
30
11 14
30
3
9
20
9
4
20
20
7
10
8
14
10
18
2 13
1
11
2
1 0
5 8
10
20
30
40
50
55
0
10
20
30
40
Fig. 4. Top: Ranking by distance to origin and angle to closest (k = 1, c = 0.1). Middle: Ranking by distance to origin and distance to closest (Equation 3, λ = 0.05). Bottom: Ranking by additive distance to origin and angle to closest (α = 0.4, λ = 0.05).
Why and How Evaluating Generalised Data ?
341
portant, intermediate or not important. A weight is computed according to the value of evaluation (from Good to Bad), and a global value is then computed by the sum of the weight for all the characteristics. The figure 9 illustrates the application of this aggregation operator on a set of buildings.
b)
weight
4
4
4
3
3
3
2
2
2
1
1
1
7 G
rG
rB
B
Size, shape
G
rG
rB
B
Orientation, granularity
G
rG
rB
B
5 Sum Good
9 Sum rGood
Bad
weight
Few important characteristic
Rather bad
weight
Characteristic of intermediate importance
Rather good
Important characteristic
Good
a)
13
10 11 Sum rBad
15 Sum Bad
Position
Fig. 8. Weight of building characteristics (a) -The interpretation function (b)
Good Rather good Rather bad Bad
Fig. 9. Evaluation of building characteristics : initial state (a), different generalisation and evaluation
6 Conclusion Whereas many research has been undertaken to measure positional accuracy, some more research should be done on the representation issue (Hunter 2001). Without being yet able to propose one single measure that would mark generalised data, we propose a methodology, a model and algorithms that can be used for evaluation purposes. As it is unrealistic to foresee user needs, our strategy is to propose interfaces with default values and default functions that the user can change according to their needs. Moreover to add more flexibility, we decomposed the process of evaluation into sub-processes. Specifically the evaluation of the generalisation of an object or a population of objects follows a bottom-up approach by means of an aggregation function chosen by the user according to its needs. The evaluation of generalised data is an important research domain
342
Sylvain Bard and Anne Ruas
as we should be able to evaluate how geographical data represents the real world.
References Bard, S., 2004a, Quality Assessment of Cartographic Generalisation, Transactions in GIS. ,8, 63-81. Bard, S., 2004b, Méthode d'évaluation de la qualité de données généralisées - applications aux données urbaines ; PhD Thesis, University of Paris 6, France. Boffet, A., and Rocca-Serra, S., 2001, Identification of Spatial Structures within Urban Blocks for Town Characterization, in Proceedings of 20th International Cartographic Conference (Beijing : ICC), Vol. 3, pp. 1974-1983. Ehrliholzer R., 1995, Quality assessment in generalization : integrating quantitative and qualitative methods, in Proceedings of the 17th International Cartographic Conference. (Barcelona : ICC), pp. 2241-2250. Goodchild M., and Jeansoulin R., 1998, Foreword in Data quality in geographic information: from error to uncertainty, edited by Goodchild M., and Jeansoulin R., (Paris : Hermes), pp.11-14. Hunter, G.J., 2001, Spatial Data Quality Revisited, Keynote Address in Proceedings of the Geo 2001 Symposium, edited by Claudia Bauzer-Medeiros, pp. 1-7 João, E.M., 1998, Causes and consequences of map generalization. (London: Taylor and Francis). Ruas, A., 2001, Automatic generalisation project: learning process from interactive generalisation. (OEEPE publication), n°39. Spiess, E., 1995, The need for generalization in a GIS environment. In GIS and Generalization : methodology and practise. Edited by Müller J C, Lagrange J P and Weibel R (London : Taylor & Francis): pp. 31-46
Road Network Generalization Based on Connection Analysis Qingnian Zhang1,2 1
Department of RS and GIS, Sun Yat-Sen University, Guangzhou, China; GIS Centre, Lund University SE-223 62 Lund, Sweden Email: [email protected]
2
Abstract Road network generalization aims to simplify the representation of road networks by reducing details, while maintaining network connectivity and overall characteristics. This paper presents a method to select salient roads based on connection analysis. The number of connections is counted at each junction, which acts as a parameter indicating the association between salient roads. Several strategies are proposed to combine connection criterion with road length and road attributes (when available), so as to order roads corresponding to their relative importance in road network generalization. A case study shows that connection criterion is favored for maintaining the density differences in road network, and the combined criteria creates more reasonable results than only length criterion. Keywords: Road network, Map generalization, Connection analysis
1 Introduction Road network generalization is an important issue in map making, and many methods have been proposed. The most common technique in road generalization is to select and omit roads according to semantic information, e.g., function class. However, semantic information is not enough for road network generalization. More advanced techniques take topological and geometric information into account, such as Graph Theory based methods (Mackaness and Beard 1993; Mackaness 1995; Thomson and Richardson 1995; Jiang and Claramunt 2003; Jiang and Harrie 2003), the perceptual grouping based method (Thomson and Richardson 1999; Thomson and Brooks 2002) and the agent-based method (Morisset and Ruas 1997; Lamy et al 1999).
344
Qingnian Zhang
Graph Theory has been popularly used in the structure analysis of road networks. Mackaness and his colleague (Mackaness and Beard 1993, Mackaness 1995) computed connectivity properties and the MCST (Minimum Cost Spanning Tree) according to graph theory in road/street network generalization. (Thomson and Richardson 1995) decomposed road network into a set of SPSTs (Shortest Path Spanning Tree). Each segment is ordered on a SPST according to rules similar to that of stream ordering. Road attributes can be further integrated into SPSTs for the evaluation of the relative importance of road segments. Jiang and his co-authors (Jiang and Claramunt 2003; Jiang and Harrie 2004) used a modified graph theory framework to model road networks, which took named streets as nodes and street intersections as links of a connectivity graph. Road geometry can be used as another important factor in road network generalization. Thomson and Richardson (1999) introduced the concept “stroke” into road generalization using road geometry. A stroke is a chain of road segments with good continuation. The term “stroke” indicates that the chain (a curvilinear element) is smooth, as if it can be drawn in one smooth movement and without a dramatic change in style. A road network is partitioned into a set of strokes. Strokes have no branches, but they may intersect each other. Further analysis based on road attribute and length allows the strokes to be ordered, which creates a sequence reflecting their relative importance in the network. From a functional point of view, Morisset and Ruas (1997) evaluated the importance of roads according to the amount of road use. They proposed a method to select roads of high frequency usage by means of an agent-based simulation in road generalization. This paper proposes a method concentrated on the geometric aspects and the association between roads. It is organized as follows. The next section introduces the criteria for road network generalization. The third section proposes a method for road network generalization based on connection analysis. This method was implemented in a Java environment. Section 4 presents some results in a case study testing this implementation. Existing problems and further research are discussed in Section 5. We conclude the paper in Section 6.
Road Network Generalization Based on Connection Analysis
345
2 Criteria for Road Network Generalization 2.1 Important Roads/Junctions by Attributes Attributes are widely used in road network generalization. Some attributes can indicate the relative importance of roads and junctions to a large degree, and are thus quite helpful in map generalization. For example, roads may be divided into different function classes; Cities at junctions may belong to different administrative levels. Those attributes can be used conveniently to select important roads/junctions, when they are available. 2.2 Connectivity of a Road Network Most roads are connected to each other in original maps. Generally speaking, the connectivity between roads should be remained. Some researchers defined the connectivity criteria as to maintain the shortest link between two terminals as adopted in MCSTs (Mackaness and Beard 1993) and SPSTs (Thomson and Richardson 1995). 2.3 Salient Roads and Junctions Hub-like junctions are those ones where many roads converge. Long roads and hub-like junctions are salient elements in road networks, whose overall structure depend on those salient elements to a large extent. Furthermore, those elements often indicate important roads and cities in road networks (Thomson and Richardson 1999, Mackaness and Beard 1993). 2.4 Characteristic Patterns Regular patterns may exist in road networks. For instance, star-like road networks exist in many regions, where many roads converge at a hub-like junction. Those patterns may not exist in the whole map area, but they consist of the most important characteristics in road networks. However, regular patterns are somehow difficult to be formalized and, accordingly, maintained in generalized maps.
346
Qingnian Zhang
2.5 Density Difference between Different Areas Road density is generally different between different areas. Thus more roads should be maintained in dense areas than those in sparse areas, so as to keep density differences in generalized maps. 2.6 Summary We described several criteria in road network generalization. Additional criteria may be introduced, e.g., the legible space between roads. However, it may be impossible to implement comprehensive criteria in the current environment. We tend to emphasize two criteria in our method, that is, salient (long) roads and density differences. Density differences are indicated by the number of connections in this study. Since connectivity and network patterns are related to salient roads and density differences in a way, we expect to satisfy these two additional criteria to some extent at the same time.
3 Methodology Our methodology selects salient roads based on connection analysis in road network. It can be viewed as a four-stage process. The first stage creates salient elements, that is, “strokes”. The following stage computes the number of connections for each stroke. Then strokes are ordered by the number of connections and other parameters, indicating their relative importance. Lastly, strokes are selected according to their sequence with specified reduction degree. 3.1 Create “Strokes” We create “strokes” as salient elements using road geometry, as suggested by (Thomson and Richardson 1999), but with a minor alteration. That is, when road attributes are available, we merge segments into strokes based on attributes dominantly and good continuation secondly, rather than good continuation dominantly. However, we would like to concentrate on such cases when no attributes are available. We create strokes by means of merging segment pairs. At each junction, all connected segments are compared to each other. Two segments with
Road Network Generalization Based on Connection Analysis
347
similar directions are the best pair. Taking Fig. 1 as an example, segment e and f meeting at point P consist of the best pair.
Fig. 1 Grouping segments into strokes with good continuation
When road attributes are available, we select two segments with the same attributes as the best pair. If more than two segments have the same attributes, similar direction is further considered. Otherwise, if no segments have the same attributes, we select the pair with similar attributes and directions. When more than one segment pairs exist at a junction, they are merged one by one. The best pair is merged first, then the next best pair. The process is repeated until no other pair exists. 3.2 Connection Analysis The purpose of connection analysis is to provide an auxiliary parameter to evaluate relative importance of the strokes created in the first stage. In this stage, we count the number of connections at all junctions for each stroke. That is, for each stroke, how many times other strokes are connected to it? A stroke may touch another stroke at a point, cross another stroke at a point, cross another stroke at several points, or not intersect another stroke at all. As described in Fig.2, stroke a, b, c and d intersect each other two more times. While segment b touches segment a at point P1, segment c crosses segment b at point P2. We distinguish these cases from each other with different count of connections as follows. a. When a stroke touches the current stroke at a point, we count it as a connection. b. When a stroke crosses the current stroke at a point, we count it as two connections, regarding it as two strokes touching the current stroke at different sides. c. When a stroke crosses or touches the current stroke at several points, we count the connections at each point for touching or crossing cases, and sum the number of connections.
348
Qingnian Zhang
Fig. 2 Strokes touches or crosses other strokes
A straightforward algorithm is to count the number of connections by intersection detection between strokes. However, in this way, we have to know how many times two strokes intersect, and further distinguish the crossing cases from the touching cases, so as to get the right count. We adopted an alternative algorithm by counting the number of original segments touching a stroke. Original segments are often stored as primitive geometries broken up at intersections. Those segments touch strokes at junctions rather than crossing strokes at junctions, let alone intersect strokes two more times. An original segment that touches a stroke but is not a part of this stroke is counted as one connection of this stroke. 3.3 Order Strokes 3.3.1 Compare three criteria At this stage, we have three parameters. That is, stroke length, number of connections, and attributes (when available). Strokes are ordered using the three parameters. Among them, road attributes are the most important factor when they are available. When road attributes are unavailable, road length is used as importance factor. As mentioned in Sect. 2, long roads coincide with important roads with a greater probability than short roads. However, density differences often exist in road networks, which result in differences of road lengths in different areas. Uniform length criterion is not good at maintaining such differences. Number of connections criterion may be more important than length criterion, when we emphasize network structure. The reason is that this factor reflects how tightly a stroke is interwoven into the network. This factor also indicates road density effectively, since the number of connections
Road Network Generalization Based on Connection Analysis
349
will be larger in dense areas than that in sparse areas, if the road length is the same. However, since we have to omit more roads in dense areas than the roads in sparse areas, we have to make a balance when using the connection criterion. 3.3.2 Define strategies to combine three criteria Since all three factors are important in road network generalization, we should combine them in reasonable ways. These combination strategies may stress one factor at the cost of other factors, or make a balance between them. We order strokes by geometric criteria, i.e., length and connection, when road attributes are unavailable. We propose four possible strategies to combine these two criteria to order strokes as follows. a. Ordering strokes by only length criterion, as suggested by Thomson and Richardson (1999). b. Ordering strokes by connection criterion firstly and length secondly. c. Ordering strokes by the product of stroke length and a connection ratio. For each stroke, a connection ratio is calculated as the number of connections of this stroke to the average number of connections of all strokes. The length of each stroke is adjusted by multiplying its ratio of number of connections. d. Ordering strokes by the product of the number of connections and a length ratio. For each stroke, a length ratio is calculated as the length of this stroke to the average length of all strokes. The number of connections for each stroke is adjusted by multiplying its ratio of length. Among the four strategies, strategy a) and b) order strokes by multiple keys, while strategy c) and d) combine the two parameters in the product form, trying to make a better balance between the two parameters. When road attributes are available, they are also combined as multiple keys or in the product form. 3.4 Select Strokes This stage simply selects strokes according to the sequence created at the last stage. Strokes are selected sequentially until their total length equals to or is larger than a value related to the specified reduction degree. Reduction degree is specified interactively as a reduction ratio of the total length of all roads.
350
Qingnian Zhang
4 Case Study 4.1 Implementation Environment The algorithm was implemented in Java environment, based on the open source Java packages JTS (JTS Topology Suite) and JUMP (JUMP Unified Mapping Platform). Both packages are from Vivid Solutions. JTS is a Java API that implements a core set of spatial data operations, while JUMP provides an interactive GUI and more. The road network generalization routines were implemented as plug-ins of JUMP, providing functions to create strokes, count the number of connections, order the strokes with specified strategy, and select strokes according to specified reduction ratio. 4.2 Data Source The map data used in this case study came from the local municipality of Lund. The data is a comprehensive and detailed representation of road and street network in Lund, Sweden. Roads are divided into highways, class one roads, class two roads, class three roads, class four (bad) roads and streets. However, this case study concentrated on road geometry, especially the number of connections criterion, so the road attributes were neglected. 4.3 Results Figure 3 presents some of the results in the case study. Figure 3a describes the original road network, while Fig. 3b-d presents three generalized maps with 30% reduction of total road length. Among them, Fig. 3b was created by stroke length criterion; Fig. 3c was created by connection criterion with auxiliary length criterion; Fig. 3d was created by the criterion of the product of length and connection-ratio, which is also the result created by the criterion of the product of connection number and length-ratio. As expected, connection criterion emphasizes density, which selects more roads in dense areas (city center) and fewer roads in sparse areas (suburb) than length criterion does. It is also obvious that neither length nor connection based criterion creates perfect results. Their combination in the product form brings about more reasonable results.
Road Network Generalization Based on Connection Analysis
The product form criterion was also test with 5-35% reduction degrees. Results shows that road density differences between city center and suburb were maintained quite well with different reduction degrees. However, a few roads were disconnected to each other with reduction degree increasing.
5 Discussions Although the case study created reasonable results, a few problems were also found, including irregular patterns, disconnected roads, and closely
352
Qingnian Zhang
positioned roads. These problems have to be dealt with in additional processes.
6 Conclusions Since roads are interwoven into complicated networks, road network generalization is somehow difficult. This paper proposes a method to generalize road network based on connection analysis. The number of connections is counted at each junction for each salient road, “stroke”. Several strategies are proposed to integrate connection criterion into road network generalization. A case study shows that connection criterion is good at maintaining density differences in generalized maps. It is also found that combining connection criterion with length criterion in product form creates more reasonable results than using only length criterion.
Acknowledgments This research is financed by China National Science Foundation under grant No. 40101024. It is also supported by the International Office and the GIS Centre at Lund University. I would like to thank my colleagues at the GIS Centre, especially Lars Harrie for commenting the manuscript, Karin Larsson for helping me access the test data, and Jean-Nicolas Poussart for correcting my English. Comments from two anonymous referees are also appreciated. Data for the case study was kindly provided by National Land Survey of Sweden.
References Jiang B and Claramunt C (2004) A Structural Approach to the Model Generalisation of an Urban Street Network. GeoInformatica 8(2): 157-171 Jiang B and Harrie L (2004) Selection of Streets from a Network Using SelfOrganizing Maps. Transactions in GIS 8(3): 335-350 Lamy S et al (1999) The Application of Agents in Automated Map Generalisation. In: Proceedings of the ICA 19th International Cartographic Conference (Ottawa 1999), pp 1225-1234 Mackaness WA and Beard MK (1993) Use of Graph Theory to Support Map Generalisation. Cartography and Geographic Information Systems 20: 210 – 221.
Road Network Generalization Based on Connection Analysis
353
Mackaness WA (1995) Analysis of Urban Road Networks to Support Cartographic Generalization. Cartography and Geographic Information Systems 22: 306 – 316. Morisset B and Ruas A (1997) Simulation and Agent Modelling for Road Selection in Generalisation. In: Proceedings of the ICA 18th International Cartographic Conference (Stockholm 1997), pp 1376-1380 Thomson RC and Richardson DE (1995) A Graph Theory Approach to Road Network Generalisation. In: Proceeding of the 17th International Cartographic Conference, pp 1871–1880. Thomson RC and Richardson DE (1999) The ‘Good Continuation’ Principle of Perceptual Organization Applied to the Generalization of Road Networks. In: Proceedings of the ICA 19th International Cartographic Conference (Ottawa 1999), pp 1215–1223. Thomson RC and Brooks R (2002) Exploiting perceptual grouping for map analysis, understanding and generalization: The case of road and river networks. Graphics Recognition: Algorithms and Applications, Lecture Notes in Computer Science 2390/2002: 148-157
Continuous Generalization for Visualization on Small Mobile Devices Monika Sester and Claus Brenner Institute of Cartography and Geoinformatics, University of Hannover, Appelstr. 9a, 30167 Hannover, Germany [email protected], [email protected]
Abstract Visualizing spatial information on small mobile displays is a big chance and challenge at the same time. In order to address the tradeoff between huge spatial data sets and small storage capacities and visualization screens, we propose to visualize only the information on the screen which adequately fits the current resolution. To realize this, we automatically decompose the generalization of an object into a sequence of elementary steps. From this, one can later easily obtain any desired generalization level by applying the appropriate subpart of the sequence. The method method does not only lead to smooth transitions between different object representations but also is useful for incremental transmission of maps through limited bandwidth channels.
1 Introduction Small mobile devices are becoming more and more important for the visualization of spatial information. This tendency is on the one hand driven by the availability of such small devices with displays (e.g. mobile phones, PDA’s), as well as by a growing number of popular applications, namely Location Based Services (LBS) in general and navigation in particular. However, these small devices also have limitations: first of all, the computing and storage capabilities are limited, leading to the situation that huge data sets or complicated calculations cannot be loaded or run on the devices, respectively. Furthermore, the displays are quite small and of relatively low resolution so that the visualization is not comparable to a map presentation. In order to still be able to employ this new technology for mobile navigation, the following demands can be formulated: firstly, spatial data sets
This research is supported by the EU-IST Project No. IST-2000-30090 (GiMoDig) and the VolkswagenStiftung.
356
Monika Sester and Claus Brenner
in different resolutions or levels of detail have to be made available for the mobile user in order to allow for a flexible zooming in and out for getting both detail and overview information. Secondly, this information has to be transmitted to the user from a server taking the possibly limited bandwidth of the communication channel into account. In the paper a solution is presented that allows for a progressive transmission of spatial data to a small mobile device, starting with coarse overview information and progressively streaming more detailed information to the client. This information will be interpreted by the client and animated in a way to simulate a continuous generalization from coarse to fine and vice versa. A prototype has been developed to illustrate the functionality and evaluate the results. The paper is structured as follows: after a brief review of related work, a set of elementary generalization operations (EGO’s) is defined, which allows for the modification of spatial objects. Based on this elementary vocabulary, two generalization applications are described: generalization of building ground plans and typification of buildings. The last section describes the client-server architecture, as well as an extension to gradually animate the changes in order make the discrete changes less visible. A summary and outlook concludes the paper.
2 Related Work In order to provide adequate information to a user, it has to be adapted to the personal situation, the device used, the task to be solved, etc. (see e.g. [Nivala, Sarjakoski, Jakobsson & Kaasinen 2003, Reichenbacher 2004]). One major component is the adaption to the level of detail of the information. In cartography, the derivation of information in different levels of detail is being achieved using generalization operations [Hake, Gr¨ unreich & Meng 2002], which traditionally have been applied by human cartographers. Over 40 years of research in the automation of generalization processes have lead to a series of algorithms specified for dedicated purposes, e.g. line generalization [Douglas & Peucker 1973], area aggregation, building simplification, typification [Regnauld 1996, Sester 2004] or displacement [Højholt 1998, Harrie 1999, Sester 2000]. In recent years, the research has focused on the integration of different generalization algorithms by modelling the relative dependencies between the operations and the spatial context of the objects (see e.g. [Lamy, Ruas, Demazeau, Jackson, Mackaness & Weibel 1999]). In the context of mobile applications, a combination of relying on pre-computed generalization levels in terms of an MRDB and on-line generalization using web-technology has been proposed (e.g. [Sarjakoski, Sarjakoski, Lehto, Sester, Illert, Nissen, Rystedt & Ruotsalainen 2002]). Progressively transmitting spatial information via a limited bandwidth has been implemented for image formats e.g. in the GIF-Format. Bertolotto
Continuous Generalization
357
& Egenhofer [1999] proposed to also use it for vector data. van Kreveld [2001] presented concepts for the continuous visualization of spatial information in order to reduce the popping effect when discrete changes in the geometry appear. In order to represent different levels of detail of vector geometry, hierarchical schemes can be applied. An example is the GAP-tree for the coding of area partitioning in different levels of detail ([van Oosterom 1995]). The BLG (binary line generalization) tree hierarchically decomposes a line using e.g. the Douglas-Peuker algorithm. In mesh simplification used for the generalization of surfaces, there are methods that animate smooth changes, e.g. by animating the insertion or removal of nodes in a triangular network [Hoppe 1998].
3 Elementary Generalization Operations The concept of our ideas is explained using the example of the simplification of building ground plans. They can, however, be expanded to other generalization operations, which will be shown later. 3.1 The Generalization Chain Similar to the ideas introduced by Hoppe [1998] for triangulated meshes, we define for a polygon P consisting of n vertices a maximal representation P n ≡ P , consisting of all original vertices, and a minimal representation P m , with m ≤ n vertices. The minimal representation is the one which is still sensible from a cartographic viewpoint, for example a rectangle, m = 4, or the empty polygon. When a polygon is generalized during pre-processing, one starts from polygon P n , successively simplifying its representation using generalization operations, finally yielding polygon P m . Assume that k generalization steps are involved (each leading to one or more removed polygon vertices), and the number of polygon vertices are numbered i0 = n, i1 , . . ., ik = m, then a sequence of generalized polygons P ≡ P n ≡ P i0 −→g0 P i1 −→g1 . . . −→gk−1 P ik ≡ P m is obtained, where gj denotes the j-th generalization operation. Every generalization step gj is tied to a certain value of a control parameter εj , which relates to the display scale and can be for example the length of the shortest edge in the polygon. Since generalization proceeds using increasing edge lengths, the sequence of εj is monotonically increasing. As a first consequence of this, one can pre-compute and record all operations gj , in order to derive quickly any desired generalization level ε of polygon P by the execution of all generalization operations g0 , . . ., gj , where ε0 , . . ., εj ≤ ε and εj+1 > ε.
358
Monika Sester and Claus Brenner
However, it is obvious that for most applications, the inverse operations gj−1 are more interesting, producing a more detailed polygon from a generalized one. Thus, we have the sequence P m ≡ P ik −→g−1 P ik−1 −→g−1 . . . −→g−1 P i0 ≡ P n , k−1
k−2
0
where again one can decide up to which point the polygon modification should be carried out, characterized by the corresponding parameter ε. This way, the inverse generalization chain can be used for progressively transmitting information over a limited bandwidth channel by transmitting P m followed by a sufficient number of inverse generalization operations. 3.2 Elementary Generalization Operations and Simple Operations We can define a set of elementary generalization operations (EGO’s) such that every generalization chain will be made up of a combination of EGO’s. EGO’s are intended to be ‘meaningful’ operations such as ‘remove an extrusion’. Each EGO in turn consists of one or more simple operations (SO’s) modifying the polygon. SO’s are low-level, basic operations, and are the most atomic operations available. It is obvious that there are operations which modify the topology of a polygon, namely the insertion and removal of vertices, and operations which affect the geometry only. Table 1 shows a list of simple operations. This list is not minimal, since e.g. a ’DV i’ operation is equivaSO IV DV MV RV
Parameters IV [edge id] [rel. position] DV [vertex id] MV [vertex id] [dx] [dy] RV [vertex id]
Inverse Operation RV [edge id + 1] RV [vertex id +1] MV [vertex id] [-dx] [-dy] –
Table 1. Set of simple generalization operations.
lent to ’IV i,0’. However, for convenience and for achieving a most compact encoding, the operations might be defined redundantly. The parameters involved are edge and vertex id’s and relative positions. Edge and vertex id’s refer to an implicit numbering of polygon edges and vertices in clockwise or counterclockwise order. Knowing the parameters of a simple operation allows to immediately give the inverse operation except for the ’remove vertex’ operation for which the inverse would require an additional parameter to specify the location of the vertex to be inserted.
4 Generalization Operations In this section, the use of the simple operations (SO’s) described above is applied to two generalization problems that relate to the generalization of
Continuous Generalization
359
buildings. When presenting buildings in increasingly smaller scale representations, different generalization operations have to be applied, which can be distinguished into three main groups of operations, depending on the scale levels: in scale ranges up to 1:20.000 the individual building shapes are simplified by inspecting the visibility of the shortest facades. This is shown in section 4.1. Further reducing the scale requires an amalgamation of adjacent buildings to form larger objects, as individual buildings would be too small to still be represented. In even smaller scales, buildings have to be selectively removed, enlarged and symbolized. This requires the typification operation, where groups of individual buildings are replaced by new groups with less buildings. This is demonstrated in section 4.2. Thus, in the following, the encoding of two building generalization functions in terms of EGO’s will be demonstrated. Amalgamation is not treated here in detail, however in section 6.1 an overview is given, how other generalization operations can be realized. 4.1 Building Generalization The generalization of building ground plans has to take the regularities of the object into account. Thus, especially rectangularity and parallelism have to be respected and even enforced. We use a rule-based approach which iteratively inspects small building facades and tries to replace them depending on the spatial context, i.e. its preceding and succeeding building facades [Sester 2000]. In this way, intrusions and extrusions can be eliminated, as well as offsets and corners. Three rules can be decomposed into simple operations. In the following, the generation of the sequence of simple operations is described for each of the three rules, leading to three EGO’s. The operations are triggered by a building side sn that is smaller than a minimum value , corresponding to a minimum side just visible at a given scale. As noted above, the inverse operation is described, namely going from a generalized version to the more specific one. Offset An offset is removed by extending the longer side of the adjacent neighbors of the shortest edge sn . In Figure 1, sn−1 is intersected with sn+2 to produce the new node Pnew . Inverting this process and describing it with the simple operations described above, leads to the following sequence of SO’s, composing the offset-EGO for re-insertion of offsets.
Figure 1. Removing an offset and generating simple operations for it.
360
Monika Sester and Claus Brenner
1. Note length ln of the shortest edge sn : EPS l_n sn−1 2. Insert vertex Pn on side sn+1,new at relative position: RelPos = sn+1,new IV S_n+1,new, RelPos 3. Duplicate this vertex, and thus create new point Pn+1 . DV P_n 4. Move this new vertex Pn+1 to the position of the old vertex Pn+1 , and move vertex Pnew to the position of the old vertex Pn+2 by increments (dx1 , dy1 ) and (dx2 , dy2 ): MV P_n+1, dx1, dy1 MV P_n+2, dx2, dy2 Extrusion An extrusion is eliminated by shifting it back or forth to be in line with the corresponding main building side. Figure 2 shows how the extrusion is shifted back to the corpus of the building. Pn
sn
sn-1
Pn+1
Pn
sn+1
Pn-1
sn-2 Pn-1 sn+2
sn-2,new
Pn+1 Pnew
Figure 2. Removing an extrusion and generating simple operations for it.
The extrusion is cut off along the shortest neighboring facade of sn . In the case of Figure 2, it is sn−1 , thus the new structure is derived by intersecting side sn−2 with sn+1 leading to the intersection vertex Pnew . Inverting this operation leads to the following sequence of operations, representing an extrusion-EGO: 1. Note length ln of the shortest edge sn : EPS l_n s 2. Insert vertex Pn−1 on side sn−2,new at relative position: RelPos = n−2,new sn−2 IV s_n-2,new, RelPos 3. Duplicate this vertex, thus creating new vertex Pn DV P_n-1 4. Move this vertex to the position of the old vertex Pn , and move vertex Pnew to the position of the old vertex Pn+1 by increments (dx1 , dy1 ) and (dx2 , dy2 ): MV P_n, dx1, dy1 MV P_new, dx2, dy2 Corner A corner is removed by intersecting the adjacent edges (see Figure 3).
Continuous Generalization Pn
sn-1
sn-1,new
sn
Pn
Pn+1
361
Pnew
sn+1
Figure 3. Removing a corner and generating simple operations for it.
A new node is inserted at the intersection point Pnew and thus creating two new building sides sn−1,new and sn+1,new , and removing sn . Inverting the operation leads to the following sequence for the corner-EGO: 1. Note length ln of the shortest edge sn : EPS l_n sn−1 2. Insert vertex Pn on side sn−1,new at relative position: RelPos = sn−1,new IV s_n-1,new, RelPos 3. Move vertex Pnew to the position of the old vertex Pn+1 by increments (dx1 , dy1 ): MV P_new, dx1, dy1 Using the EGO’s defined above iteratively all facades of the buildings are removed – starting with the shortest one and ending when the removal of one facade leads to the removal of the entire building. Figure 4 shows a sequence of operations in the inverse generalization process. In this case the building is created at an EPS of 5, which is the width of building, then an extrusion is created using the SO’s specified. v3 EPS 5
e2
e3 v0
v2 e1
e0
v1
v4 IV 1,60%
e3
e4 v0
e0
v3 e2 v2 e1 v1
v5 DV 2
e4
e5 v0
e0
v4 e3 v2, v3 e1 v1
e4
v5 MV 3,2,0 MV 4,2,0
e5 v0
e0
v4 e3 v2 e2 v3 e1 v1
Figure 4. Sequence generating a simple building and corresponding SO’s.
In Figure 5, the progressive refinement of four buildings is shown: the snapshots visualize how – at certain stages of the parameter EPS that describes the discernable minimal distance – more and more buildings and building details appear. Figure 6 shows some snapshots of a larger area in increasing levels of detail: it is clearly visible, how more and more small buildings appear, and they are displayed with more and more detail. 4.2 Typification Typification denotes the process of reducing the number of objects out of a set of similar objects while preserving their relative spatial density. Typification is a discrete process, where a set of objects is replaced by another set containing a smaller number of objects. Such a change in representation can be coded
362
Monika Sester and Claus Brenner
Figure 5. Progressive visualization of four buildings at four levels of detail.
Figure 6. Snapshots of incrementally adding more and more detail.
with the above described SO’s. It has to be realized by removing the objects at one generalization level and introducing new objects at the same time. The sequence in Figure 7 shows the generation and removal of a building: a new polygon is created at the generalization level EPS=10 at position (65,65) by first inserting a vertex; three other vertices are created by duplication of this vertex, then all four vertices are moved to four positions around the center point and thus a square with a side of 20 is finally created. This object ’survives’ until the level EPS=5 when it ’collapses’ by moving all its four corners to the center point again. POLY EPS 10 NPR 65 65 DV 0 DV 0 DV 0 MV 0 -10 10 MV 1 -10 -10 MV 2 10 -10 MV 3 10 10 EPS 5 MV 0 10 -10 MV 1 10 10 MV 2 -10 10 MV 3 -10 -10
new Polygon the following instruction appears at min. distance 10 create new polygon ring, consisting of only one point at position 65/65; this point gets ID 0 duplicate this point three times, thus create points 1 to 3
move point 0 to the left and upwards also move other points next event happens at minimum value EPS=5 move points 0 to the right and downwards (i.e. to the original zero-position 65/65) do the same with the other points
Figure 7. Example operations to generate and collapse an object.
Continuous Generalization
363
In this way, a discrete change in representation can be coded. Figure 8 shows the sequence of a typification of a regular structure of 16 buildings being replaced by four and finally by one in the different scales, respectively.
Figure 8. Three different resolutions: high, medium and low, corresponding to = 1.8, 6, 11, respectively.
The generation of the different representations has to be done in a separate process. In our case, we used an approach based on Kohonen Feature Nets in order to do the rearrangement of the reduced number of objects [Sester 2004]. Figure 9 shows a spatial situation before and after typification. In the process the number of buildings is reduced, the remaining objects are rearranged and then presented by either the original building or a square symbol depending on its size.
Figure 9. Situation before (left) and after typification of buildings(right).
5 Continuous Generalization and Transmission 5.1 Continuous Generalization When an object representation is switched due to generalization, this usually leads to a visible ‘popping’ effect. Compared to switching between different, fixed levels of detail, the use of EGO’s is already an improvement, since it gradually modifies the polygon rather than just replacing it as a whole. However, one can still improve this. Intermediate states can be defined which
364
Monika Sester and Claus Brenner
continuously change the object in response to an EGO. For example, a ‘collapse extrusion’ EGO (see Figure 2) would be interpreted as ‘move extrusion until in coincides with the main part, then change the topology accordingly’. We term this approach continuous generalization as it effectively allows to morph the object continuously from its coarsest to its finest representation. Since each EGO is made out of one or more SO’s, their effects on display popping has to be taken into account. However, this is trivial, since we can deduce immediately that IV and DV are not changing the object’s geometry. RV will only lead to a visible effect if the vertex, its predecessor and its successor are non-collinear. Thus, MV is the only SO that has to be regarded. This means that a continuous generalization can be achieved by using an appropriate encoding of EGO’s in terms of SO’s, together with an animation in the client which gradually shifts vertices instead of moving them in one step upon encountering a MV operation. This is implemented in the prototype. 5.2 A Client-Server Communication Scheme for Progressively Streaming Map Data To describe the mechanisms of a progressive streaming of map data, we now introduce the notion of a client and a server. In case of internet map displays, off-board car navigation as well as personal navigation systems, these take just the roles as expected. However, in other applications, they might be defined differently. For example, on-board car navigation systems might define the server as the main CPU unit where the mass storage resides, and the client as being the head unit CPU used for map display and user input. One possible realization is depicted in Figure 10, based on the assumption that the server keeps track of the state of the client. A stateless approach could be used instead, however this would imply a larger amount of communication, telling the server each time the object id’s and generalization levels present in the client in order to allow the server to compute the appropriate differential SO’s. When the user requests a new part of the map, the client is able to compute the bounding box in world coordinates and the generalization level ε, the latter possibly being based on the scale as well as some preferences which could balance speed versus ‘map quality’. The client sends this information to the server which can retrieve the appropriate objects from the database. Since the server keeps track of which objects have already been sent to the client, it can deduce the appropriate SO’s needed to update the display and send them to the client. While receiving SO’s, the client will constantly refresh its display. If the user interacts before the entire set of SO’s has been sent, the client may send a break request to the server which in turn will stop sending SO’s. There might be additional communication items, for example to allow the client to drop objects currently out of view to conserve memory.
Continuous Generalization Server
Client
User Start system
Start a new session
Store session id
Pan / zoom to a certain map area User input
Compute world bounding box and generalization level needed
365
Generate unique session id Session id
Bounding box Generalization level
Retrieve objects inside world bounding box Compare with objects and levels already sent to the client
SO’s
Update display progressively
Visual output
Generate SO’s required to update client’s display
View results as they appear
Pan / zoom to another map area User input
Etc.
Figure 10. Example interaction diagram for client-server communication.
6 Evaluation of results and discussion 6.1 Discussion on application for other generalization operations Generalization operations can be characterized by changes occurring to objects which are either discrete or continuous. These changes can affect individual objects and groups of objects, respectively. The two generalization operations described above are of the type of discrete changes affecting individual objects (section 4.1) and groups of objects (section 4.2). What has been shown for these two generalization operations can be extended to other functions as well: amalgamation is an operation that leads to discrete changes of groups of objects, and thus is in the same category as the typification operation. Once an appropriate algorithm is available (e.g. [Bundy, Jones & Furse 1995]), the coding has to be done in terms of SO’s and EGO’s. In this case, similar to the typification-encoding, objects have to be replaced by other objects (in the case of merging two buildings, the two objects cease to exist and are replaced by the merged one). Displacement is another important generalization operation to solve spatial proximity conflicts by shifting objects apart and possibly deforming the shapes slightly; it can be categorized as a continuous operation. Displacement can be coded easily as here only a MV-operation with respect to all object points is needed. In terms of EGO’s a compound movement of all points of an object can be bundled – only however, when no deformation of the object
366
Monika Sester and Claus Brenner
occurs. Existing generalization algorithms can be extended to generate the coding in terms of SO’s, e.g. Sester [2004]. 6.2 Data volume In order to evaluate the data volume to be transmitted using the coding scheme proposed above, the following estimation can be done: in our coding scheme, only incremental refinements have to be sent to the client, instead of sending a new representation of the whole data whenever a change in the detail of the data occurs. In case of building simplification a building consisting of n points is iteratively simplified until a polygon with typically 4 points is reached and then it vanishes. In each of the EGO’s one or two points are removed, thus to present all possible LOD’s, in the order of O(n) different representations have to be transmitted, leading to O(n2 ) transmitted points. Our coding scheme, however, requires only to transmit O(n) operations leading to a reduction factor of O(n). The representation in our coding scheme is compared to the original representation of the data set in terms of an ESRI shapefile for different data sets: the size of the data set in Figure 6 consisting of 119 buildings is 30Kbyte; the original representation in terms of the ESRI shapefile (the .shp-file) is 20 kByte. Another dataset consisting of 2000 buildings requires 0.45 MByte in terms of SO’s. The original shapefile needs 0.312 MByte. The following considerations have to be made when evaluating these numbers: • The SO-coding includes all generalization levels that are possible between the most detailed representation of the objects and its most general one – the shapefiles contain only the most detailed one. • There have not yet been any considerations concerning efficient storage of the code. • The coding in terms of EGO’s allows to group several SO’s to a higher level standard operation. In the current implementation, however, the EGO’s not yet have been efficiently coded, but are described in terms of SO’s.
7 Conclusion and Future Work The research presented in this paper was motivated by the fact that a spatial data set on a small display device like a PDA typically has to be visualized in different levels of detail: for an overview only coarse information of a larger extent of the map is needed; when the user zooms in in order to inspect details, progressively new information is loaded for that smaller section of the map the user is interested in. However, the approach is generally applicable, when spatial data to be presented is not available on the device itself, but has to be transmitted from a remote server. Besides the transmission to mobile devices described above, this also holds for the presentation of spatial data in the internet.
Continuous Generalization
367
In the paper an approach was presented that decomposes generalization operations into simple operations leading to changes in topology and geometry. It was shown how these simple operations can be used to code more complex operations dealing with continuous and also discrete changes in the objects. Finally, an implementation for a client-server data transmission was presented. In the current concept, the generation of the EGO’s is an off-line process, that is done beforehand. It is, however, also possible to generate the code on-the-fly upon a request of the client. In this case, the response time on the client will be delayed by the time needed for the generalization process. Thus the trade-off between storing overhead of the data in terms of EGO’s and processing time for the generalization has to be balanced. This will be subject to further investigations. Future work will also concentrate on implementing this concept in a distributed system environment and on the integration of other generalization operations.
References Bertolotto, M. & Egenhofer, M. [1999], Progressive Vector Transmission, in: ‘Transactions of the ACMGIS99’, Kansas City, MO, pp. 152–157. Bundy, G., Jones, C. & Furse, E. [1995], Holistic generalization of large-scale cartographic data, in M¨ uller, Lagrange & Weibel [1995], pp. 106–119. Douglas, D. & Peucker, T. [1973], ‘Algorithms for the reduction of the number of points required to represent a digitized line or its caricature’, The Canadian Cartographer 10(2), 112–122. Hake, G., Gr¨ unreich, D. & Meng, L. [2002], Kartographie, Gruyter. Harrie, L. E. [1999], ‘The constraint method for solving spatial conflicts in cartographic generalization’, Cartographiy and Geographic Information Science 26(1), 55–69. Højholt, P. [1998], Solving Local and Global Space Conflicts in Map Generalization Using a Finite Element Method Adapted from Structural Mechanics, in: T. Poiker & N. Chrisman, eds, ‘Proceedings of the 8th International Symposium on Spatial Data handling’, Vancouver, Canada, pp. 679–689. Hoppe, H. [1998], Smooth view-dependent level-of-detail control and its application to terrain rendering, in: ‘IEEE Visualization ’98’, pp. 35–42. Lamy, S., Ruas, A., Demazeau, Y., Jackson, M., Mackaness, W. & Weibel, R. [1999], The Application of Agents in Automated Map Generalization, in: ‘Proceedings of the 19th International Cartographic Conference of the ICA’, Ottawa, Canada, pp. 1225–1234. M¨ uller, J.-C., Lagrange, J.-P. & Weibel, R., eds [1995], GIS and Generalization Methodology and Practice, Taylor & Francis. Nivala, A.-M., Sarjakoski, L., Jakobsson, A. & Kaasinen, E. [2003], Usability Evaluation of Topographic Maps in Mobile Devices, in: ‘Proceedings of the 21st International Cartographic Conference of the ICA’, Durban, South Africa, pp. 1903– 1913, CD–ROM. Regnauld, N. [1996], Recognition of Building Clusters for Generalization, in: M. Kraak & M. Molenaar, eds, ‘Advances in GIS Research, Proc. of 7th Int.
368
Monika Sester and Claus Brenner
Symposium on Spatial Data Handling (SDH)’, Vol. 1, Faculty of Geod. Engineering, Delft, The Netherlands, pp. 4B.1–4B.14. Reichenbacher, T. [2004], Mobile Cartography: Adaptive Visualization of Geographic Information on Mobile Devices, PhD thesis, Technische Universit¨ at M¨ unchen. Sarjakoski, T., Sarjakoski, L., Lehto, L., Sester, M., Illert, A., Nissen, F., Rystedt, R. & Ruotsalainen, R. [2002], Geospatial Info-mobility Services - a Challenge for National Mapping Agencies, in: ‘Proceedings of the Joint International Symposium on ’GeoSpatial Theory, Processing and Applications’ (ISPRS/Commission IV/SDH2002)’, Ottawa, Canada, CD-Rom. Sester, M. [2000], Generalization based on Least Squares Adjustment, in: ‘IAPRS’, Vol. 33, ISPRS, Amsterdam, Holland. Sester, M. [2004], Optimizing Approaches for Generalization and Data Abstraction, Technical report, accepted for publication in: International Journal of Geographical Information Science. van Kreveld, M. [2001], Smooth Generalization for Continuous Zooming, in: ‘Proceedings of the ICA’, Beijing, China. van Oosterom, P. [1995], The GAP-tree, an approach to ’on-the-fly’ map generalization of an area partitioning, in M¨ uller et al. [1995], pp. 120–132.
Shape-Aware Line Generalisation With Weighted Effective Area Sheng Zhou and Christopher B. Jones School of Computer Science, Cardiff University, Cardiff, CF24 3XF, UK {S.Zhou; C.B.Jones}@cs.cf.ac.uk
Abstract Few line generalisation algorithms provide explicit control over the style of generalisation that results. In this paper we introduce weighted effective area, a set of area-based metrics for cartographic line generalisation following the bottom-up approach of the Visvalingam-Whyatt algorithm. Various weight factors are used to reflect the flatness, skewness and convexity of the triangle upon which the Visvalingam-Whyatt effective area is computed. Our experimental results indicate these weight factors may provide much greater control over generalisation effects than is possible with the original algorithm. An online web demonstrator for weighted effective area has been set up.
1 Introduction Cartographic line generalisation is one of the major processes in map generalisation. Each cartographic line represents a real world geographic feature. When a map is compiled for a certain purpose from a source map at the same or a greater scale, some “unwanted” details of a cartographic line in the source dataset are eliminated from the representation of the same feature in the target map. In addition, some retained details may be exaggerated to enhance certain characteristics of the feature represented by the line. 1.1 Graphic and semantic line generalisation The main controlling elements in map generalisation include: map purpose and conditions of use, map scale, quality and quantity of data, and graphic
370
Sheng Zhou and Christopher B. Jones
limit (Robinson, et al. 1995, pp458). The graphical limitations of the presentation media imply that some details in a line are often not presentable and hence become redundant. Normally the process of eliminating these details is graphically, geometrically and visually driven and will be referred to as graphic line generalisation. On the other hand, detail removal or modification according to the purpose and conditions of use is semantically driven and will be referred to as semantic line generalisation. In the context of this discussion, the term “detail” refers to any three or more consecutive and non-collinear vertices in a cartographic line. Broadly speaking, graphic line generalisation is very similar to the process normally regarded as line simplification, represented by such algorithms as the Ramer-Douglas-Peucker algorithm (Ramer 1972; Douglas and Peucker 1973). With such a graphically-driven view, the presence of a graphic limit has a two-fold implication: on the one hand, there should be a close association between the tolerances in these algorithms and the graphic limit; on the other hand, it is inappropriate to apply these algorithms with tolerances far beyond (i.e. greater than) the graphic limit. For digital datasets, there is a simple relation between the graphic limit (represented here by presentation resolution R) and a database resolution (Rdb), as described in Zhou and Jones (2003): Rdb = R / S = R * FE / ME
(1)
Here S is the real scale of the presentation (ratio of ME, the physical extent on the presentation medium, and FE, the corresponding real world extent). Using this relation, graphic generalisation tolerances (normally functions of the database resolution) may be linked to graphic limits for selecting appropriate tolerance values. Semantic line generalisation is also affected by graphical limitations but mainly controlled by semantic knowledge. To an extent, semantic generalisation is a process of discriminating against some details that are graphically equal to other details in the line. Such a discrimination process could be positive (e.g. a detail below the graphic threshold may be exaggerated and retained) or more frequently, negative (e.g. one of two similar details above the graphic threshold may be eliminated to highlight the other). Note that some effects of semantic generalisation may be achieved by pure graphically driven procedures, albeit unintentionally. Generally speaking, graphic generalisation tends to retain “extreme” vertices where semantic generalisation tends to eliminate these vertices.
Shape-Aware Line Generalisation With Weighted Effective Area
371
1.2 Line generalisation using effective area Among various (graphic or semantic) line generalisation algorithms, the Visvalingam-Whyatt (1993) algorithm (referred to here as VW algorithm) is of particular interest. This algorithm may be summarised as follows: x Input: a cartographic line of n vertices (for clarity of illustration, we assume the line is open) with vertices v0 and vn-1 as its endpoints. x Step 1: for each internal vertex vi (i = 1, n-2), calculate its “effective area” EAi as the area of triangle vi-1, vi and vi+1 and assign this value to vi. x Step 2: Repeat until only v0 and vn-1 are left in the line x Find the vertex vj with the smallest effective area EAj, temporarily remove it from the line x Re-calculate effective areas for the two adjacent vertices vj-1 and vj+1 as described in step 1 (if the re-calculated area value is smaller than EAj, EAj should be assigned to the vertex) x Step 3: restore all previously removed vertices to the line in their original order x Output: a cartographic line with all internal vertices labelled with their effective area. With all vertices processed in this way, given a threshold of minimum effective area, a subset of vertices may be selected to form a less detailed representation of the line. VW algorithm represents a localised bottom-up approach to line generalisation, using an area-based metric and a fixed detail size of three vertices. The effect of generalisation is gradually propagated from small local details to a more global scale and effects of both graphic and semantic generalisation may be achieved simultaneously. Comparative evaluations of the algorithm are presented in Visvalingam and Williamson (1995) and Visvalingam and Herbert (1999). In the original form of the algorithm, the shape of the triangle upon which the effective area is measured is not taken into consideration and the simple area value is used directly. Consequently, a very flat triangle and a very tall triangle of the same area (figure 1-A) are treated as having the same effective area and the same cartographic significance. This is, however, not an inherent deficiency of the algorithm. As pointed out in Visvalingam and Whyatt (1993), the essence is the process and any metrics of whatever complexity may be used to generate the effective area. In the following sections, we present some more sophisticated methods for area computation under which the original form of effective area becomes a special case. We will demonstrate how these methods offer increased control over the generalisation effects.
372
Sheng Zhou and Christopher B. Jones
2 Shape Awareness and Weighted Effective Area A straightforward method to consider the shape of the triangles in effective area computation is to apply a weight factor to the initial effective area. This weight factor will reflect some aspects of the shape of a triangle. Consequently, weighted effective area values are obtained to differentiate triangles with the same area value but of different shape characteristics. By using different weight definitions, different shape characteristics of triangles may be highlighted. These weight definition functions may be viewed as filters. Filters map the values of some parameters representing the shape of a triangle to a weight factor whose value may be drawn from a finite range (bounded filter) or may be of any non-negative values (un-bounded filter). For a particular filter, triangles with certain parameter values will have a weight of 1 so that their effective areas are equal to the weighted effective areas under this filter. These triangles are regarded as “standard forms” for this filter. V1 ML
A S1
?
S2
H V0
Area: S1 = S2
VM W
B
V2
Fig. 1. (A) Two triangles of equal area (B) Triangle Shape Indicators 2.1 Some observations on the shape of triangles For a triangle T(v0, v1, v2) where v1 is the vertex whose effective area is to be calculated (figure 1-B), the following parameters may be used to describe the shape characteristics of the triangle (referred to as the effective triangle of v1): x Length of base line: W = Distance(v0, v2) x Height: H = Distance(v1, Line(v0, v2)) x Length of middle line: ML = Distance(v1, vM) where vM is the middle point of segment v0-v2 Using these parameters, we may measure a triangle's flatness (as illustrated in figure 1-A) and skew-ness (the degree it differs from the isosceles triangle with the same W and H values). In addition, we consider the con-
Shape-Aware Line Generalisation With Weighted Effective Area
373
vexity of a triangle, indicating its orientation relative to a pre-defined vertex order. For example, assuming the three vertices in figure 1-B are part of a closed cartographic line with a counter-clockwise vertex order, if the three-point orientation of v0-v1-v2 is counter-clockwise, then triangle T is convex; otherwise, it is concave. 2.2 Weighted effective area For vertex v1 in figure 1-B, its original effective area is EA = 0.5*H*W (note that in practice the constant 0.5 may be omitted to improve efficiency). Subsequently, we may define weighted effective area(WEA) as: WEA = WFlat*WSkew*WConvex*EA
(2)
where WFlat, WSkew and WConvex are flatness, skewness and convexity filter functions respectively. 2.3 Flatness filters Potentially there are numerous ways to define flatness filters. Below are only a few examples we have used in our experiments. 2.3.1 High-pass filter (LF)
W flat
(
4M arctan( H /( KS W )) / PI N KH ) M N
(3)
This filter favours taller triangles and removes flatter triangles. Consequently, extreme points are likely to be retained and the effect is graphic simplification of the line being processed. x Parameters M > 0, N t 0, (controlling maximum range of weight), KS > 0, KH t 1 x For H/(KS*W) = 1, Wflat = 1 (standard form) x For H/(KS*W) > 1, Wflat > 1 (enhancing taller triangles) x For H/(KS*W) < 1, Wflat < 1 (weakening flatter triangles)
x For M = 1 and N = 0, Wflat [0, 2KH) 2.3.2 Low-Pass filter (HF-01)
W flat
(
4M arctan( KS W / H ) / PI N KH ) M N
(4)
This filter is indeed a symmetric form of LF described above. Thus, it tends to eliminate extreme points and achieves the effect of semantic generalisation. x M > 0, N t 0, KS > 0, KH t 1 x For (KS*W)/H = 1, Wflat = 1 (standard form) x For (KS*W)/H > 1, Wflat > 1 (enhancing flatter triangles) x For (KS*W)/H < 1, Wflat < 1 (weakening taller triangles)
x For M = 1 and N = 0, Wflat [0, 2KH) 2.4 Skewness and Convexity filters For triangle T(v0,v1, v2), ML is the distance between v1 and the middle point of edge v0-v2. Consequently, we have the ratio H/ML [0, 1], which might be used in a skewness filter to retain points with effective triangles close to isosceles:
If we consider a cartographic line as directed, a convexity filter may be defined as: Wconvex = C (if convex) or 1 (if concave)
(6)
Here C is a positive constant. If C > 1 is used, this filter tends to retain points with convex effective triangles. Otherwise, points with concave effective triangles are retained.
Shape-Aware Line Generalisation With Weighted Effective Area
375
3 Experimental Results 3.1 Sample dataset and a web demo for weighted effective area To evaluate the generalisation effects of various filters described above, we have used a sample dataset (figure 2) of the coastline of Isle of Wight, which is extracted from an original Ordnance Survey LandForm PANAROMA dataset. There are five (closed) linear objects and 2376 vertices in total in the dataset, where the largest object contains 2236 vertices. In order to provide a better view on the effects of generalisation based on weighted effective area, we have developed a JAVA applet-based web demonstrator which may be accessed following the link: http://www.cs.cf.ac.uk/user/S.Zhou/WEADemo/
Fig. 2. Sample dataset (Crown Copyright 2002) and the web demo
This demonstrator allows comparison of various generalisation results using weighted effective area to that of RDP and the original VisvalingamWhyatt algorithm, where parameter values (i.e. RDP tolerance, effective area or weighted effective area) or the number of vertices retained after generalisation are adjustable. In the following subsection, we will present a few experimental results that demonstrate the different effects of various generalisation filters described above. These results are obtained from the web demonstrator. For each comparison, the same number of filtered vertices is retained in the whole generalised dataset so that the difference in vertex selection/filtering of various algorithms or parameter values may be highlighted. Also in all experiments M=0 and N=1.
376
Sheng Zhou and Christopher B. Jones
3.2 Experiments 3.2.1 Skewness According to our current experimental results, the weight based on the skewness of the effective triangle does not make a significant impact on the output if moderate parameter values are used (e.g. figure 3-D, SM = 0 and SK = 2). On the other hand, more extreme parameter values may generate unpredictable and undesirable results. These results cast doubt on the value of using this weight factor.
A
B
C
D
Fig. 3. Effects of VW with Skewness (D) compared to original (A), RDP(B) and VW (C) (for B,C and D, 1000 vertices retained in the whole dataset)
Fig. 4. Effects of applying convexity weight to VW
3.2.2 Convexity In our experiments, extreme convexity weight values generate quite significant effects (figure 4). On top of the initial effective area value, application of a very small weight (C=0.004) tends to retain vertices on the external local convex hulls while a very large weight (C=25) has the opposite effect of retaining vertices on the internal local convex hulls (see Normant and van de Walle 1996, regarding local convex hulls).
Shape-Aware Line Generalisation With Weighted Effective Area
377
3.2.3 Line simplification with WEA - LF Scheme Figure 5 and 6 demonstrate the effect of graphic simplification using “high-pass” weighted effective area, in comparison to the results of RDP. This filter appears to be able to generate simplification effects similar to that of RDP.
Fig. 5. Simplification by RDP, LF (KS=0.5, KH=1) and LF(KS=1, KH=1) (1000 vertices retained in the whole dataset)
Fig. 6. Simplification by RDP and LF(KS=0.2, KH=1), 200 vertices retained
Fig. 7. Effects of HF01 - Original VW, KS/KH as: 0.2/1; 0.5/1; 1/1 (1000 vertices retained in the whole dataset)
3.2.4 Line generalisation with WEA - HF01 The effects of the low-pass filter HF01 are shown in figures 7-9. Figure 7 demonstrate the effect of defining different “standard forms” (i.e. weight equals to 1), represented by KS (KS = 0.2, 0.5 and 1). Clearly, a "flatter" standard form (i.e. with a smaller KS) results in heavier generalisation.
378
Sheng Zhou and Christopher B. Jones
Figure 8 shows the generalisation effects of the same set of parameter values at different levels of detail (vertices retained in the whole dataset are from 1000 to 150).
2
1
4
3
Fig. 8. Effects of HF01 at KS=0.5 & KH=2 with 1000/600/300/150 vertices (1-4) retained
Finally, figure 9 illustrates the effect of different values for KH. It is obvious that at the same level of details (1000 vertices), a larger KH value will result in heavier generalisation.
1
2
3
4
Fig. 9. Effects of different KH (1/2/4/8) for the same KS (1), 1000 vertices
4 Discussion The experimental results in the previous section have demonstrated that the application of weight factors for flatness, convexity and (to a lesser extent) skewness can provide considerable control over both graphic and semantic generalisation effects. Apart from the skewness filter, the filter parameters provide consistent and predictable control over the resulting generalisation.
Shape-Aware Line Generalisation With Weighted Effective Area
379
It is worth noting that greater control does not always result in a “better” effect, but the parameters described and demonstrated here do appear to provide excellent potential for obtaining generalisations that are adapted to the requirements of particular applications. At the current stage of research and development, we suggest WEA may best be applied in an interactive manner in order to obtain preferred results, for which an interactive mapping tool has been provided. In future it may be possible to select parameter values automatically based on the results of training with different types of generalisation. 4.1 Topologically consistent generalisation
A problem associated with VW algorithm (as well as many other algorithms such as RDP) is that topological consistency is not guaranteed. WEA-based generalisation is no exception as the same bottom-up process as in VW algorithm is adopted. It is however fairly easy to geometrically (i.e. graphically) remove inconsistencies by adopting simple approaches such as retaining any vertex whose removal may cause inconsistency. For example, the topologically consistent multi-representational dataset used in (Zhou and Jones 2003) is generated in this way. There a Delaunay triangulation was used to detect when removal of a vertex would cause an inconsistency. 4.2 The issue of feature partition
As mentioned earlier, the bottom-up process in VW algorithm is a localised, minimalist approach as only the smallest details (three vertices) are considered. Generalisation effects on larger details are achieved progressively without explicit knowledge of them. The lack of direct control over these (often semantically significant) details makes it more difficult to decide the best combination of parameter values. Indeed, often a single set of parameters may not be appropriate for every large detail in the cartographic line to be generalised (which is especially true for the convexity filters). Therefore, it is natural to consider partitioning the line into several large details and subsequently applying bottom-up or other types of generalisations on them with appropriate individual parameter sets. Many methods for (geometric or semantic) detail identification and feature partition have been proposed, such as Sinuosity measures (Plazanet 1995), Voronoi Skeletons (Ogniewicz and Kübler 1995), skeletons based on Delaunay triangulation (van der Poorten and Jones 2002) and various convex hull based methods (e.g. Normant and van de Walle 1996; Zhou
380
Sheng Zhou and Christopher B. Jones
and Jones 2001). Following partitioning of features and addressing issues such as hierarchical details, overlapped details and oriented details of larger scale (i.e. more vertices), the best overall generalisation effects may be achieved by combining a localised bottom-up generalisation method, as presented here, with methods which take a more global view of features and operate successfully at the level of larger details (such as the branch pruning approach of van der Poorten and Jones, 2002) or multiple features.
References Douglas, D.H. and Peucker, T.K., 1973, Algorithms for the reduction of the number of points required to represent a digitised line or its caricature. The Canadian Cartographer, 10(2), 112-122. Normant, F. and van de Walle, A., 1996, The Sausage of Local Convex Hulls of a Curve and the Douglas-Peucker Algorithm, Cartographica, 33(4), 25-35 Ogniewicz, R.L. and Kübler, O., 1995, Hierarchic Voronoi Skeletons, Pattern Recognition, 28(3), 343-359 Plazanet, C., 1995, Measurements, Characterization, and Classification for Automated Line Feature Generalization. ACSM/ASPRS Annual Convention and Exposition, Vol. 4 (Proc. Auto-Carto 12): 59-68 Ramer, U., 1972, An iterative procedure for polygonal approximation of planar closed curves. Computer Graphics and Image Processing 1, 244-256. Robinson, A.H., Morrison, J.L., Muehrcke, P.C., Kimerling, A.J. and Guptill, S.C., 1995, Elements of Cartography, sixth edition. John Wiley & Sons, Inc. van der Poorten, P.M. and Jones, C.B., 2002, Characterisation and generalisation of cartographic lines using Delaunay triangulation. International Journal of Geographical Information Science 16(8), 773-794. Visvalingam, M. and Whyatt, J.D., 1993, Line generalisation by repeated elimination of points. Cartographic Journal, 30(1), 46-51. Visvalingam, M. and Williamson, P.J, 1995, Simplification and generalization of large scale data for roads. Cartography and Geographic Information Science 22(4), 3-15. Visvalingam, M. and Herbert, S., 1999, A computer science perspective on the bendsimplification algorithm. Cartography and Geographic Information Science 26(4), 253-270. Zhou, S. and Jones, C.B., 2001, Multi-Scale Spatial Database and Map Generalisation. ICA Commission on Map Generalization 4th Workshop on Progress in Automated Map Generalization Zhou, S. and Jones. C.B., 2003, A Multi-representation Spatial Data Model. Proc. 8th International Symposium on Advances in Spatial and Temporal Databases (SSTD'03), LNCS 2750, 394-411
Introducing a Reasoning System Based on Ternary Projective Relations Roland Billen1 and Eliseo Clementini2 1
Center for Geosciences, Department of Geography and Geomatics, University of Glasgow, Glasgow G12 8QQ, Scotland (UK), [email protected] 2 Department of Electrical Engineering, University of L’Aquila, I-67040 Poggio di Roio (AQ), Italy, [email protected]
Abstract This paper introduces a reasoning system based on ternary projective relations between spatial objects. The model applies to spatial objects of the kind point and region, is based on basic projective invariants and takes into account the size and shape of the three objects that are involved in a relation. The reasoning system uses permutation and composition properties, which allow the inference of unknown relations from given ones.
1 Introduction The field of Qualitative Spatial Reasoning (QSR) has experienced a great interest in the spatial data handling community due to its potential applications [1]. An important topic in QSR is the definition of reasoning systems on qualitative spatial relations. For example, regarding topological relations, the 9-intersection model [2] provides formal definitions for the relations and a reasoning system based on composition tables [3] establishes a mechanism to find new relations from a set of given ones. Topological relations take into account an important part of geometric knowledge and can be used to formulate qualitative queries about the connection properties of close spatial objects, like “retrieve the lakes that are inside Scotland”. Other qualitative queries that involve disjoint objects cannot be formulated in topological terms, for example: “the cities that are between Glasgow and Edinburgh”, “the lakes that are surrounded by the mountains”, “the shops that are on the right of the road”, “the building that is before the crossroad”. All these examples can be seen as semantic inter-
382
Roland Billen and Eliseo Clementini
pretations of underlying projective properties of spatial objects. As discussed in [4], geometric properties can be subdivided in three groups: topological, projective and metric. Most qualitative relations between spatial objects can be defined in terms of topological or projective properties [5], with the exception of qualitative distance and direction relations (such as close, far, east, north) that are a qualitative interpretation of metric distances and angles [6]. The use of projective properties for the definition of spatial relations is rather new. A model for ternary projective relations has been introduced for points and regions in [7]. The model is based on a basic geometric invariant in projective space, the collinearity of three points, and takes into account the size and shape of the three objects involved in a relation. In first approximation, this work can be compared to research on qualitative relations dealing with relative positioning or cardinal directions [813]. Most approaches consider binary relations to which is associated a frame of reference, never avoiding the use of metric properties (minimum bounding rectangles, angles, etc.). To this respect, the main difference in our approach is that we only deal with projective invariants, disregarding distances and angles. Most work on projective relations deals with point abstractions of spatial features. In [9], the authors develop a model for cardinal directions between extended objects. Composition tables for the latter model have been developed in [14]. Freksa’s double-cross calculus [15] is similar to our approach in the case of points. Such a calculus, as it has been further discussed in [16, 17], is based on ternary directional relations between points. However, in Freksa’s model, an intrinsic frame of reference centred in a given point partitions the plane in four quadrants that are given by the front-back and right-left dichotomies. This leads to a greater number of qualitative distinctions with different algebraic properties and composition tables. In this paper, we establish a reasoning system based on the ternary projective relations that were introduced in [7]. From a basic set of rules about the permutation and composition of relations, we will show how it is possible to infer unknown relations using the algebraic properties of projective relations. The paper is organized as follows. We start in Section 2 with introducing the general aspects of a reasoning system with ternary relations. In Section 3 we summarize the model for ternary projective relations between points and we present the associated reasoning systems. In section 4, recall the model in the case of regions and we introduce the reasoning system for this case too. In Section 5, we draw short conclusions and discuss some future developments.
Introducing a Reasoning System Based on Ternary Projective Relations
383
2 Reasoning systems on ternary relations In this section, we present the basis of a reasoning system on ternary projective relations. Usually, reasoning systems apply to binary spatial relations, for example, to topological relations [3] and to directional relations [18]. For binary relations, given three objects a,b,c and two relations r(a,b) and r(b,c), the reasoning system allows to find the relation r(a,c). This is done by giving an exhaustive list of results for all possible input relations, in the form of a composition table. The inverse relations complete the reasoning system, by finding, given the relation r(a,b), the relation r(b,a). Reasoning with ternary relations is slightly more complex and it is not been applied a lot to spatial relations till now, with few exceptions [16, 17, 19]. The notation we use for ternary relations is of the kind r ( PO, RO1 , RO2 ) , where the first object PO represents the primary object, the second object RO1 represents the first reference object and the third object RO2 represents the second reference object. The primary object is the one that holds the relation r with the two reference objects, i.e., PO holds the relation r with RO1 and RO2. A reasoning system with ternary relations is based on two different sets of rules: x a set of rules for permutations. Given three objects a,b,c, and a relation r(a,b,c), these rules allow to find which are the other relations with permutations of the three arguments. There are 6 (=3!) potential arrangements of the arguments. The permutation rules correspond to the inverse relation of binary systems. x a set of rules for composition. Given four objects a,b,c,d, and the two relations r(a,b,c) and r(b,c,d), these rules allow to find the relation r(a,c,d). The composition of relations r1 and r2 is indicated r1 r2 . Considering a set of relations 5, it is possible to prove that the four following rules, three permutations and one composition, are sufficient to derive all the possible ternary relations out of a set of four arguments. (1) (2) (3) (4)
r (a, b, c) o r ' (a, c, b) r (a, b, c) o r ' ' (b, a, c) r ( a , b, c ) o r ' ' ' ( c , a , b ) r1 (a, b, c) r2 (b, c, d ) o r3 (a, c, d )
In the next sections, we will see how to apply such a ternary reasoning system in the case of projective ternary relations between points and between regions.
384
Roland Billen and Eliseo Clementini
3 Reasoning system on ternary projective relations between points The projective ternary relations between points have been introduced in a previous paper [7]. They have a straightforward definition because they are related to common concepts of projective geometry [20]. In section 3.1, we will only present the definitions and the concepts necessary for a good understanding of the reasoning system. In section 3.2, we show how to apply the reasoning system on these ternary projective relations. 3.1 Ternary projective relations between points Our basic set of projective relations is based on the most important geometric invariant in a projective space: the collinearity of three points. Therefore, the nature of projective relations is intrinsically ternary. Given a relation r ( P1 , P2 , P3 ) , the points that act as reference objects must be distinct, in such a way they define a unique line passing through them, indicated with P2 P3 . When the relation needs an orientation on this line, the orientation is assumed to be from the first reference object to the second one: the oriented line is indicated with P2 P3 . The most general projective relations between three points are the collinear relation and its complement, the aside relation. The former one can be refined into between and nonbetween relations, and the latter one into rightside or leftside relations. In turn, the nonbetween relation can be subdivided into before and after relations, completing the hierarchical model of the projective relations between three points of the plane (see Figure 1.a). Out of this hierarchical model, five basic projective relations (before, between, after, rightside, leftside) are extracted. They correspond to the finest projective partition of the plane (see Figure 1.b).
a. Hierarchical model of the relations Fig. 1. Projective relations between points
b. Projective partition of the plane
Introducing a Reasoning System Based on Ternary Projective Relations
385
Definitions are given only for the collinear relation and the five basic relations. Definition 1. A point P1 is collinear to two given points P2 and P3, with P2 z P3 , collinear ( P1 , P2 , P3 ) , if P1 P2 P3 . Definition 2. A point P1 is before points P2 and P3, with P2 z P3 , before( P1 , P2 , P3 ) , if collinear ( P1 , P2 , P3 ) and P1 (f, P2 ) , where the last interval is part of the oriented line P2 P3 . Definition 3. A point P1 is between two given points P2 and P3, with P2 z P3 , between( P1 , P2 , P3 ) , if P1 >P2 P3 @ . Definition 4. A point P1 is after points P2 and P3, with P2 z P3 , after ( P1 , P2 , P3 ) , if collinear ( P1 , P2 , P3 ) and P1 ( P3 ,f) , where the last interval is part of the oriented line P2 P3 . Considering the two halfplanes determined by the oriented line P2 P3 , respectively the halfplane to the right of the line, which we indicate with HP ( P2 P3 ) , and the halfplane to the left of the line, which we indicate with HP ( P2 P3 ) , we may define the relations rightside and leftside.
Definition 5. A point P1 is rightside of two given points P2 and P3, rightside( P1 , P2 , P3 ) , if P1 HP ( P2 P3 ) . Definition 6. A point P1 is leftside of two given points P2 and P3, leftside( P1 , P2 , P3 ) , if P1 HP ( P2 P3 ) .
3.2 Reasoning system Using this model for ternary relations between points, it is possible to build a reasoning system, which allows the prediction of ternary relations between specific points. Such a reasoning system is an application of the reasoning system on ternary relations previously introduced. The four rules become: (1) r ( P1 , P2 , P3 ) o r ' ( P1 , P3 , P2 ) (2) r ( P1 , P2 , P3 ) o r ' ' ( P2 , P1 , P3 ) (3) r ( P1 , P2 , P3 ) o r ' ' ' ( P3 , P1 , P2 ) (4) r1 ( P1 , P2 , P3 ) r2 ( P2 , P3 , P4 ) o r3 ( P1 , P3 , P4 ) For any ternary relations (P1,P2,P3), Table 1 gives the corresponding relations resulting from permutation rules (1), (2) and (3). The following abbreviations are used: bf for before, bt for between, af for after, rs for right-
386
Roland Billen and Eliseo Clementini
side and ls for leftside. For example, knowing bf(P1,P2,P3), one can derive the relationships corresponding to the permutation of the three points, which are this case af(P1,P3,P2), bt(P2,P1,P3) and af(P3,P1,P2). Table 1. Permutation table of ternary projective relations between points r ( P1 , P2 , P3 )
r ( P1 , P3 , P2 )
r ( P2 , P1 , P3 )
bf
af
bt af
bt bf
bt bf af
rs ls
ls rs
ls rs
r ( P3 , P1 , P2 ) af bf bt rs ls
Table 2 gives relations resulting from the composition rule (4). The first column of the table contains the basic ternary relations for ( P1 , P2 , P3 ) and the first row contains the basic ternary relations ( P2 , P3 , P4 ) . The other cells give the deduced transitive relations for ( P1 , P3 , P4 ) . For some entries, several cases may occur and all the possibilities are presented in the table. Table 2. Composition table of ternary projective relations between points bf bt af rs ls
bf bf bf af, bt rs ls
bt af, bt bt bf ls rs
af af af, bt bf ls rs
rs rs rs ls af, rs, ls, bt bf, rs, ls
ls ls ls rs bf, rs, ls af, rs, ls, bt
Using this reasoning system and knowing any two ternary relations between three different points out of a set of four, it is possible to predict the ternary relations between all the other possible combinations of three points out of the same set.
4 Reasoning system on ternary projective relations between regions Likewise the section on relations between points, we first recall some concepts about the ternary projective relations between regions (section
Introducing a Reasoning System Based on Ternary Projective Relations
387
4.1). Afterwards, we introduce the associated reasoning system including an example of application of such a system (section 4.2). 4.1 Ternary projective relations between regions We will assume that a region is a regular closed point set possibly with holes and separate components. We will only present briefly the basic projective relations and the related partition of the space, while we refer to [7] for a more extended treatment. In the following, we indicate the convex hull of a region with a unary function CH(). As in the case of points, we use the notation r ( A, B, C ) for projective relations between regions, where the first argument A is a region that acts as the primary object, while the second and third arguments B and C are regions that act as reference objects. The latter two regions must satisfy the condition CH ( B) CH (C ) , that is, the intersection of their convex hulls must be empty. This condition allows to build a reference frame based on B and C, as it will be defined in this section. We also use the concept of orientation, which is represented by an oriented line connecting any point in B with any point in C. Definition 7. Given two regions B and C, with CH ( B) CH (C ) , a region A is collinear to regions B and C, collinear ( A, B, C ) , if for every point P A , there exists a line l intersecting B and C that also intersects P, that is: P A, l , (l B z ) (l C z ) | l P z . The projective partition of the space into five regions corresponding to the five basic projective relations is based, as it was for the points, on the definition of the general collinear relation between three regions. The portion of the space where this relation is true is delimited by four lines that are the common external tangents and the common internal tangents. Common external tangents of B and C are defined by the fact that they also are tangent to the convex hull of the union of B and C (figure 2.a). Common internal tangents intersect inside the convex hull of the union of regions B and C and divide the plane in four cones (figure 2.b). In order to distinguish the four cones, we consider an oriented line from region B to region C and we call Conef ( B, C ) the cone that contains region B, Conef ( B, C ) the cone that contains region C, Cone ( B, C ) the cone that is
to the right of the oriented line, Cone ( B, C ) the cone that is to the left of
388
Roland Billen and Eliseo Clementini
the oriented line. We obtain a partition of the space into five regions, which correspond to the five basic projective relations before, between, after, rightside, and leftside (figure 3.a).
a. Common external tangents
b. Common internal tangents
Fig. 2. Internal and external tangents
Definition 8. A region A is before two regions B and C, before( A, B, C ) , with CH ( B) CH (C ) , if A Conef ( B, C ) CH ( B C ) . Definition 9. A region A is between two regions B and C, between( A, B, C ) , with CH ( B ) CH (C ) , if A CH ( B C ) . Definition 10. A region A is after two regions B and C, after ( A, B, C ) , with CH ( B) CH (C ) , if A Conef ( B, C ) CH ( B C ) . Definition 11. A region A is rightside of two regions B and C, rightside( A, B, C ) , with CH ( B ) CH (C ) , if A is contained inside Cone ( B, C ) minus the convex hull of the union of regions B and C, that is,
if A (Cone ( B, C ) CH ( B C )) . Definition 12. A region A is leftside of two regions B and C, leftside( A, B, C ) , with CH ( B ) CH (C ) , if A is contained inside Cone ( B, C ) minus the convex hull of the union of regions B and C, that is,
if A (Cone ( B, C ) CH ( B C )) . The set of five projective relations before, between, after, rightside, and leftside can be used as a set of basic relations to build a model for all projective relations between three regions of the plane. The model, that we call the 5-intersection, is synthetically expressed by a matrix of five values that are the empty/non-empty intersections of a region A with the five regions defined in Figure 3.b. In the matrix, a value 0 indicates an empty intersection, while a value 1 indicates a non-empty intersection. The five basic relations correspond to values of the matrix with only one non-empty value (Figure 4). In total, the 5-intersection matrix can have 25 different values that correspond to the same theoretical number of projective rela-
Introducing a Reasoning System Based on Ternary Projective Relations
389
tions. Excluding the configuration with all zero values, which cannot exist, we are left with 31 different projective relations between the three regions A, B and C. A leftside(B,C) A between(B,C) A rightside(B,C)
Fig. 4. The projective relations with object A intersecting only one of the regions of the plane.
4.2. Reasoning system The reasoning system for regions is fully defined on the basis of the following relations: (1) r ( A, B, C ) o r ' ( A, C , B) (2) r ( A, B, C ) o r ' ' ( B, A, C ) (3) r ( A, B, C ) o r ' ' ' (C , A, B) (4) r1 ( A, B, C ) r2 ( B, C , D) o r3 ( A, C , D)
390
Roland Billen and Eliseo Clementini
Currently, the reasoning system has been established for the five basic projective relations only. We are working on its extension to the whole set of projective relations by developing a system which will combine the results of permutations and composition of the basic cases. Table 3. Permutation table of ternary projective relations between regions r ( A, B, C )
r ' ( A, C , B )
r ' ' ( B, A, C )
r ' ' ' (C , A, B)
bf
af
bt , (bt rs ) , (bt ls ) , (bt rs ls )
af , (af rs ) , (af ls ) ,
(af rs ls )
bf , (bf rs ) , (bf ls ) ,
bf , (bf rs ) , (bf ls ) ,
(bf rs ls )
(bf rs ls )
af , (af rs ) , (af ls ) ;
bt , (bt rs ) , (bt ls ) , (bt rs ls ) rs
bt af
bt bf
(af rs ls )
rs ls
ls rs
ls rs
ls
For any basic ternary relation r(A,B,C), Table 3 gives the corresponding relations resulting from permutation rules (1), (2) and (3). The similarity with the permutation table for three points is clear. Only for some cases, there are exceptions to the basic permutations for points. In those cases, the “strong” relation (which is the one that holds also for points) can be combined with one or both of leftside and rightside relations. The results of the composition rule (4) of the reasoning system are presented in Table 4. The first column of the table contains the basic ternary relations for r1 (A,B,C) and the first row contains the basic ternary relations for r2 (B,C,D). The other cells give the deduced r3 ( A, C , D) relations. In this table, we present only the single relations as results. The full composition relations can be obtained by combinations of these single relations. For example, the result of the composition before(A,B,C) before(B,C,D) is: bf , rs , ls , bf rs , bf ls , ls rs , bf rs ls . Table 4. Composition table of ternary projective relations between regions bf bt af rs ls
bf bf, rs, ls bf bf, bt, af, rs, ls bf, bt, af, rs bf, bt, af, ls
bt bt, af, rs, ls bt bf, bt, rs, ls bf, bt, af, ls bf, bt, af, rs
af af bt, af, rs, ls bt, af bf, bt, af, ls bf, bt, af, rs
rs af, rs bf, bt, rs bf, bt, ls bt, af, rs, ls bf, rs, ls
ls af, ls bf, bt, ls bf, bt, rs bf, rs, ls bt, af, rs, ls
Introducing a Reasoning System Based on Ternary Projective Relations
391
We will end this section by an example of application of the reasoning system. Given the relations before(A,B,C) and rightside(B,C,D), we find out the potential relations for r(A,B,D). x Step 1: apply (1) to the first term of the transitive relations, and (2) to the second term (fig. 5a, b and c): before( A, B, C ) o after ( A, C , B) ; rightside( B, C , D) o leftside(C , B, D) .
x Step 2: apply (4) to the following composition: after ( A, C , B) leftside(C , B, D) o before( A, B, D)
(fig 5.d) between( A, B, D) (fig 5.e) rightside( A, B, D) (fig 5.f) (before( A, B, D) between( A, B, D)) (fig 5.g) (before( A, B, D) rightside( A, B, D)) (fig 5.h) (between( A, B, D) rightside ( A, B, D)) (fig 5.i) (before( A, B, D) between( A, B, D) rightside( A, B, D)) . (fig 5.j)
5 Conclusion and future work In this paper, we have introduced a reasoning system based on ternary projective relations between points and between regions. These sets of qualitative spatial relations, invariant under projective transformations, provide a new classification of configurations between three objects based on a segmentation of the space in five regions. The associated reasoning system allows inferring relations between three objects using permutations and compositions rules. It is the first step of the establishment of a whole qualitative reasoning based on projective properties of space. In the future, the reasoning system has to be more formally defined; in particular the relations contained in permutation and composition tables have to be proved. Another issue that should be explored is the realisation of a complete qualitative spatial calculus for reasoning about ternary projective relations.
392
Roland Billen and Eliseo Clementini
a. Apply (1) to bf(A,B,C) implies af(A,C,B).
b. Apply (2) to rs(B,C,D) implies…
c. … ls(C,B,D)
d. bf(A,B,D)
e. bt(A,B,D)
f. rs(A,B,D)
g. bf(A,B,D) bt(A,B,D)
h. bf(A,B,D) rs(A,B,D)
i. bt(A,B,D) rs(A,B,D)
j. bf(A,B,D) bt(A,B,D) rs(A,B,D)
Fig. 5. Example of application of the reasoning system between regions
Acknowledgements This work was supported by M.I.U.R. under project “Representation and management of spatial and geographic data on the Web”.
References 1. Cohn, A.G. and S.M. Hazarika, Qualitative Spatial Representation and Reasoning: An Overview. Fundamenta Informaticae, 2001. 46(1-2): p. 1-29. 2. Egenhofer, M.J. and J.R. Herring, Categorizing Binary Topological Relationships Between Regions, Lines, and Points in Geographic Databases. 1991, Department of Surveying Engineering, University of Maine, Orono, ME. 3. Egenhofer, M.J., Deriving the composition of binary topological relations. Journal of Visual Languages and Computing, 1994. 5(1): p. 133-149.
Introducing a Reasoning System Based on Ternary Projective Relations
393
4. Clementini, E. and P. Di Felice, Spatial Operators. ACM SIGMOD Record, 2000. 29(3): p. 31-38. 5. Waller, D., et al., Place learning in humans: The role of distance and direction information. Spatial Cognition and Computation, 2000. 2: p. 333-354. 6. Clementini, E., P. Di Felice, and D. Hernández, Qualitative representation of positional information. Artificial Intelligence, 1997. 95: p. 317-356. 7. Billen, R. and E. Clementini, A model for ternary projective relations between regions, in EDBT2004 - 9th International Conference on Extending DataBase Technology, E. Bertino, Editor. 2004, Springer-Verlag: Heraklion - Crete, Greece. p. 310-328. 8. Gapp, K.-P. Angle, Distance, Shape, and their Relationship to Projective Relations. in Proceedings of the 17th Conference of the Cognitive Science Society. 1995. Pittsburgh, PA. 9. Goyal, R. and M.J. Egenhofer, Cardinal directions between extended spatial objects. IEEE Transactions on Knowledge and Data Engineering, 2003. (in press). 10. Kulik, L. and A. Klippel, Reasoning about Cardinal Directions Using Grids as Qualitative Geographic Coordinates, in Spatial Information Theory. Cognitive and Computational Foundations of Geographic Information Science: International Conference COSIT'99, C. Freksa and D.M. Mark, Editors. 1999, Springer. p. 205-220. 11. Schmidtke, H.R., The house is north of the river: Relative localization of extended objects, in Spatial Information Theory. Foundations of Geographic Information Science: International Conference, COSIT 2001, D.R. Montello, Editor. 2001, Springer. p. 415-430. 12. Moratz, R. and K. Fischer. Cognitively Adequate Modelling of Spatial Reference in Human-Robot Interaction. in Proc. of the 12th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2000. 2000. Vancouver, BC, Canada. 13. Schlieder, C., Reasoning about ordering, in Spatial Information Theory: A Theoretical Basis for GIS - International Conference, COSIT'95, A.U. Frank and W. Kuhn, Editors. 1995, Springer-Verlag: Berlin. p. 341-349. 14. Skiadopoulos, S. and M. Koubarakis, Composing cardinal direction relations. Artificial Intelligence, 2004. 152(2): p. 143-171. 15. Freksa, C., Using Orientation Information for Qualitative Spatial Reasoning, in Theories and Models of Spatio-Temporal Reasoning in Geographic Space, A.U. Frank, I. Campari, and U. Formentini, Editors. 1992, Springer-Verlag: Berlin. p. 162-178. 16. Scivos, A. and B. Nebel, Double-Crossing: Decidability and Computational Complexity of a Qualitative Calculus for Navigation, in Spatial Information Theory. Foundations of Geographic Information Science: International Conference, COSIT 2001, D.R. Montello, Editor. 2001, Springer. p. 431-446. 17. Isli, A. Combining Cardinal Direction Relations and other Orientation Relations in QSR. in AI&M 14-2004, Eighth International Symposium on Artificial Intelligence and Mathematics. 2004. January 4-6, 2004, Fort Lauderdale, Florida.
394
Roland Billen and Eliseo Clementini
18. Frank, A.U., Qualitative Reasoning about Distances and Directions in Geographic Space. Journal of Visual Languages and Computing, 1992. 3(4): p. 343-371. 19. Isli, A. and A.G. Cohn, A new approach to cyclic ordering of 2D orientations using ternary relation algebras. Artificial Intelligence, 2000. 122: p. 137-187. 20. Struik, D.J., Projective Geometry. 1953, London: Addison-Wesley.
A Discrete Model for Topological Relationships between Uncertain Spatial Objects Erlend Tøssebro and Mads Nygård Department of Computer Science, Norwegian University of Science and Technology, NO-7491 Trondheim, Norway. tossebro, mads @idi.ntnu.no
Abstract Even though the positions of objects may be uncertain, one may know some topological information about them. In this paper, we develop a new model for storing topological relationships in uncertain spatial data. It is intended to be the equivalent of such representations as the Node-Arc-Area representation but for spatial objects with uncertain positions. Keywords: Uncertain spatial data, topology, node-arc-area.
1 Introduction Several models have been presented for storing spatial data with uncertain or vague positions, including the models from (Tøssebro and Nygård 2002a and 2002b) that this paper builds on. However, these models do not handle topological relationships well. One example would be to describe two regions that share a stretch of boundary, but one does not know quite where this boundary lies. The goal of this paper is to develop a vector model of uncertain spatial points, lines and regions that incorporates this sort of information. The following examples describe situations in which one wants to store topological relationships between uncertain objects: Example 1: In a historical database, it may be known that two empires, such as the Egyptians and the Hittites, shared a boundary. However, it is not necessarily known exactly where the boundary was. If a coarse temporal granularity is used, the border may also have changed inside the time period of one snapshot. In such cases one may want to store the fact that the boundaries were shared rather than there being a possible overlap. A more recent example of this problem from (Plewe 2002) is the fact that the textual descriptions of county borders in Utah in the 19th century sometimes gave a lot of room for interpretation. Example 2: In a database containing information of past rivers and lakes, the location of the rivers and lakes may be partially uncertain. How-
396
Erlend Tøssebro and Mads Nygård
ever, one may know that a particular river came out of a particular lake. This relationship must be stored explicitly because it cannot be deduced using existing representations. Based only on the uncertain representation, the river may overlap the lake or start some distance away from it. Additionally, one must store such topological information if one wants to store uncertain or vague partitions. A partition is a set of uncertain regions that covers the entire area of interest and where the regions do not overlap each other. One example of a vague partition is: Example 3: A database of soil types or land cover should store a fuzzy partition because soil types typically change gradually rather than abruptly. However, the total map covers the entire terrain and has no gaps, thus being a partition. All these examples illustrate the need to store topology in uncertain data. Because such topology in many cases cannot be inferred from the data representation itself, it must somehow be stored explicitly. This paper will describe representations for storing topological data as well as methods for generating these representations from a collection of vectorized regions. The representations are based on existing models for uncertain spatial data by the present authors as well as how topology is stored for crisp data. (The word “crisp” will be used in this paper as the opposite of uncertain or vague.) A vector model is chosen rather than a raster model because vector data take much less space than raster data, and topology is much easier to generate and store for vector data. Most models for dealing with topology in crisp data are vector models. The next section will discuss the models and papers on which these new representations are built. Section 3 will discuss how to represent those topological relationships that cannot be inferred from the spatial extents of the uncertain data. Section 4 will discuss how one can extract information of one type from another, such as finding the uncertain border curve of an uncertain face. Section 5 extends the methods from Sections 3 and 4 to generate the uncertain border curves between regions. Section 6 will discuss other previous approaches to similar problems and compare them with this approach when appropriate. Section 7 summarizes the contributions of this paper.
2 Basis for the model This section will discuss some of the means that have been used to represent topological relationships before. Section 2.1 looks at how topology is represented for crisp data, and Section 2.2 looks at how topology is represented for uncertain data. Section 2.2 also looks at which relationships that
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
2
A
397
B
a
Start Point
1
(0-n)
Left Area (1)
Line Segment (1)
End
C
(0-n)
(0-n)
Face
Right Area (1)
(1)
(0-n)
ID Start End Left Reg. Right Reg. a 1 2 A B
a) Example
b) ER Diagram
Figure 1: Node-Arc-Area representation
can be inferred from the spatial representation alone when the data are uncertain. 2.1
Topology in crisp data
In the ideal case, topology can be inferred from the data itself in the crisp case. However, rounding errors and update inconsistencies mean that in practise it is often desirable to store topological information explicitly. Topology is therefore often stored explicitly in databases for crisp spatial data. One common representation is the so-called node-arc-area (NAA) representation. It is described in for instance (Worboys 1995), page 193. An example of this representation is shown in Fig. 1a. The table row below the main figure shows the relational entry for line a. An ER-diagram of this representation is given in Fig. 1b. In this representation, each face is bounded by a set of line segments. Each line segment has a pointer to the faces that are on each side of it, as well as to the start and end points. 2.2
Abstract uncertain topology
Much work has been done on determining topological relationships between indeterminate regions. However, the conclusions of most of these papers are either that there are a lot more topological relationships in the uncertain case (Clementini and Di Felice 1996), or that one cannot determine the relationships for certain (Roy and Stell 2001). This is particularly a problem for relationships like equals that requires precise overlap or meet that requires that the borders but not the interiors overlap. The topological relationships that are used in (Schneider 2001) for uncertain regions are given in Table 1.
398
Erlend Tøssebro and Mads Nygård
The contains, inside, disjoint and overlap relationships can be determined by geometric computations for uncertain data as well as for crisp data. These computations may yield an uncertain answer, but at least a crisp answer is possible. However, this is not the case for covers, coveredBy, equals and meet. Determining for certain that two regions are equal is only possible if they are known to be the same object (have the same object identifier or an explicit link to each other). Even if the uncertain spatial representations are equal, there is no guarantee that the actual objects are equal. It is also impossible to determine the meet relationship for certain, for the same reason as for equals. Although this operation is but one of eight, it encompasses a number of cases: • Point meet point: This is the same as point equals point and is solved by checking object identities as per the other equals operations. • Point meet line: Is the point P on the line L? (Did the ancient city C lie on the river R?) • Point meet face: Is the point P on the border of the face F? • Line meet line: Do the two lines share an end point? • Line meet face: Does the line end in the face? (Does the river R come from the lake L?) • Face meet face: Do the two faces share a common boundary? (Were the ancient empires A and B neighbours?) The node-arc-area representation stores regions by their boundary and lets the edges of bordering regions store links to each other. In this fashion one knows that these two regions meet each other. Covers and coveredBy cannot be determined for certain with uncertain data for the same reason as with meet, and the solution presented in this paper will work for them as well.
The second object is inside the first but shares a part of its boundary The first object is inside the second but shares a part of its boundary The second object is entirely inside the first The first object is entirely inside the second the intersection of the two objects is empty The two objects are equal The two objects overlap The boundaries of the objects overlap but the interiors do not
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
(0 - n)
Meet
Touches
(0 - n)
(0 - n)
On Uncertain Point
(0 - n)
399
(0 - n)
Enters (0 - n)
Uncertain Curve
(0 - 2)
(0 - n)
(0 - n)
Uncertain Face (0 - n)
On Border
Figure 2: ER-diagram of the simple method of storing connections
3 Storing connections between types Let us say that the data provider knows that two uncertain objects meet each other but does not know precisely where. Because he does not know precisely where the objects are, they should be stored as uncertain or fuzzy spatial objects. However, existing representations of such objects do not provide any means of determining whether they meet or not. To be able to determine the meet relationships, one needs to store them somehow. Two different approaches will be presented in this section. The first is to simply store the relationship explicitly in the data objects. The second is to create an uncertain equivalent of the node-arc-area representation. This second method is also useful in storing an uncertain partition. 3.1
Simple method
The simplest solution is to store the relationships as lists in the objects themselves. In this method, each object stores a link to each other object to which it is related. An ER diagram showing these relationships is presented in Fig. 2. The advantage of this model is simplicity. There is for instance no need to convert between types. However, it is impossible to check where the relationship holds in this model. For instance, it is impossible to check which part of the boundary of region A that it shares with region B. This model also cannot be used to store an uncertain partition because one cannot determine whether or not a given set of faces is a partition. 3.2
Uncertain node-arc-area
The alternative is to use a representation that is similar to the Node-ArcArea (NAA) representation that is used in the crisp case. The NAA representation can be translated into a representation for uncertain data using much the same ER diagram as for the crisp case. The only difference is that
400
Erlend Tøssebro and Mads Nygård End point (0 - 3)
Uncertain Point (0 - n)
Border
(0 - 2)
On
Uncertain Curve
(0 - 2)
(0 - n)
Uncertain Face
(0 - n)
Figure 3: ER-diagram of uncertain Node-Arc-Area
one might want to store the On relationship explicitly. In the crisp case this can be computed from the geometry. In the uncertain case it cannot because even if the support of an uncertain point is entirely within the support of the uncertain curve, the point still probably is not on the curve. The ER-diagram for the uncertain node-arc-area representation is given in Fig. 3. In an uncertain node-arc-area representation, the following relationships from Fig. 2 are checked by testing relationships with different types: • • • •
p.OnBorder ( F) ≡ p.On ( f.Border()) c 1 .Meet ( c 2) ≡ ∃p ( c 1 .EndPoint ( p) ∧ c 2 .EndPoint ( p) ) c.Enters ( F) ≡ ∃p ( c.EndPoint ( p) ∧ p.OnBorder ( F) ) F 1 .Touches ( F 2) ≡ ∃c ( F 1 .BorderedBy ( c) ∧ F 2 .BorderedBy ( c) )
Notice that this representation requires that the border of the face and the end points of the curve be represented explicitly. To ensure consistency between the curve and the end point, it should be possible to generate the end points from the curve. In some cases it may also be necessary to generate some of these data in order to create an uncertain ordered edge list. For instance, if one has a set of regions, one must generate their border curves as well as the end points of those curves. In this representation, it is assumed that if two regions are bordered by the same curve, they meet. If they are bordered by different curves, they do not meet. If one knows that they may meet, one may store this as if they were certain to meet and store a probability of meeting in the border curve.
4 Converting between types To be able to generate the uncertain ordered edge list, one needs to be able to find the border of a region and the end points of a curve. In this section, one method for doing this for a general egg-yolk style representation (Cohn and Gotts 1996) is given. In an egg-yolk model, a face is represented with two boundaries. Each uncertain face is expected to have a core in which it is certain to exist, and a support in which it can possibly be. The object may also store a probability
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
401
Support Uncertain Boundary Curve Core
Figure 4: Uncertain Face and Boundary Curve
function or fuzzy set to indicate where the object is most likely to be. One way of producing the boundary curve of a face is the following: 1. Let BR be the boundary region of the face. 2. If there is a probability function, create the expected border line along the 0.5 probability contour. Otherwise, create it in the middle of the boundary region. 3. Assign a probability of existence of 1 along the entire expected border line. 4. Let the uncertain border curve be the expected border line and the boundary region of the face. 5. The boundary region indicates where the border line may be, and the expected border line indicates where it is most likely to be. This procedure assumes that there are no peaks in the probability function outside the core. If there are, a more complex algorithm is needed. This algorithm has been omitted due to space limitations. Figure 4 shows an example of an uncertain face and its boundary curve. To create the uncertain node-arc-area representation, one also needs to find the end points of an uncertain curve. An uncertain cycle like the ones that form the borders of uncertain faces do not have end points, but other uncertain curves may have. When storing topological relationships, one possible method of finding the points where two or more curves meet is to use the area that is overlapped by the supports of all the curves that are supposed to meet there. This method is used in the algorithm presented in Sect. 5. Computing the end points from the representation of the uncertain curve would require an unambiguous representation of the probability that the curve is at particular places. Such a representation is presented in (Tøssebro and Nygård 2002a), but it is beyond the scope of this paper. All uncertain objects have a face as support and an object of the appropriate type as the core. Probability functions are computed by triangulating the support and computing the probability functions along the triangle edges. The basic approach that was described for point-set models can also be used in this discrete model.
402
Erlend Tøssebro and Mads Nygård Four uncertain regions
One end point
Two end points
Two end points
Figure 5: Possible configurations when four uncertain curves meet
5 Creating a representation of the meeting relationships for uncertain regions In an uncertain ordered edge list, each uncertain curve is supposed to form the boundary between two specific faces. However, the conversion method from the previous section only gives the boundary of a face as a single curve. Thus this curve has to be split into several curves. Because of rounding errors, digitizing errors and other inaccuracies, one cannot guarantee that two regions meet up perfectly even if the data provider knows that they do. Additionally, one does not know whether four lines meet in one point or two. One simple example of this last problem is shown in Fig. 5. For the four uncertain regions to the left, it is impossible to know which of the three cases to the right is the actual one. For these reasons creating an uncertain partition from its component set of regions is not easy. One way of storing the topological relationships is to try to make the regions fit together. This requires that the regions are changed slightly so that they do fit together. The method presented below is one way of creating such a best fit for the advanced discrete vector model from (Tøssebro and Nygård 2002a): 1. Add a buffer to the supports to ensure that the supports overlap entirely. For each point of the support of A that is inside the support of B, increase the distance from the core of A. a. Advantage of a large buffer: Makes certain that the supports overlap entirely. b. Disadvantage of a large buffer: Makes end points larger than necessary. 2. Remove those parts of the support of A that are inside the core of B. 3. Repeat points 1 and 2 for region B. 4. Take the intersection of the supports of A and B. This is the support of the border curve between the two faces. 5. Find the end points of the border curve. The supports of an end point is all the line segments of the support of the uncertain curve that are not
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects Original faces
Step 1
Step 2
403
Finished representation
Figure 6: Creating the advanced uncertain partition
shared with the core of either A or B. These line segments will form two lines. Turn each of these lines into cycles by adding a straight closing line segment. 6. The core of the border curve and border points are determined as for normal uncertain curves and points. Figure 6 shows an example of two uncertain regions that border each other using this algorithm. The gray areas in the final representation are the meeting points of the border lines. If several faces border each other with no gaps between (a partial partition), each end point should be the end point of three or more border curves. One may use the following procedure to join the results together: 1. Take the union of the supports of A from all the topological relationships that it is involved in. 2. Remove those parts of A that are inside the cores of any region that it should border. 3. Join all meeting points that overlap by creating a new meeting point. The support of this point is the intersection of the supports of all the regions that meet there. (An alternative would be to use the union of the supports of the existing points. However, this may result in points with a weird and unnatural shape.) The number of possible configurations like in Fig. 5 grows rapidly with the number of meeting curves. For four curves, the number is three, for five it is 11, and for six it is around 45. Therefore it is important that the algorithm decides which of the configurations is most likely the correct one, and not delay that decision to query time. The above algorithm assumes that all the lines end in a single point. This assumption makes the algorithm simpler. An alternative assuming that at most three lines meet in each end point has been constructed but is omitted due to space limitations. An additional aspect of the advanced uncertain partition is whether the shape of an uncertain region should be stored only as the border curves or
404
Erlend Tøssebro and Mads Nygård
also stored separately. Because the border curves are needed anyway in an uncertain partition, letting the shape of the region be implied by its border curves costs less space and prevents inconsistencies between the representations. However, it also means that to get the shape of the region, one must retrieve all its border curves. If the shape was also stored in the region itself, the shape could be retrieved from the region object itself without retrieving the border curves, which probably are stored in another table.
6 Related work Many studies have looked at topology in uncertain data from a slightly different direction: How is the classical 9-intersection model affected by uncertainty? Early works in this direction like (Clementini and Di Felice 1996), (Clementini and Di Felice 1997) and (Cohn and Gotts 1996) have found that the number of distinct topological relationships increases from 8 to 44 when boundaries are broad rather than crisp. These 44 different relationships are then grouped into 14 different cases, one for each of the 8 crisp relationships from Table 1, and 6 that indicate which of the 8 the relationship is closest to. (Roy and Stell 2001) uses a method similar to that used in (Cohn and Gotts 1996) but defines the relationships differently. Rather than define a lot of possible relationships, they show how one can find approximate answers to some core relationships. Rather than having an explicit “Almost overlap” relationship, they determine for each of the topological relationships the probability that it is the true relationship. (Schneider 2001) and (Zhan 1998) present versions of this method which give a mathematical way of computing a fuzzy Boolean value for uncertain topological relationships. (Winter 2000) presents a statistical method for computing the probability that various topological relationships are true for two uncertain regions. (Cobb and Petry 1998) uses fuzzy sets to model topological relationships and direction relationships between spatial objects modelled as rectangles. Their approach is to find out how much of object B is west of object A or how much of object B overlaps object A. They store this as fuzzy set values in a graph that has one node for each direction and edges to nodes representing the objects. In Chap. 9 of (Molenaar 1998), a model for fuzzy regions and lines is defined. This model is defined on a crisp partition of the space. The shape of a region is defined by which of the faces in this underlying partition it contains. A fuzzy region is defined by assigning a fuzzy set value for each underlying face the region contains. A fuzzy line is defined by two fuzzy regions, one that lies to the right of the line, and one that lies to the left. This
A Discrete Model for Topol. Relationships between Uncertain Spat. Objects
405
model also includes methods for computing fuzzy overlap, fuzzy adjacency and fuzzy intersection of a line and a region. The fuzzy adjacency relationship states that if the supports of the regions are adjacent or overlapping, the regions are adjacent. It also defines strict adjacency as adjacent but not overlapping. This method works for vague regions but not for uncertain regions. For uncertain regions, overlapping supports may just as easily indicate a small overlap or that the regions are near each other. Therefore one needs to store adjacency explicitly in the uncertain case rather than derive it from the geometry as the model from (Molenaar 1998) does. (Bjørke 2003) uses a slightly different way of deriving fuzzy lines from regions and computing topological relationships. The fuzzy boundary of a fuzzy region is defined with the function 2 ⋅ min [ fv ( x , y) , 1 – fv ( x, y) ] where fv is the fuzzy membership value at those coordinates. To compute the relationship between two fuzzy objects, (Bjørke 2003) computes the fuzzy set values of the four intersections between the boundary and the interior of the two regions. Bjørke then computes the similarity between this result and all the 8 valid topological relationships from Table 1. The model from (Bjørke 2003) is capable of computing the border of a region as well as topological relationships on the regions, including equals and meets. It does not deal with meeting points. It is also an abstract model while this paper focuses on a discrete model.
7 Conclusions This paper has presented a way to represent topological information in a vector model for uncertain or fuzzy data. In particular, it shows how to represent the various relationships classified under meet by (Schneider 2001) in a vector model such that one can get definite results in queries if such results are known by the data providers. This paper also shows how one can store a partition made up of uncertain or vague regions in a vector model. The representation is derived from an earlier model for representing individual uncertain objects. The paper further presents algorithms for generating the uncertain boundary of an uncertain face and the end points of an uncertain curve. These are used to generate uncertain partitions from a set of regions. Regions that the data provider knows share a border, do not necessarily meet perfectly. Various errors may cause slight overlaps or gaps between them. This paper presents a solution to this that bases itself on altering the regions so that they meet perfectly. One challenge is that our model requires fairly complex algorithms for extracting the border curves of faces and end points of lines. Additionally,
406
Erlend Tøssebro and Mads Nygård
the algorithm for end points is only an approximate solution. We can think of no way to generate the uncertain end point precisely. A simpler representation based on another discrete model has been constructed but is not included due to space limitations.
References Bjørke JT (2003) Topological relations between fuzzy regions: derivation of verbal terms. Accepted for publication in Fuzzy sets and Systems. Clementini E and Di Felice P (1996) An Algebraic Model for Spatial Objects with Indeterminate Boundaries. In Geographic Objects with Indeterminate Boundaries, GISDATA series vol. 2, Taylor & Francis, pp. 155-169. Clementini E, and Di Felice P (1997) Approximate Topololgical Relations. International Journal of Approximate Reasoning, 16, pp. 173-204. Cohn AG and Gotts NM (1996) The ‘Egg-Yolk’ Representation of Regions with Indeterminate Boundaries. In Geographic Objects with Indeterminate Boundaries, GISDATA series vol. 2, Taylor & Francis, pp. 171-187. Cobb MA and Petry FE (1998) Modeling Spatial Relationships within a Fuzzy Framework. In Journal of the American Society for Information Science, 49 (3), pp. 253-266. Molenaar M (1998) An Introduction to the Theory of Spatial Object Modelling for GIS. (Taylor & Francis). Plewe B (2002) The Nature of Uncertainty in Historical Geographic Information. Transactions in GIS, 6 (4), pp. 431-456. Roy AJ and Stell JG (2001) Spatial Relations between Indeterminate Regions. Accepted for publication in International Journal of Approximate Reasoning, 27(3), pp.205-234. Schneider M (2001) Fuzzy Topological Predicates, Their Properties, and Their Integration into Query Languages. In Proceedings of the 9th ACM Symposium on Geographic Information Systems (ACM GIS), pp. 9-14. Tøssebro E and Nygård M (2002a) An Advanced Discrete Model for Uncertain Spatial Data. In Proceedings of the 3rd International Conference on Web-Age Information Management (WAIM), pp. 37-51. Tøssebro E and Nygård M (2002b) Abstract and Discrete models for Uncertain Spatiotemporal Data. Poster presentation at 14th International Conference on Scientific and Statistical Database Management (SSDBM). An abstract on page 240 of the proceedings. Winter S (2000) Uncertain Topological Relations between Imprecise Regions. International Journal of Geographical Information Science, 14(5), pp. 411430. Worboys M (1995) GIS: A Computing Perspective. (Taylor & Francis). Zhan FB (1998) Approximate analysis of binary topological relations between geographic regions with indeterminate boundaries. Soft Computing, 2, pp. 28-34.
Modeling Topological Properties of a Raster Region for Spatial Optimization Takeshi Shirabe Institute for Geoinformation, Technical University of Vienna, 1040 Vienna, Austria, [email protected]
Abstract Two topological properties of a raster region – connectedness and perforation – are examined in the context of spatial optimization. While topological properties of existing regions in raster space are well understood, creating a region of desired topological properties in raster space is still considered as a complex combinatorial problem. This paper attempts to formulate constraints that guarantee to select a connected raster region with a specified number of holes in terms amenable to mixed integer programming models. The major contribution of this paper is to introduce a new intersection of two areas of spatial modeling – discrete topology and spatial optimization – that are generally separate.
1 Introduction A classic yet still important class of problems frequently posed to rasterbased geographic information systems (GIS) is one of selecting a region in response to specified criteria. It dates back to the early 1960s when landscape architects and environmental planners started the so-called overlay mapping – a process of superimposing transparent single-factor maps on a light table – to identify suitable regions for particular land uses (see, e.g. McHarg 1969). The then manual technique is today automated by most of the current GIS. Expressing spatial variation in numerical terms, rather than by color and texture, makes the technique more ‘rigorous and objective’ (Longley et al. 2001) as well as reproductive and economical. If a region-selection problem can be reduced to a locationwise screening, the overlay mapping technique will suffice. For example, a query like ‘select locations steeper than 10% slope within 200m from streams’ can be easily answered. If selection criteria apply not to individual locations but to loca-
408
Takeshi Shirabe
tions as a whole, however, one will encounter combinatorial complexity. To see this, add to the above query a criterion ‘selected locations must be connected and their total area is to be maximized.’ Criteria like this are called ‘holistic’ as opposed to ‘atomistic’ (Tomlin 1990, Brookes 1997), and found in a variety of planning contexts such as land acquisition (Wright et al. 1983, Diamond and Wright 1991, Williams 2002, 2003), site selection (Diamond and Wright 1988, Cova and Church 2000), and habitat reserve design (McDonnell et al. 2002, Church et al. 2003). Storing cartographic data in digital form also enables one to cast such a holistic inquiry as a mathematical optimization problem. In general, to address an optimization problem, two phases are involved: model formulation and solution. A problem may be initially stated in verbal terms as illustrated above. It is then translated into a set of mathematical equations. These equations tend to involve discrete (integer) as well as continuous decision variables – decision variables are the unknowns whose values determine the solution of the problem under consideration – for dealing with indivisible raster elements. A system of such equations is called a mixed integer programming (MIP) model. Once a problem is formulated as such, it is solved by either heuristic or exact methods. Heuristic methods are designed to find approximate solutions in reasonable times and are useful for solving large-scale models (as is often the case with raster space). Exact methods, on the other hand, aim to find best (or optimal) solutions – with respect to criteria explicitly considered. If there is no significant difference between their computational performances, exact methods are preferred. Even when exact methods are not available, good heuristic methods should be able to tell how good obtained solutions are relative to possible optima. Thus, whether a problem is solved approximately or exactly, for solutions to be correctly evaluated, the problem needs to be formulated exactly. Many region-selection criteria have been successfully formulated in MIP format (see, e.g. Wright et al. 1983, Gilbert et al. 1985, Benabdallah and Wright 1991, Cova and Church 2000, Williams 2002). Typical among them are (geometric) area, (non-geometric) size, compactness, and connectedness. The area of a region can be equated with the number of raster elements in the region. Similarly, the size of a region in terms of a certain attribute (such as cost or land use suitability) is computed by aggregating the attribute value of each element in the region. Compactness is a more elusive concept and has no universally accepted measure. This allows various formulae for compactness, such as the ratio between the perimeter and the area of a region, the sum of the distances – often squared or weighted – between each element and a region center, and the number of adjacent pairs of elements in a region.
Modeling Topological Properties of a Raster Region
409
Connectedness (or contiguity) is, on the other hand, a difficult property to express in MIP form. In fact it has not been modeled as such until recently (Williams 2002). A frequently used alternative for addressing connectedness is to pursue a compact region, which tends (but not guarantees) to be connected. Though connectedness and compactness are in theory independent of each other, this approach is of practical value particularly when both properties are required. ‘Region-growing’ (Brookes 1997) is another technique which guarantees to create a connected region. The procedure starts with an arbitrary chosen raster element and attaches to it one element after another until a region of a desired size and shape is built. Cova and Church (2000) devised another heuristic for growing a region along selected shortest paths. On the contrary to these heuristics, Williams (2002) proposed a MIP formulation of a necessary and sufficient condition to enforce connectedness. The formulation exploits the facts that a set of discrete elements can be represented by a planar graph, whose dual is also a planar graph; and that the two graphs have spanning trees that do not intersect each other. More recently, Shirabe (2004) proposed another MIP formulation of an exact connectedness condition based on concepts of network flows (see Section 3). There is no difference between the two exact models in accuracy. Also they involve the same number of binary (0-1) decision variables – a major factor affecting tractability. The latter formulation, however, requires fewer continuous variables and constraints and seems more tractable. Perforation is another important region-selection criterion. A region with holes, even though connected, often lacks integrity and utility. Neither of the two aforementioned connectedness models prevents a region from having holes. In fact, the more elements a region is required to contain, the more likely it will have holes. The closest to achieve nonperforation is Williams’s MIP model for selecting a convex (i.e. not perforated or indented) region from raster space (Williams 2003). It does not consider all convex regions but guarantees to make a certain type of convex region. Convexity is, however, not a necessary but a sufficient condition for connectedness. Thus a potentially good connected region without holes might be overlooked. The last two difficult criteria are topological in nature. Unlike metric counterparts, they are strictly evaluated in Boolean terms. Topology in discrete space – as opposed to that in continuous space (Egenhofer and Franzosa 1991, Randell et al. 1992) – has been formalized by many researchers in terms that can be implemented in raster GIS and image processing software (see Section 2). Their topological models have enhanced spatial reasoning and query, which deal with regions already recorded. They may
410
Takeshi Shirabe
not, however, lend themselves to spatial optimization, which is concerned with regions yet to be realized. This paper therefore aims to bridge this gap. More specifically, it proposes a MIP formulation of a necessary and sufficient condition for selecting a connected region without holes from raster space. It then generalizes the formulation to be able to achieve a desired degree of perforation – from no hole to a largest possible number of holes. The rest of the paper is organized as follows. Section 2 reviews topology of a single region in raster space. Section 3 presents an existing MIP-based connectedness constraint set and two new MIP-based perforation constraint sets. Section 4 reports computational experiments. Section 6 concludes the paper.
2 Topology of a Single Raster Region Raster space is a two-dimensional space that is discretized into equallysized square elements called cells (or pixels in image processing). Two cells are said to be 4-adjacent (resp. 8-adjacent) if they share a side (resp. a side or a corner) (figure 1). A finite subset of raster space is here referred to as a raster map. A subset taken from a raster map constitutes a region. (a) 4-adjacency
(b) 8-adjacency
Fig. 1. Adjacency. Shaded cells are (a) 4- or (b) 8-adjacent to the center cell.
Connectedness and perforation in raster space are roughly described as follows. A region is said to be connected if one can travel from any cell to any other cell in the region by following a sequence of adjacent cells without leaving the region. Likewise, a region is said to be perforated if it has a hole, which is a maximum connected set of cells not included in but completely surrounded by the region. Connected regions are classified into two kinds: simply connected and multiply connected (Weisstein 2004). A simply connected region has no hole, while a multiply connected region has one or more holes. In general, a connected region with n holes is called an (n+1)-ply connected region. These topological properties are easily recognized by visual inspection, and should also be implied by adjacency relations between cells. Unlike Euclidean counterparts, however, connectedness in raster space is not so
Modeling Topological Properties of a Raster Region
411
obvious, which in turn makes perforation ambiguous, too. This is well exemplified by the so-called connectedness paradox (Kovalevsky 1989, Winter 1995, Winter and Frank 2000, Roy and Stell 2002). To see it, consider a raster curve illustrated in figure 2. In terms of the 4-adjacency, the curve is not closed and the inside of the curve is not connected to the outside of the curve. This violates the Jordan curve theorem (Alexandroff 1961) as a non-closed curve has separated space into two parts. If the 8-adjacency is employed, the curve is closed and the inside of the curve is connected to the outside of the curve. This, too, violates the theorem, since a closed curve has not separated space into two parts.
Fig. 2. A raster curve
To overcome the connectedness paradox, Kong and Rosenfeld (1989) applied different adjacency rules to a region and its complement (i.e. its background). For instance, if the curve mentioned above has the 8adjacency and its complement has the 4-adjacency, the curve is closed and separates space into two parts. The paradox is then resolved. Kovalevsky (1989), Winter (1995), Winter and Frank (2000), and Roy and Stell (2002) took another approach to the paradox. They decompose space into geometric elements of n different dimensions called ‘n-cells’, which are collectively referred to as a ‘cellular complex’ (Kovalevsky 1989). A cellular complex of raster space or a ‘hyper raster’ (Winter 1995, Winter and Frank 2000) consists of 0-cells (cell corners), 1-cells (cell sides), and 2-cells (cells). Euclidean topology applies to this model (Winter 1995), so the paradox does not arise. This paper relies on a mixed-adjacency model – more specifically, the (4, 8) adjacency model (Kong and Rosenfeld 1989) which applies the 4adjacency to a region and the 8-adjacency to its complement – for two reasons. First, as far as a single region is concerned, no paradox occurs. Second, a raster map can be represented by a graph, which enables one to evaluate connectedness and perforation of a region without reference to its complement (see Section 3). To represent a raster map in terms of a graph, each cell is equated with a vertex and each adjacency relation between a pair of cells is equated with an edge. If the 4-adjacency is employed, the graph is planar, that is, it can
412
Takeshi Shirabe
be drawn in the plane such that no two edges cross (Ahuja et al. 1993). A face is formed by a cycle of four edges. In the case of the 8-adjacency, however, a raster map cannot be modeled by a planar graph. A connected planner graph has an important property expressed by the following Euler’s formula. ve f 2 (1) where v, e, and f are the numbers of vertices, edges, and faces, respectively, in the graph. Consider a 10-by-10 raster map as an example. Its associated graph satisfies the formula as it has 100 vertices, 180 edges, and 82 faces (including an unbounded face). The formula also applies to a connected subgraph of it. Figure 3 illustrates two connected regions and their corresponding graphs. The graph on the left has 25 vertices, 34 edges, and 11 faces, while the graph on the right has 25 vertices, 29 edges, and six faces. Here it is important to note that a perforated region has faces formed by more than four edges. # # # # #
# # #
# #
#
#
# # # # # # # #
# # # # # # # #
# # #
# #
#
#
#
# # #
# # # # # #
# # # # #
Fig. 3. Graph representations of two regions. Each region (shaded) is represented by a graph of vertices (white dots) and edges (white lines).
3 Formulation In this section, based on the (4, 8) adjacency, we present three sets of connectedness and perforation constraints. They are formulated self-contained so as to be used as components of region-selection models that may involve additional criteria for optimization. The first constraint set guarantees to select a connected region regardless of the number of holes. The second constraint set, coupled with the first one, guarantees to select a simply connected region. The third constraint set generalizes the second one to control the number of holes in a connected region.
Modeling Topological Properties of a Raster Region
413
3.1 Connectedness Constraints Connectedness is defined in graph-theoretic terms as follows: a region is said to be connected if there is at least one path between every pair of vertices in a graph, G, associated with the region. Since the adjacency relation between any pair of cells is symmetric in raster space, a necessary and sufficient condition for G to be connected is that there is at least one vertex, p, in G, to which there is a path from any other vertex in G. This can be interpreted in terms of network flows by regarding p as a sink and the rest of vertices in G as sources and replacing each edge with two opposite directed arcs. In this setting, for G to be connected, every source must have a positive amount of supply that ultimately reaches the sink (Shirabe 2004). Thus, selecting a connected region from a raster map is equivalent to selecting, from a network associated with the map, a sink vertex and source vertices such that they make an autonomous subnetwork that satisfies the above condition. This is expressed by the following set of constraints. y ij y ji x i mwi i I (2)
¦
¦
{ j |( i , j ) A}
¦w ¦y i
{ j |( j ,i ) A}
1
(3)
iI
ij { j | ( i , j ) A}
d (m 1) xi
i I
(4)
where I set of cells in a given raster map. Each cell of I is denoted by i or j. A set of ordered pairs of adjacent cells in the map. By definition, (i, j ) A if and only if ( j , i ) A . m Given nonnegative integer indicating the number of cells to be selected for the region. xi Binary decision variable indicating if cell i is selected for the region. x i
wi
yij
1 if selected, 0 otherwise.
Binary decision variable indicating if cell i is a sink. wi 1 if a sink, 0 otherwise. Nonnegative continuous decision variable indicating the amount of
flow from cell i to cell j. Constraints (2) represent the net flow of each cell. The two terms on the left-hand side represent, respectively, the total outflow and the total inflow of cell i. Constraint (3) requires that one and only one cell be a sink. Constraints (4) ensure that there is no flow into any cell in the region’s complement (where x i 0 ) and that the total inflow of any cell in the region
414
Takeshi Shirabe
(where x i 1 ) does not exceed m-1. This implies that there may be flow (though unnecessary) from a cell in the region’s complement to a cell in the region. Even in this case, the supply from each cell in the region remains constrained to reach the sink and the contiguity condition holds. This formulation has a fairly efficient structure, as the numbers of binary variables, continuous variables, and constraints are |I|, |A|, and 2|I|+1, respectively. 3.2 Simply Connectedness Constraints The above constraint set only guarantees to make a connected region, so it is possible that the region has one or more holes. To prevent perforation, a new set of constraints are added. It takes advantages of a special topological structure of a simply connected region in the (4, 8) adjacency model. That is, in a graph representation of a simply connected region, all faces except one unbounded face are cycles of four edges (see the region on the left in Figure 3). Thus the enforcement of the Euler’s formula, while counting only such cycles as faces, will result in a simply connected region. Accordingly, the following constraints, together with constraints (2)(4), guarantee to select a simply connected region from a raster map. m eij (1 f ijkl ) 2 (5)
¦
( i , j )E
¦
( i , j , k ,l )F
eij d xi
(i, j ) E
(6)
eij d x j
(i, j ) E
(7)
eij t x i x j 1
(i, j ) E
(8)
f ijkl d eij
(i, j , k , l ) F
(9)
f ijkl d e kl
(i, j , k , l ) F
(10)
f ijkl t eij e kl 1
(i, j , k , l ) F
(11)
where E set of unordered pairs of adjacent cells in the map. F set of unordered quads of adjacent cells in the map. eij Binary decision variable indicating if cells i and j (unordered) are both selected for the region. eij
f ijkl
1 if selected, 0 otherwise.
Binary decision variable indicating if cells i, j, k, and l (unordered) are all selected for the region. f ijkl
1 if selected, 0 otherwise.
Modeling Topological Properties of a Raster Region
415
Constrain (5) is Euler’s formula for a graph representation of a simply connected region. Three terms on the left-hand side of constraint (5) respectively represent the numbers of vertices, edges, and faces (including one unbounded face). Constraints (6)-(8) make eij 1 if cells i and j are included in the region, 0 otherwise. Constraints (9)-(10) make f ijkl
1 if
cells i, j, k, l are included in the region, 0 otherwise. Since all xi ’s are constrained to be 0 or 1, all eij ’s and f ijkl ’s are guaranteed to be 0 or 1 without explicit integrality constraints on them. Thus the number of binary variables remains the same as the previous model. 3.3 Multiply Connectedness Constraints A connected region that does not satisfy constraints (5)-(11) has one or more holes. It, however, does not mean that such a region violates Euler’s formula; but that it has faces the previous formulation cannot detect. Those overlooked faces are formed by more than four edges. Fortunately such faces are easy to enumerate, since each of them encircles one and only one hole (see the region on the right in Figure 3). Therefore, the following constraint generalizes Euler’s formula for a connected region with n holes (an (n+1)-ply connected region) with the (4, 8) adjacency assumed. m eij (1 n f ijkl ) 2 (12)
¦
( i , j )E
¦
( i , j , k ,l )F
This constraint, together with constraints (2)-(4) and (6)-(11), guarantees to select a connected region with n holes from a raster map. It should be noted that the number of holes can be confined in a specific range (rather than a single value) by making constraint (12) an inequality.
4 Experiments To show how the proposed constraints address connectedness and perforation, three sample problems are considered. All use the same dataset from Williams (2002): a 10-by-10 raster map, in which each cell is assigned a random number taken from a uniform distribution of values ranging from 0.2 to 1.8 with a step size of 0.1. Note that for simplicity and transparency the problems are rather contrived, admittedly, and that reallife problems often involve multiple regions (see, e.g. Benabdallah and Wright 1992, Aerts et al. 2003) and multiple objectives (see, e.g. Wright et al. 1983, Gilbert et al. 1985, Diamond and Wright 1988).
416
Takeshi Shirabe
Problem 1: Select a connected region of m cells from the raster map to minimize the sum of each cell’s value. The problem is formulated as the following MIP model. Min (13) ci xi
¦ iI
Subject to (2)-(4) where ci represents the random number attributed to cell i. The model was solved with 100 different m’s from one to 100 on a Pentium 4 with 512 MB RAM, using the CPLEX 8.0 MIP solver. The model is generally tractable at this scale, as solution times (in wall clock) range from nearly zero to 23.54 seconds with a median of 1.08 second. All instances that took more than 10 seconds to solve are clustered where m is between 20 and 52. The model’s tractability generally improved as m departed from this range. This indicates that while a region is relatively small, it gets harder to achieve connectedness as the region grows; but that once a region becomes sufficiently large, it gets easier to achieve connectedness as the region further grows. Problem 2: Select a simply connected region of m cells from the map to minimize the sum of each cell’s value. The problem differs from Problem 1 only in that no holes are allowed. It is formulated as a MIP model consisting of objective function (13) and constraints (2)-(11). The model was again solved with 100 different m’s using the same computing system. It turned out that the model is less tractable than the previous one, as the maximum solution time was 121.24 seconds and the median was 4.355 seconds. It is also found that the model has two separate clusters of difficult instances (figure 4). The first cluster appears where m is between about 10 and 50, with the most difficult case at m = 24. It corresponds to the cluster found in Problem 1, which has already been explained. The second cluster is seen where m is relatively large, and the most difficult case is where m = 93. The implication of this is that a large region (in terms of the number of cells) is more susceptible to perforation. In fact, Problems 1 and 2 have identical optimal solutions, where m is smaller than 78. Problem 3: Select a connected region of m cells with n holes from the map to minimize the sum of each cell’s value. The problem is formulated as a MIP model consisting of objective function (13) and constraints (2)-(4) and (6)-(12). Two experiments were conducted. In the first experiment, m was fixed to 24 (corresponding to the most difficult case in the first cluster for Problem 2) while n was varied from zero to four (the largest possible number of holes). In the second experiment, m was fixed to 93 (corresponding to the most difficult case in the
Modeling Topological Properties of a Raster Region
417
second cluster for Problem 2) while n was varied from zero to seven (the largest possible number of holes). Selected solutions are illustrated in figure 5, and numerical results are summarized in table 1. 140
Fig. 5. Optimal solutions to Problem 3 with m = 24
Table 1. Numerical results for Problem 3 n
MIP(24)
LP(24)
Time(24)
BB(24)
MIP(93)
LP(93)
Time(93)
BB(93)
0
10.9
9.80
45.62
3,770
82.40
81.95
123.49
30,459
1
11.7
10.15
164.63
11,131
81.70
81.70
0.21
0
2
13.5
10.52
8,780.39
668,779
81.70
81.70
0.21
0
3
14.7
10.88
22,442.42
1,704,274
81.80
81.80
0.21
0
4
16.3
11.25
87,882.50
6,737,713
81.90
81.90
0.19
0
5
-
-
-
-
82.10
82.07
0.33
0
6
-
-
-
-
82.30
82.25
0.29
0
7
-
-
-
-
82.70
82.50
0.98
0
MIP(24), LP(24), Time(24), and BB(24) respectively represent the optimal objective value of the MIP model, the optimal objective value of its LP relaxation (with no integer constraints on xi’s), the solution time in wall clock, and the number of branch-and-bound nodes required, when m = 24. MIP(93), LP(93), Time(93), and BB(93) represent those, when m = 93. Roughly speaking, a brunch-and-bound al-
418
Takeshi Shirabe
gorithm, which is employed by CPLEX and many other MIP solvers, solve a model by moving from one LP feasible solution to another until a solution is found MIP optimal in a treelike fashion. So, in general, the larger the gap between the LP and the MIP optima, and the size of the tree (consisting of branch-andbound nodes), the less tractable the model is.
As seen in table 1, where m = 93, any degree of perforation is a trivial requirement. It may be speculated that when a problem requires a perforated region to encompass large part of a raster map, there are relatively a small number of feasible solutions. Where m = 24, on the other hand, the model’s tractability significantly deteriorates. The model becomes less tractable as n increases, and takes about one full day of computation to solve where n = 4. Other computational experiments have found even more difficult cases, whose solution requires several days of computation. These suggest that perforation requirement adds excessive computational complexity. 5 Conclusion We have formulated three sets of MIP constraints that guarantee to select from a raster map a connected region regardless of the number of holes, a connected region without holes, and a connected region with a specified number of holes, respectively. Though they involve relatively few binary decision variables, computational experiments suggest that they are, in practical sense, not tractable enough to be solved by general-purpose MIP solver such as CPLEX. The most difficult problems seem to be those to select a medium-sized connected region with many holes from a large raster map (though perforated regions are largely more theoretical interest than practical application). Thus, to solve such problems, one may need to resort to heuristic methods (e.g. Brookes 1997, Aerts and Heuvelink 2002). We have assumed the (4, 8) adjacency in this paper. It is then natural to ask whether connectedness and perforation constraints can be similarly formulated in the (8, 4) adjacency and the hyper raster model. At present, we do not have answers. The difficulty lies in the facts that those models do not represent a region by a planar graph. Still, they are structured so regularly that other approaches might exist. This should be explored in future research. Lastly, we have seen significant overlap between discrete topology and spatial optimization. Connectedness and perforation are fundamental yet small part of it. It would be interesting to see how other topological properties – including topological relations between two regions – are incorporated into spatial optimization models. This, too, should be explored in future research.
Modeling Topological Properties of a Raster Region
419
References Aerts JCJH and Heuvelink GBM (2002) Using simulated annealing for resource allocation. International Journal of Geographical Information Science 16: 571587 Aerts JCJH, Eisinger E, Heuvelink GBM, Stewart TJ (2003) Using linear integer programming for multi-region land-use allocation, Geographical Analysis 35: 148-169 Ahuja RK, Magnanti TL, and Orlin JB (1993) Network flows: theory, algorithms, and applications. Prentice Hall, Englewood Cliffs, New Jersey Alexandroff P (1961) Elementary concepts of topology. Dover, New York. Brookes CJ (1997) A parameterized region-growing programme for region allocation on raster suitability maps. International Journal of Geographical Information Science 11: 375-396 Benabdallah S, Wright JR (1991) Shape considerations in spatial optimization. Civil Engineering Systems 8: 145-152 Benabdallah S, Wright JR (1992) Multiple subregion allocation models. ASCE Journal of Urban Planning and Development 118: 24-40 Church RL, Gerrard RA, Gilpin M, Sine P (2003) Constructing Cell-Based Habitat Patches Useful in Conservation Planning. Annals of the Association of American Geographers 93: 814-827 Cova TJ, Church RL (2000) Contiguity constraints for single-region region search problems. Geographical Analysis 32: 306-329 Diamond JT, Wright JR (1988) Design of an integrated spatial information system for multiobjective land-use planning. Environment and Planning B 15: 205214 Diamond JT, Wright JR (1991) An implicit enumeration technique for land acquisition problem. Civil Engineering Systems 8: 101-114 Egenhofer M, Franzosa R (1991) Point-set topological spatial relations. International Journal of Geographical Information Systems 5: 161-174 Gilbert KC, Holmes DD, Rosenthal RE (1985) A multiobjective discrete optimization model for land allocation. Management Science 31: 1509-1522 Kong TY, Rosenfeld A (1989) Digital topology: introduction and survey. Computer Vision, Graphics, and Image Processing 48: 357-393 Kovalevsky VA (1989) Finite topology as applied to image analysis. Computer Vision, Graphics, and Image Processing 46: 141-161 Longley PA, Goodchild MF, Rhind DW (2001) Geographic information systems and science. John Wiley and Sons, New York McDonnell MD, Possingham HP, Ball IR, Cousins EA (2002) Mathematical methods for spatially cohesive reserve design. Environmental Modeling and Assessment 7: 107-114 McHarg I (1969) Design with nature. Natural History Press, New York
420
Takeshi Shirabe
Randell DA, Cui Z, Cohn AG (1992) A spatial logic based on regions and connection. Proceedings of the third international conference on knowledge representation and reasoning, Morgan Kaufmann, San Mateo: 165-176 Roy AJ, Stell JG (2002) A quantitative account of discrete space. Proceedings of GIScience 2002, Lecture Notes in Computer Science 2478. Springer, Berlin: 276-290 Shirabe T (2004) A model of contiguity for spatial unit allocation, in revision Tomlin CD (1990) Geographical information systems and cartographic modelling. Prentice Hall , Englewood Cliffs, New Jersey Williams JC (2002) A Zero-One programming model for contiguous land acquisition, Geographical Analysis 34: 330-349 Williams JC (2003) Convex land acquisition with zero-one programming, Environment and Planning B 30: 255-270 Weisstein EW (2004) World of mathematics. http://mathworld.wolfram.com Winter S (1995) Topological relations between discrete regions In: Egenhofer MJ, Herring JR (eds) Advances in Spatial Databases, Lecture Notes in Computer Science 951. Springer , Berlin: 310-327 Winter S, Frank AU (2000) Topology in raster and vector representation. Geoinformatica 4: 35-65 Wright JR, ReVelle C, Cohon J (1983) A multipleobjective integer programming model for the land acquisition problem. Regional Science and Urban Economics 13: 31-53
Sandbox Geography – To learn from children the form of spatial concepts Florian A. Twaroch & Andrew U. Frank Institute for Geoinformation and Cartography, TU Vienna, [email protected], [email protected]
Abstract The theory theory claims that children’s acquisition of knowledge is based on forming and revising theories, similar to what scientists do (Gopnik and Meltzoff 2002). Recent findings in developmental psychology provide evidence for this hypothesis. Children have concepts about space that differ from those of adults. During development these concepts undergo revisions. This paper proposes the formalization of children’s theories of space in order to reach a better understanding on how to structure spatial knowledge. Formal models can help to make the structure of spatial knowledge more comprehensible and may give insights in how to build GIS. Selected examples for object appearances are modeled using an algebra. An Algebra Based Agent is presented and coded in a functional programming language as a simple computational model.
1 Introduction Watch children playing in a sandbox! Although they have not collected much experience about their surroundings, they follow rules about objects in space. They observe solid objects and liquids, they also manipulate them, and they can explain spatial behavior of objects albeit their judgments are sometimes in error (Piaget and Inhelder 1999). Infants individuate objects; they seem to form categories and can make predictions about object occurrences. Like geographers they explore their environment and make experiments and derive descriptions. Geographers omit large scale influences like geomorphologic movements, so do children. Lately social
422
Florian A. Twaroch & Andrew U. Frank
interactions are considered in spatial models. These can be also found in the sandbox. The playing toddlers have contact with other kids and from time to time they check if their caring parents are still in the vicinity. Children also form theories about people (Gopnik, Meltzoff et al. 2001). This paper proposes to exploit recent findings of psychologists in order to build formal models for GIS. Different approaches are taken to explain how adults manage spatial knowledge. Newcombe and Huttenlocher (2003) review three approaches that influenced spatial development in the research during the last fifty years Piagetianism, Nativism and Vygotskyanism. Followers of Piaget assume that children start out with no knowledge about space. In a four-stage process child knowledge develops to adult knowledge. A follower of the nativist view is Elizabeth S. Spelke, who has identified in very young children components of cognitive systems that adults still make use of. It is called core knowledge (Spelke 2000). New knowledge can be built by the composition of these core modules. The modules itself are encapsulated and once they are triggered they do not change (Fodor 1987). Vygotskyanists believe that children are guided and tutored by elders, cognitive efforts are adapted to particular situations, and that the human has a well developed ability in dealing with symbolic material (Newcombe and Huttenlocher 2003). The present work concentrates on a view called the theory theory explored by A. Gopnik and A. N. Meltzoff. From the moment of birth the acquired knowledge undergoes permanent change whenever beliefs do not fit together with observed reality (Gopnik, Meltzoff et al. 2001; Gopnik and Meltzoff 2002). The presented paper starts with a formalization of this model using Algebra as a mathematical tool for abstracting and prototyping. Finding very simple and basic concepts about the physical world is not new. Patrick Hayes proposes in a manifesto a Naïve Physics (Hayes 1978; Hayes 1985). A Naïve Theory of Motion has been investigated (McCloskey 1983). The geographic aspects have been considered in a Naïve Geography that forms a “body of knowledge that people have about the surrounding geographic world” (Egenhofer and Mark 1995). This knowledge develops through space and time. It starts at very coarse core concepts and develops to a fully fledged theory. An initiative for common sense geography has been setup to identify core elements (Mark and Egenhofer 1996). The demand for folk theories has been stated by several authors in Artificial Intelligence science to achieve a more usable intuitive interface. Recent results by the psychology research community can influence GIS by forming new and sound models. In extension to the naïve ge-
Sandbox Geography – To learn from children the form of spatial concepts
423
ography new insights about how to structure space can be won by the investigation of children’s mind. Recent research in developmental psychology is discussed in section two of the paper. Section three connects these findings to current GIS research. The use of Algebra in GIS is introduced in section four. An Algebra Based Agent is proposed in section five, using simple examples for object appearances on static and moving objects. Section six introduces the prototypic modeling done so far. In the concluding seventh section the results and future research topics are discussed.
2 Children and Space In a multi-tiered approach for GIS (Frank 2001) the human plays a central role – a cognitive agent is modeled as its own tier. For the last fifty years children were ignored in geo-sciences. Children for a long time were not investigated as an object of psychological research. Aristotle and the English philosopher John Locke considered them tabulae rasae, not knowing anything in advance. Nowadays a whole research enterprise has developed which investigates children’s mental models. It started with Piaget in the early fifties (Piaget, Inhelder et al. 1975; Piaget and Inhelder 1999). Although he was wrong in some of his assumptions his ideas have been studied in detail. Piaget had the opinion that children start out into the world without any innate knowledge. All the knowledge a person has at a certain point of time had to be acquired before. Today researchers suppose that there is some innate knowledge available that is either triggered in some way and reused or developed in form of adaption. According to the theory theory the learning process is driven by three components (Gopnik and Meltzoff 2002): Innate knowledge – core knowledge: Evidence shows that babies are born with certain abilities. Object representations consist of 3 dimensional solid objects that preserve identity and persist over occlusion and time (Spelke and Van de Walle 1993). Gopnik, Meltzoff and Kuhl show that there is also an innate understanding of distance. The same authors also detected evidence that there are links between information picked up from different sensor modalities (Gopnik, Meltzoff et al. 2001). Powerful learning abilities: Equipped with those innate structures babies start a learning process. Language acquisition especially shows how powerful this mechanism must be (Pinker 1995). In the first six years a child learns around 14 000 words. Another thing that has to be learned is
424
Florian A. Twaroch & Andrew U. Frank
the notion of object permanence. To understand object permanence means to understand that a hidden object continues to exist. Different approaches seem to be used by children to explain this phenomenon during their learning process. The formation of object categories and the understanding of causal connections are two other aspects that have to be learned by children throughout many years (Gopnik, Meltzoff et al. 2001). Unconscious tutoring by others: Adults teach children by doing things in certain ways. By repeating words, accentuating properly and speaking slowly they help children unconsciously to acquire the language. Children learn many things by imitation. The absence of others can heavily influence social behavior. As demonstrated by Kaspar Hauser 1828 in Nürnberg, Germany. These three components innate knowledge, powerful learning and tutoring by others are also the basis for the theory theory by A. Gopnik and A. N. Meltzoff (Gopnik, Meltzoff et al. 2001; Gopnik and Meltzoff 2002). Children acquire knowledge by forming and revising theories, similar to what scientists do. The spatial concepts infants live in are obviously different from an adult’s concepts. Ontological commitments are made in order to explain events in the world. The theories babies build about the world are revised and transformed. Children form theories about objects, people, and kinds; they learn language and all this is connected to space. The core about the theory theory is formulation and testing of hypotheses. It is a theory about how humans acquire knowledge about the world (by forming theories). When children watch a situation, they are driven by an eagerness to learn. They set up a hypothesis about a spatial situation and they try to prove it by trial and error. If the outcome is as expected they become uninterested (bored) and give up testing after some tries. If something new happens they test again and again, even using methodology. When they are puzzled they try new hypothesis and test alternatives. An 18 months old child is not supposed to concentrate for a long time. But an experiment shows that they test hypothesis up to half an hour (Gopnik, Meltzoff et al. 2001). User requirement analysis is a common way to build ontologies for GIS, using interviews, questionnaires, and desktop research. Infants can not communicate their experiences with space through language, so psychologists make use of passive and active measure studies. Two methods will be shortly described here. Studies of predictive action like reaching with the hand for an object: Infants are presented with a moving object while their reaching and tracking actions are observed and measured. When doing so children act predictive. They start reaching before the object enters their reaching space, aiming for a position where the object will appear when it will reach their hands.
Sandbox Geography – To learn from children the form of spatial concepts
425
Similar observations can be made for visual tracking studies and studies that measure predictive motion of the head. There is evidence that infants extrapolate future object positions (von Hofsten, Feng et al. 2000). Studies of preferential looking for novel or unexpected events: When children are confronted with outcomes different from their predictions they are puzzled. It is like watching magic tricks (Gopnik and Meltzoff 2002). The surprise can be noticed by the children’s stare. An independent observer can measure how long children watch a certain situation in an experimental setup. It is evident that children make inferences about object motions behind other objects (Spelke, Breinlinger et al. 1992).
3 Sandbox Geography A sandbox is a place for experimentation; The laws of physics can be investigated using very simple models. The models are made of sand, so they do not last forever, but they can raise new insights into the little engineers’ understanding. The objects treated in a sandbox underlie a mesoscopic partitioning (Smith and Mark 2001) they are on human scale and they belong to categories that geographers form. “Sandbox Geography” is motivated by children’s conception of space and can be seen as a contribution to the naïve geography (Egenhofer and Mark 1995). The investigation of very simple spatial situations is necessary to find out more about how space is structured in mental models. The goal is a formalization of these simple models. There is no need to connect these models to a new theory of learning nor do the authors intend to build a computational model for a child’s understanding of space. Furthermore, the sandbox is also a place to meet, a place of social interaction. The social aspect is considered more and more in building ontologies for GIS. The presented research may contribute new insights for finding structures to define sound GIS interfaces. The basis of the present investigation is the theory theory as explained in the previous section. An initial geographic model formed under this assumption will underlie changes. The necessity for adaptation can be caused by two reasons. First, the environment may change and the models we made about it may not be applicable anymore. Second, we may acquire new knowledge or be endowed with new technology. Our conceptual models then change and we perceive the environment differently. Consequently we do model the environment differently. We select one example of several theories in this paper for modeling what is called object permanence in psychology. Where is an object when
426
Florian A. Twaroch & Andrew U. Frank
it is hidden? Adults have a quite sophisticated theory about “hidden objects”. Four factors contribute to their knowledge. Adults know about spatial relations between stationary objects, they assume the objects to have properties and they know about some laws that govern the movement and the perception of objects. Equipped with this knowledge they can predict where and when an object will be visible to an observer. They can explain disappearance and reappearance and form alternate hypothesis about where the object might be if the current rules do not hold (Gopnik and Meltzoff 2002). Children start out with quite a simple theory where an object might be. 2.5 months old infants expect an object to be hidden when behind a closer object, irrespective of their relative sizes. After about a month they will revise this theory and consider the size as well. An object that disappeared is firstly assumed to be at the place where it appeared before. That is habituation – parents tidy up in the world of infant’s objects. A later hypothesis is that an object will reappear at the place where it disappeared. The object is individuated only by its location. The properties of the object seem to be ignored. It is even possible to exchange a hidden object. In a series of experiments an object is presented to the child and then hidden behind a screen. An experimenter exchanges the hidden object e.g. a blue ball by a red cube. Then the screen is removed. A child around the age of six months will not be surprised as long as an object reappears where it disappeared. Surprise appears only if observations and predictions about an object do not fit together. Because the child’s prediction does not consider properties of objects, the exchange of the object will not lead to a contradiction between prediction and observation. The object individuation by location will change in the further development to an object individuation by movement. An object that moves along a trajectory will be individuated as a unique object even when it does change its properties. Additionally, there seems to be a rule that solid 3D objects can not move into each other as long as they are on the same path. The child will even be able to make a prediction about when the object will appear on a certain point in the trajectory. This theory will again change to an object individuation by physical properties like shape, color, and size. As it goes through this process the child will come closer to an adult’s theory of objects with every new experience it makes about the objects. In the following sections we want to present a formalization of the “hidden object” problem. It is the first model in the necessary series of models for the sandbox geography.
Sandbox Geography – To learn from children the form of spatial concepts
427
4 Algebra An algebraic specification consists of a set of sorts, operations, and axioms (Loeckx, Ehrich et al. 1996). There are well known algebras, like the algebra for natural numbers, the Boolean algebra or the linear algebra for vector calculations. An algebra groups operations that are applied to the same data type. The Boolean algebra has operations that are all applied to truth values. Axioms describe the behavior of these operations. An example is given below. Algebra Bool b operations not :: b -> b and, or, implies :: b -> b -> b axioms (for all p,q) not(not p) = p p and q = if p then q else False p or q = if p then True else q p implies q = (not p) or q A structure preserving mapping from a source domain to a target domain is called morphism. Morphisms are graded by their strength and describe the similarity of objects and operations in source and target domain. Finding or assuming morphisms helps to structure models. They help to link a cognitive model to a model of the real world. Previous work has successfully used algebra to model geographic problems (Frank 1999; Frank 2000; Raubal 2001). Algebras help to abstract geographic problems and offer the possibility to do this in several ways. An Algebra can be used as a sub algebra within another algebra and thus allows the combination of algebras. Instantiation is another way to reuse algebras (Frank 1999). This research assumes the following hypothesis: Theories of space can be described by a set of axioms. It is possible to revise such a theory by adding, deleting, or exchanging axioms. Therefore algebra seems to be the right option for modeling the problem. Algebras for different spatial situations can be built and quickly tested with an executable functional programming language.
428
Florian A. Twaroch & Andrew U. Frank
5 Agents and Algebra To model the “hidden objects” an agent based approach has been chosen. An agent can be defined as “Anything that can be viewed as perceiving its environment through sensors and acting upon the environment through effectors” (Russell and Norvig 1995). Several definitions can be found in the literature (Ferber 1998; Weiss 1999). Modeling an Algebra based Agent is motivated by using the tiered belief computational ontology model proposed by (Frank 2000). A two tiered reality beliefs model allows to model errors in a person’s perception by separating facts from beliefs. This distinction is vital for modeling situations where agents are puzzled. This happens always when a belief about the “real world” does not fit together with the actual facts. Several reactions are possible in this situation. 1. The agent retests the current belief against the reality. 2. The agent makes use of an alternative hypothesis and tests it. 3. If no rule explains the model of reality the agent has to form a new ad-hoc rule that fits. 4. If all rules fail and ad-hoc rules also do not work the agent has to exchange its complete theory. This is not the case under the hypothesis taken that theories can be revised by adding, deleting, or exchanging axioms. The agent generates a reaction of surprise when beliefs do not fit together with facts about the world. An environment with a cognizant agent has to be built as a computational model.
6 Computational Model The computational model consists of a simple world with named solid objects. The objects can be placed in the world. Their locations are described by vectors. It is also possible to remove objects. An agent has been modeled that can observe his own position and orientation in the world. Algebra World(world of obj, obj, id, value) Operations putObj removeObj getObj
Sandbox Geography – To learn from children the form of spatial concepts
429
Algebra Positioned(obj, vec) Uses VecSpace Operations putAt isAt Algebra VecSpace(Vector,length) Operations dotProd orthogonal distance direction ... Algebra Object(obj) Operations maxHeight maxWidth color ... The computational model motivated by the theory theory makes use of three basic properties. The prototypic agent has to be endowed with innate knowledge. Jerry Hobby claims that there is a certain minimum of “core knowledge” that any reasonably sophisticated intelligent agent must have to make its way in the world (Hobbs and Moore 1985). The agent is able to observe the distance and direction between him and an object. The objects are given names in order to identify them. The agent can give an egocentric description about the objects in the environment. A learning mechanism shall enable the agent to revise his knowledge and have new experiences with his environment. The agent shall apply a mechanism of theory testing by making hypotheses and testing and verifying them. To determine the location of an object the agent is equipped with an object-location memory. Each object is situated on a vector valued location in the world. The agent stores locations with a timestamp in order to distinguish when objects have been perceived at a certain location. For the first version of the computational model agents do not move. The agent generates a reaction of surprise when beliefs do not fit together with facts about the world. Algebra Agent Operations
430
Florian A. Twaroch & Andrew U. Frank
position :: agent -> pos direction :: agent -> dir observe :: t -> world -> [obj] predict :: t -> [[obj]] -> Maybe egocentric :: t -> [obj] Axiom isSurprise = If observe(t1,world) <> predict(t1,[[obj]]) then TRUE By the exchange of one axiom three different behaviors and thus three spatial conceptualizations can be achieved. In the first model a disappearance of an object will be explained by the following hypothesis. The object will be behind the occluding object where it disappeared. The predict function will return a list of objects at time ti, being behind an occluding object. Axiom: predict (t,[[obj]]) = [obj(ti)] Observing contradiction with prediction, an axiom is replaced. This gives new prediction. The second model formalized will consider that the object will be where it appeared before. The predict function will return a list of objects being perceived at an initial observation time t0. Axiom: predict (t,[[obj]]) = [obj(t0)] The final model will assume that an object will be where it disappeared. The predict function will deliver a list of objects visible at the observation time tv. Axiom: predict (t,[[obj]]) = [obj(tv)] For the realization of this computational model an executable functional language has been chosen. Haskell is widely accepted for rapid prototyping in the scientific community (Bird 1998).
7 Conclusion and Outlook Naïve Geography theories can benefit from the presented investigations. We have shown that it is possible to formalize the conceptual models children have about space. Further research in developmental psychology will be beneficial for this work, but the already existing body of research will
Sandbox Geography – To learn from children the form of spatial concepts
431
be sufficient for my Ph.D. Future research will certainly concentrate on moving objects and extend the presented approach. The formalization has to be enhanced and different spatial situations have to be modeled using algebra. We will not undertake human-subject testing, but concentrate on formalizing operations reported in the literature. The model of our current agent can be extended by the inclusion of perspective taking, delivering intrinsic and absolute allocentric descriptions of the world. If an agent is tutored by other agents it requires rules about when and how knowledge is acquired. Last considerations are at that time omitted in our research. However we want to keep it as an interesting topic for the future. It is important to identify the structures in the spatial models and find mappings between them. To form a sound GIS theory we need to find simple commonsense concepts. Children’s understanding of space can be exploited to find these concepts. This paper wants to contribute towards a better understanding of the formal structure of spatial models – the Sandbox Geography.
Acknowledgements This work has been funded by the E-Content project GEORAMA EDC11099 Y2C2DMAL1. I especially like to thank to Prof. Andrew Frank for guiding me through the process of finding a Ph.D. topic, for his discussions and all the help with this paper. Gratefully we would like to mention Christian Gruber and Stella Frank for correcting the text and checking my English. Last but not least many thanks to all colleagues especially Claudia Achatschitz from the Institute for Geoinformation for their valuable hints.
Bibliography Bird, R. (1998). Introduction to Functional Programming Using Haskell. Hemel Hempstead, UK, Prentice Hall Europe. Egenhofer, M. J. and D. M. Mark (1995). Naive Geography. Lecture Notes in Computer Science (COSIT '95, Semmering, Austria). A. U. Frank and W. Kuhn, Springer Verlag. 988: 1-15. Ferber, J., Ed. (1998). Multi-Agent Systems - An Introduction to Distributed Artificial Intelligence, Addison-Wesley. Fodor, J. A. (1987). The modularity of mind: an essay on faculty psychology. Cambridge, Mass., MIT Press.
432
Florian A. Twaroch & Andrew U. Frank
Frank, A. U. (1999). One Step up the Abstraction Ladder: Combining Algebras From Functional Pieces to a Whole. Spatial Information Theory - Cognitive and Computational Foundations of Geographic Information Science (Int. Conference COSIT'99, Stade, Germany). C. Freksa and D. M. Mark. Berlin, Springer-Verlag. 1661: 95-107. Frank, A. U. (2000). "Spatial Communication with Maps: Defining the Correctness of Maps Using a Multi-Agent Simulation." Spatial Cognition II: 80-99. Frank, A. U. (2001). "Tiers of ontology and consistency constraints in geographic information systems." International Journal of Geographical Information Science 75(5 (Special Issue on Ontology of Geographic Information)): 667-678. Gopnik, A. and A. N. Meltzoff (2002). Words, Thoughts, and Theories. Cambridge, Massachusetts, MIT Press. Gopnik, A., A. N. Meltzoff, et al. (2001). The Scientist in the Crib - What early learning tells us about the mind. New York, Perennial - HarperCollins. Hayes, P. (1985). The Second Naive Physics Manifesto. Formal Theories of the Commonsense World. J. R. Hobbs and R. C. Moore. Norwood, New Jersey, Ablex Publishing Corporation: 1-36. Hayes, P. J. (1978). The Naive Physics Manifesto. Expert Systems in the Microelectronic Age. D. Mitchie. Edinburgh, Edinburgh University Press: 242-270. Hobbs, J. and R. C. Moore, Eds. (1985). Formal Theories of the Commonsense World. Ablex Series in Artificial Intelligence. Norwood, NJ, Ablex Publishing Corp. Loeckx, J., H.-D. Ehrich, et al. (1996). Specification of Abstract Data Types. Chichester, UK and Stuttgart, John Wiley and B.G. Teubner. Mark, D. M. and M. J. Egenhofer (1996). Common-Sense Geography: Foundations for Intuitive Geographic Information Systems. GIS/LIS '96, Betsheda, American Society for Photogrammetry and Remote Sensing. McCloskey, M. (1983). Naive Theories of Motion. Mental Models. D. Genter and A. L. Stevens, Lawrence Erlbaum Associates. Newcombe, N. S. and J. Huttenlocher (2003). Making Space: The Development of Spatial Representation and Reasoning. Cambridge, Massachusetts, MIT Press. Piaget, J. and B. Inhelder (1999). Die Entwicklung des räumlichen Denkens beim Kinde. Stuttgart, Klett-Cotta. Piaget, J., B. Inhelder, et al. (1975). Die natürliche Geometrie des Kindes. Stuttgart, Ernst Klett Verlag. Pinker, S. (1995). The Language Instinct. New York, HarperPerennial. Raubal, M. (2001). Agent-based Simulation of Human Wayfinding: A Perceptual Model for Unfamiliar Buildings. Institute for Geoinformation. Vienna, Vienna University of Technology: 159. Russell, S. J. and P. Norvig (1995). Artificial Intelligence. Englewood Cliffs, NJ, Prentice Hall. Smith, B. and D. M. Mark (2001). "Geographical categories: an ontological investigation." International Journal of Geographical Information Science 15(7 (Special Issue - Ontology in the Geographic Domain)): 591-612. Spelke, E. S. (2000). "Core Knowledge." American Psychologist November 2000: 1233-1243.
Sandbox Geography – To learn from children the form of spatial concepts
433
Spelke, E. S., K. Breinlinger, et al. (1992). "Origins of knowledge." Psychological Review 99: 605-632. Spelke, E. S. and G. S. Van de Walle (1993). Perceiving and reasoning about objects: insights from infants. Spatial representations: problems in philosphy and psychology. N. Eilan, R. McCarthy and B. Brewer. Cambridge, Massachusetts, Blackwell: 132-161. von Hofsten, C., Q. Feng, et al. (2000). "Object representation and predictive action in infancy." Developmental Science 3(2): 193-205. Weiss, G. (1999). Multi-Agent Systems: A Modern Approach to Distributed Artificial Intelligence. Cambridge, Mass., The MIT Press.
Determining Optimal Critical Junctions for Realtime Traffic Monitoring for Transport GIS Yang Yue1 and Anthony G. O. Yeh2 Centre of Urban Planning and Environmental Management, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, P. R. China 1 [email protected] 2 [email protected]
Abstract. Traffic data is the most important component of any transport GIS. They are mainly collected by traffic sensors (detectors). However, the installation of sensors is quite expensive. There have been studies on the finding of optimum location of inductive loop detector, the most widely used sensor in the past decades. However, with the advancement of new sensor technologies in recent years, many new sensors are now available for multi-lane and multi-direction traffic monitoring. Most of them are nonintrusive sensors which are more suitable be located at road junctions instead of roadways. Thus, different from previous studies on link-based sensor location, this paper explores method in determining critical road network junctions for the optimum location of nonintrusive sensors for monitoring and collecting real-time traffic data. The objective is to select the least number of junctions while maximally cover the road network. Since the problem is NP-complete, a greedy-based heuristic method is proposed and a numerical experiment is conducted to illustrate its efficiency.
1. Introduction The most typical application of transport GIS is road navigation systems and numerous travelers have benefited from them. As the key component of navigation systems, real-time traffic information enhances their efficiency and reliability, not only for road users but also for traffic managers.
448
Yang Yue and Anthony G. O. Yeh
However, the quality of real-time traffic information heavily depends on the location and number of the sensors deployed on the road network. Due to the high installation and maintenance costs, a saturation coverage of sensors on every road section of a road network is still prohibitive even with the drastic fall in the costs of electronic sensors in recent years (Perrin 2002). Thus, how to use the minimum number of sensors to obtain high quality traffic information is a problem worthy of study. In practical applications, sensors are usually installed on road links with the greatest volumes and the greatest variability (Klein 2001). This method is easy to implement but often difficult to reach a systematic solution. Some researchers realized this issue, and tried to solve this problem from a more theoretical view of point. Looking back at the research of the sensor location problem, most of the studies modelled it into a conventional facility location problem which is used for determining the optimal locations of a hospital or gas station, etc., for serving the maximum number of customers while minimizing the total transport distance. And since inductive loop detectors (IDLs), which are installed in the roadway subsurface, are the most widely used traffic sensors for decades, these studies further concentrated their focuses on the link-based location problem, i.e. locating the minimum number of sensors on the links of road network to cover the maximal area. However, with the advancement of new sensor technologies in recent years, the placement of traffic sensors are not limited on the roadways any more, especially for the nonintrusive sensors, such as cameras associated with video image processors, microwave radar, passive and active infrared devices, ultrasound, and passive acoustic arrays. Many of them are more suitable to be installed on road junctions to perform multilane and even multi-direction monitoring (Klein 2001). They are beginning to serve as substitutes for ILDs because of their superior abilities. However, to our best knowledge, there is no related study on how to determine the optimal locations of these junction-based sensors yet. So, this paper will discuss the problem of determining optimal critical road network junctions for the purpose of collecting real-time traffic data in the context of traffic monitoring and further applications in transport GIS.
2. Related studies Aiming at the location of IDLs, Lam and Ho (1990), Yang and Zhou (1998), Bianco et al. (2001), and Yang and Miller-Hooks (2002) explored link-based sensor location problem with different rules and criteria. Yang and Zhou (1998) adopted four rules (O-D covering rule, maximal flow
Determining Optimal Critical Junctions
449
fraction rule, maximal flow-intercepting rule and link independence rule) to select links for locating sensors with the objective function of maximizing the traffic flows covered. Besides giving priority to links according to the traffic flows, other criteria also include “the frequency of a link used in O-D pairs”, and “the Mean Absolute Error in estimation of O-D matrix” (Lam 1990). Meanwhile, in terms of interpreting the influence area of a sensor, Bianco et al. (2001) used a combined cutset1 associated the sensor to represent the area. Similar to the idea of Bianco et al., Yang and MillerHooks (2002) defined that a sensor can cover a link and all of its immediately adjacent link(s) when it is located on the link, and formulated the location problem into a dependent maximum arc-covering location problem based on this notion. However, the notions on defining the influence area of a sensor in the above studies may not be applicable for traffic flow monitoring and data collection in which more emphases are given to the change of flows and their distribution instead of the amount itself. This paper takes the view that what distinguishes the traffic sensor location problem, especially those flow related sensors, from other location problems is the influence area of a sensor. The influence areas of facilities, such as hospitals, gas stations, and schools are usually defined by a service radius. Either the definition of cutest or arc-covering is similar to this kind of method which contains more than one links in the influence area of a sensor. However, as we can see, traffic conditions are capricious. We can have congestions on a road link even without the disturbances from entrances or exits. However, as Homburger (1992) pointed that, “vehicles neither appear from nowhere nor disappear into thin air. They can only join traffic flows at trip origins or at junctions, and can leave them only at trip destinations, or, again, at junction”. So, the amount of vehicles on a road link will not keep conservational when meeting interrupted points, such as, junctions, enter- and exit-points. These herein are all referred to as junctions, where some flows leave and new flows join. So, for any single flow-monitoring-purposed sensor, as IDL for example, its effects cannot be beyond two junctions – it begins at a junction and ends immediately when the road link meets another junction(s), no matter the shape and the length of the road link (Fig.1). As to its immediately adjacent road(s), their status can only be either detected or un-detected, and the only benefit they get is the smaller estimation errors because of the relatively shorter propagation distance. Meanwhile, we should also realize that rules for deploying those speed or 1
A cutset (edge cutset) is a set of edges whose removal would disconnect the connected graph (a non-empty graph G is called connected if any two of its vertices are linked by a path in G).
450
Yang Yue and Anthony G. O. Yeh
density/occupancy related sensors are different from the flow related ones. Because speed and density/occupancy may vary from point to point even in a very short distance, and the effective range of speed or density/occupancy detection sensor is only the point they located. In this paper, we only exam the flow related sensors. Because of the limited influence area of IDLs, and other limitations, such as have to be buried under roadways, new nonintrusive sensors are introduced into the surveillance system. They can be used not only at constructive site, bridge, but also at the places like junctions where, for exam-
f1 f2
f3 A
f
B
Sensor
f4
f1 f2
f3 A
f
f4
B
f1 f2
Junction Influence area of traffic sensor
f3 A
f
B
f4
Fig. 0. The influence area of a traffic sensor in the area between two junctions
ple, they are able to measure the intersection flow split directly, which used to be achieved only with the installation of IDLs on every lane of a junction. The natures of the new sensors decide they are more suitable to be installed at junctions. So, junction-based sensor location problem ought to be studies to find the optimal location of these sensors, i.e. select the least number of junctions to cover maximum number of the road links. While in graph theory, there is a topic named as ‘vertex cover’. Vertex cover aims to find the least number of vertices to cover all the edges in a graph. This paper will use the vertex cover to explore the junction-based sensor location problem. In section 2, we will discuss the mathematical basis of the vertex cover problem. In section 3, we will use a numerical example to illustrate the proposed method. General discussion and conclusions are presented in sections 4 and 5, respectively.
Determining Optimal Critical Junctions
451
3. Problem formulation In this section, a mathematical formulation of the vertex cover problem is given. The following symbols are defined: E = set of edges in network, e E V = set of vertices in network, v V wij = weight of the edge (i, j) Then, the network of roads is a graph G = (V, E): the vertices V of G stand for the transport junctions (‘node’ in GIS), the edges E of G correspond to road links with junctions situated at their endpoints but not inbetween them, and wij is the traffic flow on the edge (i, j). A formal description of vertex cover is that: a graph G= (V, E), find a set of vertices of minimal size such that every edge (u, v), e E is adjacent to at least one vertex in the cover. The set of vertices is called a vertex cover. The size of a vertex cover is denoted by |V’| which is the minimum number of vertices of G, i.e., |V’| = min |V|. In Fig. 2., vertex covers, indicated with black dot, are shown for a number of graphs.
Fig. 1. Vertex Cover (Source: Weisstein 1999)
Let: Xi
1 ® ¯0
if vertex vi is chosen to be a critical vertex otherwise
Since our objective is to maximize the number of connected road links as well as the benefit of real-time information on the links, this is a multiobjective function: Maximize:
¦ X ¦e i
iV
( i , j )E
ij
and
¦ X ¦w i
iV
ij
( i , j )E
(1)
452
Yang Yue and Anthony G. O. Yeh
Subject to: min
¦X
i
iv
where Xi = {1, 0}, (i)V However, finding the solution of this problem is a NP-complete problem2 (Dolan 1993), thus the search for an optimal solution is intractable. The simplest algorithm is a greedy-based heuristic algorithm – “selects the vertex with highest degree, adds it to the cover, deletes all adjacent edges, and then repeats until the graph is empty” (Diestel 2000; Beineke 1978). VertexCover(G=(V,E)) while (V z 0) do: Select an vertex with higest degree W(v), v V If vertices have same degree Select the one with largest weight wij Add v to the vertex cover, delete v from V Delete all edges from E that are incident on v. The following section will use a numerical experiment to show how this algorithm is used for determining the optimal critical junctions.
4. Numerical experiment Consider a network with 24 junctions and 69 links that is shown in Fig.3, and AADT (Annual Average Daily Traffic) of each link is labeled. For each junction, the in- and out-traffic flows are calculated to represent its weight. Table 1 shows the degree and weight (total traffic flow amount) of each junction.
2
A NP-complete problem is a problem that to be both NP (verifiable in nondeterministic Polynomial time) and NP-hard (any other NP-problem can be translated into this problem).
Determining Optimal Critical Junctions 6,070
1
453
2 4,970
4,250
820
1,220
22
3,750
4,090
21
1,130
5,180
3,380
3,650
3,950 4,740
4,260
20
Fig. 2. The road network of the numerical example Table 1. Vertex degrees and weights of the given network in Fig. 3
We now determine the critical junctions of this network by using the greedy heuristic algorithm stated above. In this network, Junction 10 has the largest degree among all the junctions, so this junction is selected first and we are able to monitor all the flows on its adjacent links. Then delect the 9 edges from the network, and begin the next interation, until all the edges are covered. In the last two iterations, since only two links are left, so select either the endpoints of two links’ will work. Table 2 shows the selected junction and percentage of links and flows covered after each iteration. 15 iterations are used for solving the junction-based sensor location problem, i.e., 15 junctions are picked for the minimum vertex cover of the given network. Table 2. Selected junctions and its covered links and flows in each iteration
However, above procedure is a purely mathematic calculation. When applied onto road network, the traffic flow conservation rule can be taken into consideration which could speed the interation procedure. Because following the flow conservation rule, there should be:
¦f
( i , j )v
¦f
ij
(2)
jm
( j , m )v
So, among all the in and out flows of a junction, one flow can be allowed to be un-detected because its status can be derived by (3). Therefore, at least two vertices in this example (the last two) that can be discarded from the result set. Thus, at most 13 junctions will be enough to cover all the links in the network, as shown in Fig. 4. 2
1
3
12
4
11
5
6
9
8
7
10
16
18
17
13
14
15
23
22
24
21
19
20
Fig. 3. Optimal critical junctions of the network for a vertex cover
456
Yang Yue and Anthony G. O. Yeh
On the other hand, this problem can be considered as: given a percentage of links or flows that need to be covered, find the correspondingly optimal critical junctions. For example, in this network, if only 80% of the links or flows need to be covered, then only 9 and 7 junctions are needed respectively (Fig. 5). Those non-detected flows can be estimated by applying certain traffic flow estimation algorithm, for example, CA model (Chrobok 2001) or Turning Movement Estimation in Real Time (TMERT) to simulate or construct the traffic flow of the whole transport network. The accuracy of the latter method can be as high as 96% (Perrin 1999).
Coverage (%)
100 80
% of covered flows
60
% of covered links
40 20 0 1
3
5
7
9
11
13
15
4.2% 12.5 20.8 29.2 37.5 45.8 54.2 62.5% No. and percentage of critical junctions
Fig. 4. Percentage of links and flows covered and the number
5. Discussion The numerical experiment shows that using the greedy-based heuristic algorithm about half of the total number (54%) of the junctions of the transport network is able to cover all the links of the network. And if it is acceptable to have 80% coverage for the flows, this will be reduced to 7 (29% of the total number of junctions), achieving a further saving of construction and maintenance cost in installing the sensors at these selected junctions rather than at all the 24 junctions. Some studies (Crescenzi 2003; Skiena 1997) revealed that, with appropriate data structures, this greedybased algorithm can be done in linear time, and the minimum size of
Determining Optimal Critical Junctions
vertices, |V’|, will be between 2
log log V 2 log V
and 2
2 ln ln V ln V
457
(1 R (1)) .
However, in the case of ties or near ties, this heuristic algorithm can go seriously astray and, in the worst case, it can yield a cover that is O(log n) times optimal, where n is the number of vertices. Thus, another algorithm for vertex cover is suggested: repeatedly choose an edge, and put both of its endpoints into the cover; throw the vertices and its adjacent edges out of the graph, and continue. Since each edge that gets chosen must have one of its endpoints in the cover, it should be apparent that this algorithm uses at most twice as many vertices as the optimal vertex cover (Mitzenmacher 2003) We also need to be aware that in the experiment, we use AADT to represent the flows on each link which is a yearly-based value. However, in reality, flows vary from hour to hour, and day to day. When different flows, such as peak-hour data and off-peak data are used, the fluctuation would certainly influence the optimal solution of the critical junctions. So, the stability of the optimal location under different traffic conditions should be further studied.
6. Conclusions Real time traffic data is essential for any transport GIS for road navigation and traffic monitoring systems. The problem of determining the optimal critical network junctions for collecting real time traffic data for transport GIS can be solved by modelling it into a vertex cover problem. We have provided a mathematical formulation and a greedy-based heuristic approach to solve this problem. A numerical experiment using this method indicates that about half of the junctions will be able to cover all the links of the network, and if covering 80% of flows is acceptable, only 29% of the junctions are needed. This will help to save enormous amount of money in installing and maintaining traffic sensors for the collection of real time traffic data. In future studies, we will compare the performances of the two algorithms on a real road network. Our study will further examine the robustness and stability of this method for the location of optimal critical junctions under different traffic conditions, for example, the AADT and peak hour traffic flow.
458
Yang Yue and Anthony G. O. Yeh
Acknolwdgement The authors would like to thank the invaluable suggetions from the anonymous reviewers for improving this paper. References Bieneke L W (1978). Selected Topics in Graph Theory. Academic press, London Bianco L, Confessore G., and Reverberi P. (2001). A network based model for traffic sensor location with implications on O/D matrix estimates.Transportation Science 35(1): 50-60 Chrobok R, Wahel J., and Scherkenberg M. (2001). Traffic Forecast using simulations of large scale networks. 2001 IEEE Intelligent Transportaion Systems Conference Proceedings, Oakland, CA. Crescenz P, Kann V. (2003). A compendium of NP optimization problem. Diestel R (2000). Graph Theory (2nd Edition). Springer, New York Dolan A (1993). Networks and algorithms: an introductory approach. J. Wiley & Sons, Chichester Homburger W S (1992). Fundamentals of Traffic Engineering. Institute of Transportation Studies, University of California at Berkeley,, Berkeley Klein L A (2001). Sensor Technologies and Data Requirements for ITS. Artech House, Boston Lam W H K, Lo H. P. (1990). Accuracy of O-D estimates from traffic counts.Traffic Engineering and Control June: 358-367 Mitzenmacher M (2003). Computer Science 124 -- Data Structure and Algorithms. Perrin H J (1999). A Priority Mehod for Optimizing Network-Wide Traffic Detector Location. Salt Lake, Univeristy of Utah. Perrin H J (2002). Real time flow estimation using TMERT.Traffic Engineering and Control March: 90-91 Skiena S S (1997). The Algorithm Design Manal. Springer, New York Weisstein E W (1999). Vertex Cover -- from MathWorld, http://mathworld.wolfram.com/VertexCover.html Yang B, Miller-Hooks E. (2002). Determining critical arcs for collecting real-time travel information.Transportation Research Record 1783: 34-41 Yang H, Zhou J (1998). Optimal traffic counting locations for origin-destinaion matric estimation.Transportation Research B 32(2): 109-126
Collaborative Decision Support for Spatial Planning and Asset Management: IIUM Total Spatial Information System Alias Abdullah1, Muhammad Faris Abdullah1 and Muhammad Nur Azraei Shahbudin2 1- Department of Urban and Regional Planning, Kulliyyah of Architecture and Environmental Design, International Islamic University Malaysia [email protected], [email protected] 2 - Bureau of Consultancy and Entrepreneurship, International Islamic University Malaysia [email protected]
Abstract Although geographical information system (GIS) is being used as a tool in Malaysian development process, this is being limited to the planning and design stage of the process. There have been very few Malaysian initiatives to further the usage of GIS into the operation and post-operation phases of the development process. This paper looks at the effort by International Islamic University Malaysia (IIUM) in developing GIS-based system to assist the management of her campus. Keywords: GIS, development process, asset management, I-SPACE
1 Introduction In line with the advancement in information technology (IT), and coupled with Government policies and initiatives (Alias Abdullah et al. 2003 & Lee 2000), GIS has find its way into Malaysian built environment scene. Within Malaysian development process, GIS is being, to an extent, rigorously used as a tool to aid surveying, mapping, analyses, development plan preparation and development control activities (Azizan Mohd Sidin 2002). For instance, the Federal Department of Town and Country Planning have explicitly requested the application of GIS in the preparation of all development plans in West Malaysia. Additionally, GIS is also being extensively used by other government agencies and also by utility companies such as the Tenaga Nasional Berhad, the nation-wide electricity supplier.
460
A.Abdullah, M.F.Abdullah and M.N.A.Shahbudin
Even though the benefits of GIS are widely acknowledged among Malaysian built environment professionals, its use is still largely being limited to the planning and design stage of the development process (refer Figure 1). Attempts to apply GIS in the operation and post-operation stages of development process are very few. One of the more notable examples is by Putrajaya Corporation, the local authority responsible to oversee the development and management of the new Malaysian Government Centre (Putrajaya). They have embarked on developing an integrated urban management system known as SUMBER-PUTRA. Nevertheless, the system is still very much in its infancy stage and, thus, it is too early to gauge how effective or successful it would become. Stages of development process Operation Post-operation Planning & design Start Asset maintenance, loMapping, spatial analyses, plan prepa- cation and direction finding, tender and procureration ment, space planning and utilisation
End or restart
I-SPACE application in activities Fig. 1. I-SPACE application in development process
The following discussion in this paper looks at another Malaysian initiative to apply GIS further into development process - the International Islamic University Malaysia (IIUM) Total Spatial Information System.
2 Background IIUM is a full-size university covering an area of roughly 700 acres and home to 13,000 students and 3,000 staff. To cater the needs and requirements of her population, the University established a Development Division to oversee the planning and development of buildings and facilities within the campus area, and also to ensure proper management of all her assets. In the early period of their establishment, the Development Division had taken the tasks of planning and management using paper-based system. Despite its extensive set up, by 2001, the Division realised that the paper-based system could no longer cope with the amount of tasks they have to undertake. Being a relatively new university (IIUM was established in
Collaborative Decision Support for Spatial Planning
461
the 1980s but only recently moved to its present site), planning and design exercises were constantly being undertaken as new buildings are needed to be constructed. The amount of assets accumulated also increasing, both in terms of value and quantity, and management of these assets are becoming more and more labourious. Consequently, the Division decided to seek advice from the University’s Centre for Built Environment (a business wing of the Faculty of Architecture and Environmental Design) on how information technology (IT)-based systems can help to make the tasks of planning and management of the campus less burdensome and more effective and efficient. Although the Division was only interested in IT-based facility management system, the Centre suggested that GIS-based system would be more appropriate since planning and design exercises are involved. As GISbased system is able to perform analyses based on geographicallyreferenced spatial data, it would considerably improve the accuracy of those exercises. The Centre also suggested that, in order to reap the full benefit of such system, it has to be applicable to many users. Therefore, the functions incorporated into the system must be reasonably wideranging and not solely for spatial planning and asset management purposes. Additionally, it was also agreed that the system must also be ‘total’ in nature. This means, in developing the system, emphasis must be given to the elements of organisation, procedure, technology and data (Alias Abdullah et al. 2003). By late 2001, a team comprising lecturers from the Faculty of Architecture and Environmental Design and officers from the Information Technology Division was set up to begin developing the system.
3 I-SPACE Overview In brief, IIUM Spatial Information System (known as I-SPACE) is an integrated planning and decision support system for spatial planning and asset management. Although the system largely sits on a GIS platform, it also combines several other systems including computer aided design (CAD), image management system (IMS), facilities management system (FMS), and database management system (DBMS) (refer Figure 2). The system allows IIUM campus planning as well as management of assets to be undertaken based on geographically-referenced data and projection. The system aims to reduce wastage, enhance efficiency and effectiveness, and create a better working environment through computerisation of tasks and procedures of development of plans, documentation, production of letters
462
A.Abdullah, M.F.Abdullah and M.N.A.Shahbudin
and reports, procurement, space planning, inventory, and supplies (refer Figure 1). GIS CAD FMS IMS MMS
+ + + + +
I-SPACE
Fig. 2. Component of I-SPACE
The development of the system has been divided into several phases. The first phase was concluded in early 2002, and presently the system is undergoing the second phase of its development. Under the first phase, the main tasks were to develop the database structure, the graphical-userinterface (GUI) and basic analyses and queries. At the end of the first phase, the development team had come up with a prototype, which was presented to potential users for comments and feedbacks. Under the second phase, further enhancements are being carried out on the system such as integration of 3D data representation and incorporation of enhanced functions. Future phases will involve full implementation of the system as well as maintenance and further enhancement.
4 I-SPACE Development In order to ensure that the benefits are wide-ranging and long-lasting, the system was developed incorporating the four essential elements of a total spatial information system (TSIS). 4.1 Organisation A TSIS must be developed at organization or institution level. Decision to develop such system must be made by the top management of the organization and such decision must be disseminated to departmental level. This is to ensure that everyone in the organization is aware of the project and to garner full support for the development, operation and management of the system.
Collaborative Decision Support for Spatial Planning
463
The development of the I-SPACE was undertaken at organisation level, in this case IIUM. Although initiated by the Development Division of IIUM, the approval for the project came from the top management of the University. The funding for the project was approved by the University’s Standing Finance Committee. Presently, the system can be accessed by users from hubs located within the IIUM Campus through a local area network (LAN) that supports high-bandwith communication for short distances. Users access to the data and information are being restricted through assigned username and password. Online registration is provided where username and password will be assigned to users. Upon accessing the system, users will be prompted to enter username and password. Access to data and information are being restricted according to users and data sensitivity. 4.2 Procedure TSIS must be designed to cater all the tasks or job scope of the organisation. Thus, prior to undertaking system development works, the development team conducted a user need study where a meeting was arranged with representatives of the users in order to inform them about the project, the benefits of the system, and also to get their feedback on what they expect from the system. Findings from this meeting become among the important inputs in devising appropriate workflow for the system. During the second phase of I-SPACE development, the team began to engage in closer and more frequent discussions with the users in order to fine-tune the system to fully meet their requirements. These discussions also provided the avenue for the team to look at existing systems or databases maintained by the users and to discuss on how I-SPACE can be integrated with the existing systems or databases. 4.3 Technology Technology refers to the hardware and software for use in TSIS operation. Appropriate hardware and software must be acquired to ensure the system can be used to its full potential. It is important to recognise that TSIS is a ‘living’ system. It will require maintenance and system capacity building from time to time. Accordingly, sufficient funding must be made available for manpower to operate and maintain the system, and for capacity building, which include hardware and software upgrades as well as the overall improvement of the system. In developing I-SPACE, the team realized that it is important to acquire hardware and software with the right capability and functionality to
464
A.Abdullah, M.F.Abdullah and M.N.A.Shahbudin
perform users required tasks. In terms of software, the primary desktop application used in system development includes MapInfo, AutoCAD, MapBasic, and Visual Basic. The system employs the client-server network technology where data are being kept in a centralise database. For database management system, the system uses Oracle as the main engine. Two key hardware components required by the system are personal computers, from where users access the system, and network server, where the database is being kept. Most personal computers available in the market would be able to access the system. The only apparent difference between using high performance computers and low performance computers in accessing the system would be the speed of data download. The speed of data download for low performance computers would be significantly slower than high performance computers. Besides providing sufficient budget to acquire the required hardware and software for the initial development of I-SPACE, the University has also decided to allocate budget for operation and future capacity building of the system. Operating budget will be used towards recruitment and training of personnel to maintain and manage the system. The capacity building budget will be used for purchasing hardware and software upgrades when necessary and also for system improvement as a whole. Another network server may need to be purchased when the data storage capacity of the existing one is already at the maximum. New software or software upgrades may need to be acquired when more functions are to be included into the system. For instance, under the first phase of I-SPACE development, the system was only capable of performing 2D spatial analyses. However, due to the undulating landscape of the study area, it is necessary for 3D spatial analysis to be added into the system. Therefore, under the second phase of the system development, a new function is being added into the system, which is 3D spatial analysis. As a result, new software has been purchased in order to generate 3D data from the existing 2D data. A new application tool is also being designed and developed to allow users to perform the 3D analysis. 4.4 Data Development process deals with both spatial and aspatial data. Thus, TSIS must be able to handle both types of data successfully. This includes data storage, retrieval, analysis, update and sharing. Appropriate tools must be designed within the system to perform these data handling tasks.
Collaborative Decision Support for Spatial Planning
PLANNING:
ARCHITECTURE:
LANDSCAPE:
ENGINEERING:
x State, district & mukim boundaries x Land parcels x IIUM lot x IIUM photomap x Contour x Slope x Roads and highways x Rivers
A) Buildings x Floor plan x Furniture plan x Finishes plan x Room Photo B) 3D Model x Buildings blocks x Split levels x Stairs
A) Hardscape x Light poles x Flag poles x Signages x Flower pots x Pedestrian paths B) Softscape x Trees x Shrubs C) Nature x Landform (3D) x Streams x Wetlands x Ponds
A) Infrastructure x Road x Parking x Sewerage x Drainage x Water supply x Telecommunication x Hubs & networking B) M & E x AirCond system x AHU & PHU x Fan Coil Unit x Electrical system x Lamp posts x ICT equipments x Fire fighting system
I-SPACE
data model
465
GPS INVENTORY DATABASE
SPATIAL DATABASE Model Manager
I-SPACE environment
Data Manager
Knowledge Manager
x Staff x Students x Equipments x Furniture x Accessories
USER INTERFACE
Management & Service Division:
Student Affairs Division:
x Staff particulars x Rooms location
CORE USERS
x Student particulars x Room location
IIUM Properties:
Security Unit:
x Facilities & assets maintenance
x Safety x Security
IT Division:
Development Division:
x Hubs x Networking x Telephone, etc.
x Procurement x Planning x Inventories x Development
Students Admission and Records: x Class rooms/labs. x Examination settings
Fig. 3: I-SPACE data model
Data retrieved and analysed using TSIS can be used to assist decision-making. Thus, accuracy of data is also important. The more accurate the data, the more sound the decision made. Hence, it is necessary that the
466
A.Abdullah, M.F.Abdullah and M.N.A.Shahbudin
accuracy of those data be verified on the ground, especially spatial data. Accurate spatial data does not only lead to accurate and sound decision, but also can save costs. Future development can be planned and designed straight from the data retrieved from the system database without the need to conduct another ground survey. I-SPACE utilises both spatial and non-spatial data. The data model of the system is as shown in Figure 3. Spatial data collection involved two main stages. Firstly, spatial data were gathered from construction and asbuilt drawings of IIUM. Secondly, the accuracy of the data obtained from the drawings was verified on the ground through as-built survey. Any discrepancies between data obtained from drawings and from survey were rectified before those data were input into the database. Visual representations of the spatial data were captured in the form of photographs. Availability of photographs would help users in identifying the data on the ground and also help users in making selection of the facilities they want to book or use. For instance, users who would like to use IIUM’s seminar or conference room can check the photographs of suitable rooms before deciding on the one they preferred. The photographs would give a clearer picture on, for example, the layout of the room, the lighting or furniture design, and the wall finishes. In the second phase of the system development, works are being undertaken to improve the visual representation of the spatial data by presenting them in 3-D form. At the moment, 3-D model is being generated for the major buildings in the campus. A number of tools were designed and incorporated into I-SPACE to allow easy data retrieval and analysis. These tools are presented to users in clear and easy to use GUI. Several customised query tools are also being developed for advanced users who wish to perform a more complicated analysis on the data and information. Selected MapInfo tools such as query, ruler, and select tools were also included in the system (Figure 4).
5
Experience developing I-SPACE: data accuracy
One of the main objectives of I-SPACE development is to allow management decisions to be made based on accurate, geographicallyreferenced spatial data. In fact, there have been proposals that data retrieved from I-SPACE should be fit for tender purposes. Thus, to the team, the accuracy of the spatial data is given utmost importance. During the early stages of the system development, it was thought that as-built drawings would be the main source of accurate spatial data since
Collaborative Decision Support for Spatial Planning
467
these drawings were drawn based on pre-computation plans. Thus, the team began to gather and compile all the as-built drawings for all development projects within IIUM campus. However, it was later discovered that not all drawings were available for use since some (especially the old ones) have been badly damaged and some were already lost and could not be recovered. As a result, the team decided that it was necessary to conduct a comprehensive as-built survey.
Standard tools in MapInfo
Navigation map
Photo of selected building
Map window Building Information
System Cursor status location
User ID
Date
Time
Fig. 4. Main I-SPACE application window
When the as-built survey was completed, the team tried to overlay the available as-built drawings with drawings generated from the as-built survey. However, the team found out that they did not match. Upon further inspection, it was found that many of the as-built drawings were actually inaccurate and did not reflect the actual constructed buildings on the ground. The overlay technique actually revealed the inaccuracies that persisted in the as-built drawings. It was rather fortunate that some of the as-built
468
A.Abdullah, M.F.Abdullah and M.N.A.Shahbudin
drawings were missing or badly damaged; otherwise, the team would have not carried out the as-built survey and would have not discovered the inaccuracies within the as-built drawings. Further discussion with professionals in the built environment revealed that it is quite common for discrepancies to occur in as-built drawings. Therefore, to others who wish to develop similar system, the team, to an extent, would recommend that a comprehensive as-built survey to be conducted if accuracy of spatial data is of significance in the context of the system development.
6 Concluding remarks It is hoped that this paper has been able to briefly demonstrate on ISPACE and the experience of the team in developing the system. Although the system is still far from being completed, it has been receiving encouraging reviews from users. The aim now is to further enhance the system. In fact, several new ideas are already in the pipeline and will be developed and incorporated into the system under its future development phases. Finally, it is also hoped that that the development of I-SPACE can contribute towards widening the scope of GIS usage in Malaysian development process.
Acknowledgements The authors would like to thank IIUM management (especially the IIUM Development Division) for granting the consultancy work of ISPACE that makes this research paper possible, and also the generous support from Professor Dr. Ismawi Zen (Deputy Rector, Planning and Development) and Ir. Shaffei Mohamad (Director, Development Division) towards the I-SPACE project.
References Alias Abdullah, Muhammad Faris Abdullah, Mohd Fauzan Nordin, 2003, Managing urban development process by using spatial information system: a case study of I-SPACE. Journal of the Malaysian Institute of Planners, 1, 71-92. Azizan Mohd Sidin, Alias Abdullah, 2002, Strategi Perancangan DEGIS dan Pusat Geo-Data Negeri Selangor: Pengalaman dan Perspektif Pusat R&D Seksyen Penswastaan dan Ekonomi Negeri Selangor. Lee Lik Meng, Mohamed Jamil Ahmad, 2000, Local authority networked development approval system. Planning Digital Conference, Pulau Pinang, Malaysia, 28-29 March 2000.
Automatic Generation and Application of Landmarks in Navigation Data Sets Birgit Elias and Claus Brenner Institute of Cartography and Geoinformatics, University of Hannover, Appelstr. 9a, 30167 Hannover, Germany [email protected], [email protected]
Abstract Landmark-based navigation is the most natural concept for humans to navigate themselves through their environment. It is desirable to incorporate this concept into car and personal navigation systems, which are nowadays based on distance and turn instructions. In this paper, an approach to identify landmarks automatically using existing GIS databases is introduced. By means of data mining methods, building information of the digital cadastral map of Germany is analyzed in order to identify landmarks. In a second step, a digital surface model obtained by laser scanning is used to assess the visibility of landmarks for a given route.
1 Introduction and Related Work With the growing market for small mobile devices (PDA, mobile phones) the need for useful applications increases. One very important application is a guiding component for personal use. But also existing car navigation applications need improvement to make them more adapted to humans and thus more easy and reliable to use. Today’s navigation systems give driving assistance in terms of instructions and distances, based on the current position and the underlying digital map, see Section 1.1. In contrast, research in the field of spatial cognition has shown that the use of landmarks is the most natural concept for humans to navigate themselves through unfamiliar environment. Therefore, it is very important to provide landmarks for navigation purposes to simplify the system for the user by giving more natural instructions. The concept of landmarks is explained in Section 1.2. For the automatic determination of landmarks different existing GIS databases are used, see Section 1.3. In this paper, we will outline an approach to determine landmarks automatically in GIS databases. For that purpose, we subdivide the generation of
470
Birgit Elias and Claus Brenner
landmarks in two different stages (see Section 2). First, we investigate the data for so called potential landmarks using data mining techniques (Section 2.1). After that, we narrow the selection down to route-specific landmarks considering their visibility from the point of view. Therefore, we need 3D data of the environment. Here, we use a DSM (Digital Surface Model) obtained from laser scanning data (Section 2.2). In Section 3 we discuss the results and give an outlook about future work. 1.1 Today’s Car Navigation Systems Modern car navigation systems have been introduced in 1995 in upper class cars and are now available for practically any model. They are relatively complex and mature systems able to provide route guidance in form of digital maps, driving direction pictograms, and spoken language driving instructions (Zhao 1997). Looking back to the first beginnings in the early 1980s, many nontrivial problems have been solved such as absolute positioning, provision of huge navigable maps, fast routing and reliable route guidance. The maps used by car navigation systems not only contain the geometry and connectivity of the road network but also a huge amount of additional information on objects, attributes and relationships. A good overview can be obtained from the European standard GDF, see e.g. (Geographic Data Files 3.0 1995). Of particular interest are points of interest (POI) which include museums, theaters, cultural centers, city halls, etc. Map data is acquired by map database vendors such as Tele Atlas or NavTeq and supplied to car navigation manufacturers in an exchange format (such as GDF). There, it is converted to the proprietary formats finally found on the map CD or DVD. This conversion is nontrivial since the data has to be transformed from a descriptive form into a specialized form supporting efficient queries by the car navigation system. Often, structures and values are pre-computed by this conversion process in order to relieve the navigation system’s online resources such as bandwidth and CPU time. 1.2 Basic Theory of Navigation with Landmarks The term landmark stands for a salient object in the environment that aids the user in navigating and understanding space. In general, an indicator of landmarks can be particular visual characteristic, unique purpose or meaning, or central or prominent location. They can be divided into three categories: visual, cognitive and structural landmarks. The more of these categories apply for the particular object, the more it qualifies as a landmark (Sorrows & Hirtle 1999). A study of Lovelace, Hegarty & Montello (1999) includes an exploration of the kinds and locations of landmarks used in route directions. It can be distinguished between four groups: choice point landmarks (at decision points), potential choice point landmarks (at traversing intersections), on-route landmarks (along a path with no choice) and off-route landmarks
Landmarks in Navigation Data Sets
471
(distant but visible from the route). A major outcome of the study is that decision point and on-route landmarks are the most used ones in route directions of unfamiliar environments. The use of landmarks in street maps is discussed by Deakin (1996). The findings reveal that when supplemental landmarks are given, the navigation process is more successful and less errors occur. Landmark symbolization represented by geometric symbols or stereotype sketches was found to be equally effective. Even in car navigation systems using graphic and voice instructions instead of maps, landmarks added to directional instructions were helpful. The determination of which kind of landmark is useful for a specific navigation task is investigated for the purposes of car navigation systems by Burnett (1998): especially ‘road infrastructure’, such as traffic lights and petrol stations are considered to be important. Prospective attributes of such landmarks include permanence, visibility, location in relation to decision point, uniqueness and brevity (Burnett, Smith & May 2001). The automatic generation of landmarks is a matter of ongoing research. One approach uses the landmark concept of Sorrows & Hirtle (1999) to provide measures to specify formally the landmark saliency of buildings: the strength or attractiveness of landmarks is determined by the components visual attraction (e.g. consisting of fa¸cade area, shape, color, visibility), semantic attraction (cultural and historical importance, explicit marks such as shop signs) and structural attraction (nodes, boundaries, regions). The combination of the property values leads to a numerical estimation of the landmarks’ saliency (Raubal & Winter 2002, Nothegger 2003). The concept was extended by a measure of advance visibility (Winter 2003). The visibility of fa¸cades while approaching to a destination point is determined. Combined with the measure of saliency it represents an approach to identify route-specific landmarks. 1.3 Used GIS Databases In our approach we use the digital cadastral map of Lower Saxony in Germany. This database is an object oriented vector database of state-wide availability. The map includes information about parcels, land use, building and further administrative statements. We focus especially on building polygons and use geometry as well as additional semantic attributes like building use (residential, public use) and building labels (name or function of building such as church, kindergarten). For providing the road network, we used ATKIS data. Finally, the digital surface model (DSM) for this study was obtained by airborne laser scanning using the TopoSys scanner (Lohr 1999). The original point cloud was interpolated to a 1m × 1m grid. Last pulse data was used in order to reduce the influence of vegetation. The DSM is used for computing the visibility of objects, as outlined below in Section 2.2.
472
Birgit Elias and Claus Brenner
2 Determination of Landmarks The identification of appropriate landmarks consists of two stages (see Figure 1): First, the detection of all potential landmarks in the geo-database. That means, the existing GIS database is searched for objects following our definition for a landmark: landmarks are assumed to be topographic objects which exhibit distinct and unique properties with respect to their local neighborhood. These properties determine the saliency of the objects, which in turn depends on different factors like size, height, color, time of the day or year, familiarity with the situation, direction of route, etc. For the first stage, the general geometric and semantic characteristics of the investigated objects in a certain neighborhood are needed only. This computation step is independent from the particular route chosen (see Section 2.1). In a second step, the tentative selection has to be adapted to the routespecific needs such as visibility of the object from the decision point and the visibility while approaching this point. Finally, the position of the landmark relative to the route has to be considered in order to derive the appropriate route instruction which can be integrated into navigation data sets (see Section 2.2). potential landmarks
detection
geodata base
route-specific landmarks
chosen route
routing
Fig. 1. Generation of landmarks.
2.1 Discovering Potential Landmarks To make an automatic analysis process possible, we use data mining techniques to detect landmarks. Data mining methods are algorithms designed to analyze data, to cluster data into specific categories or identify regular patterns (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy 1996). Basic models of data mining are clustering, regression models, classification, summarization, link and sequence analysis. These procedures can be applied to data sets consisting of collected attribute values and relations for objects. In order to identify landmarks among buildings, all existing information about the buildings has to be extracted: information about semantics (use, function)
Landmarks in Navigation Data Sets
473
and geometry of the object itself (area, form, edges), but also information about topology, e.g. neighborhood relations to other buildings and other object groups (roads, parcel boundary etc.) and orientation of the buildings (towards north, next road, neighbor) are collected in an attribute-value table. Because of the findings of Lovelace et al. (1999) (see section 1.2), we only investigate (potential) decision points on the route for landmarks, that means all junctions in the underlying road network. For each potential decision point the local environment for the investigation is determined by means of a 360 degree visibility analysis to determine which objects are visible from that point of view at all. All selected buildings (creating the local environment) are transferred to the data mining process to detect the object with distinct and unique properties with respect to all others. We used the classification algorithm ID3 (Quinlan 1986) and the clustering approach COBWEB (Witten & Eibe 1999), see (Elias 2003). Here, we present the results of the first test using a modified ID3 algorithm. Chosen Algorithm for Data Mining Originally, the ID3 algorithm is a method of supervised learning for inducing classification models, also called decision trees, from training data. The decision tree is built from a fixed set of examples, each of which has several attributes and belongs to a class (like yes or no). The resulting tree can be used to classify future samples. The decision nodes of the tree are chosen by use of an entropy-based measure known as information gain (Han & Kamber 2001). Applied to our data the algorithm needs classified examples (the information which instances belong to the class ‘landmark = yes’). The resulting tree provides the shortest optimal description possible for a classification into the given classes. As in our study we do not know the landmarks in beforehand, there are no classified examples. Thus, we use this procedure in a modified way: we enhance our data set with the class landmark (values: yes, no). Then, we iteratively hypothesize each building being a landmark, whereas all the other buildings are no landmarks. Afterwards, we process all these data sets with the algorithm and compare the resulting decision trees with each other. The assumption is that true landmarks are identified by yielding the most simple (shortest) description, which means a position in the decision tree close to the root (for further details see (Elias 2003)). At the moment, we only consider instances with a resulting decision tree with one branch as a potential landmark. Preprocessing for Data Mining Before processing the data with the data mining algorithm, a few preprocessing steps are necessary: first, we have to provide the attribute values for all buildings in the analysis environment (see (Elias 2003)). The used attribute
474
Birgit Elias and Claus Brenner Table 1. Types of Attributes.
Nominal attributes building use (residential, public, . . . ) deviation of corners being rectangular (yes, no) building functions (code table) orientation to street (along, across, angular) orientation to neighbor (identical, angular) other parcel use than neighbors (yes, no)
Numeric attributes number of corners building area length, width of building and their ratio orientation to north (angle) distance to road number of neighbor buildings with direct contact ratio of building to parcel area built-up density around building number of buildings per parcel
values divide into two different types of attributes: nominal (values are categories) and numeric (see Table 1). The derived attribute values have to be adapted to the chosen algorithm, because depending on the underlying mathematical concept only particular attribute types can be used. For example, the ID3 algorithm works with nominal attributes only. Therefore, a transformation between the attribute types is necessary. We have used the data mining package WEKA (Witten & Eibe 1999) which offers an automatic conversion from numeric to nominal values (by dividing the numerical values in different class ranges). Processing of Potential Landmarks We focus on the extraction of landmarks at decision points, that means junctions that can be potential turning points in the later routing description. Therefore, we use the road network of the ATKIS database to provide all possible junctions in the neighborhood. For each decision point we determine its local environment to investigate if there is a topographic object (in this approach, buildings) with distinct unique properties with respect to its local neighborhood, fulfilling our requirements for being a landmark. We use the results of the visibility analysis (section 2.2) to determine which objects are in fact visible from the point of view at the decision point (see Figure 2, left). The selected objects are used with their attribute values in the data mining process. As a result, all instances which lead to a positive classification to the class ‘landmark = yes’ in the first branch are proposed as potential landmarks. In the example of Figure 2, right, the result of the modified ID3 processing is shown: Three different buildings are highlighted as being potential landmarks. They are chosen because each of them discriminates from the other buildings in one single attribute value: building number 1 has a different parcel use than its neighbor (because its a high voltage transformer building next to public buildings and a parking place). Object number 2 and 3 are singular because
Landmarks in Navigation Data Sets 15
24B
20
19
28
#
29
30
10
4
9
6 21
14
1
18
8
17
15
7A
7A
16 29 12
42
9B
21 20
19
27
18
28
17 16 15
29
7
34 30
#
StadtteilzentrumNordstadt 29A
10
38
30A
43
41
#
#
30B
39
38
38
33
41
10
37
11
9
35
10
9
35 8
10
31 7
7
#
31
6
9
9
24A 9
5
24A
5
5
26
Kindertagesstätte
24
28
12
12
9
11 30
6
6
Kindertagesstätte
24
7
11 30
28
31
30C
33
7
#
31
6
10
8
30C
33
5
11
37
30A
30A
12
26
13
3
3
14
41
15
2
24
#
3
11
#
5
42
49
13
1
1
26
28 30
Universität Hannover Chemisches Institut 3A
1A
1A
38
#
50
50
20
22
6
3 2
2
3
#
31
1A
#
Kindertagesstätte
##
#
27B
29
34 36
8
1
#
3 1A
4
4
#
27 27A
32 10
7
# # # #
13
11
1B
3
3A
30
12
4
#
49
Universität Hannover Chemisches Institut
#
9
Universität Hannover Institute 5
#
5
26 28
14
2
Mensa
#
42
1
# #
4
3
24
#
24
28A
1
4
26
28 30
$
3A
2
#
23
2
15
2
38
7
# # # #
#
22
6
4
8
3
3
14
41
2
1
31 20
2
5
#
25A
1
4
3
1
#
27B
29
34 36
8
4
#
16
27 27A
32 10
9
Universität Hannover Institute
18
Univ. Hannover Instiut
28 30
12
Mensa
#
20
6
#
6
#
$
14
23
26A 22
26
15
5 15
25A
25
7
#
24
28A
3B
Univ. Hannover Instiut
3A
3B
16
7
7
18
6
6
20
7A
7A
26A 22
5
4
13
7
4
8
8
8
25
1B
12
33
30B
30
StadtteilzentrumNordstadt 29A
43
30A
39
Kindertagesstätte
36 38
10
Kindertagesstätte
36
11
#
11
7
34
22
12
19 28
9B
#
9A
20
8
9
#
Universität Hannover Institute
22
27
#
21
26 27
2
9
5
14
5
Universität Hannover Institute 9A
42
25
15
#
29
10
4
19
28
30
9
6
22
12 20
27
11
21
26
11
10
25
10
12
22
24B
2
475
## #
Kindertagesstätte
# #
Fig. 2. Decision point (black triangle). Left: selection of all visible buildings. Right: three potential landmarks after processing.
of their function: one is the cafeteria of the university and the second is a day-care center for children. The other large buildings near by the decision point could have been expected being potential landmarks, but they are all classified as university buildings in the data set and therefore they are not singular in their environment. In Table 2 the degree of visibility (for a definition see Section 2.2), their distance to the decision point, and a short description about the building is given. It is clearly visible that object number 2 is better suited for being a landmark because of its high degree of visibility and nearness to the decision point. This has to be considered in a further assessment of the results. Table 2. Degree of visibility. No. 1 2 3
Visibility 4 1278 3
Distance [m] Description 216 high voltage transformer building 60 cafeteria of university 118 kindergarten
In Figure 3 a panoramic camera view from the decision point is shown. Only potential landmark number 2, the University cafeteria, can be clearly recognized, because it has a high degree of visibility (the building is cut in two pieces, visible on the left and right margin). 2.2 Route-Specific Landmarks Visibility Analysis Landmarks can only be of use for navigation purposes if they are sufficiently visible during the actual navigation process. Although some conclusions on
476
Birgit Elias and Claus Brenner
Fig. 3. Panoramic camera view from decision point
visibility can be drawn from two-dimensional maps, important situations cannot be handled adequately. For example, Figure 4(a) shows this case where the visibility of the tower on the right is not revealed, as the height is not taken into consideration. The optimal case is of course when a full three-dimensional city model is available. Then, visibility can be computed exactly, yielding even information on the visibility of single building faces. Buildings standing out behind other buildings are correctly identified (Figure 4(b)). Unfortunately, such threedimensional city models are not always available, or only at substantial costs, which makes this approach not practical, especially with respect to the large areas typically covered in navigation databases. However, we can do better if we base the visibility analysis directly on the DSM from laser scanning. We will not obtain “beautiful” visualizations but instead a sufficiently good estimate on which buildings can be seen from any given viewpoint (Figure 4(c)). We realized this approach as follows. For a given viewpoint, the position and viewing direction define the exterior orientation of a virtual camera of given horizontal and vertical viewing angle. This virtual camera represents the driver’s view. The height is derived from the DSM itself, whereas the viewing angle can be obtained from the orientation of the corresponding street segment in the GDF or ATKIS dataset.
(a)
(b)
(c)
Fig. 4. Visibility analysis for buildings (gray boxes) standing along a street (black lines). The visibility cone is shown in dark gray. (a) Based on 2D ground plans. (b) Based on true 3D geometry. (c) Based on a discrete DSM.
The virtual image plane is then divided into a regular raster, each picture element (pixel) defining a ray in object space. All the rays are traced in object space to determine intersections with the DSM. For each hit, the corresponding object number is obtained by a lookup in an image containing ground plan id’s. Figure 5 shows an example. It represents the same decision point
Landmarks in Navigation Data Sets
477
as Figure 3. The objects appear in different shades of gray, which have been assigned randomly to the ground plan id’s.
Fig. 5. Visibility computation for the scene shown above. Left: top view, showing the DSM and the cone of visibility. Right: virtual panoramic view where each shade of gray corresponds to an object (building) number.
For car navigation, a virtual camera should be used resembling a traditional (perspective) camera, as the driver’s view is indeed constrained by the orientation of the car and the extends of the windshield. However, for personal navigation, the panoramic view shown here might be more appropriate, since the person to be guided might turn and look in all directions quite easily. For each viewpoint, a list of records is obtained showing which objects are visible in the scene and how many pixels of the virtual image plane they cover. We use this number of pixels as a direct measure for the degree of visibility. Since each pixel corresponds to a well-defined area in the field of view of the person to be guided, the number of pixels is proportional to the absolute area in the person’s field of view. Since the objects in the virtual view are directly linked to the id’s of the map objects, one can immediately infer the degree of visibility for all objects in the cadastral map. Visibility Tracking In the last section, visibility was computed for a single view. However, landmarks selected for a routing instruction should be visible during an entire manoeuvre. This can be checked by tracking the visibility of objects along the trajectory defined by the corresponding manoeuvre. To this end, our algorithm traces the entire trajectory, generating virtual views at equidistantly spaced positions (2 meters distance in this case) and in the orientation defined by the trajectory. For each such view, the area covered by each object on the virtual image plane is determined. Figure 6 shows a plot of all those areas along a trajectory passing the university cafeteria. One can see typical ’peaked’ curves that are generated as objects appear, grow larger and finally disappear as the viewing position passes by. Those curves can be used to find out if an object is visible early
478
Birgit Elias and Claus Brenner 14000
5
12000 10000 8000 6000 4000 2000 193
181
169
157
145
133
121
97
109
85
73
61
49
37
0 1
1
4
25
2
13
3
Fig. 6. Left: Location of a virtual trajectory (white line) passing the university cafeteria (marked with ’3’). Right: Plot of all pixel counts along this trajectory (each abscissa step corresponds to 2 m real world distance).
enough, if it is visible throughout the entire manoeuvre, and if it covers enough area in the virtual image plane. For example, in Figure 6, the first four peaks in the plot on the right correspond to the buildings marked with ’1’ to ’4’ in the top view on the left. Also, one can clearly distinguish the few, isolated peaks in the left half of the plot from the many, very dense peaks in the right half, caused by the second half of the trajectory which passes along a narrow street with many buildings standing closely together (marked ’5’ in the top view). Integration into Navigation Databases If an object is identified as being a landmark and additionally fulfills all requirements regarding its visibility, it can be used in a driving instruction. It is important to note that all the required computations can be done beforehand and there is no need to do them online on the navigation system itself – i.e., no cadastral maps or digital surface models are required in the end device. This can be accomplished by automatically investigating all junctions and all possible manoeuvres associated with those junctions, i.e. ‘turn right’, ’keep straight’, etc. For each of those manoeuvres, the suitability of landmarks can be assessed. In order to integrate landmark-based instructions into navigation systems, one has to simply extend the navigation matrices present in today’s databases. Nowadays, those matrices identify which manoeuvres are allowed, which is done using a matrix of Boolean values. However, if we replace the entries by command codes which represent instructions such as ’turn right after the church’, a landmark-based guidance can be derived. It is worth to note that no structural change in the database is necessary to achieve this.
Landmarks in Navigation Data Sets
479
3 Conclusion and Future Work In the paper, we have shown a concept for the automatic extraction of landmarks from GIS databases. The current approach to determine potential landmarks with a modified ID3 algorithm has some drawbacks: even though singular objects are identified by the process effectively, problems arise if there are several similar objects (for example two churches) and each is qualified being a potential landmark at one decision point. The algorithm detects only objects which exist just once in the analyzed neighborhood, so double objects will never be processed as potential landmarks. This is due to the fact that at the moment, the comparison of the decision trees is done on a very simple basis: we only consider the first branching in the decision tree. Therefore, a more complex analysis of the result trees has to be taken into account to avoid the disadvantage of deleting similar objects from the list of potential landmarks. Furthermore, it has to be evaluated if the used attributes are sensible and are properly weighted for the analysis process. For example, the function of the building has a very dominant impact on the results because of the nominal code list of function types. Furthermore, the use of the visibility information has to be improved and fully integrated: it is not sufficient to provide a generic visibility of objects, also the extent and distance of the object towards the decision point have to be taken into account. In the near future, we will also investigate into unsupervised clustering approaches and compare the results to the results obtained from using ID3. Regarding the analysis of visibility, there are many possibilities for improvement. So far, we used the “virtual image size” to rate an object’s visibility. However, from the virtual image, one can also obtain information on the distance, if the object is sticking out behind another, closer object, if it is part of the silhouette, and if it is closer to the center of view. First pulse laser scan measurements could be integrated to get a better approximation for the occlusion caused by trees. The DSM could also be used to feed additional information to the extraction, for example, small towers sticking out behind a larger building could be identified. The implementation of the visibility tracking could also use equidistant time sampling instead of space sampling, based on assumed vehicle speeds in the vicinity of intersections.
References Burnett, G. E. [1998], Turn Right at the King’s Head – Drivers’ requirements for route guidance information, PhD thesis, Loughborough University. Burnett, G., Smith, D. & May, A. [2001], Supporting the Navigation Task: Characteristics of ’Good’ Landmarks, in: M. A. Hanson, ed., ‘Contempory Ergonomics 2001’, Taylor and Francis, pp. 441–446. Deakin, A. K. [1996], ‘Landmarks as Navigational Aids on Street Maps’, Cartography and Geographic Information Systems.
480
Birgit Elias and Claus Brenner
Elias, B. [2003], Extracting Landmarks with Data Mining Methods, in: M. W. Werner Kuhn & S. Timpf, eds, ‘Spatial Information Theory: Foundations of Geographic Information Science’, Springer Verlag, pp. 398–412. Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P. & Uthurusamy, R., eds [1996], Advances in Knowledge Discovery and Data Mining, AAAI Press/The MIT Press, Menlo Park, Californien. Geographic Data Files 3.0 [1995], Technical report, European Committee for Standardization, CEN TC 278. Han, J. & Kamber, M., eds [2001], Data Mining: Concepts and Techniques, Morgan Kaufmann. Lohr, U. [1999], High Resolution Laserscanning, not only for 3D-City Models, in: D. Fritsch & R. Spiller, eds, ‘Photogrammetric Week 99’, Wichmann Verlag, pp. 133–138. Lovelace, K., Hegarty, M. & Montello, D. [1999], Elements of Good Route Directions in Familiar and Unfamiliar Environments, in: C. Freksa & D. Mark, eds, ‘Spatial Information Theory: Cognitive and Computational Foundations of Geographic Information Science’, Springer Verlag, pp. 65–82. Nothegger, C. [2003], Automatic Selection of Landmarks, Master’s thesis, Technical University of Vienna. Quinlan, J. R. [1986], ‘Induction of Decision Trees’, Machine Learning. Raubal, M. & Winter, S. [2002], Enriching Wayfinding Instructions with Local Landmarks, in: M. Egenhofer & D. Mark, eds, ‘Geographic Information Science’, Vol. 2478 of Lecture Notes in Computer Science, Springer Verlag, pp. 243–259. Sorrows, M. & Hirtle, S. [1999], The Nature of Landmarks for Real and Electronic Spaces, in: C. Freksa & D. Mark, eds, ‘Spatial Information Theory: Cognitive and Computational Foundations of Geographic Information Science’, Springer Verlag, pp. 37–50. Winter, S. [2003], Route Adaptive Selection of Salient Features, in: M. W. Werner Kuhn & S. Timpf, eds, ‘Spatial Information Theory: Foundations of Geographic Information Science’, Springer Verlag, pp. 320–334. Witten, I. H. & Eibe, F. [1999], Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, San Francisco. Zhao, Y. [1997], Vehicle Location and Navigation Systems, Artech House, Inc. Boston, London.
Towards a Classification of Route Selection Criteria for Route Planning Tools Hartwig Hochmair University of Bremen, Cognitive Systems Group, PO Box 330 440, D28334 Bremen, Germany [email protected]
Abstract Route planners are tools that support the navigator in selecting the best route between two locations. Solving a route choice problem involves sorting and ranking of alternatives according to underlying evaluation criteria and decision rules. Using an appropriate classification of route selection criteria in the user interface is an important ingredient for user friendly route planners. The paper presents a method for assessing a hierarchical structure of route selection criteria for bicycle route planning tasks along with data from two empirical studies. The first study investigates route selection criteria that are relevant for bicycle navigation in urban environments. The second study reveals preferred classification schemata for these criteria. The presented methodology can be adopted for other transportation domains, such as car or pedestrian navigation. Keywords. Route selection criteria, classification, spatial decision support, user interface design, bicycle navigation
Introduction Most of the current bicycle route planners apply a fixed criterion optimization function (Ehlers et al. 2002; MAGWIEN 2004) or offer preference statement functionality only between a limited set of route selection criteria, such as short, fast, scenic, or avoiding slopes (Rad.RoutenPlaner 2003; MVEL 2004). Previous work gives evidence that human navigators are not exclusively shortest path or least time decision makers (Golledge 1997; Hyodo et al. 2000; Hochmair 2004). Thus, the user of a route planner should be offered the possibility to select between a larger range of route
482
Hartwig Hochmair
selection criteria. We address route selection within the framework of multiattribute decision making (MADM), which involves a single objective and a limited number of choice alternatives (Malczewski 1999). The objective “find best route” can be measured in terms of several evaluation criteria. The first study of this paper will reveal the relevance of various route selection criteria with respect to this objective. Despite the user’s demand for additional route selection criteria in bicycle route planners, the number of offered criteria from which the user can select must be kept small due to limited human cognitive capacities in information processing (Miller 1956; Rosch 1978). Thus, designers of a user interface need to find a compromise between simplicity and more detailed functionality. An appropriate classification of route selection criteria provides the basis for intelligent user interfaces that adopt their functionality to the user’s current demand for detail: Preference statements between a small number of more general higher-level attributes will result in a good route suggestion after a small number of interactive steps. Additional preference statements between more detailed lower-level criteria would allow the decision maker to refine her query. The second empirical study in this work investigates how participants hierarchically structure a given set of 35 route selection criteria. These findings provide the starting point for describing a method which derives a single final classification from a set of given classification suggestions, and where the final classification contains a reasonable number of criterion classes and provides a good “average” classification from the suggestions. The paper is structured as follows: Sections 0 and 0 describe two empirical studies about relevant route selection criteria for cyclists and their suggested classifications. Section 0 presents a guideline for deriving one single final classification from a set of given classification suggestions. Section 0 introduces a method for intra-class weighting of member attributes of a criterion class, and section 0 summarizes the findings and presents directions for future work.
Study 1: Evaluation Criteria The set of evaluation criteria included in a decision support system should be complete to cover all the important aspects of the decision problem (Keeny and Raiffa 1993; Malczewski 1999). So far route selection criteria for cyclists have only been roughly sketched in the context of very specific applications, such as urban planning (Hyodo et al. 2000), or Web based bicycle route planning for tourists (Ehlers et al. 2002). To provide a useful
Towards a Classification of Route Selection Criteria
483
classification of route selection criteria, a more comprehensive list of bicycle route selection criteria is required, which we achieved by an internet survey. The participants in the survey took the role of a cycling tourist in an unfamiliar city who wants to find the best route to a given restaurant. Participants were asked to enter the criteria they would consider in their route choice as free text in the questionnaire. The importance of each mentioned route selection criterion had to be stated by a score value between 1 (quite unimportant) and 4 (very important). Table 1 shows the summary of the 42 filled questionnaires, i.e., the mentioned route selection criteria ranked after their summed score values. Numbers in brackets indicate how many times a criterion has been mentioned. The most prominent route selection criterion was “bike lane” (mentioned by 78% of the participants), followed by “short”, “sights”, and “avoid heavy traffic”. Criterion bike lane short sights avoid heavy traffic parks side streets avoid steep street segment simple fast good signage good street condition lakes and rivers prominent buildings and LM few intersections snack bar safe area few traffic lights avoid pedestrian area
Criterion main road no wrong enter of one-ways lighted at night safe avoid tunnel straight avoid city center shopping streets city center avoid public transport nice bridges avoid roundabout avoid busy intersections avoid controls by authorities nice view interesting route avoid construction sites
Table 1. Route selection criteria for bicycle navigation in urban environments
The route selection criteria in Table 1 are of varying generality. Some denote more general demands, such as “safe” or “interesting” and can be split into several lower-level attributes. Other ones, such as “short” or “few intersections”, are more focused and describe a measurable effect.
Study 2: Classification Task The decision maker’s objective (“find best route”) and the related route attributes form a hierarchical structure of evaluation criteria. We expect that an appropriate classification which pre-sorts the lower-level criteria with
484
Hartwig Hochmair
their effects on a route into criterion classes will reduce the user’s mental effort in stating her preferences. Finding an adequate value function over a set of route selection criteria, which is needed for the implementation of a route search algorithm that provides trade-off functionality, requires a complete set of measurable lower-level attributes. The presented classification study will provide such a comprehensive set as part of its results. This paper will not discuss the assessment of value functions, as the importance values between the involved attributes depend on the range of attribute scores of choice alternatives at hand (Keeny and Raiffa 1993) and on the user’s subjective thresholds for accepting attributes scores (Srinivasan 1988). Cluster analysis (Hartigan 1975) sorts cases (e.g., higher-level criteria) into groups of clusters based on selected characteristics, so that the degree of association is strong between members of the same cluster. However, cluster analysis cannot replace empirical investigation as the best number of criterion classes to be used in a route planner, as well as the cases that should be clustered, are not known a priori. Factor analysis (Backhaus et al. 1996) attempts to identify underlying variables (factors) that explain the patterns of correlations within a set of observed variables. The disadvantage with factor analysis is that the method would require explicit route suggestions to be evaluated by the participants, which would affect the results. 3.1 Task description The list of route selection criteria from Table 1 was handed out to 12 participants who were asked to classify all criteria into either three, four, five, or six classes. The participants could either re-use criteria names for class names or create their own class names, if none of the terms in the list matched their concept of a particular class in mind. Further, the participants had to mark, whether a lower-level criterion was positively (+) or negatively (-) oriented towards the class. As lower-level attributes could be assigned to several classes by each participant, an attribute could be stated as being positively oriented with one class and being negatively oriented with another. For example, participants stated that the criterion “avoid pedestrian area” makes a route faster (positive orientation wrt. the class “fast”) but at the same time less attractive (negative orientation wrt. the class “attractive”).
Towards a Classification of Route Selection Criteria
485
3.2 Results Participants suggested ten different class names in the classification study (Fig. 1a). The class names “fast” and “safe” were mentioned by all participants, followed by “simple” (67%) and “attractive” (58%). Most participants (33%) used four classes (Fig. 1b). Thus, four should be an appropriate number of classes used in the final classification. It is interesting that none of the participants suggested the prominent criterion “short” as its own class. This finding may be explained by the fact that “short” cannot be decomposed into further attributes. Number of classes used
Classes mentioned 35 75
% of participants
% of participants
100
50 25
(a)
25 20 15 10 5
fa st sa fe si m at ple tra ct iv co e m co fo nv rt en ae ien st t he tic si g ec h t s on om i st op c ov er
0
30
0 3
4
5
6
(b)
Fig. 1. Classes mentioned in the classification task (a), and distribution of used number of classes (b)
Next we analyzed for each class the membership structure of included criteria. Fig. 2 shows a part of the membership structure for the classes “fast” and “safe”. A value of 100% for an attribute a in class C means that in all classifications where class C has been mentioned, attribute a has been assigned to class C. It does therefore not necessarily mean that all participants assigned attribute a to class C (as not all participants may have mentioned class C). fast
safe
Fig. 2. Membership structure for the classes “fast” and “safe”
The result in Fig. 2 is in principle independent from the global importance of a route selection criterion (Table 1) and therefore not bound to any specific wayfinding situation or task. However, the wayfinding task does
486
Hartwig Hochmair
have a small impact on the found membership structure of a class, as only those criteria which have been mentioned in connection with a given case scenario (see study 1) were presented to the participants of the classification study.
Finding the Final Classification 4.1 Guideline This section presents an informal guideline of how to obtain a single representative classification from a set of classification suggestions, which is demonstrated along with the data from the two previous studies. Whether a suggested class should be kept for the final classification or not depends on several factors. A major factor is the frequency with that a class has been mentioned in the classification study. Frequently mentioned classes represent intuitive higher-level criteria and should therefore be kept as such in the user interface. According to this rule, the classes “fast”, “safe”, “simple”, and “attractive” are candidates for the final classification (see Fig. 1a). A second factor is the class size where we suggest that classes that comprehend a high number of lower-level attributes should be kept. The third factor concerns the similarity of the final classes: As one of the demands on a good classification of criteria is non-redundancy (Keeny and Raiffa 1993), pairs of criterion classes that share many common attributes should be avoided in the final classification and merged. 4.2 Class size We define the size of a class C as the ratio between the attribute assignments to C actually made by the participants and the number of theoretically possible assignments to C. A (hypothetical) score of 100% for C thus means that all participants assign all 35 criteria from Table 1 (used as positively and negatively oriented criterion) to C. The class size correlates with the class frequency. Fig. 3 shows the computed size of all 10 mentioned classes. According to the ranking of classes after the class size, again the higher-level attributes “fast”, “safe”, “simple”, and “attractive” should be kept for the final classification.
Towards a Classification of Route Selection Criteria
487
Class size
% of total assignment
20 15 10 5
fa st sa f si e m at ple tra ct i co ve co mf nv or en t ae ien st t he tic si g ec ht on s o st mic op ov er
0
Fig. 3. Class sizes computed from the number of attribute assignments made to each class
4.3 Class similarity We define the similarity of two criterion classes over the presence or absence of assigned criteria in each of these classes, i.e., over binary similarity measures. Due to the fact that participants assigned only a small number of criteria to each class, many zeros appear in the membership tables. Therefore, we use a binary similarity measure that excludes double zeros, i.e., the Tanimoto- (Jaccard-) coefficient (Backhaus et al. 1996). Tversky’s ratio model (Tversky 1977) defines a normalized similarity measure between objects as a linear combination of the measures of their common and distinctive features. If setting D = E = 1 in the ratio model, it defaults to the Tanimoto coefficient. Fig. 4 shows the lower part of the symmetric similarity matrix containing the Tanimoto coefficient T for each pair-combination within the ten suggested classes. The matrix cells for the most correlated classes are shaded (we use a threshold of T t 0.20 here). Ideally, final classes should be independent and not share any attributes. In this case changes in the stated intra-class weightings of one class would not affect the intra-class weightings of another class. Although with partly overlapping classes this will be principally the same, the decision maker needs to mentally separate the effects of changes in the weighting of a lower-level criterion in one class from other higher-level classes that contain the same attribute. Preference statements between uncorrelated higherlevel criteria would allow the user for defining precisely the direction of the objective “best route”, which is more complex with correlated higherlevel criteria.
stopover
aesthetic
attractive
economic
convenient
simple
comfort
sights
safe
Hartwig Hochmair
fast
488
fast
1,00
safe
0,29
sights
0,01
0,04
1,00
comfort
0,11
0,13
0,14
1,00
simple
0,16
0,20
0,06
0,09 1,00
convenient
0,11
0,13
0,05
0,29 0,07
economic
0,01
0,00
0,00
0,00 0,00
0,00
1,00
attractive
0,15
0,23
0,22
0,24 0,11
0,20
0,00 1,00
aesthetic
0,06
0,08
0,37
0,22 0,05
0,12
0,03
stopover
0,01
0,01
0,03
0,14 0,00
0,15
0,00 0,05 0,07 1,00
1,00
1,00
0,35 1,00
Fig. 4. Similarity matrix showing the Tanimoto coefficient between all suggested classes
Each of the four preliminary favorite classes (“fast”, “safe”, “simple”, and “attractive”) for the final classification, shares a similarity measure t 0.2 with at least one other (Fig. 4). The highest similarity coefficient between favorite classes is found between “fast” and “safe” (0.29), which is partly caused by the fact that the lower-level criteria “parks” (6;7), “avoid heavy traffic” (6;7), or “avoid construction sites” (7;7)—which are altogether not the most characteristic members of both classes (Fig. 2)—have been assigned almost equally often to both classes. Generally, as double zeros are not counted by Tanimoto, not a high number of common attribute assignments, but rather a similar number of assignments increases Tanimoto. However, seen from the aspect of class similarity (Fig. 4), the two classes “fast” and “safe” should merged in the final classification. A strong argument for keeping these two classes separately is that both classes have been mentioned by all respondents (Fig. 1a), and that both classes have a big class size (Fig. 3). Similar considerations concerning class similarity, class size, and class frequency need to be made for the remaining three “favorite” classes. Finally we decide to keep “fast”, “safe”, “simple”, and “attractive” as classes of the final classification.
5 Class Structure Once a final set of higher-level classes is found, the intra-class importance of class members may be used in combination with a threshold for show-
Towards a Classification of Route Selection Criteria
489
ing or hiding a route selection criterion in an adaptive user interface. If the user demands simple user interface functionality, preference statements should at least be possible between the higher-level criteria. With the user’s increased demand for more detailed intra-class preference statement functionality, the order in which additional class member attributes are shown in the user interface should take into account several impacts, such as global importance (Table 1) or class memberships (Fig. 2). Further it is relevant whether an offered route feature is actually part of one of the route alternatives at hand. Table 2 shows the result of a suggested function (Eq. 1) that ranks member attributes of the four final higher-level classes according to the two previously mentioned impacts, yet assuming that the attributes are actually found in the choice alternatives at hand. The relevance value r suggests a default intra-class importance measure for a member attribute wrt. to the corresponding higher-level class. The symbol r stands for the normalized relevance value. fast
r
short few traffic lights avoid pedestrian area few intersections avoid heavy traffic avoid city center straight main road good street condition avoid steep street
1,00 0,50 0,33 0,29 0,26 0,24 0,24 0,20 0,14 0,11
simple
safe
r
good signage few intersections bike lane prominent buildings and LM straight main road short avoid roundabout lighted at night no wrong enter of one-ways
1,00 0,39 0,36 0,33 0,32 0,19 0,14 0,12 0,12 0,11
r
bike lane safe area lighted at night avoid heavy traffic good street condition avoid busy intersections avoid public transport no wrong enter of one-ways avoid roundabout avoid steep street
attractive
1,00 0,63 0,55 0,50 0,46 0,35 0,28 0,28 0,25 0,23
r
sights parks lakes and rivers nice view nice bridges city center good street condition snack bar avoid tunnel prominent buildings and LM
1,00 0,77 0,58 0,40 0,37 0,27 0,22 0,20 0,15 0,15
Table 2. Normalized relevance values for the ten most relevant members of the four final classes
ra ,h =
{
0
if ma , h = 0 N
3
Z a ¦ S n ,h ma2,n
Eq. 1
elsewhere
n 1
where n o number of all classes mentioned in the classification study = N
490
Hartwig Hochmair
The computation of the relevance value r for an attribute a with respect to a final higher-level class h considers the global importance of a (Za), the grade of membership of the attribute in all suggested classes n (ma,n), and the Tanimoto coefficient Tn,h between all suggested classes and h (Tn,h is part of Sn,h). For attribute members which have a high degree of membership in eliminated classes only (e.g., the attribute “city center” as member of the eliminated class “aesthetic”) the similarity (i.e., the Tanimoto coefficient) between this eliminated class and the final classes needs to be considered. Otherwise the effect of such lower-level attributes would be underestimated in the final classification structure. However, if in the classification study the assigned degree of membership of a in h amounts to zero, we decided to keep this zero value despite any class similarities (first line in Eq. 1). To avoid double counting of effects that are caused by the same lowerlevel criterion in several final classes, we introduce SC1,C2, which is a modified similarity measure between two classes C1 and C2. SC1,C2 equals 0 if C1 z C2 and if both C1 and C2 are classes of the final classification. Otherwise SC1,C2 yields TC1,C2. Testing various scaling factors for the impact of Za and ma,n lead to the intuitive finding that the impact of Za should be reduced as compared to linear weighting (using the cubic root), whereas the impact of ma,n should be increased (using the square). Although the global weights used (Table 1) refer to tourist navigation in urban environments, and the importance weights of criteria may change for a different wayfinding task or a different type of traveler (Bovy and Stern 1990), the distribution of relevance values in each of the final classes will not change dramatically due to the use of the cubic root for Za. The goal of the classification method was to find uncorrelated final classes that share only a small number of overlapping member attributes. Despite this goal, some few attributes, such as “bike lane”, appear within the top ten in several final classes (Table 2). This is not problematic, as long as the number of these shared attributes is small so that the user is able to mentally distinguish between the resulting effects on the route, i.e., wrt. the higher level classes, during her preference statements.
6 Conclusions and Future Work Along with data from two empirical studies, this work presented a method for finding a set of higher-level criteria (factors) that cover the objective of finding a best bicycle route in urban environments. Though referring to
Towards a Classification of Route Selection Criteria
491
this specific domain, we expect that the presented approach can also be applied for the hierarchical structuring of the criterion space of other transportation domains (e.g., car or pedestrian navigation). The work presented an intuitive intra-class ranking of route selection criteria for the final classes by introducing a relevance measure. However, the question of which criteria should actually be shown on the user interface is tricky, as the actual importance of criteria depends also on the range of attribute scores of alternatives at hand and on contextual parameters, such as the user’s familiarity with the environment. The assignment of importance weights may even be impossible for the user if no score ranges are known, and it may lead to inconsistencies if too many attributes are presented at the same time to the decision maker (Morris and Jankowski 2000). Requesting the user preferences within an interactive dialogue (Robinson 1990) would have the advantage that the user would need to consider only a small number of criteria at the same time, and that additional criteria presented to the user could be tailored to the results of previous screening phases. That is, unnecessary requests for user preferences that have no effect on the outcome of the route selection algorithm could be avoided (e.g., the request for the importance of bike lanes if there aren’t any in the area of interest). Future work needs to develop context dependent methods that hide irrelevant route selection criteria from the user interface and present only those functionalities that are of interest for the user at the current state of interaction. Hiding or offering route choice criteria is closely connected to the user’s preferred sequence of interactive steps and the preferred level of detail in each subsequent step. Whenever a refined query is submitted, the user should be given relevant information about the resulting pre-screened choice alternatives in order to be able to build a conceptual model about existing alternatives and to assign importance weights to each offered criterion. Dynamic updates and continuous feedback (similar as with sensitivity analysis) will give the user the chance to assess the consequences of her changed preference statements and to make her choice under a higher degree of certainty.
Acknowledgements This research has been funded by the IQN grant #40300059 from the German Academic Exchange Service (DAAD).
492
Hartwig Hochmair
Bibliography Backhaus K, Erichson B, Plinke W, Weiber R (1996) Multivariate Analysemethoden. Berlin, Springer Bovy PHL, Stern E (1990) Route choice: Wayfinding in transport networks. Dordrecht, Kluwer Academic Ehlers M, Jung S, Stroemer K (2002) Design and Implementation of a GIS Based Bicycle Routing System for the World Wide Web (WWW). Spatial Data Handling 2002, Ottawa Golledge RG (1997) Defining the Criteria Used in Path Selection. In: Ettema DF, Timmermans HJP (eds) Activity-Based Approaches to Travel Analysis. Elsevier, New York, pp 151-169 Hartigan JA (1975) Clustering algorithms. New York, John Wiley & Sons Hochmair H (2004) Decision support for bicycle route planning in urban environments. In: Proceedings of the 7th AGILE Conference on Geographic Information Science. Crete University Press, Heraklion, Greece, pp 697-706 Hyodo T, Suzuki N, Takahashi K (2000) Modeling of Bicycle Route and Destination Choice Behavior for Bicycle Road Network Plan. Annual Meeting of Transportation Research Board Keeny RL, Raiffa H (1993) Decision Making with Multiple Objectives: Preferences and Value Tradeoffs. Cambridge, UK, Cambridge University Press MAGWIEN (2004) Magistrat Wien: Routensuche für Radfahrer. http://www.wien.gv.at/stadtentwicklung/radwege [accessed 08/04/2004] Malczewski J (1999) GIS and Multicriteria Decision Analysis. New York, John Wiley Miller GA (1956) The magical number seven, plus minus two: Some limits on our capacity for processing information. Psychological review 63: 81-97 Morris A, Jankowski P (2000) Combining fuzzy sets and fuzzy object oriented spatial databases in multiple criteria spatial decision making. Flexible Query Answering Systems: Recent Advances: 103-116 MVEL (2004) Ministerium für Verkehr, Energie und Landesplanung NRW: Radroutenplaner NRW. http://www.radroutenplaner.nrw.de [accessed 08/04/2004] Rad.RoutenPlaner (2003) Software CD ROM. http://www.tvg-software.de [accessed 08/04/2004] Robinson VB (1990) Interactive Machine Acquisition of a Fuzzy Spatial Relation. Computers and Geosciences 16(6): 857-872 Rosch E (1978) Principles of categorization. In: Rosch E, Lloyd BB (eds) Cognition and categorization. Erlbaum, Hillsdale, NJ, pp 27-48 Srinivasan V (1988) A Conjunctive-Compensatory Approach to the SelfExplication of Multiattributed Preferences. Decision Sciences 19: 295-305 Tversky A (1977) Features of Similarity. Psychological Review 84(4): 327-352
An Algorithm for Icon Labelling on a Real-Time Map Lars Harrie¹, Hanna Stigmar¹, Tommi Koivula² and Lassi Lehto² ¹National Land Survey of Sweden, SE-801 82 Gävle, Sweden, [email protected], [email protected] ²Finnish Geodetic Institute, P.O. Box 15, FI-02431 Masala, Finland,[email protected], [email protected]
Abstract An algorithm has been developed for icon labelling on a real-time map. The algorithm is based on a least-disturbing definition and positions the icons in an area where they obscure the cartographic data as little as possible. To find this area the algorithm performs a spiral search on a precomputed grid. The computational complexity is low, which makes it possible to use the algorithm in real-time applications. A case study has been performed which demonstrates the improvement in icon labelling achieved by the algorithm. Keywords: Real-time map, label placement, icon, small-display cartography.
1 Introduction The development of computers during recent decades, and the increasingly widespread use of computers and the Internet on a regular basis, has contributed to revolutionary developments in the field of mapping and map use. Screen maps provide possibilities that could never be offered by standard paper maps. With the aid of computers, it is possible to design maps to suit a specific purpose and integrate them into services, such as route finders and Yellow Pages. The rapid development of mobile terminals, as well as positioning techniques, has made a new category of position-dependent, map-based services available to mobile users (Nissen et al. 2003). The small size of the display in these devices limits the types of functionality and applications
494
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
that can be used. These problems are similar to the problems of largerscreen devices, but with decreasing screen size the problems become even more obvious. However, as there is a user demand for small-display devices, they are likely to remain in use (Öquist and Goldstein 2003). In order to provide a good user interface in these devices, map readability is an important factor. Resolution will probably increase with time while size will remain the same. To satisfactorily present cartographic data on a small display the following operations are needed (cf. Elvins and Jain 1998): x Extraction of data from databases and their preparation for display. This includes specification of the visual attributes that help users to differentiate between certain pieces of information (such as colour, line thickness etc.) (Mayhew 1992). x Specification of the order for overlaying successive data on a map and, if necessary, generalising the cartographic data and integrating it with service data. x Implementation of browsing, zooming, and panning functions. A map that fulfils these three requirements is, in this paper, denoted a realtime map. This paper focuses on icon labelling on such a real-time map. When adding new features (such as icons) on a map their positions in relation to already existing map objects is very important. If the new features are positioned so that they overlap existing map objects, interpretation of the objects may be impaired. The positioning of new features, such as icons, must therefore be carefully considered. The paper starts with a general introduction to icon labelling on a map. This section includes a short review of previous work related to icon labelling. Section 3 contains a definition of the least-disturbing position for an icon. The following section describes a computationally fast algorithm for positioning icons in accordance with this definition. In section 5, a system architecture is presented for the implementation of the algorithm for realtime map applications. The algorithm has been implemented and evaluated in a case study, which is described in section 6. The paper is concluded with a short discussion and conclusions.
2 Positioning icons on a map Positioning icons can be regarded as a special case of map labelling. Good label placement helps a map user to interpret the information given by the map. It is, however, difficult to measure the impact of good label place-
An Algorithm for Icon Labelling on a Real-Time Map
495
ment, since it depends on the map user’s preferences. Some guidelines for label placement are: labels should be easy to read and locate, it should be obvious to which object a label belongs; labels should not interfere with other objects or labels and they should be positioned in the best position possible (Kakoulis and Tollis 1998). Figure 1 illustrates two examples of icon positioning on maps. The left example shows icon positioning that does not follow all of the guidelines described above. Here, the icons are positioned without taking the underlying map objects into consideration. Many real-time maps published on the Internet suffer from similar deficiencies. The right example shows a map where the icons are less disturbing.
Fig. 1. Two examples of real-time maps with icons added. (The right one was created by Nissen et al. 2003).
In cartographic applications three types of labelling are used: point labelling, line labelling, and area labelling. This paper considers point and line labelling, but methods similar to those presented here could also be used for area labelling. (Icon labelling for area features has similarities with positioning other type of information on maps such as pie charts; see e.g. recent findings by van Kreveld et al. 2004.) Few studies have been made of algorithms for icon labelling in cartographic applications. As the problem is related to text labelling and, to some extent, also to cartographic generalisation, we will review some algorithms in these two fields. Extensive research has been carried out to create algorithms that implement general rules for text placement (see bibliography in Wolff 2003). Some of these algorithms include the possibility of evaluating whether the text conceals part of underlying cartographic symbols and not only other text. For example, Zoraster (1997) defines the point text placement as a combinatorial optimisation problem. An objective function is defined using constraints on label-label and label-symbol overlaps. A search space is then constructed by the permitted positions of the labels. Finally, the
496
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
minimum of the objective function (i.e. the optimal label positions in conformity with the constraints) is computed using a minimising strategy (simulated annealing). Strijk and van Kreveld (2002) use a slider model for point text labelling. The slider model allows not only fixed positions for the labels (as in combinatorial optimisation), but also all positions that touch the point. The authors present a fast algorithm (suitable for a realtime map) that considers underlying cartographic objects in the positioning of the labels. In a real-time map it could be justifiable to move the cartographic objects to make space for the icon since the visual aspects are more important than the geometrical properties in many real-time maps. If the cartographic objects are allowed to be moved, the point labelling problem is related to the displacement problem in cartographic generalisation. Several optimisation techniques have been proposed for performing automatic generalisation (e.g. Sester 2000, Bader 2001, Harrie and Sarjakoski 2002). All of these methods are based on a number of analytical constraints, where some are defined for single objects and others for groups of objects. The constraints define an objective function that defines a continuous optimisation problem. The solution to these optimisation problems will then give the optimal displacement in conformity with the constraints. A major problem associated with these methods is the computational complexity, which makes them difficult to use in real-time applications. It is, however, possible to use some of these methods for text labelling and displacement for positioning icons. The choice of method is mainly dependent on the definitions of good icon labelling. In this paper we define a least-disturbance principle which can be implemented by quite a simple algorithm (see section 4). Another definition of the ideal icon labelling might require more advanced techniques, e.g. optimisation techniques.
3 A definition of the least-disturbing position of an icon on a map There are several definitions of an optimal position for the icon label in relation to its referent. In this paper we use a least-disturbance definition that requires that the icon be placed in such a position that it obscures as little information as possible. There are many ways to define the least-disturbing position for an icon on a map. Our definition is based on the points that form the cartographic objects, denoted cartographic points below. The definition is as follows (cf. Figure 2):
An Algorithm for Icon Labelling on a Real-Time Map
497
The disturbance value is equal to the sum of the weighted cartographic points that are covered by the icon, where the weights are numbers that reflect the importance of the object to which the cartographic points belong. The best position of the icon is the position with the minimum disturbance value within a specific search distance from the original position of the icon. Some characteristics of this definition are: x all positions within the search distance are given equal values (i.e., there is no weighting of the positions based on the distance the icon is moved), x all directions of movement are considered to be equally good, x possible overlaps of line segments or polygon areas are not taken into consideration, and x the definition does not support movement of the cartographic objects to make space for the icons.
Fig. 2. Two restaurant icons are positioned on a simple map (containing only building objects). The icon position for the left restaurant contains two cartographic points which form building objects; i.e. the disturbance value will be twice the weight of the cartographic building points. The icon position for the right restaurant is optimal in the sense that the icon does not cover any cartographic points.
One objection to this definition is that a map consists not only of points but also contains line segments and polygon areas. The question arises as to whether the definition of disturbance value should also take into consideration the length of line segments and area of polygons covered by the icon. We believe that, generally, the algorithm should not consider line segments and areas. Experiments on object recognition have shown that the human mind easily interprets objects that have been degraded to the extent that deleted regions can be replaced by collinearity or smooth curvature. On the other hand, objects that have had their contours deleted at regions of concavity, and have altered vertices, are much more difficult to recover. See for example Biederman (1985) and Figure 3.
498
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
Fig. 3. Examples of two ways in which objects can be degraded. The left column shows the original intact object, the middle column shows objects that can easily be recovered, while the right column shows objects that are more difficult to recover. The objects in the middle column can be recovered by imagining collinearity or smooth curvature. The right column shows objects that are nonrecoverable because they have had their contours deleted at regions of concavity, and some vertices are altered. Reprinted from Biederman (1985, p. 57) with permission from Elsevier.
4 Algorithms for identification of the least-disturbing position for an icon This section describes an algorithm for placing icons in the least-disturbing position. Here the least-disturbing position is defined as the place that gives the minimal sum of weights for the cartographic points. The algorithm requires the following information as input (the scalar and vector parameters refer to the pseudo-code below): x the original position of the icon (i.e. the position of the referent; orginalX, orginalY), x the size of the icon (below assumed to be a square with side length iconSideLength), x the permitted movement of the icon from the original position (searchDistance),
An Algorithm for Icon Labelling on a Real-Time Map
499
x the resolution of the search space (i.e. the distance between two neighbouring search positions; p), x the cartographic objects on the map (stored in the featureDataset), and x the importance of each cartographic object type (defined in the weightDefinition). A simple algorithm would involve moving the icon to each possible position and checking which cartographic points are covered by the icon for each position. The total number of possible positions for the icon is equal to n2, where n is the number of permitted positions in each direction (which is about (2*searchDistanceiconSideLength)/resolution). Such an algorithm would have a computational complexity of O(n2*numPoints), where numPoints is the number of cartographic points. This algorithm will be slow if there are many cartographic points. It can be improved by sorting the cartographic points in the x and/or y directions or by using a point index structure such as the kd-tree (see, for example, de Berg et al. 1997), but here we use another approach. The simple algorithm described above can be improved computationally by setting a constraint such that the resolution between the possible positions of the icon is an integer fraction (p) of the icon size. The only operations that are now necessary in order to find the “ideal” position for the icon are: (1) to define a grid (with grid size iconSideLength/p); (2) to sum the weights of the cartographic points in each pixel of the grid; and (3) to compute the total sum for each possible icon position by summing the weights of each pixel covered. In this paper we call this approach, the grid algorithm. Below we present a function that implements this algorithm in object-oriented pseudo code. The function takes the parameters stated above as input and returns the least-disturbing position for the icon. function gridAlgorithm(featureDataset, weightDefinition, iconSideLength, orginalX, orginalY, searchDistance, p) numFeatures = featureDataset.getNumber() for i = 1:numFeatures feature = featureDataset.getFeature(i) if (feature.overlaps(orginalX, orginalY, searchDistance)) weight = feature.getWeight(weightDefinition) addPointWeight(points, weights, feature, weight) end end numPoints = points.getNumber() step = iconSideLength/p n = floor((2*searchDistance-iconSideLength)/step)+1 // Define the sum of weights for the grid
500
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
for i = 1:numPoints a = floor((points(i).x-minSearchX)/step)+1 b = floor((points(i).y-minSearchY)/step)+1 gridValue(a,b) = gridValue(a,b)+weights(i) end // Find the position of the icon (specified by minX and minY) with the smallest disturbance value [minX, minY] = integrateGridValues(gridValue, n, p) return [minX, minY] end // End of function gridAlgorithm where: numFeatures numPoints feature.overlaps points weights points, addPointWeight(…)
is the number of features in the featureDataset, is the number of cartographic points, is a method that returns the value true if the feature overlaps the search area, is a vector containing the cartographic points, is a vector containing the weights for the cartographic
is a function that adds the cartographic points and weights for the feature to the vectors points and weights, gridValue(a,b) is a matrix that contains the total sum of point weights for pixel [a,b] in the grid, minSearchX, minSearchY are the coordinates of the bottom left position of the search space, [minX, minY] specifies the coordinates for the “ideal” position of the icon, floor(z) is a function that returns the integer part of z, and integrateGridValues(…) is a function that computes the disturbance value for each possible position of the icon (by summing values in the matrix gridValue) and returns the ideal position according to the leastdisturbance principle. The function utilises the fact that neighbouring window positions almost cover the same pixels (see Figure 4); and that the search is ordered as a spiral (see Figure 5). If a satisfactory position is found (disturbance value lower than a predefined threshold) the search is terminated. The spiral search method ensures that no other satisfactory positions can be found closer to the original position.
An Algorithm for Icon Labelling on a Real-Time Map
501
Figure 4. When the window is moved from one position (shown with dashed lines) to a new position (shown by the solid lines) only the grid values in the grey areas have to be added/removed to compute the disturbance value for the new icon position. Using this approach the function integrateGridValues(…) will run in O(n2*p) time. Figure 5. The search starts in the original position of the icon (denoted by a double circle) and is then performed in a spiral fashion. The black dot denotes the current position of the icon and the icon is shown in the grey area (in this example a total of nine grid values are covered by the icon). Figure 6. An icon can be placed along a line through repeated use of the grid algorithm. The first search is performed around the midpoint of the line (here shown by a box labelled 1). The following searches are then performed to the left and right along the line (boxes 2, 3, etc.). For reasons of clarity the algorithm is somewhat simplified. For example, the spiral approach in the function integrateGridValues(…) requires that n be an odd number, which is not enforced in the pseudo-code. Furthermore, the code has only been written for one icon. If several icons are involved the code must be extended to determine their positions sequentially, and a subsequent icon is not permitted to overlap an icon that has already been positioned (this is implemented by assigning large weight to the icon and adding the values to the matrix gridValue). The algorithm as stated above does not consider text. Icons should never overlap text; therefore disturbance values should be added to all cells in the matrix gridValue that are covered by a text label. Furthermore, the algorithm described here assumes that the icons are squares. The algorithm can be extended to include rectangular icons (but this implies that the search space must also be rectangular). The function integrateGridValues(…) is the most computationally complex part of this algorithm. The total computational complexity of the grid algorithm is O(numPoints+n2*p) and the storage complexity is O(n2), i.e. the grid algorithm is numPoints/p times less complex than the simple algo-
502
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
rithm. For normal values of numPoints (about 100) and p (about 10), the grid algorithm is much faster than the simple algorithm described above. The grid algorithm should be fast enough to be used in most real-time map applications. The grid algorithm can also be used for positioning icons that represent line objects. The idea here is to perform repeated spiral searches along the line (Figure 6).
5 System architecture for implementation of the icon labelling algorithm The grid algorithm should be implemented in system architecture for realtime map services. The implementation requires cartographic data in vector format and icon data. In theory, the computations could be carried out by the client application, but since the process capability of the client is often low, for instance in the case of mobile devices, icon positioning should preferably be done before the map is sent to the client. The study described in this paper is part of the EU project GiMoDig (Geospatial info-mobility service by real-time data integration and generalisation; GiMoDig 2004); which is a project intended to establish methods of distributing cartographic data from core databases at national mapping agencies to mobile devices (mainly following the Open GIS Consortium standards). Figure 7 is a simplified overview of the GiMoDig system architecture (see Lehto (2003) or Sarjakoski and Lehto (2003) for details). According to the GiMoDig service architecture, the client will access the service on the Value-Added Service (VAS) layer. The VAS acquires Point Of Interest (POI) and Line Of Interest (LOI) data from a third-party source and includes it in the query to be sent to the GiMoDig Portal Service. The Portal Service translates the query and forwards it to the Data Processing Service (DPS). The main task of the DPS is spatial data generalisation. The DPS creates a Web Feature Service (WFS 2003) query to obtain the cartographic data in the form of a Geography Markup Language (GML) dataset (GML 2003). After the real-time generalisation, the results are still expressed as GML-encoded spatial data, also including POIs and LOIs as GML features. After receiving the generalised dataset from the DPS, the Portal Service translates the data into a map image (e.g Scalable Vector Graphics, SVG, image). Finally, the VAS layer will receive the resulting map and return it to the client.
An Algorithm for Icon Labelling on a Real-Time Map
503
Fig. 7. A simplified illustration of the GiMoDig system architecture for POI and LOI integration.
6 Case study The work presented in this paper has been integrated into the GiMoDig system architecture as a plug-in data processing operation in the DPS layer. The algorithm is used to place POI icons in real time so that they obscure cartographic data as little as possible. The process also adds icon representation for LOIs. The grid algorithm for icon placement was implemented as part of a Java API for generalisation and integration of cartographic data. The Java API is based on the open source Java packages JTS Topology Suite (JTS), JTS Conflation Suite (JCS) and JTS Unified Mapping Platform (JUMP). All three packages are available from Vivid Solutions (2003). JTS conforms to the Simple Features Specification for SQL (developed by Open GIS Consortium) and contains a robust implementation of the most fundamental spatial algorithms (in 2D). JCS supplies methods for data integration and conflation. Finally, JUMP contains import and export functions for GML data as well as a viewer. In the GiMoDig project these packages have been extended by inclusion of a number of other packages to create a Java API for real-time generalisation and integration (Harrie and Johansson 2003).
504
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
Figure 8 shows two examples taken from the GiMoDig real-time service (Sarjakoski and Nivala 2004). In the left-hand examples the icons are placed in the centre of POI locations. In the right-hand examples the icons are placed at the least-disturbing position, as computed by the grid algorithm. Icons for the LOIs (denoted c in Figure 8) have been added in the right-hand figures and were also positioned by the grid algorithm. The following parameters were given to the grid algorithm via the query interface of the DPS layer: x weights of feature types: buildings = 3, others = 2, x disturbance value threshold = 0, and x fount size of text labels. The widths of the text labels were estimated using the fount size and the number of letters in the text label. The resolution parameter (p) is set to 9 and the search distance is 5 times the icon size (these values were hardcoded in the case study). In the top-left map the major problem is that two of the icons (a and d) almost totally conceal the three underlying buildings. In the top-right map the grid algorithm has placed them so that the underlying buildings are visible.
Fig. 8. Example maps from the GiMoDig service. The left-hand examples illustrate the original positions of the icons. In the right-hand side maps icons for LOIs (marked by c) have been added and the icons are all positioned at the least-disturbing position, as computed by the grid algorithm.
An Algorithm for Icon Labelling on a Real-Time Map
505
In the bottom-left example the icons overlap the cartographic data and one of the icons (i) makes the name and the level of the lake unreadable. In the bottom-right map the grid algorithm has moved most of the symbols further away from the lake where there is plenty of space. Also, the icons no longer interfere with the text labels.
7 Discussion The grid algorithm gave good visual results in the case study described above. However, there is room for possible improvement of the algorithm. As stated above, the icons are positioned sequentially i.e. if one icon has already been positioned the following icons are not permitted to overlap it. This might not, however, always be the optimal solution. A better approach would be to store several possible positions for each icon (plus a disturbance value for each position). A final solution could then be computed by combinatorial optimisation where possible overlaps are taken into account. The problem with this approach is to make it sufficiently computationally efficient for real-time applications. Another possible improvement of the algorithm would be to use a buffer zone around the icon. This could be implemented by extending the size of the icon and then finding the optimal position for this spatially extended icon. From a cartographic point of view, it might be preferable that an icon for a LOI should not overlap the LOI. This could be achieved by adding fictitious points on the LOI or adding values for each cell in the gridValue matrix that is overlapped by the LOI. The disturbance principle, as applied in this study, does not consider the distance from the original position of the icon as a parameter, as long as the distance is less than the search distance. It is easy to implement distance weighting and this should be evaluated to establish whether such weighting is desirable. Furthermore, the algorithm does not take into consideration whether there is an important cartographic object between the original position (referent) and the new position. An improvement to the algorithm would, for example, be not to allow the icon and its referent to lie on different sides of a street. In this study we did not consider the design of the cartographic symbols or the icons. It should be kept in mind that the design is important and that it may influence what is regarded as the optimal position for an icon.
506
Lars Harrie, Hanna Stigmar, Tommi Koivula and Lassi Lehto
8 Conclusions This paper presents a definition of a least-disturbance principle for positioning an icon on a real-time map. Briefly, the definition of this principle is that the weighted sum of points (covered by the icon) should be minimised. An algorithm that implements this principle was evaluated in a case study. It is shown that the principle gives good visual results. The algorithm is sufficiently computationally efficient to be integrated in a realtime map system.
Acknowledgements The research described in this paper is part of the GiMoDig project, IST-2000-30090, which is funded by the European Union via the Information Society Technologies (IST) programme (GiMoDig 2003). We would like to thank our colleagues in the GiMoDig project and Qingnian Zhang for their cooperation.
References de Berg, M., van Kreveld, M., Overmars, M., and Schwarzkopf, O., 1997, Computational Geometry: Algorithms and Applications (Springer). Bader, M., 2001, Energy Minimization Methods for Feature Displacement in Map Generalization, Doctorate Thesis, Geographic Information System Division, Department of Geography, University of Zurich, Switzerland. Barrault, M., 2001, A methodology for placement and evaluation of area map labels. Computers, Environment and Urban Systems, 25:1, pp.33-52. Biederman, I., 1985, Human image understanding: recent research and a theory. Computer Vision, Graphics, and Image Processing, 32:1, pp.29-73. Elvins, T.T., and Jain, R., 1998, Engineering a human factor-based geographic user interface. IEEE Computer Graphics and Applications, 18:3, pp.66-77. GiMoDig, 2004. Geospatial info-mobility service by real-time data-integration and generalization, http://gimodig.fgi.fi/ (accessed 2004-05-03). GML, 2003. Geographic Markup Language, http://www.opengis.org/techno/documents/02-023r4.pdf (accessed 2003-0912). Hampe, M., Anders, K.- H., and Sester, M., 2003, MRDB applications for data revision and real-time generalization. In Proceedings of the 21st International Cartographic Conference (ICC), 10-16 August, Durban, South Africa, pp.192-202.
An Algorithm for Icon Labelling on a Real-Time Map
507
Harrie, L., and Sarjakoski, T., 2002, Simultaneous Graphic Generalization of Vector Data Sets. GeoInformatica, 6:3, pp.233-261. Harrie, L. and Johansson, M., 2003, Real-time data generalization and integration using Java. Geoforum Perspektiv, Februar 2003, pp.29-34. Kakoulis, K.G., and Tollis, I.G., 2003, A unified approach to automatic label placement. International Journal of Computational Geometry & Applications, 13:1, pp.23-59. Lehto, L., 2003. GiMoDig system architecture, Available at http://gimodig.fgi.fi/deliverables.php (accessed 2003-09-10). Nissen, F., Hvas, A., Munster-Swendsen, J., and Brodersen, L., 2003, SmallDisplay Cartography. GiMoDig-project, IST-2000-30090, Deliverable D3.1.1, Public EC report, 66 p. An electronic version available at http://gimodig.fgi.fi/deliverables.php (accessed 2004-05-03). Mayhew, D.J., 1992, Principles and Guidelines in Software User Interface Design (Prentice Hall). Öquist, G., and Goldstein, M., 2003, Towards an improved readability on mobile devices: evaluating adaptive rapid serial visual presentation. Interacting with Computers, 15:4, pp.539-558. Sarjakoski, L. T. and A.-M. Nivala, 2004. Adaptation to Context – A Way to Improve the Usability of Mobile Maps. A chapter to be published in a book: Meng, L., Zipf A. and T. Reichenbacher, 2004, Map-based mobile services – Theories, Methods and Implementations, Springer Verlag, 19 p. Sarjakoski, T., and Lehto, L., 2003, Mobile Map Services Based on an Open System Architecture. In Proceedings of the 21st International Cartographic Conference (ICC), 10 - 16 August 2003, Durban, South Africa, pp.1107-1113. Sester, M., 2000, Generalization Based on Least Squares Adjustment. International Archives of Photogrammetry and Remote Sensing, Vol. XXXIII, Part B4, Amsterdam, pp.931-938. Strijk, T., and van Kreveld, M., 2002, Practical extensions of point labeling in the slider model. GeoInformatica, 6:2, pp.181-197, 2002. Vivid Solutions, 2003. Java Topology Suite, http://www.vividsolutions.com/jts/jtshome.htm (accessed 2003-09-10). WFS, 2003. Web Feature Service Implementation Specification, http://www.opengis.org/techno/specs/02-058.pdf (accessed 2003-09-10). van Kreveld, M., Schramm, É., and Wolff, A., 2004. Algorithms for the Placement of Square and Pie Charts on Maps. Manuscript. Wolff, A., 2003. The Map-Labeling Bibliography, http://i11www.ira.uka.de/~awolff/map-labeling/bibliography/ (accessed 200312-22). Zoraster, S., 1997, Practical Results Using Simulated Annealing for Point Feature Label Placement. Cartography and Geographic Information Systems, 24:4, pp.228-238.
Semantically Correct 2.5D GIS Data – the Integration of a DTM and Topographic Vector Data Andreas Koch and Christian Heipke Institute of Photogrammetry and GeoInformation (IPI), University of Hannover, Germany, Nienburger Strasse 1, 30167 Hannover, Germany [email protected], [email protected]
Abstract This paper presents an approach for a semantically correct integration of a 2.5D digital terrain model (DTM) and a 2D topographic GIS data set. The algorithm is based on a constrained Delaunay triangulation. The polygons of the topographic objects are first integrated without considering the semantics of the object. Then, those objects which contain implicit height information are dealt with. Object representations are formulated, the object semantics are considered within an optimization process using equality and inequality constraints. First results are presented using simulated and real data. Keywords: GIS, DTM, Integration, Adjustment, Modelling
1 Introduction
1.1 Motivation The most commonly used topographic vector data, the core data of a geographic information system (GIS), are currently two-dimensional. The topography is modelled by different objects which are represented by single points, lines and areas with additional attributes containing information on function and dimension of the object. In contrast, a digital terrain model (DTM) in most cases is a 2.5D representation of the earth’s surface. By in-
510
Andreas Koch and Christian Heipke
tegrating these data sets the dimension of the topographic objects is augmented, however, inconsistencies between the data may cause a semantically incorrect result of the integration process. Inconsistencies may be caused by different object modelling and different surveying and production methods. For instance, vector data sets often contain roads modelled as lines or polylines. The attributes contain information on road width, road type etc. If the road is located on a slope, the corresponding part of the DTM often is not modelled correctly. When integrating these data sets, the slope perpendicular to the driving direction is identical to the slope of the DTM which does not correspond to the real slope of the road. Additionally, data are often produced independently. The DTM may be generated by using lidar or aerial photogrammetry. Topographic vector data may be based on digitized topographic maps or orthophotos. These different methods may cause inconsistencies, too. A semantically correct integration leads to consistent data. By considering the semantics of the objects it is also possible to verify the DTM. In many cases, topographic vector data are almost up-to-date because objects like roads and railways possess major priority in GIS. A DTM, however, may be more than ten years old. It is true that height changes appear less frequently than changes in the horizontal position of objects. Nevertheless, integration of both data sets considering the semantics of objects will show discrepancies, and will allow to draw conclusions on the quality of the DTM. 1.2 Related work The integration of a DTM and 2D GIS data is an issue that has been tackled for more than ten years. Weibel (1993), Fritsch & Pfannenstein (1992) and Fritsch (1991) establish different forms of DTM integration: In case of height attributing each point of the 2D GIS data set contains an attribute “point height”. By using interfaces it is possible to interact between the DTM program and the GIS system. Either the two systems are independent or DTM methods are introduced into the user interface of the GIS. The total integration or full database integration comprises a common data management within a data base. The terrain data often is stored in the data base in form of a triangular irregular network (TIN) whose vertices contain X,Y and Z coordinates. The DTM is not merged with the data of the GIS. The merging process, i.e. the introduction of the 2D geometry into the TIN, has been investigated later by several authors (Lenk 2001; Klötzer 1997; Pilouk 1996). The approaches differ in the sequence of introducing the 2D geometry, the amount of change of the terrain morphology and the number
Semantically Correct 2.5D GIS Data
511
of vertices after the integration process. Among others, Lenk and Klötzer argue that the shape of the integrated TIN should be identical to the shape of the initial DTM TIN. Lenk developed an approach for the incremental insertion of object points and their connections into the initial DTM TIN. The sequence of insertion is object point, object line, object point etc. The intersection points between the object line and the TIN edges (Steiner points) are considered as new points of the integrated data set. Klötzer, on the other hand, first introduces all object points, then carries out a new preliminary triangulation. Subsequently, he introduces the object lines, determines the Steiner points, adds both to the data set. Since the Delaunay criterion is re-established in the preliminary triangulation, the shape of the integrated TIN may deviate somewhat from the one of the initial DTM. The methods have in common, that inconsistencies between the data are neglected and thus may lead to semantically incorrect results. Rousseaux & Bonin (2003) focus on the integration of 2D linear data such as roads, dikes and embankments. The linear objects are transformed into 2.5D surfaces by using attributes of the GIS data base and the height information of the DTM. Slopes and regularization constraints are used to check semantic correctness of the objects. However, in case of incorrect results the correctness is not established. A new DTM is computed using the original DTM heights and the 2.5D objects of the GIS data. Using a common algorithm as those of Lenk (2001) and Klötzer (1997) may lead to a semantically incorrect integrated data set. We not only check constraints which are derived from the semantics of the objects as Rousseaux & Bonin (2003) but the heights of the integrated data set are changed to fulfill predefined constraints. Such approach is comparable to the homogenization of different data sets. Hettwer (2003) and Scholz (1992) have investigated the homogenization of 2D cartographic data which possibly stem from different data sources and refer to different coordinate systems. As well as we do they use a least squares adjustment: coordinates are introduced as direct observations and regularization constraints are formulated as pseudo observations. However, the investigations are restricted to 2D data and no inequation constraints are introduced.
512
Andreas Koch and Christian Heipke
2 Semantic correctness
2.1 Consequences of non-semantic integration A digital terrain model (DTM) is composed of points with its coordinates X,Y,Z and an interpolation function to derive Z values at arbitrary positions X,Y. Mostly the DTM is a 2.5D representation of the topography, i.e. bridges, vertical walls and hang overs are not modelled correctly. Against this, the topographic vector data we consider are two-dimensional. The topography is modelled by different objects which are represented by single points, lines and areas.
Fig. 1. The integration of a DTM and 2D topographic vector data without considering the semantics of the objects; left: lakes, right: road network, exaggerated terrain
Figure 1 shows two examples of an integration of a DTM and 2D topographic vector data without considering the semantics of the topographic objects lake and road. The integrated data set is represented by an irregular triangular network (TIN). The height values of the lakes do not show a constant height level. Several heights of the lakes near the bank are higher than the mean lake height. At the right side of figure 1 the roads are not correctly modelled in the corresponding part of the DTM. The slopes of the cross sections are identical to the mean slope of the hills. There are no breaklines at the left and right borders of the roads. Also, neighbouring triangles of the DTM TIN show extremly different orientations.
Semantically Correct 2.5D GIS Data
513
2.2 Correct integration If we have a look Table 1. Some topographic objects and their at the topography representation in the corresponding part and divide the of the terrain topography into Object Representation different topographic objects Horizontal plane Sports field (road, river, lake, Race track building, etc.), Runway like the data of a Dock GIS, there are Canal several objects Lake, pool which have a diRoad Tilted subplanes rect relation to Path the third dimenRailway, tramway sion. These obRiver jects contain imBridge Height relation plicit height Undercrossing, crossover information. For example, a lake can be described as a horizontal plane with increasing terrain at the bank outside the lake. Even if we do not know the real lake height, we have an idea of the representation of a lake in relation to the neighbouring terrain. To give another example, roads are usually nonhorizontal objects. We certainly do not know the mathematical function representing the road height, but we know from experience and from road construction manuals that roads do not exceed maximum slope and curvature values in road direction. Also, the slope perpendicular to the driving direction is limited. Of course, all other objects are related to the third dimension, too. But it is difficult and often impossible to define general characteristics of their three-dimensional shape. For example, an agricultural field can be very hilly. But it is not possible in general to define maximum slope and curvature values because these values vary from area to area. The objects containing implicit height information which need to be used for the semantically correct integration can be divided into three different classes (see Table 1). The first class contains objects which can be represented by a horizontal plane. The second class describes objects which can be composed of several tilted planes. The extent of the planes depends on the curvature of the terrain; the planes should be able to adequately approximate the corresponding part of the original DTM. The last class shown in Table 1 describes objects which have a height relation to
514
Andreas Koch and Christian Heipke
other objects. Bridges, undercrossings and crossovers may contain a certain height relation to the terrain or water above or beneath. To integrate a DTM and a 2D topographic GIS data set in a semantically correct sense, the implicit height information of the mentioned topographic objects has to be considered. That means, after the integration the integrated data set must be consistent with our view of the topography. E.g. all height values of points of the bounding polygon of a lake and all heights situated inside the bounding polygon must have the same height level. The DTM points at the bank outside the lake must be higher than the lake height. In case of roads, the slope and curvature values in road direction should not exceed maximum values, the slope across the road must be nearly zero. This means, that points of a road cross section should have nearly the same height value.
3 An algorithm for the semantically correct integration The aim of the integration is a consistent data set with respect to the underlying data model which needs to take care of the semantics of the topographic objects. Topographic objects which are modelled by lines but which have a certain width, are first buffered. The buffer width is taken from the attribute “width” if available, otherwise a default value is used. Thus, the lines are transformed into elongated areas, the borders of which are further considered. The next step of the algorithm is an integration of the data sets without considering the semantics of the topographic objects. It is based on a constrained Delaunay triangulation (Lee & Lin, 1986) using all points of the DTM (mass points and structure elements) and the points of the topographic objects of the 2D GIS data (section 3.1). The linear structure elements from the DTM and the object borders are introduced as edges of the triangulation, the result is an irregular triangular network (TIN) – an integrated DTM TIN. Then, certain constraints are formulated and are taken care of in an optimization process (section 0). In this way, the topographic objects of the integrated data set are made to fulfill predefined conditions related to their semantics. The constraints are expressed in terms of mathematical equations and inequations. The algorithm results in improved height values and in a semantically correct integrated 2.5D topographic data set. A basic assumption of our approach is that the general terrain morphology as reflected in the DTM is correct and has to be preserved also in the neighbourhood of objects carrying implicit height information. Therefore,
Semantically Correct 2.5D GIS Data
515
any changes must be as small as possible. A second assumption is that inconsistencies between DTM and topographic object stem from inaccurate DTM heights and not from planimetric errors of the topographic objects. 3.1 Non-semantic data integration There are several approaches for the integration of a DTM and 2D topographic GIS data based on a triangulation. Among others, Lenk (2001) and Klötzer (1997) argue that the shape of the integrated TIN should be identical to the shape of the initial DTM TIN. Lenk developed an approach for the incremental insertion of object points and their connections into the initial DTM TIN. The sequence of insertion is object point, object line, object point, etc.: after having inserted the first object point Lenk introduces the object line between this point and the following one. The intersection points between the object line and the TIN edges are considered as new points of the integrated data set (Steiner points). Subsequently, Lenk in-
Fig. 2. Integration of a DTM and a topographic object “lake“, a) original DTM TIN and object “lake”, b) integrated data set
serts the next object point, and so on, until the data set is complete. Klötzer, on the other hand, first introduces all object points, then carries out a new preliminary triangulation. Subsequently, he introduces the object lines, determines the Steiner points, adds both to the data set. Since the Delaunay criterion is re-established in the preliminary triangualtion, the shape of the integrated TIN may deviate somewhat from the one of the initial DTM. The advantage of Lenk’s approach compared to that of Klötzer is that the shape of the integrated TIN is identical to the shape of the initial DTM
516
Andreas Koch and Christian Heipke
TIN. The disadvantage is that the approach results in a large amount of Steiner points which lead to additional observation equations and/or inequation constraints (see section 0). Because computational aspects are not subject of this paper and because the height changes of the original heights have to be as small as possible the prementioned condition has to be fulfilled. Thus, we use a variant of Lenk’s method. First, a DTM TIN is created using the mass points and the structure elements in a constrained Delaunay triangulation. Second, the heights for the topographic objects are derived using the height information of the TIN by interpolating a height value for each object point. Then, the points and their connections are inserted into the initial DTM TIN: After insertion of the first object point the connection between this point and the following one is inserted. This is done in such a way, that the intersection points between the object line and the edges of the DTM TIN are introduced as new points (Steiner points). The edges of the DTM TIN and the lines of the object polygon are splitted. Here, the Delaunay criterion may locally be not fulfilled. Figure 2 shows an example of the integration of a DTM and an object “lake” of a 2D GIS data set. The original points of the bounding polygon of the lake are shown in grey. After the integration, the intersection points between the DTM TIN and the object polygon are new points of the inte-
Fig. 3: Integration of a DTM and a topographic object “road“, a) original DTM TIN and object “road“, b) intersection between DTM TIN and object, c) buffered object, d) integrated data set
grated data set (coloured by black). Another example is given in figure 3. A road is an object modelled by lines which is buffered using an attribute “road width” (figure 3a). First, the middle axis of the road (black line) is introduced using a constrained Delaunay triangulation. All intersection points between the middle axis and the DTM TIN are introduced as new points. This is done because
Semantically Correct 2.5D GIS Data
517
every triangle has a different inclination and the middle axis should be best fitted to the terrain represented by the DTM TIN. The left and right side of the buffered road, which contain as much points as the middle axis, are then introduced using another constrained Delaunay triangulation (figure 3b). 3.2 Optimization process As mentioned, there are topographic objects of the 2D GIS data which contain implicit height information. Within the integrated data set these objects have to fulfill certain constraints which can be expressed in terms of mathematical equations and inequations. To fulfill these constraints or to achieve semantic correctness, the heights of the DTM are changed. Up to now the horizontal coordinates of the polygons of the topographic objects are introduced as error-free. The heights of the topographic objects are estimated within an optimization process which is based on a least squares adjustment; these values are the unknown parameters. The heights of the corresponding part of the DTM are introduced as direct observations for the unknown heights at the same planimetric position. Equality constraints are introduced using pseudo observations. Thus, a Gauss-Markov adjustment model is used and the adherence to the constraints is controlled via weights for the pseudo observations. Furthermore, inequality constraints are introduced. The optimization process is solved using the linear complementary problem (LCP) (Lawson & Hanson, 1995; Fritsch, 1985; Schaffrin, 1981). Basic observation equations The heights of the DTM which correspond to the topographic objects of the 2D GIS data are introduced as:
0 vˆi
Zˆ i Z i
(1)
The height Zi refers to the original height of the DTM, the value Zˆ i denotes the unknown height which has to be estimated, vˆi ist the residual of the observation i. In order to be able to preserve the slope of an edge connecting two neighbouring points Pj and Pk of the DTM TIN (and thus to control the general shape of the integrated DTM TIN) additional equations are formulated. One of the two points is part of the polygon describing the object, the other one is a neighbouring point outside the object:
518
Andreas Koch and Christian Heipke
Z j Z k vˆ jk
Zˆ j Zˆ k
(2)
Equality and inequality constraints Each class of object representation (see table 1) has its own constraints which can be expressed in terms of mathematical equality and inequality constraints. These constraints will be derived in the following for each class of representation. Horizontal plane Heights of objects which represent a horizontal plane must be identical everywhere. This means, that points Pl with height Zl and planimetric coordinates Xl,Yl situated inside the object boundary (see Figure 4a, grey points) must all have the same value Zˆ HP which has to be estimated in the optimization process. These height values lead to the following observation equation:
0 vˆl
Zˆ HP Z l
(3)
The points of the bounding polygon of the topographic objects do not contain any height information, i.e. the heights have to be derived from the DTM. We use the mean height value of all points inside the object. Again, the height difference between the unknown object height and the calculated or measured original height is used to formulate an additional pseudo observation (see figure 4a, black points):
0 vˆm
Zˆ HP Z m Z1 , Z 2 ,..., Z n
(4)
Fig. 4: Equality and inequality constraints of a horizontal plane, topographic object “lake“, a) points inside the lake and points of the waterline, b) points of the neighbouring terrain
Semantically Correct 2.5D GIS Data
519
The neighbouring terrain of the horizontal plane is considered using the basic observations 1 and 2 (see 0). If the object represents a lake it is necessary to use a further constraint which represents the relation between the lake in terms of a horizontal plane and the bank of the lake whose height values Zˆi have to be higher than the height level of the lake Zˆ HP :
0 ! Zˆ HP Zˆ i
(Inequation 1)
In figure 4b the points Zˆi of inequation 1 which are points of the neighbouring terrain are shown in black. Tilted planes The objects treated in this paper which can be composed of serveral tilted planes are elongated objects. In longitudinal direction these objects are not allowed to exceed a predefined maximum slope value sMax:
s Max t
Zˆ n Zˆ o Dno
(Inequation 2)
Fig. 5: Equation and inequation constraints, a) maximum slope and maximum slope difference, b) horizontal profile and points belonging to a plane The example in figure 5 shows a road which is modelled by lines and then buffered using the attribute “road width” of the GIS data base. Here, Zˆ n and Zˆ o are the unknown height values of successive points Pn and Po in driving direction of the road (figure 5a). Dno is the horizontal distance between these points.
520
Andreas Koch and Christian Heipke
In addition, the difference between two successive slope values which is comparable to the curvature of the object is restricted to the maximum value dsMax:
ds Max t
Zˆ n Zˆ o Zˆ o Zˆ p Dno Dop
(Inequation 3)
In case of a road, the points Pn, Po and Pp are successive points of the middle axis of the object, Dno and Dop are the corresponding horizontal distances. Assuming a horizontal road profile in the direction perpendicular to the middle axis the height values of corresponding points must be identical:
0 vˆnq Zˆ n Zˆ q (5) The values Zˆ n and Zˆ q represent point heights of the centre axis and the left or the centre axis and the right side of the buffered object (figure 5b). These constraints are introduced for all cross sections whose centre point results from the intersection between the DTM TIN and the original object line. In figure 5b these cross sections are p1, p3 and p4. Those cross sections whose centre points are original points of the object middle axis are not used to form this kind of constraint because in the original points the road may show a change in horizontal direction and slope (cross section p2). Consequently the cross section is not horizontal. Finally, the points of any two neighbouring cross sections and the points in between have to represent a plane:
0 vˆr
aˆ 0 aˆ1 X r aˆ 2Yr Zˆ r
(6)
In figure 5b the points of the neighbouring profiles p3 and p4 as well as the points in between represent a point Pr of equation 6. These points have to represent a plane with the unknown coefficients aˆ 0 , aˆ1 , aˆ 2 . Xr,Yr are the planimetric coordinates of point Pr, Zˆ r is the height of Pr which has to be estimated. A special case is the treatment of the points of a cross section involving an original object point of the 2D road centre axis. Equation 6 is set up twice, once for the points of the horizontal profile p1 and the centre point of profile p2 (and any point in between), and again for the points of the horizontal profile p3 and the centre point of profile p2 (and any point in between). After the optimization process the intersection straight line of the two neighbouring planes can be calculated. This straight line represents the non-horizontal profile p2 and the left and right point of the profile can be calculated, too.
Semantically Correct 2.5D GIS Data
521
Height relation Bridges, undercrossings and crossovers have a certain height relation to other objects (for example roads, railways, rivers, etc.). The height values of these objects must be higher or lower than the one of related objects. The height difference d is identical to the height of the bridge or the crossover or the depth of the undercrossing.
d t Zˆ s Zˆt
(Inequation 4)
Zˆ s is the unknown height of the higher point, Zˆ t the one of the lower point. Inequality constrained least squares adjustment The basic observation equations (section 0) and the equation and inequation constraints (section 0) have to be introduced in the optimization process which is based on an inequality constrained least squares adjustment. The stochastic model of the observations (basic observations and equation constraints) consists of the covariance matrix which can be transformed into the weight matrix. Assuming that the observations are independent of each other, in general the diagonal of the weight matrix contains the reciprocal accuracies of the observations. To fulfill the equation constraints the corresponding pseudo observation has to get a very high accuracy and the corresponding diagonal element of the weight matrix has to be very large. The solvability of the optimization process, i.e. the semantic correctness of the resulting integrated data set depends on the choice of the individual weights. Because of this the next section 0 deals with investigations on weighting the observations. The algorithm is formulated as the linear complementary problem (LCP) which is solved using the Lemke algorithm (Lemke, 1968). For more details see Koch (2003), the LCP is explained in detail in Lawson & Hanson, 1995; Fritsch, 1985 and Schaffrin, 1981.
4 Results The results presented here were determined by using simulated and real data sets. Two different objects were used – a lake which can be represented by a horizontal plane and a road which can be composed of several tilted planes. The simulated data consist of a DTM with about 100 height
522
Andreas Koch and Christian Heipke
values containing one topographic object. The heights are nearly distributed in a regular grid with a grid size of about 25 meters. The real data were made available by the surveying authority of Lower Saxony “Landesvermessung und Geobasisinformation Niedersachsen LGN”. The data consist of the DTM ATKIS£ DGM5, a hybrid data set containing regularly distributed points with a grid size of 12,5 m and additional structure elements. The 2D topographic vector data are objects of the German ATKIS£ Basis-DLM. Three different lakes were used bordered by polygons. The objects are shown on the left side of figure 1. 4.1 Simulated data In case of a lake, the unknown lake height is identical to the mean value of the heights inside the lake. This is true if the neighbouring heights outside the lake are higher than the mean height value, i.e. if the inequation constraints (inequation 1) are fulfilled before the optimization begins. It is also true if neighbouring heights outside the lake are somewhat lower than the mean height value and the equations 3 and 4 have a very high weight. Here, equations 3 and 4 have a weight of 106 times higher than all other observations. After the optimization process the equation and inequation constraints are fulfilled, and thus the neighbouring heights outside the lake are higher than the estimated lake height. All heights inside the lake and at the waterline have the same height level; the integrated data set is consistent with our view of a lake. If the heights of the bank outside the lake have a high weight, the lake height is pushed down. Then, the heights outside are nearly unchanged but the original height difference between the lake and the point outside has changed. The second simulated data set represents a road with five initial polyline points. The maximum height difference is 6 m, the road length is 160 m and the width is 4 m. The investigations were carried out by using different weights for the basic observation equations and the equality constraints. Equation 1 was used for all points of the bordering polygon and the points outside the object which are connected to the polygon points. Equation 2 represents the connections to the neighbouring terrain. Using the same weight for all observations results in a road with non-horizontal cross sections and differences to the tilted planes. The inequation constraints are fulfilled and the maximum differences between the initial DTM heights and the heights of the integrated data set are in an order of half a meter.
Semantically Correct 2.5D GIS Data
523
Fig. 6. Height differences between the original heights of the DTM and the estimated heights of the optimization process (vertical exaggeration factor: 30), object: lake
Using higher weights (106 times higher than other weights) for the basic equation 2 and the equation constraints 5 and 6 leads to horizontal cross sections and nearly no differences to the tilted planes. The maximum differences between the initial DTM heights and the heights of the integrated data set are somewhat bigger than the differences before. If the equation constraints 5 and 6 have a high weight, the equation and inequation constraints are fulfilled exactly. Compared to the results before, the terrain morphology has changed considerably. The results show, that a compromise has to be found between fulfilling the equation constraints and changing the terrain morphology. Using a higher weight of 106 leads to fixed observations, i.e. the equation constraints are fulfilled exactly. But, the terrain morphology is not the same as before. 4.2 Real data The real data sets representing lakes consists of three ATKIS BasisDLM objects with 294 planimetric polygon points. The DTM contains 1.961 grid points with additional 1.047 points representing structure elements (break lines). The semantically correct integration was carried out by using high weights for the equation constraints 3 and 4 and for the basic observation equation 1 (106 times higher than other weights). The number of basic observations and equation constraints is 2.754; 533 parameters had to be estimated and the number of inequation constraints is 530. The results show, that all constraints were fulfilled after applying the optimization. The differences between the estimated lake heights and the initial mean height values are very small. The first mean height value is
524
Andreas Koch and Christian Heipke
reduced by 2 mm and the second one by 4 mm. The third lake is 3,7 cm lower than the original mean height value which is caused by a higher number of heights at the bank which did not fulfill the inequation constraint (inequation 1). Figure 6 shows the residuals after the optimization process. The blue vectors correspond to height values which are lower than the original heights after the optimization. Red coloured vectors refer to heights which became higher. The figure shows that most of the heights inside the lakes became higher. Most of the points which became lower are situated at the border of the lakes. Against this, a big part of the differences of the left lake became lower, too. Here, the corresponding part of the DTM seems to be coarse erroneous. The maximum differences between the original heights and the estimated heights are -1,84 m and +0,88 m, respectively. Figure 7 shows the result of the semantically correct integration (right side) with respect of the results without considering the semantics of the lakes (left side). The semantically correct integrated data set shows that all constraints are fulfilled. The height values inside the lake and at the water line have the same level. The terrain outside the lake arises.
Fig, 7. Results of the integration, left: without considering the semantics of the topographic objects, right: semantically correct integration
5 Outlook This paper presents an approach for the semantically correct integration of a DTM and 2D topographic GIS data. The algorithm is based on a Delaunay triangulation and a least squares adjustment taken into account inequality constraints. First investigations were carried out using simulated and real data sets. The objects used are lakes represented by a horizontal plane with increasing terrain outside the lake and roads which can be composed of several tilted planes. The results which are based on the use of different weights for the basic equations and equation constraints are satisfying. Height
Semantically Correct 2.5D GIS Data
525
blunders or big differences to the equality and inequality constraints may cause a non-realistic improvement of the original height information of the DTM. Thus, blunders have to be detected and corrected prior to the overall adjustment in the future. Furthermore, the planimetric coordinates of the topographic objects were introduced as error-free. This may cause a erroneous height level of the topographic objects. In addition, planimetric coordinates of structure elements have to be considered in the adjustment algorithm. Otherwise, structure elements inside the object polygon will be deleted and the morphology can be erroneous, too.
Acknowledgement This research was supported by the surveying authority of Lower Saxony Landesvermessung und Geobasisinformation Niedersachsen (LGN). We also express our gratitude to LGN for providing the data.
References Fritsch, D., 1985. Some Additional Information on the Capacity of the Linear Complementary Algorithm, in: E. Grafarend & F. Sanso, Eds., “Optimization and Design of Geodetic Networks”, Springer, Berlin, pp. 169-184. Fritsch, D., Pfannenstein, A., 1992. Comceptual Models for Efficient DTM Integration into GIS. Proceedings EGIS’92. Third European Conference and Exhibition on Geographical Information Systems. Munich, Germany, pp. 701710. Hettwer, J., 2003. Numerische Methoden zur Homogenisierung großer Geodatenbestände. Publication of the Geodetic Institute of the Rheinisch-Westfälischen Technischen Hochschule Aachen. PhD Thesis, Aachen, Germany. Klötzer, F., 1997. Integration von triangulierten digitalen Geländemodellen und Landkarten. Diploma thesis of the Institute of Informatics, Rheinische Friedrich-Wilhelms-Universität Bonn, Germany, unpublished. Koch, A., 2003. Semantically correct integration of a digital terrain model and a 2D topographic vector data set, Proceedings of ISPRS workshop on challenges in geospatial analysis, integration and visualization II, Stuttgart, 8-10 September, CD. Lawson, C. L., Hanson, R. J., 1995. Solving Least Squares Problems. Society for Industrial and Applied Mathematics, Philadelphia. Lee, D.-T., Lin, A.-K., 1986. Generalized Delaunay triangulations for planer graphs. Discrete Comput. Geom., 1: 201-217.
526
Andreas Koch and Christian Heipke
Lemke, C. E., 1968. On complementary pivot theory, in: G. B. Dantzig, A. F. Veinott, Eds., “Mathematics in the Decision Sciences”, Part 1, pp. 95-114. Lenk, U., 2001. – 2.5D-GIS und Geobasisdaten – Integration von Höheninformation und Digitalen Situationsmodellen, Wiss. Arb. Fachr. Verm. Universität Hannover Nr. 244 and DGK bei der Bayer. Akad. D. Wiss., Reihe C, Nr. 546. Diss., Univ. Hannover. Pilouk, M., 1996. Integrated Modelling for 3D GIS. PhD Thesis. ITC Publication Series No. 40, Enschede, The Netherlands. Rousseaux, F., Bonin, O., 2003. Towards a coherent integration of 2D linear data into a DTM. Proceedings of the 21st International Cartographic Conference (ICC). Durban, South Africa, pp. 1936-1942. Schaffrin, B., 1981. Ausgleichung mit Bedingungs-Ungleichungen. AVN, 6, pp. 227-238. Scholz, T., 1992. Zur Kartenhomogenisierung mit Hilfe strenger Ausgleichungsmethoden. Publication of the Geodetic Institute of the RheinischWestfälischen Technischen Hochschule Aachen. PhD Thesis, Aachen, Germany. Weibel, R., 1993. On the Integration of Digital Terrain and Surface Modeling into Geographic Information Systems. Proceedings 11th International Symposium on Computer Assisted Cartography (AUTOCARTO 11). Minneapolis, Minnesota, USA, pp. 257-266.
Generalization of integrated terrain elevation and 2D object models J.E. Stoter1, F. Penninga2 and P.J.M. van Oosterom2 1
International Institute for Geo-Information Science and Earth Observation (ITC), Department of Geo-Information Processing, Hengelosestraat 99, Enschede, the Netherlands 2 Delft University of Technology, OTB, section GIS technology, Jaffalaan 9, 2628 BX Delft, the Netherlands
Abstract A lot of attention has been paid to generalization (filtering) of Digital Elevation Models (DEMs) and the same is true for generalization of 2D object models (e.g. topographic or land use data). In addition there is a tendency to integrate DEMs with classified real-world objects or features, the result is sometimes called a Digital Terrain Model (DTM). However, there has not been much research on the generalization of these integrated elevation and object models. This paper describes a four step procedure. The first two steps have been implemented and tested with real world data (laser elevation point clouds and cadastral parcels). These tests have yielded promising results as will be shown in this paper.
1 Introduction There is a close relationship between Digital Elevation Models (DEMs, 2.5D), based on for example raw laser-altimetry point data, and the topographic objects or features embedded in the terrain. Feature extraction techniques aim to obtain the 2D geometry and heights for certain types of topographic objects such as buildings. There are methods for object recognition in TINs (Triangular Irregular Networks) based on point clouds in which the selection of an object (e.g. building roofs, flat terrain between buildings) corresponds to planar surfaces (Gorte, 2002). This technique can be used for 3D building reconstruction from laser altimetry. However, this will not be the topic of this paper.
528
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
On the other hand, 2D objects, from another independent source, such as a cadastral or topographic map, can explicitly be incorporated as part of the TIN structure, which is representing a height surface (Lenk, 2001; Stoter and Gorte, 2003). In this case the TIN structure is based on both 2D objects and point heights. The data structure of the planar partition of 2D objects is within the TIN. Within this data structure, the 2D objects are identifiable in the TIN and are obtainable from the TIN, as a selection of triangles which yield 2.5D surfaces of individual 2D objects. This is the topic of this paper. For the study described in this paper, the following two data sets have been used (see figure 1). Terrain height points For the terrain elevation model we use a data set representing the DEM (Digital Elevation Model) of the Netherlands, i.e. AHN (Actueel Hoogtebestand Nederland) (Van Heerd, 2000). The AHN is a data set of point heights obtained with laser altimetry with a density of at least one point per 16 square meters and in forests a density of at least one point per 36 square meters. The point heights are resampled in a regular tessellation at a resolution of 5 meters. Due to availability issues this regular data set is used, whereas the TIN approach is designed with irregular data in mind. The AHN contains only earth surface points: information such as houses, cars and vegetation has been filtered out of the AHN. The heights in the AHN have a systematic error of on average 5 cm and 15 cm RMSE. Parcel boundaries The used parcels are from the cadastral database of the Netherlands. In the cadastral database parcel boundaries are organised in a structure of geometrical primitives (boundaries or edges described by their polylines) and parcels are topologically stored via references to boundaries (Lemmen et al., 1998). The typical geometric accuracy is about 10 cm. For this research four different types of TINs (Triangular Irregular Networks) were generated, all representing surface height models based on point heights obtained from laser altimetry, and the last three also including 2D parcels: unconstrained Delaunay TIN, constrained Delaunay TIN, conforming Delaunay TIN and refined constrained TIN. In section 2 the definition and creation of these TINs is described, together with their results when applied to the test data set. The TINs are stored in the Oracle DBMS, and from this information, some spatial analyses, queries, and visualisation were performed in the context of the DBMS (section 3).
Generalization of integrated terrain elevation and 2D object models
529
Fig. 1. Data sets used in this research; top: elevation (dark = low, light = high), bottom: cadastral parcels
One of the disadvantages of using a dense laser altimetry data set is the resulting data volume and with that the poor performance of the queries. However, due to the 'sampling' nature of data obtained with laser altimetry not all points are needed to generate an accurate elevation model (within epsilon tolerance in the same order of magnitude as the original height model and cadastral data). Therefore we examined how the number of TIN nodes (and thereby related edges and triangles) can be reduced by removing nodes that are not significant for the TIN, but at the same time maintaining the constraints of the parcel boundaries. Section 4 describes a method, which can be used to generate an effective TIN, including a repre-
530
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
sentation of 2D objects, in which only the relevant points are used. Part of this method has been implemented, specifically the first 'generalization' step: the filtering of non-significant elevation points. The results of this prototype implementation are presented in section 5. This paper ends with conclusions.
2 Integrated TINs of point heights and parcels To explore the possibilities of including a data set in a 2D planar partition in a TIN structure, four different types of TINs, all representing height models and the last three also including 2D objects, were generated: unconstrained Delaunay TIN (section 2.1), constrained Delaunay TIN (section 2.2), conforming Delaunay TIN (section 2.3) and refined constrained TIN (section 2.4) (also see Shewchuk, 1996). 2.1 Unconstrained TIN First a TIN was generated using only the point data. The triangulation was performed outside the DBMS since TINs (and triangulation) are not (yet) supported within DBMSs. The ideal case would be just storing the point heights and the parcel boundaries in the DBMS and to generate the TIN of the area of interest on user's request within the DBMS, without explicitly storing the TIN structure in the DBMS. The representation of the implicit TIN could then be obtained via a view. This is more efficient and less prone to decrease in quality because no data transfer (and conversion) is needed from DBMS to TIN software and back. In the future a distributed DBMS structure may be possible within the Geo-Information Infrastructure (GII). An integrated view, based on two different databases (as the different data sets are maintained by different organisations in different databases) may be feasible from the technical perspective. In our research we stored copies of all data sets in one single DBMS. In our test case, a TIN has been generated with Delaunay triangulation (Worboys, 1995). The Delaunay triangulation results in triangles, which fulfil the 'empty circle criterion', which means that the circumcircle around every triangle contains no vertices of the triangulation other than the three vertices that define the triangle. In general this results in good and numerically stable polygons. It should be noted that the Delaunay TINs are computed in 2D and may therefore be suboptimal for true elevation data. The z-value of points is not taken into account in the triangulation process, but added afterwards. This is perhaps a bit strange if one realises that the TIN
Generalization of integrated terrain elevation and 2D object models
531
is computed for an elevation model in which the z-value is very important; see (Verbree, 2003) for better TIN construction for terrain elevation models.
Fig. 2. A parcel surface (detail) based on an unconstrained TIN
Fig. 3. A parcel surface (detail) based on a constrained TIN
2.2 Constrained TIN The selection of triangles from the unconstrained TIN (partly) overlapping one parcel surface represents an area larger than the parcel itself since triangles cross parcel boundaries (figure 2). Therefore, to improve the selection of a parcel surface, a constrained TIN was generated. In order to obtain a more precise parcel surface, a constrained TIN was generated, using the parcel boundaries as constraints. We assigned z-coordinates to the nodes of parcel boundaries by projecting them in the unconstrained TIN. In contrast with the unconstrained TIN, each triangle in the constrained TIN (figure 3) belongs to one parcel only and therefore the selection of triangles exactly equals the area of a parcel. However, as can also be seen in figure 3, keeping the parcel boundaries (edges) undivided leads to elongated triangles near the location of parcel boundaries. This has two important drawbacks. First, the very flat elongated triangles may be numerically unstable (not robust, as small changes in the coordinates may cause errors) and the visualisation is unpleasant. Second, and maybe even more important, a long original parcel boundary will remain a straight line in 3D even when the terrain is hilly, because there are no intermediate points on the
532
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
parcel boundaries by which it is not possible to represent height variance across the parcel boundaries. 2.3 Conforming TIN Keeping the original edges in the constrained TIN undivided in the triangulation process leads to elongated triangles if parcel boundaries are much longer than the average distance between DEM points (5 meters) which is the case in using parcel boundaries with the AHN data set. An alternative to the constrained TIN may be the conforming TIN. The computation starts with a constrained TIN, but every constrained edge which has a triangle to the left or right not satisfying the empty circle condition is recursively subdivided by adding so-called Steiner points (and locally recomputing the TIN with the two new constrained edges). The recursion stops when all triangles, also the ones with (parts of) the constrained edges, satisfy the empty circumcircle criterion (the Delaunay property). The conforming TIN has both the Delaunay property and the advantage that all constrained edges are present, possibly subdivided in parts, in the resulting TIN. Figure 4 shows a conforming TIN, covering several parcels (different shades of grey). To improve visualisation the height has been exaggerated (10 times).
Fig. 4. Conforming TIN in which point heights and 2D planar partition of parcels (each with own shade of grey) are integrated.
2.4 Refined constrained TIN However, also a (normal) conforming TIN has its drawbacks compared to a constrained TIN. In case of two very close ‘near parallel’ constrained edges, a large number of very small triangles are generated while these constrained edges are split in many very small edges (see figure 5). Something similar, can also happen when AHN points are very close to the con-
Generalization of integrated terrain elevation and 2D object models
533
strained edges. These small triangles have no use, as they do not reflect any height differences (at least the height differences cannot be derived from the AHN points) and they also do not reflect additional object information.
Fig. 5. Conforming TIN results in very small triangles in case of two very close near parallel constrained edges or in case AHN points are very close to constrained edges, while no extra information is added
A solution for this is splitting the constrained edges, before inserting them, into parts not larger than two or three times the average distance between neighbour AHN points and then computing the (normal) constrained TIN. Figure 6 shows the refined constrained TIN for one parcel. The edges of the parcel boundaries were split into parts of at most 10 meters. These edges were then used as constraints in the triangulation, which resulted in a refined constrained TIN. This improves the shape of triangles considerably (too flat and too small triangles are avoided). Moreover, since points are added on the parcel boundaries for which the height has been deduced based on the unconstrained TIN, it is possible to represent more variation in height across a parcel boundary. Also the problem of many, very small triangles (in case of close ‘near parallel’ constraints and input points close to constraints) in the conformal TIN is avoided.
Fig. 6. A parcel surface based on a refined constrained TIN
534
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
3 Analysing and querying parcel surfaces from the DBMS The actual extraction of a parcel surface is performed within the Oracle DBMS. A kind of topologically structured model is used in which the triangles are not explicitly stored, but are represented via the references to their nodes. During the analysis of parcel surfaces, all triangles that are covered by one parcel are selected by means of a spatial query. To select these triangles, first the realisation of the geometries of triangles needs to be performed. To illustrate the query to extract a parcel surface from the DBMS, the refined constrained TIN has been used. To speed up the query first a function-based index was built on the TIN table (R-tree index): create table TIN_vertex (id number(10), location sdo_geometry, z number(10)); create table TIN_r (id number(10), v1 number(10), v2 number(10), v3 number(10)); insert into user_sdo_geom_metadata values ('TIN_R', 'return_geom(id)', mdsys.sdo_dim_array( mdsys.sdo_dim_element('X', 0, 254330, .001), mdsys.sdo_dim_element('Y', 0, 503929, .001)), NULL); create index tin_idx on tin_r(RETURN_geom(ID)) indextype is mdsys.spatial_index;
The spatial query to find all points or triangles that are located within one parcel can be performed with a spatial function (sdo_geom.relate in the Oracle implementation). The query to select triangles that are within a specific query parcel (number 4589, municipality GBG00, section D) using the spatial function, is: select id, return_geom(id) shape from tin_r, parcels par where parcel=' 4589' and municip='GBG00' and section=' D' and sdo_geom.relate(par.geom, COVEREDBY+INSIDE', return_geom(tin_r.id),1)='TRUE';
For the unconstrained TIN we used the option 'anyinteract', since otherwise we miss the triangles that cross parcel boundaries. 3D area of parcel surface The cadastral map is a 2D map containing projection of parcels. Consequently the cadastral map does not contain the true area of surface parcels. In mountainous countries the true area of parcels may be needed, since tax rates are based on the area of parcels. The integrated TIN based on height data and parcels can also be used for obtaining the true area of a parcel. The area of a parcel in 3D space can be computed by summing up the true area in 3D space of all triangles covering one parcel. DBMSs do not support 3D data types and consequently they also do not contain functions to calculate the area in 3D. Arens (2003) describes a research in which a 3D
Generalization of integrated terrain elevation and 2D object models
535
primitive, together with 3D functions, has been implemented as an extension of the Oracle geometrical model. The implementation is based on a proposal of (Stoter and Van Oosterom, 2002). To be able to compute the area of all triangles covering one parcel in 3D, the function 'area3D' that was implemented as part of the research of (Arens, 2003) was used. The 3D area calculation can therefore be performed inside the DBMS. First, we calculated the 2D area of the original parcel polygon. The query parcel is the parcel with a small 'hill' on it (see figure 4): select sdo_geom.sdo_area(geom, 0.1) from parcels where parcel=' 4589' and municip='GBG00' and section=' D';
The area in 2D is 6,737 square meters. The 3D area of the same parcel, which resulted in 6,781 square meters, is performed with the following query: select sum(area3d(return_geom(ID)) from tin_r, parcels par where parcel=' 4589' and municip='GBG00' and section=' D' and sdo_geom.relate(par.geom, 'COVEREDBY+INSIDE', return_geom(id),0.1)='TRUE';
As can be seen from these results, the difference between the projected area and the real area in 3D of this parcel is 44 square meters. Other queries can be performed as well, e.g. fine steepest triangle, find all triangles pointing to the south, or find the highest (lowest) point in this parcel: select max(z), min(z) from tin_vertex, parcels par where par.parcel=' 4589' and par.municip='GBG00' and par.section=' D' and sdo_geom.relate(par.geom, 'COVEREDBY+INSIDE',location,0.1) ='TRUE'; MAX(Z) -----14.24
MIN(Z) --------10.027
4 Generalization of the integrated height and parcel TIN Both the conforming TIN and the refined constrained TIN (with constraints based on subdivided parcel boundaries in order to avoid long straight lines) look promising: the triangles are well shaped (not too flat and in case of the conforming TIN, the Delaunay criterion is fulfilled) and points are added on parcel boundaries in order to represent more height variance on them. However, after some analyses we suspected that far too many points are used in order to represent the surface TIN with the same horizontal and vertical accuracy as the input data sets (AHN points and cadastral map). Note that this is already the case in the input data of AHN, but has become somewhat worse in the conforming TIN and refined constrained TIN. A problem of having huge data sets, is the resulting data vol-
536
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
ume and with that poor performance of queries and analyses. Therefore filtering of the data set aiming at data reduction (generalization) is needed. This section describes two methods to improve the initial integrated height and object model: a detailed-to-coarse approach (section 4.1) and a coarse-to-detailed approach (section 4.2). In section 4.3 a more advanced generalization method of the integrated model is discussed (that is, more than based on height only). 4.1 Detailed-to-coarse approach The first method starts with the complete integrated model. From this model a number of non-relevant point heights are removed while maintaining the significant points, e.g. removing the points where the normal vectors of the incident triangles have a small maximum angle. After removing such a point, the triangulation is locally corrected and it is explicitly checked if the height difference at the location of the removed point in the new TIN is within this tolerance. If so, the point was indeed not significant for the TIN and can be removed. In this process the parcel boundaries are still needed as constraints, since the aim is to be able to select a parcel surface from the TIN. The filtering (aiming at data reduction) is based on filtering the TIN structure and not the point heights themselves. The filtering can use the characteristics of the height surface. On location with little variance in height, points can be removed while on the location with higher variances points are maintained to define the variance in height accurately. Important advantages of data reduction in a TIN structure, compared to gridded structures, are that they can be used on irregularly distributed points and that locations with high height variance will remain as such in the new data set. The prototype implementation is based on this method and more details can be found in section 5. The result of the generalization is shown in figure 7. This may be considered the first step of a larger generalization process (see section 4.3).
Generalization of integrated terrain elevation and 2D object models
537
Fig. 7. Conforming TIN in which point heights and 2D planar partition of parcels are integrated, before (top) and after (bottom) filtering
4.2 Coarse-to-detailed approach The procedure described above starts with all available details and then tries to remove some of the less relevant details, which is not always easy. An alternative method would be starting with a very low detail model and then adding points where the errors are the largest. The initial model could be just the constraints (with estimated z-values at every point of the parcel boundary) inserted in a (conforming/constrained) TIN. In the next step the AHN height point with the largest distance to this surface is located. If this point is within eps_vert distance from the surface (epsilon tolerance in the vertical direction), then the model already satisfies the accuracy requirements. If this point is not within the tolerance, then it is added to the TIN (and the TIN is re-triangulated under the TIN conditions). This procedure is repeated until all AHN height points are within the tolerance distance. This procedure is a kind of 2.5D counterpart of the well-known Douglas-Peucker (Douglas and Peucker, 1973) line generalization.
538
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
4. 3 Integrated height and object generalization Until now, only the height was taken into consideration during the generalization process, both in the detailed-to-coarse and coarse-to-detailed approach. However, as the model is supposed to be an integrated model of height and objects, also the objects should participate in the generalization. Therefore the integrated height and object model could be further generalized by taking into account both the elevation aspect and the 2D objects at the same time. It is already possible to separately generalize the terrain model (Bottelier, 2000; Brugelmann, 2000; Jacobsen and Lohmann, 2003; Passini and Betzner, 2002) and 2D objects (Douglas and Peucker, 1973; Van Oosterom, 1995). However, the integrated generalization of the height and object model makes this model also well suitable for truly integrated generalization of both the terrain elevation model and the object model at the same time. This is a novel approach. Starting with the detailed-tocoarse approach one could identify the following steps: Step 0: Integrate raw elevation model (AHN) and objects (parcel boundaries) in a (conforming or refined constrained) TIN; see section 2. Step 1: Improve the efficiency of the TIN created in step 0 by removing AHN points under the conditions of the TIN until this is now longer possible given the maximum tolerance value eps_vert_1 (as described in section 4.1). Note that this tolerance could be adjusted for different circumstances, but the initial value should be the same size as the accuracy of the input data. Step 2: Now also start generalization of the object boundaries, for example with the Douglas-Peucker line generalization algorithm, by removing those boundary points which do not contribute significantly to the shape of the boundary. This can be done in 2D (standard Douglas-Peucker), but it is better to apply this algorithm in 3D. Keep on removing points until this is impossible within the given tolerance eps_hor_1. After this line generalization of the constraints, re-triangulate the TIN according to the rules as in step 1 (of a conforming or refined constrained TIN). Step 3: Finally, for multi-resolution purposes, also start aggregating the objects, for example in our case: parcels to sections (and the next aggregation level would be sections to municipalities, followed by municipalities to provinces, etc.). In fact this is removing some of the constrained edges (original parcel boundaries) from the input of the integrated model. Repeat step 1 and 2 with other values for the epsilon tolerances at every aggrega-
Generalization of integrated terrain elevation and 2D object models
539
tion level: eps_vert_2, eps_hor_2 (at the section level), eps_vert_3, (at the municipality level), ....
eps_hor_3
5 Prototype The first steps (step 0 and step 1) have been implemented in a prototype. During the development of the filtering algorithm inspiration was drawn from Cellular Automata and more specific from the Game of Life (Wojtowicz, 2004). Cellular automata is the generic name of a set of mathematical point operators, which change repeatedly the state of a collection of cells in a chaotic order. For example, in the ‘Game of Life’ the starting point is a grid with cells that can be either 'on' or 'off'. A small set of rules defines the state of a cell in the next generation, based on the state of the direct neighbouring cells. John Conway, the mathematician who invented the best-known cellular automation ‘Game of Life’, defined the following criteria for cellular automata (Wojtowicz, 2004): x There should be no initial pattern for which there is a simple proof that the population can grow without limit. x There should be initial patterns that apparently do grow without limit. x There should be simple initial patterns that grow and change for a considerable period of time before coming to an end in three possible ways: o fading away completely (from overcrowding or from becoming too sparse), o settling into a stable configuration that remains unchanged thereafter, or o entering an oscillating phase in which they repeat an endless cycle of two or more periods. The beauty of these procedures is its fuzzy-like, unpredictable behavior, which provided useful stable results in modeling amongst others urban development and chemical, physical and biological dynamic processes. In the prototype the basic idea of cellular automata is adopted as the points in the surface TIN are considered as having a boolean state, namely 'characteristic' or 'non-characteristic', which determines whether the point makes a significant contribution to the shape of the surface model. The analogy with the cellular automata can be carried further in the decision criteria, i.e. two criteria are defined, one that defines whether a point's state changes from 'characteristic' to 'non-characteristic' and one for the reverse operation. The question whether points are characteristic, is in the proto-
540
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
type dependent on the following conditions (note that these conditions are, analogously to cellular automata, based on local characteristics): - A point is characteristic if the angles of neighbouring triangles of the point are significantly different (Brugelmann, 2000). To detect this, the normal vectors for the neighbouring triangles are determined and compared. If the difference is bigger than a given threshold angle, the point will be defined as characteristic and not be removed from the TIN. - Local minima and maxima are also characteristic points of a TIN. If two neighbouring triangles are in the same direction in the first condition, the change in angle is less important than in the case the change in angle demarcates a top or a valley. Therefore, a smaller threshold angle is used in case the specific point is a local minimum or maximum. A minimum or maximum is the case when the azimuths of two neighbouring triangles are opposite of each other, which can be determined by calculating the differences in the azimuths. If the difference is bigger than a given threshold value, a smaller threshold angle is used in the first condition. In the used software the azimuth of a triangle is one of the automatically generated TIN attributes and therefore available without additional calculations Based on these conditions a point's state is determined. If two neighbouring triangles of one point already fulfil one of the criteria, the point's state will be 'characteristic' and the point will therefore be maintained. All other points are marked 'non-characteristic' and therefore removed from the surface TIN. Subsequently after determining these two types of nodes, it is checked whether all non-characteristic nodes are really allowed to be removed (given the height tolerance). This is done by calculating the height difference at the location of the removed point between the original TIN and the new TIN that is generated based on the reduced data set. If this difference is bigger than a threshold value the state changes from 'non-characteristic' to 'characteristic' and the removed point is readded. This now completes one iteration of data reduction. After this step the data reduction is performed again as the data reduction is an iterative process until a more or less stable data structure is obtained. The prototype has been implemented in the 3D Analyst extension of ArcView (ESRI) using the macro language of ArcView (Avenue). In ArcView the TIN is recognised as an object and therefore the TIN data structure can be used directly in the reduction algorithm and in addition the results can easily be visualised. Figures 8 and 9 clearly illustrate how the filtering maintains all terrain shapes but reduced the number of points sub-
Generalization of integrated terrain elevation and 2D object models
541
stantially (height is exaggerated ten times). This prototype shows already the possibility of data reduction on a TIN, but should be implemented as part of the database in the future, once a TIN data structure is supported as data type in a geo-DBMS.
Fig. 8. Detail of filtering results: before (left) and after (right) data reduction
Fig. 9. Detail of filtering results: before (left) and after (right) data reduction
For our initial test, the data reduction is performed on the unconstrained TIN. This means that the 2D objects and the point heights are kept separately during the data reduction process in order to get a first impression of the achievable results. Incorporating the constraints at this stage would have made the data reduction process more complex. Figure 7 already showed the conforming TIN of our test data set, before and after filtering. We did experiments with different parameters. The parameters that showed best results for an initial generalization (filtering), were: minimum angle between two neighbouring triangles to be a characteristic point 4.5 resp. 3 degrees (if a point resp. is or is not a top or a valley), difference in azimuth between two neighbouring triangles to determine if two triangles are opposite of each other: 120 degrees. The maximum allowed difference in height to determine if a removed point should be re-added was 0.25 meters. Apart from the minimum angle, the chosen parameters values are based on previous research (Penninga, 2002). The data set used in this example covers an area of 1,450 by 800 meters and contains 44,279 AHN points (maximum z-value 14.2 meters, minimum z-value 6.7 meters, mean 9.5 meters). Three iterations steps were
542
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
used to filter the data set. In the first iterative step, 34,457 AHN points were removed (9,822 were considered to be characteristic). The average height difference between the original and the new TIN was 0.09 meter. 3,243 points were re-added since they exceeded the height difference of 0.25 meter, which resulted in 13,065 points after the first step. After the second iteration step 8,697 points were determined as characteristic, the average height difference between the new and the original point was again 0.09 meter, 3,469 points were re-added and this all resulted in 12,166 points. After the third iteration step, 8,455 points were considered to be characteristic (average height difference 0.09 meter) and 3,529 points were re-added. After this step the data reduction process was stopped. The results of the data reduction process are listed in the following table: Table 1. Results of iteration steps in the data reduction process Iteration step 1 2 3
# input points 44,279 13,065 12,166
# rem. points 34,457 4,368 3.711
# char. points 9,822 8,697 8,455
# re-added points 3,243 3,469 3,529
Reduction rate 70 % 73% 73%
After the total data reduction process 11,984 points from the original 44,279 points were maintained. This is a reduction of 73%. As can be seen from figure 10, points were removed from areas with little height variance, while density of point heights in areas with high height variance (e.g. on the dikes) is still high.
Fig. 10. Results of data set after reduction (points not removed are black)
Generalization of integrated terrain elevation and 2D object models
543
6 Conclusions Incorporating the planar partition of 2D objects, e.g. the cadastral map, into a height surface makes it possible to extract the 2.5D surfaces of 2D objects and to visualise 2D maps in a 3D environment by using 2.5D representations. As described and discussed in section 2 it is not easy and straightforward to create a good integrated elevation and object model. Several alternatives were investigated, unconstrained Delaunay TINs, constrained TINs, conforming TINs, and finally refined constrained TINs. After some analyses, the most promising solution, the refined constrained TIN, was selected and applied with success to our test case with real world data: AHN height points and parcel boundaries. The integrated model however, contains too many AHN points, which do not contribute much to the actual terrain description. Therefore we proposed a method to generalize the integrated model. We implemented the first step of this method into a prototype. In this prototype noncharacteristic points are removed from the (unconstrained) TIN in an iterative filtering process based on cellular automata. The filtering was conducted directly on the surface instead of on the point attributes. As can be concluded from experiments with the prototype, it is possible to determine important terrain characteristics by using a simple criterion (difference in angle of neighbouring triangles). With this method it is possible to reduce the data set considerably. The test data set contained about 4 times fewer points after generalization, but still within the epsilon tolerance of the same size as the quality of the original input data sets. On the other hand significant information on the height surface is still available in the TIN. The initial filtering yielded therefore a much improved integrated model. Finally, it was indicated that the integrated height and object model could be further generalized by taking into account both the elevation aspect and the 2D objects at the same time. This is a complete new research topic and no previous results are known. Of course, it is possible to separate generalization of the terrain model and 2D objects. However, the integrated generalization of the height and object model should make this model also well suitable for other resolutions (scales) or even in a multiresolution context. An initial algorithm has been outlined. Future work with respect to the integrated generalization of the height and object model includes: - Implement TIN data structure and triangulation within the geoDBMS, as well as the data reduction methods (removing need to in- and export data between DBMS and TIN applications).
544
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
-
-
-
-
Implement more of the proposed integrated model, especially the boundary line generalization and the object aggregation. Test the proposed integrated generalization with the original AHN point data (that is not-filtered and not resampled to a grid structure). Maintain the result of the generalization in a multi-scale data structure, as the costs of the computations are significant. A data structure similar to the BLG-tree (for line generalization based on the Douglas-Peuker algorithm, (Van Oosterom, 1995)) or the GAP-tree (for storing the result of generalization of an area partitioning (Van Oosterom, 1995a)) should be developed for the integrated height and object model. As indicated in section 2, the current TIN computation takes place in the 2D plane. It may be better to compute the integrated height and object model in true 3D space, based on tetrahedrons (and then finding the proper surface within this tetrahedron network) (Verbree 2003). Test with applications based on integration of height and objects from other domains than the current cadastral parcels; examples of these are topographic, soil type, or land-use data sets. Also these objects can be aggregated to larger objects during the generalizations (and the result can be stored in the earlier mentioned GAPtree, but this is purely 2D until now).
Acknowledgements We would like to thank RWS-AGI and the Netherlands' Kadaster for making respectively their AHN and cadastral data available for our research. Further we would like to thank the developers of the Triangle software for making their product available. Finally, we would like to thank our partners in the GDMC (Geo Database Management Center), Oracle and ESRI, again for making their software available. This publication is the result of the research programme 'Sustainable Urban Areas' (SUA) carried out by Delft University of Technology.
Generalization of integrated terrain elevation and 2D object models
545
References Arens, C., J.E. Stoter, and P.J.M. van Oosterom, 2003, Modelling 3D spatial objects in a GeoDBMS using a 3D primitive. In Proceedings AGILE 2003, Lyon, France, April 2003. Bottelier, P., 2000, Fast Reduction of High Density Multibeam Echosounder Data for Near Real-Time Applications. The Hydrographic Journal, (98), 2000. Brugelmann, R., 2000, Automatic break line detection from airborne laser range data. International Archives of Photogrammetry and Remote Sensing, 33(B3/1):109–116, 2000. Douglas, D.H. and T. K. Peucker 1973, Algorithms for the reduction of points required to represent a digitized line or its caricature. Canadian Cartographer, 10:112–122, 1973. Gorte, B., 2002, Segmentation of TIN-structured surface models. In Joint Conference on Geo-spatial theory, Processing and Applications, Ottawa, Canada, July 2002. Heerd, van. ,R.M., et al., 2002, Productspecificatie AHN 2000. Technical Report MDTGM 2000.13, Rijkswaterstaat, Meetkundige Dienst, 2000 (in Dutch). Jacobsen, K. and P. Lohmann, 2003, Segmented filtering of laser scanner DSMS. In ISPRS working group III/3 workshop 3D reconstruction from airborne laserscanner and InSAR data, Dresden, Germany, October 2003. Lemmen, C.H.J., E. Oosterbroek, and P.M.J. van Oosterom, 1998, New spatial data management developments in the Netherlands Cadastre. In Proceedings of the FIG XXI International Congress, commission 3, Land Information Systems, pages 398–409, Brighton UK, July 1998. Lenk, U., 2001, Strategies for integrating height information and 2D GIS data. In Joint OEEPE/ISPRS workshop: From 2D to 3D, establishment and maintenance of national core spatial databases, Hannover, Germany, October 2001. Oosterom, van, P.J.M., 1995, The GAP-tree, an approach to ‘On-the-Fly’ Map Generalization of an Area Partitioning. In Müller, J.P. Lagrange, and R. Weibel, editors, GIS and Generalization, Methodology and Practice, chapter 9, pages 120–132. Taylor & Francis, 1995. Oosterom, van, P.J.M. and V. Schenkelaars, 1995a, The Development of an Interactive Multi-Scale GIS. International Journal of Geographical Information Systems, 9(5):489–507, 1995. Oosterom, van, P.J.M., J.E. Stoter, C.W. Quak, and S. Zlatanova, 2002, The balance between geometry and topology. In D. Richardson and P.J.M. van Oosterom, editors, Advances in Spatial Data Handling, 10th International Symposium on Spatial Data Handling, Ottawa, Canada, July 2002. Passini, R. and D. Betzner, 2002, Filtering of digital elevation models. In Proceedings FIG, ACSM/ASPRS, Washington D.C. USA, April 2002. Penninga, F., 2002, Detectie van kenmerkende hoogtepunten in TIN’s voor iteratieve datareductie (in Dutch). In Geo-informatiedag Nederland 2002, Ede, the Netherlands, February 2002.
546
J.E. Stotr, F. Penninga and P.J.M. van Oosterom
Shewchuk, J.R., 1996, Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator. In First Workshop on Applied Computational Geometry, pages 124–133, Philadelphia, Pennsylvania, USA, May 1996. Stoter, J.E. and B. Gorte, 2003, Height in the cadastre, integrating point heights and parcel boundaries. In Proceedings FIG Working Week, Paris, France, April 2003. Stoter, J.E. and P.J.M. van Oosterom, 2002, Incorporating 3D geo-objects into a 2D Geo-DBMS. In Proceedings FIG, ACSM/ASPRS, Washington D.C. USA, April 2002. Verbree, E. and P.J.M. van Oosterom, 2003, The STIN method: 3D surface reconstruction by observation lines and Delaunay TENs. In ISPR workshop ’3-D reconstruction from airborne laserscanner and InSAR data, Dresden, Germany, October 2003. Wojtowicz, M., 2004, http://psoup.math.wisc.edu/mcell/, 2004. Worboys, W.F., 1995, GIS, a computing perspective. Taylor and Francis, London, 1995.
An Image Analysis and Photogrammetric Engineering Integrated Shadow Detection Model Yan Li1 Peng Gong1,2 and Tadashi Sasagawa3 1 International Institute for Earth System Science, Nanjing University, China, [email protected] 2 Department of Enrirontemtal Science, Policy, and Management, University of California, Berkeley, [email protected] 3 PASCO Corporation, Tokyo, 153-0043, Japan,
Abstract A model of automatically detecting the building shadows in high resolution aerial remote sensing image is introduced in this paper. The space coordinates of the shadows are first computed using photogrammetric engineering. To do this, digital surface model (DSM) and the sun zenith and azimuth angles are used. By camera model, the scanning line and the camera position are calculated for each space shadow. A contour driven height field ray tracing is proposed to determine the visibility of a shadow. For a visible shadow, its projecting in the image, called measured shadow, is calculated by collinearity equations. Then by image analysis the reference segmentation threshold is obtained from the intensity distribution of the measured shadows. At last, the image segmentation is implemented to get the precise image shadow areas. Keywords: Shadow detection, remote sensing, DSM, ray tracing, photogrammetric engineering.
1 Introduction In high resolution aerial images the buildings in urban area generally create shadows on the ground or other buildings. In some applications, such as image matching and change detection, the shadows will affect the analysis and cause wrong results. The objective of our research is to detect
548
Yan Li, Peng Gong and Tadashi Sasagawa
the shadow area by image processing and photogrammetric engineering, and restore their color and intensities. For this reason, we detect the shadow coordinates of the building using photogrammetric engineering on the DSM, and project them to the image plane. Then we use image processing to remove these image shadows. This paper will focus on a shadow detection model integrated with image analysis and phtogrammetric engineering. Shadow processing is a critical problem in image processing. In some literatures, binocular image computations have been used to remove the shadows (Chen and Rau 1993), (Zhou et al. 2003). A mathematical model was proposed in literature (Zhou et al. 2003) to detect the occlusion by visibility analysis and photogrammetric engineering. When the building model is given it is projected to the image and the coordinates of the corners in the image are calculated. The shadow area is fulfilled by the same position of the slave image taken simultaneously with the master image. Ray tracing is widely used in computer graphics as a kind of method to determine the intersection of a ray with a 3D surface. It is used in the visibility analysis to decide if the light between a pair of points is occluded (Paglierroni and Petersen 1994), (Paglierroni 1997), (Paglierroni 1999), (Bittner 1999). The height field ray tracing traces the ray incrementally along successively encountered cells. Incremental methods are inefficient when applied to oblique rays that traverse large distances over the surface before intersecting it. One approach is to apply results of height field preprocessing to the ray tracing algorithm. This approach boosts ray tracing efficiency by reducing the number of ray tracing steps required (Paglierroni and Petersen 1994).
The image used in our research is the aerial photo of 20 cm resolution acquired by a line scanning sensor ADS40. Another data is the digital surface model (DSM) of the same region of 1m resolution. We developed a model to automatically detect and segment the shadows of the buildings. It first computes the space coordinate of a building shadow from DSM and camera model by photogrammetrics (Zhang and Zhang 1997), (Gong et al. 1996) and projects it to the image plane. Then the shadow area in the image is precisely segmented and labeled. In shadow computation and ray tracing, we proposed a building contour driven model. The ray tracing is based on parameter plane transform (PPT). Experiments show that the model can precisely detect the shadow area
An Integrated Shadow Detection Model
549
In the second part of this paper, the mathematical model of shadow detection and segmentation is described. The third part is the experiments and the discussion. 2 Shadow Detection and Segmentation The photo scanned by ADS40 is rectified through level 0 and level 1 rectification to create the pseudo-orthoimage which is what we used in thr research. We have studied three approaches to detect and segment the shadows in the image. 1. Detect and segment shadows only using image analysis. The results are correct in the most case. However, because of the complicity of the urban circumstance, there may be some factors effecting the detection of the shadow. For example, the high reflectivity ground, the glass wall of the building, will make some relevant shadow bright, and cause their intensities close to the uncovered area. These shadows may not be detected by image analysis technology. Besides, the threshold of the segmentation is difficult to decide. As a result, the segmentation of the shadow is not so reliable and robust. 2. Compute the shadow location in the RGB image by photogrammetric engineering, using camera model and digital surface model (DSM) and the image together. If the DSM had been as precise as the camera model, the result would have been perfect. However, for the time being, this is not reality. The primary locations of the buildings in DSM have no problem, but the resolution is lower than the image. In addition, DSM lose much details of the buildings. Obviously, singly using this method the shadow computed in the image will have errors. 3. Based on the above facts, we propose a new approach, integrating the advantages of the two method. In detail, since the locating of method 2 is reliable, the cast shadows are first computed by method 2, and the shadow area and its corresponding bright area are labeled and their statistics are calculated. The reference segmentation threshold is obtained from their statistics. Then the delicate segmentation of the shadow is carried out by image analysis. This strategy ensures the correct shadow locating, and no false shadow or losing shadow happens. Meanwhile, the details of the shadow shape will persist.
550
Yan Li, Peng Gong and Tadashi Sasagawa
2.1 Coordinate of the shadow in 3D space
Fig. 1 Rotated DSM
The ADS40 model, DSM are used in this stage to obtain the shadow locations in space. DSM gives the heights of the surface of the ground, including the buildings and the trees. When the altitude of the sun is given, the shadow casted by the building and tree can be calculated. DSM represents a two dimensional height field. The row of DSM corresponds to the altitude, and the column corresponds to the longitude. A position in the DSM plane is called a cell. The sun ray is supposed equidirectional everywhere, which can be represented by the zenith angle z and the altitude angle a. The two angles are independent obviously. For convenience, We first rotate the DSM by an angle, that is, the zenith angle, to making the zenith angle is equivelant to point from the left to the right horizentally in the rotated DSM, as shown in Figure 1. Because a shadow must be created along the sun zenith angle, for a cell in some row of the rotated DSM, if it creates a shadow, the shadow cell must be in the same row. Thus, we can compute the shadow coordinate in one dimension case. Consider a row of the rotated DSM. It is a one dimensional height field now. One cell in this row, if possible, will create a shadow along the horizontal direction. The cells between these two cells must be shadows too (when no other models supplied besides of DSM), as illustrated in Figure 2. Since the study region is urban, the shadows are caused mostly by the buildings. Thus we adopt a mathematical model driven by building contour. The height signal breaks at the cell corresponding to the building contour. Consequently, a contour cell with negative height difference along the zenith angle will create a shadow, which is called fore-end shadow. Only building contours and fore-end shadows are detected in our approach, not all the cells. Other shadows are determined directly from them. This approach can boost the shadow detection procedure. For a row of a rotated height field, the following steps
An Integrated Shadow Detection Model
551
dure. For a row of a rotated height field, the following steps are needed to compute the shadow location: 1. Detect the edges of the one dimensional height field using Laplas of Gauss operator; 2. Let current cell c=C. 3. Determine if the current cell c is a fore-end shadow using the simple height comparison. If yes, then go on to detect the shadows in the same section, i.e. the cells between the fore-end shadow cell and the first edge cell e in its left side. Update the current cell as c=e-1; Otherwise, let c=c-1. 4. If c=1, stop; Otherwise, go to step 3. Height comparison includes following steps: 1. Starting with the current cell c, sequentially search cells c-1, c-2, ….If c=1, stop; Otherwise, let searching step length s=1. 2. If c-s=1, cell c is not a shadow; Otherwise, if the height of cell c-s, h[c-s], is greater than or equal to the height of the ray corresponding to c at cell c-s, tan(a)×s×r, where r is the resolution of DSM, then cell c is a fore-end shadow; Otherwise, let s=s+1, and go to step 2.
Fig. 2 Shadow computation in one dimensional height field
2.2 Ray tracing The objective of ray tracing here is to decide if a space shadow is visible for a scan line. A scanning line corresponds to a position of the camera at a moment. The so-called ray refers to the line between the shadow and its corresponding camera position. Figure 3(a) shows the ray between a shadow and the camera in a height field. (b) shows the heights in one dimension of the cells corresponding to the ray in (a). For a shadow in space, it is first determined to be scanned by a certain scanning line, supposing no occluding happens. This is implemented by the camera model (LH System 2001). Then ray tracing is carried out to determine if it is visible. If it is visible, the corresponding coordinate in the projecting image, i.e. the
552
Yan Li, Peng Gong and Tadashi Sasagawa
measured shadow, is computed by collinearity equations(LH System 2001). 29
28. 5
28
27. 5
27
26. 5
26
25. 5 0
5
10
15
20
25
Fig. 3 Height field and one dimensional height field from camera to shadow
The fundamental ray tracing is Incremental method. The ray is traced at each cell (Paglierroni and Petersen 1994), (Paglierroni 1999). One way to boost the efficiency is the height field preprocessing. The approach to height field preprocessing here is to take the parameter plane transform to the height field surface, or DSM, see Figure 4 (Paglierroni and Petersen 1994). For a height field cell, the empty space above it can be represented by an inverted cone. Whatever direction of the ray is, it will not intersect any volume within the cone. The cone 1 corresponds to cell 1, which is the current tracing cell. The intersect of the ray with cone 1 corresponds to cell 2. The next tracing cell is cell 2 then. It can be seen that, using parameterized ray tracing, the tracing steps will much less than the incremental ray tracing.
Fig. 4 Ray tracing steps for parameter plane transform method
30
An Integrated Shadow Detection Model
553
The height field transform is performed before ray tracing. The parameter field is stored in the memory and accessed in the ray tracing procedure. So the efficiency of the computation and the memory requirement has to be considered for a large height field. Actually, in the urban area which is our case, if a shadow is occluded it is always occluded by a building. Further more, the ray must intersect with the walls of the building which occlude the shadow. Thus, we improved the ray tracing algorithm to make it faster. That is, the tracing is only performed for the contour cells of the buildings. Therefore, only the contour cells need to perform the parameter transform. When the intersect of the ray with the cone corresponds to the current cell locates at a cell before the current cell, the shadow is occluded, and the iterate procedure stops. Otherwise, the tracing goes on to the next contour cell. The shadow is visible if no occlude occurs. Figure 5 illustrates the tracing procedure as an example. In a 1D height field starting from the current camera position and stopping with the shadow point, cell c1 is the tracing start. It corresponds to the first cone and its parameter. The ray intersects with the cone corresponding to cell c1. The intersect corresponds to cell s1. Because s1 is after c1, tracing goes on to the next contour cell c2. The ray intersects with the cone corresponding to c2 and c3 at s2 ad s3 respectively. Since s3 is before c3, tracing stops, and the shadow is occluded. If a shadow is visible, its projection in the average ground height is then computed according to the collinearity equation. Its projection in the image plane is obtained by rotating, scaling, and translating.
Fig. 5 Height field ray tracing steps
Ray tracing is much computational expensive. If we implement ray tracing for every shadow, the computation will not be afforded. In this paper, a simplified tracing model is proposed. We determine the visibility of the shadow based on spatial relativity of the shadows. By doing this, quite a
554
Yan Li, Peng Gong and Tadashi Sasagawa
lot of shadows are determined of their visibilities without ray tracing. Thus the shadow detection time is greatly reduced.
The principle of our simplified tracing model is described as following. The scanning is carried out in the flight. Thus the sequence of the scanning line is relevant with the flight direction. We define the relevant cell of a cell as its adjacent one along the flight direction. Suppose that a shadow cell m is visible for a scanning line k. If the relevant cell m+1 of m is a shadow too, its visibility can be discussed by two cases. 1. The height of cell m+1 is not less than that of cell m, it must be visible for some scanning line k+l (l>0). Because if it is occluded, there must be an object between the current camera position and cell m+1 locating at cell m-n (n>0), and intersect with the ray between the cell and the camera. Suppose that the height of the intersect of the object with the ray corresponding to cell m+1 and cell m is h m+1 and h m respectively, as showing in figure 6, it can be seen from the illustration that we must have h m+1> h m. It means that cell m must be occluded by the same object.
Fig. 6 The relation between the visibility and the heights of shadow cell and relevant cell Therefore, our rule is, for a visible shadow cell, if its relevant cell is a shadow with height not less than the visible shadow, it is visible too. For example, in figure 7, the flight direction of the plane is horizontal and from right to left. The shadows in (a) are caused by the building in (a) and the sun is in the left of the building; The shadows in (b) are caused by the building in (b) and the sun is in the right of the building. These shadow cells adjacent for each other and with the same heights. If the most right shadow cell is traced as visible, according to our rule, the left shadow cell
An Integrated Shadow Detection Model
555
is also visible. Further more, all the other shadow cells are visible too, for both (a) and (b).
Fig. 7 Simplified tracing model
2.3 Integrated shadow detection and location The shadows derived from the last stage do not fit very well with what we observe from the image. Because DSM itself is short of the details, such as the fine structure of the building, it is not practice to count on DSM to give the precise result. However, the ray tracing result gives the approximating correct positions of the shadows. It is helpful for the following shadow segmentation. Since the shadow area now contain most observed image shadows and only little errors, the statistics of the shadow area reflects the distribution of the intensities of the shadow approximately. Therefore, in the first stage, that is, the result of the ray tracing, a reference segmentation threshold is obtained from the mean of the intensities of the shadow area. In the image histogram the local minimum value which is most close to the reference threshold is taken as the threshold for shadow segmentation. Thus, a final segmentation will give a more precise shadow locating.
3 Results and Discussion We have made experiments using aerial image and the associated data to test our shadow detection model. The study field is Tsukuba, Japan. The resolution of the image is 20cm. Figure 8 shows some patches of the image and the effects of shadow detection. In Figure 8, (a) and (d) are two patches of the original image. (b) and (e) refer to the compound image by superposing the shadow area contour detected through photogrammetric engineering to the original image for (a) and (d) respectively. The closing white curves represent the detected shadow area contours. (c) and (f) refer
556
Yan Li, Peng Gong and Tadashi Sasagawa
(a)
(b)
(c)
(d)
(e) (f) Fig. 8 Experiment images and the corresponding ones superposed shadow area contours
An Integrated Shadow Detection Model
557
to the compound image by superposing the shadow contour detected by the integrated model to the original image for (a) and (d) respectively. It can be seen that the detected shadow area contours in (b) and (e) do not fit with what we observe from the image very well. While they are correct as well as precise by the integrated model, as shown in (c) and (f). The experiment for the whole image supports the same conclusion, i.e., the detection model proposed in this paper gives better results than the photogrammetric method singly. This paper proposes a model for shadow detection integrating several data sources, and by taking advantages of photogrammetric method and image analysis method a better result is obtained. Based on the shadow detection result, an image processing technique is used to restore the color and intensities of the shadow pixels to remove the shadow effects. The remaining work will be introduced in other paper.
Reference Bittner J., 1999, Hierarchical techniques for visibility determination, Postgraduate Study Report, DC-PSR-99-05. Chen, L.C. and Rau J.Y., 1993, A unified solution for digital terrain model and orthoimage generation from SPOT stereopairs, IEEE Trans. on Geoscience and Remote Sensing, 31(6), 1243-1252. Gong P., Shi P., Pu R., etc., 1996, Earth Observation Techniques and Earth System Science(Science Press, Beijing, China). Paglierroni D. W. and Petersen S. M., 1994, Height distributional distance transform methods for height field ray tracing, ACMTransactionson Graphics, 13(4), 376-399. Paglieroni D. W., 1997, Directional distance transforms and height field preprocessing for efficient ray tracing, Graphical Models and Image Processing, 59(4), 253–264. Paglieroni D. W., 1999, A complexity analysis for directional parametric height field ray tracing, Graphical Models and Image Processing, 61, 299–321. Zhang Z., Zhang J., 1997, Digital Photogrammetry(Press of WTUSM, Wuhan, China). Zhou G., Qin Z., Kauffmann P., Rand J., 2003, Large-scale city true orthophoto mapping for urban GIS application, ASPRS 2003 Annual Conference Proceedings. ADS40 Information Kit for Third-Party Developers 2001, www.gis.leica-
geosystems.com
Understanding Taxonomies of Ecosystems: a Case Study Alexandre Sorokine1 and Thomas Bittner2 1 2
Geography Department, University at Buffalo [email protected] Institute of Formal Ontology and Medical Information Systems, Leipzig [email protected]
Abstract This paper presents a formalized ontological framework for the analysis of classifications of geographic objects. We present a set of logical principles that guide geographic classifications and then demonstrate their application on a practical example of the classification of ecosystems of Southeast Alaska. The framework has a potential to be used to facilitate interoperability between geographic classifications.
1 Introduction Any geographic map or a spatial database can be viewed as a projection of a classification of geographic objects onto space [1]. Such classification can be as simple as a list of objects portrayed on the map or as complex as a multi-level hierarchical taxonomy such as used in the areas of soil or ecosystem mapping [2]. However, in any of these cases classification is predicated on a limited set of rules that ensure its consistency within itself and what it is projected on. Classifications of geographic objects, if compared to classifications in general, have certain peculiarities because geographic objects inherit many of their properties from the underlying space. Classifications of geographic objects typically manifest themselves as map legends. The goal of this paper is to develop a formalized framework for handling of the structure of and operations on classifications of geographic objects. There are three groups of purposes for development of this framework: (1) such a framework would allow better understanding of the existing classification systems and underlying principles even for non-experts, (2) the framework can provide useful tips for developing new classification systems with improvements in terms of consistency and generality, (3) the framework would allow
560
Alexandre Sorokine and Thomas Bittner C o1 ...on R1 `h J JJJJ JJJJ JJJJ JJJJ J(
W
R
8[ 8 88 88 88q1 ...qn 88 88 88 R2 > 6 ttt ttttt t t t ttt v~ tttt
Fig. 1. Interoperability through representation models
more flexibility for achieving interoperability and fusion between datasets employing non-identical classification systems. Each classification is a representation (representations are denoted by letters R1 and R2 on Fig. 1) of the real world W that was created using a unique and finite chain of operations (o1 . . . on and q1 . . . qn respectively). The goal of our research is to outline operations o1 . . . on and q1 . . . qn in a clear formal, understandable and non-ambiguous manner. This knowledge will allow us to build a new representation R that would be able to accommodate both R1 and R2 thus achieving interoperability (shown on the diagram as double arrows) between them and possibly with other representations.
2 Classification of Ecological Subsections of Southeast Alaska To demonstrate our theory we will use the classification of ecological subsection of Southeast Alaska [3] as a running example (Fig. 2). This classification was developed by the USDA Forest Service and it subdivides the territory of Southeast Alaska into 85 subsections that represent distinct terrestrial ecosystems. The purpose of the classification is to provide a basis for practical resource management, decision making, planning, and research. The classification has three levels that are depicted in Table 1. The first level (roman numerals in Table 1) subdivides the territory into three terrain classes: active glacial terrains, inactive glacial terrains and post-glacial terrains. At the next level (capital letters in Table 1) the territory is subdivided according to its physiographic characteristics. The third level of the classification (numbers in Table 1) divides territory by lithology and surface deposits.
D. Lowlands 1. Till Lowlands 2. Outwash Plains 3. Glaciomarime Terraces 4. Wave-cut Terraces III. Post-glacial Terrains A. Volcanics
Roman numerals Terrains classes Capital letters Physiographic classes Numbers Geologic classes
3 A Formal Theory of Classes and Individuals In this section we will introduce logical theories that are needed to formalize relations behind ecosystem classifications and demonstrate their application using the classification of ecological subsection of Southeast Alaska (Table 1 and Fig. 2) as a running example. Formalization of the theories will be presented using first order predicate logic with variables x, y, z, z1 , . . . ranging over individuals and variables u, v, w, w1 , . . . ranging over classes. Predicates always begin with a capital letter. The logical connectors ¬, = , ∧ , ∨ , → , ↔ , ≡ have their usual meanings: not, identical-to, and, or, ‘if . . . then’, ‘if and only if’ (iff), and ‘defined to be logically equivalent’. We write (x) to symbolize universal quantification and (∃x) to symbolize existential quantification. Leading universal quantifiers are assumed to be understood and are omitted. Strict distinction between classes and individuals is one of the cornerstones of our theory. Typically classifications and map legends show only classes. However, the diagram on Fig. 2 shows a mix of classes and individuals. In our understanding ecological subsections, i.e., such entities as “Behm Canal Complex”, “Summer Strait Volcanics”, “Soda Bay Till Lowlands” and others leafs of the hierarchy, are individuals. All other entities that are not leafs (e.g., “Active glacial terrains”, “Granitics”, “Volcanics”, etc.) are classes. In the same sense Table 1 shows only classes of the classification.
Understanding Taxonomies of Ecosystems
563
3.1 The Tree Structure of Classes Examples of classes are the class human being, the class mammal, the class ecosystems of the polar domain, the class inactive glacial terrains, etc. Classes are organized hierarchically by the is-a or the subclass relation in the sense that a male human being is-a human being and a human being is-a mammal, or, using our example, “Exposed Bedrock” is-a “Deglaciated Area”. In the present paper the is-a or subclass relation is denoted by the binary relation symbol and we use symbol for the proper subclass relation. We will write u v to say that class u is involved in the subclass relation with class v. Also we will call v a superclass of u if the relation u v holds. The proper subclass relation is asymmetric and transitive (ATM1–2). It very closely corresponds to the common understanding of the is-a (kind-of) relations: (ATM1) u v → ¬v u (ATM2) (u v ∧ v w) → u w Axiom ATM1 postulates that if u is a proper subclass of v then v is not a proper subclass of u. Transitivity (ATM2) implies that all proper subclasses of a class are also proper subclasses of the superclass of that class. In our example (Table 1) class “Exposed Bedrock” is a proper subclass of “Recently Deglaciated Areas” that in turn is a proper subclass of “Active Glacial Terrains”. Due to the transitivity of the proper subclass relation we can say that class “Exposed Bedrock” is also a proper subclass of the class “Active glacial terrains”. Next we define the relations of subclass as D . Unlike proper subclass, subclass relation allows for a class to be a subclass of itself: D u v ≡ u v ∨ u = v One can then prove that the subclass relation () is reflexive, antisymmetric and transitive, i.e., a partial ordering. Class overlap (O ) is defined as DO : DO O uv ≡ (∃w)(w u ∧ w v) Classes overlap if there exists a class that is a subclass of both classes, e.g., in Table 1 class “Icefields” overlaps with class “Active Glacial Terrains”. We now add the definitions of a root class and an atomic class (atom). A class is a root class if all other classes are subclasses of it (Droot ). A class is an atom if it does not have a proper subclass (DA ): Droot DA
root u ≡ (∀v)(v u) A u ≡ ¬(∃v)(v u)
In our example the root class of the classification would be a class of all Southeast Alaska ecological subsections (Fig. 2). Geologic classes designated
564
Alexandre Sorokine and Thomas Bittner
with numbers in Table 1 are atoms because they do not have any proper subclasses. In practice in many classifications root classes are not specified explicitly however their existence has to be implied. For example, Table 1 does not contain a root class but it can be inferred from the context that the root class is “Southeast Alaska ecological subsections”. Since the hierarchy formed by classes of Southeast Alaska ecological subsections are the result of a scientific classification process we can assume that the resulting class hierarchy forms a tree. We are justified to assume that scientific classifications are organized hierarchically in tree structures since scientific classification employs the Aristotelean method of classification. As [4] point out, in the Aristotelian method the definition of a class is the specification of essence (nature, invariant structure) shared by all instances of that class. Definitions according to Aristotle’s account are specified by (i) working through a classificatory hierarchy from the top down, with the relevant topmost node or nodes acting in every case as undefinable primitives. The definition of a class lower down in the hierarchy is then provided by (ii) specifying the parent of the class (which in a regime conforming to single inheritance is of course in every case unique) together with (iii) the relevant differentia, which tells us what marks out instances of the defined class or species within the wider parent class or genus, as in: human = rational animal, where rational is the differentia (see also [5] for more details.) Now we have to add axioms that enforce tree structures of the form shown in Fig. 3(a) and which rule out structures shown in Figs. 3(b) and 3(c). These additional axioms fall into two groups, axioms which enforce the tree structure and the finiteness of this structure respectively. We start by discussing the first group. Firstly, we demand that there is a root class (ATM3). Secondly, we add an axiom to rule out circles in the class structure: if two classes overlap then one is a subclass of the other (ATM4). This rules out the structure in Fig. 3(b) and also it is very much true for our running example: all overlapping classes in Table 1 are subclasses of each other. Thirdly, we add an axiom to the effect that if u has a proper subclass v then there exists a class w such that w is a subclass of v and w and u do not overlap (ATM5). This rules out cases where a class has a single proper subclass or a chain of nested proper subclasses. Following [6] we call ATM5 the weak supplementation principle. (ATM3) (∃u)root(u) (ATM4) O uv → (u v ∨ v u) (ATM5) u v → (∃w)(w u ∧ ¬O wv) Upon inspection, the classification in Table 1 violates the weak supplementation principle (axiom ATM5) because the class of “Post-glacial terrains” has only a single subclass “Volcanics”. For this reason classification on Table 1 is not a model of our theory. We will discuss this case in detail in Sect. 4 and show that the underlying classification does satisfy our axioms but that additional operations have been performed on these structures.
Understanding Taxonomies of Ecosystems
root
c
565
root
a
b
d
e
a
f
(a) Proper tree
c
b
d
f
(b) Overlaps
c
a
b
d
e
f
(c) Multiple roots
Fig. 3. Trees (a) and non-trees ((b) and (c))
Using Droot , the antisymmetry of , and ATM4 we can prove uniqueness of the root class. This rules out the structure shown in Fig. 3(c). The second group of axioms that characterizes the subclass relation beyond the properties of being a partial ordering are axioms which enforce the finiteness of the subclass-tree. ATM6 ensures that every class has at least one atom as subclass. This ensures that no branch in the tree structure is infinitely long. Finally ATM7 is an axiom schema which enforces that every class is either an atom or has only finitely many subclasses. This ensures that class trees cannot be arbitrary broad. T M 6 (∃ )( ∧ x) V T M7 ¬ → (∃x1 . . . xn )(( 1≤i≤n xi ) ∧ (z)(z
→
W 1≤i≤n
z = xi ))
Here ( 1≤i≤n xi y) is an abbreviation for x1 y ∧ . . . ∧ xn y and 1≤i≤n z = xi for x1 = z ∨ . . . ∨ xn = z.
3.2 Classes and Individuals Classes and individuals are connected with the relationship of instantiation InstOf xu which first parameter is an instance and which second parameter is a class. InstOf xu then is interpreted as ‘individual x instantiates the class u’. From our underlying sorted logic it follows that classes and individuals form disjoint domains, i.e., there cannot exist a entity which is a class as well as an individual. Therefore instantiation is irreflexive, asymmetric, and non-transitive. In terms of our theory each individual (an ecological subsection) instantiates a class of the subsection hierarchy. A single class can be instantiated by several individuals. For example, we can say that individuals “Behm Canal Complex”, “Berg Bay Complex” and others instantiate the class of “Rounded Mountains”. Axiom (ACI1) establishes the relationships between instantiation and the subclass relation. It tells us that u v if and only if every instance of u is also an instance of v.
566
Alexandre Sorokine and Thomas Bittner ACI1 (u v ↔ (x)(InstOf xu → InstOf xv)) T CI1 (u = v ↔ (x)(InstOf xu ↔ InstOf xv))
From (ACI1) it follows that two classes are identical if and only if they are instantiated by the same individuals. Finally we add an axiom that guarantees that if two classes share an instance then one is a subclass of the other (AI2). AI2 (∃x)(InstOf xu ∧ InstOf xv) → (u v ∨ v u) AI2 can be illustrated using the following example: classes “Inactive Glacial Terrains” and “Rounded Mountains” share instance “Kook Lake Carbonates” (Fig. 2) and “Rounded Mountains” is a subclass of “Inactive Glacial Terrains”.
4 Applying the Theory to Multiple Classifications As it was mentioned in Sect. 3.1, Southeast Alaska ecological subsections hierarchy (Fig. 2 and Table 1) does not satisfy one of the axioms of our classification theory: the weak supplementation principle (ATM5). It is easy to notice that at the third level of the classification on (Table 1) contains repeating classes. For example, class “Granitics” can be found under the classes “Angular Mountains”, “Rounded Mountains” and “Hills” in the class “Inactive Glacial Terrains”. Given this, it is possible to interpret classification on Fig. 2 as a product of two independent classification trees: classification of terrains (terrain classes and physiographic classes in Table 1) and classification of lithology and surface geology (geologic classes in Table 1). The hierarchies of classes that represented these two separate classifications are shown on Fig. 4(a) and Fig. 4(b) respectively. The product of these classifications is depicted in Table 2, with terrain classes as columns and geologic classes as rows. Each cell of the table contains the number of individuals that instantiate corresponding classes of both hierarchies. Class hierarchies on Fig. 4 contain two differences from the original hierarchy in Table 1. The class “Volcanics” that violates the weak supplementation principle was moved from the terrains hierarchy into the geologic classes hierarchy (Fig. 4 and Table 2). This is a more natural place for this class because there is already a class with an identical name. Another problematic class is “Icefields”. It is an atomic class and does not have any subclasses. Also it does not represent any geologic class. To be able to accommodate class “Icefields” in the product of classifications we have added a new class “Other” to the hierarchy of geologic classes (Fig. 4(b) and Table 2). By using a product of two classifications we have avoided the problem of having a class with a single proper subclass. The remaining part of this section describes how a product of two or more classifications can be formalized.
Understanding Taxonomies of Ecosystems
567
Terrain classes
I. Active Glacial Terrains
A. Icefields
B. Recently deglaciated areas
II. Inactive Glacial Terrains
C. Mainland rivers
A. Angular Mountains
B. Rounded Mountains
III. Post-glacial Terrains
C. Hills
D. Lowlands
A. Volcanics
(a) Terrain classes (class “Volcanics” violates the weak supplementation principle)
Geologic classes
Exposed Bedrock
Unconsolidated Sediments
Valleys
Deltas
Granitics
Sedimentary, Carbonates
.......
Other
Volcanics
(b) Geologic classes (some classes are not shown, class “Volcanics” was added from Terrain classes hierarchy) Fig. 4. Classification trees
4.1 From Theory to Models The theory presented in the previous section gives us a formal account of what we mean by a classification tree and by the notion of instantiation. In this section we now consider set-theoretic structures that satisfy axioms given above. This means that we interpret classes as sets in such a way that the instance-of relation between instances and classes is interpreted as the element-of relation between an element and the set it properly belongs to and we interpret the is-a or subcell relation as the subset relation between sets. Sets satisfying our axioms then are hierarchically ordered by the subset relation then can be represented using directed graph structures in such a way that sets are nodes on the graph. Formally a graph is a pair T = (N, E) where N is a collection of nodes and E is a collection of edges. Let ni and nj nodes be nodes in N corresponding to the sets i and j then we have a directed edge e in E from ni to nj if and only if the set i is a subset of the set j. Since the sets we consider are assumed to satisfy the axioms given in the previous section it follows that the directed graph structures constructed in this way are trees and we call them classification trees. We now are interested in classification trees, operations between them, and the interpretation of those operations in our running example. In particular
568
Alexandre Sorokine and Thomas Bittner Table 2. The product of terrain and geology classifications Terrain classes
Glaciation phases Physiographic classes
Geologic classes
Other Exposed Bedrock Unconsolidated Sediments Valleys Deltas Granitics Sedimentary, Noncarbonates Sedimentary, Carbonates Metasedimentary Complex Sedimentary and Volcanics Mafics, Ultramafics Volcanics Till Lowlands Outwash Plains Glaciomarine Terraces Wave-cut Terraces
we will use the notion of classification tree in order to formalize the notion of product between classification discussed in the introduction of this section. Set theory allows us to also form higher order sets, i.e., sets of sets. In what follows we will consider levels of granularities that are sets of sets in the given interpretation. For example, the set of all leafs in a classification tree is a level of granularity. Since in our interpretation nodes in the tree structure correspond to sets levels of granularity are sets of sets. Below we will introduce the notion of a cut in order to formalize the notion of level of granularity. Notice that in the case of higher order sets the element-of relation is not interpreted as an instance-of relation.
Understanding Taxonomies of Ecosystems
569
4.2 Cuts Classification trees can be intersected at their cuts. To define a cut (δ) let us take a tree T constructed as described above and let N be the set of nodes in this tree. Definition 1. A cut δ is a cut in the tree-structure T is a subset of N defined inductively as follows [7, 8]: (1) {r} is a cut, where r is the root of the tree; (2) For any class z let d (z) denote the set of immediate subclasses of z and let C be a cut with z ∈ C and d (z) = ∅, then (C − {z}) ∪ d (z) is a cut. For example, the hierarchy of terrain classes (Fig. 4(a) without class “Volcanics”) has five different cuts. By Definition 1 the root class “Terrain classes” is a cut. If we assume that class “Terrain classes” is z and C is a cut then z ∈ C. Immediate descendants d (z) of the root class z are classes d (z) = { “I. Active Glacial Terrains”, “II. Inactive Glacial Terrains”, “III. Post-glacial Terrains”}. Then d (z) will be the next cut because in this case (C − {z}) = ∅ thus (C − {z}) ∪ d (z) leaves us with d (z) . Repeated application of Definition 1 to the hierarchy of terrain classes (Fig. 4(a)) results if five cuts that are listed in Table 3. Table 3. Examples of cuts in the hierarchy of terrain classes on Fig. 4(a) 1. the root class “Terrain classes” 2. “I. Active Glacial Terrains”, “II. Inactive Glacial Terrains”, “III. Postglacial Terrains” 3. “A. Icefields”, “B. Recently deglaciated areas”, “C. Mainland rivers”, “A. Angular Mountains”, “B. Rounded Mountains”, “C. Hills”, “D. Lowlands”, “III. Post-glacial Terrains”
Using Definition 1 and (ATM1–7) one can prove that the classes forming a cut are pair-wise disjoint and that cuts enjoy a weak form of exhaustiveness in the sense that every class–node in N is either a subclass or a superclass of some class in the cut δ at hand [7].
4.3 Joining Classification Trees Let δ1 and δ2 be cuts in two different classification trees T1 and T2 . Cuts are sets that are composed of classes that satisfy conditions on Definition 1. Let
570
Alexandre Sorokine and Thomas Bittner
δ1 = {u1 , u2 , . . . , un } and δ2 = {v1 , v2 , . . . , vm }. The cross-product δ1 × δ2 of these cuts can be represented as a set of pairs that can be formed by classes in δ1 and δ2 : ⎧ ⎫ (u1 , v1 ) (u1 , v2 ) · · · (u1 , vm ) ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ (u2 , v1 ) (u2 , v2 ) · · · (u2 , vm ) ⎬ δ1 × δ2 = .. .. .. .. ⎪ ⎪ . . . . ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ (un , v1 ) (un , v2 ) · · · (un , vm ) The product of two classification trees on Fig. 4 is depicted in Table 2. A product of two classification trees (or their cuts) will produce N = n×m pairs where n and m are the number of classes in the respective cuts. Most likely N will be greater than the number of classes that can be instantiated by the instances. In our example most of the cells of the Table 2 are empty indicating that there are no individuals that instantiate classes in either classification tree. The reason for it is that certain higher-level classes do not demonstrate as much diversity on the studied territory as other classes do. For example, class “Post-glacial terrains” is represented with only a single subclass “Volcanics” while class “Inactive glacial terrains” contains a total of 20 subclasses (Table 1). To remove empty pairs of classes one has to normalize the product of two classifications, i.e., one has to remove the pairs of classes that do not have instances. Normalized product δ1 δ2 can be formally defined as the cross product of levels of granularity which yields only those pairs which sets have at least one element in common (D ). D∗ δ1 δ2 ↔ (ui , vi ) ∈ δ1 × δ2 ∧ (∃x)(x ∈ ui ∧ x ∈ vi ) In this case each individual instantiates several classes each belonging to a different classification tree. In our example each individual instantiates one class from the classification tree of terrains and another class from the tree of geologic classes. For instance, ecological subsection “Thorne Arm Granitics” instantiates class “Hills” from the terrains classification tree and class “Granitics” from geologic classes.
5 Conclusions In this paper we have presented an ontological framework to dissect and analyze geographic classifications. The framework is based on a strict distinction between the notion of a class and the notion of an individual. There are two types of relations in the framework: subclass relation defined between classes and instantiation relation defined between individuals and classes. Subclass relation is reflexive, antisymmetric and transitive. Classes are organized into classifications that form finite trees (directed acyclic graphs). The latter is
Understanding Taxonomies of Ecosystems
571
achieved by requiring a classification to have a root class, prohibiting loops and classes with a single proper subclass. We have demonstrated how a practical classification can be built using outlined principles on the example of Southeast Alaska ecological subsection hierarchy. Practical classification may require to employ additional operations such as a product of classifications and removal of some classes due to redundancy. Even though most of the operations in our approach would seem obvious for geographers and ecologists, such operations have to be outlined explicitly if the goal of interoperation of two datasets is to be achieved or the information contained in classifications is to be communicated to non-experts in the area. The formalized theory presented above can be used to facilitate interoperability between geographic classifications. Interoperability between classifications can be achieved by creating a third classification capable of accommodating of both of the original classifications. The formal theory of classes and individuals can be used to mark out the elements of classifications such as classification trees, cuts, products and normalized products. Original classifications have to be decomposed into these elements and then these elements have to be reassembled into a new and more general classification. In a hypothetical example to interoperate Southeast Alaska ecological subsection with a similar hierarchy for some other region, we must first perform the operations outlined in Sect. 3 (mark out classification trees, cuts and their products) for both classifications. Then the trees from the different classifications would have to be combined. Territories with dissimilar geologic histories would posses different sets of classes and more diverse territories will have a larger number of classes. For instance, let us assume that the second hierarchy in our example was developed for a territory only partly affected by glaciation. Glaciation-related classes from Southeast Alaska are likely to be usable in that territory too. However, classifications for that territory will contain many classes that would not fit into the class trees specific for Southeast Alaska. Those classes would have to be added to the resulting classification trees. Most of these would fall under the “Post-glacial Terrains” class of Southeast Alaska hierarchy. Combined classification trees must satisfy axioms ATM1–7. Finally, a normalized product of class trees will have to be created. This methodology still awaiting testing in practical context.
Acknowledgments The authors would like to thank Dr. Gregory Nowaki from the USDA Forest Service for useful and detailed comments on our paper. Dr. Nowaki is one of the leading developers of the classification of Southeast Alaska ecological subsections [3]. Support for the second author from the Wolfgang Paul Program of the Alexander von Humboldt Foundation and from National Science Foun-
572
Alexandre Sorokine and Thomas Bittner
dation Research Grant BCS–9975557: Geographic Categories: An Ontological Investigation, is gratefully acknowledged
References [1] E. Lynn Usery. Category theory and the structure of features in geographic information systems. Cartography and Geographic information systems, 20(1):5–12, 1993. [2] David T Cleland, Peter E Avers, W Henry McNab, Mark E Jensen, Robert G Bailey, Thomas King, and Walter E Russell. National hierarchical framework of ecological units. In Mark S. Boyce and Alan W. Haney, editors, Ecosystem Management Applications for Sustainable Forest and Wildlife Resources, pages 181–200. Yale University Press, New Haven, CT, 1997. [3] Gregory Nowaki, Michael Shephard, Patricia Krosse, William Pawuk, Gary Fisher, James Baichtal, David Brew, Everett Kissinger, and Terry Brock. Ecological subsections of southeastern Alaska and neighboring areas of Canada. Technical Report R10-TP-75, USDA Forest Service, Alaska Region, October 2001. [4] Barry Smith, Jacob Khler, and Anand Kumar. On the application of formal principles to life science data: A case study in the gene ontology. In E Rahm, editor, Database Integration in the Life Sciences (DILS 2004). Springer, Berlin, 2004. forthcoming. [5] Harold P Cook and Hugh Tredennick. Aristotle: The Categories, On Interpretation, Prior Anlytics. Harvard University Press, Cambridge, Massachusetts, 1938. [6] Peter Simons. Parts, A Study in Ontology. Clarendon Press, Oxford, 1987. [7] P. Rigaux and M. Scholl. Multi-scale partitions: Application to spatial and statistical databases. In M. Egenhofer and J. Herring, editors, Advances in Spatial Databases (SSD’95), number 951 in Lecture Notes in Computer Science, pages 170–184. Springer-Verlag, Berlin, 1995. [8] T. Bittner and J.G. Stell. Stratified rough sets and vagueness. In W. Kuhn, M. Worboys, and S. Timpf, editors, Spatial Information Theory. Cognitive and Computational Foundations of Geographic Information Science. International Conference COSIT’03, pages 286–303. Springer, 2003.
Comparing and Combining Different Expert Relations of How Land Cover Ontologies Relate Alexis Comber1, Peter Fisher1, and Richard Wadsworth2 1
Department of Geography, University of Leicester, Leicester, LE1 7RH, UK, Tel: +44 (0)116 252 3859, Fax: +44 (0)116 252 3854, email: [email protected] 2 Centre for Ecology and Hydrology, Monks Wood, Abbots Ripton, Huntingdon, Cambridgeshire, PE28 2LS, UK
Abstract Expressions of expert opinion are being used to relate ontologically diverse data and to identify logical inconsistency them. Relations constructed under different scenarios, from different experts and evidence combined in different ways identify different subsets of inconsistency, the reliability of which can be parameterised by field validation. It is difficult to identify one combination as being objectively “better” than other. The selection of specific experts and scenarios depends on user perspectives.
1 Introduction Geographic data necessarily are a simplification of the real world. Decisions about what to record and what to omit results in differences between datasets even when they purport to record the same features. Some of these differences may be rooted in local practice, in institutions, in the technology used to record and measure the processes of interest or in the (policy related) objective of the study (Comber et al. 2003a). The net result is that objects contained in one dataset may not correspond to similarly named objects in another. Specifying which objects to measure, and how to measure them is to describe an ontology or a specification of a conceptualisation (Guarino 1995). In the case of a remotely sensed land cover dataset an ontology is more than a list of the class descriptions: as well describing the botanical and ecological properties of the classes, it describes the epistemological aspects of data collection, correction and processing. Research reported here considers how different expert descriptions of the way that data relate may be combined and used to identify change and inconsistency
574
Alexis Comber, Peter Fisher and Richard Wadsworth
between them. Previous work has shown that using the opinions of a single expert it is possible to identify change in the land cover mappings of the UK (Comber et al. in press a). Different experts have different opinions, however, and the aim of the current work is to examine how evidence from multiple experts may be combined in order to more robustly identify change and inconsistency. In this paper we describe a method that addresses the issues of data conceptualisation, meaning and semantics by using a series of expert expressions of how the categories in different datasets relate. We asked three experts to describe the relationships between two mappings of UK land covers in 1990 and 2000 (introduced in the next section) using a table of pair-wise relations under three different scenarios. The experts can be characterised as a data User, a data Producer and a data Supplier. In the scenarios the experts described the semantics, common technical confusions and possible transitions of land covers, in an attempt to account for the differences in meaning and conceptualisation between the two datasets.
2 Background
2.1 Land cover mapping in the UK The 1990 Land Cover Map of Great Britain (LCMGB) is a raster dataset that records 25 Target land cover classes classified from composite winter and summer satellite imagery using a per pixel supervised maximum likelihood classification (Fuller et al. 1994). The classification was determined by scientists on the basis of what they believed they could reliably identify (Comber et al. 2003a). The UK Land Cover Map 2000 (LCM2000) upgrades and updates the LCMGB (Fuller et al. 2002). It describes the landscape in terms of broad habitats as a result of postRio UK and EU legislation and uses a parcel structure as the reporting framework. It includes extensive parcel-based meta-data including processing history and the spectral heterogeneity attribute “PerPixList” (Smith and Fuller 2001). LCM2000 was released with a “health warning” against comparing it to LCMGB because of the thematic and structural differences. 2.2 Data integration The integration of spatial data is an objective common to many research themes. The fundamental problem of integration is that the features of one dataset may not match those recorded in another. Intersecting the data provides measures of correspondence between the elements of each data set which can be used to model the uncertainty between two datasets, for instance, as membership functions using fuzzy sets or as probability distributions for Bayesian approaches. Such approaches allow multiple realisations of combined data to be modelled using the
Comparing and combining different expert relations of land cover
575
general case provided by the correspondence matrix and, because there is no spatial component to correspondences, they are assumed to be evenly distributed. This means that every object (pixel or parcel) of the same class is treated in the same way. For the typical case of the data object this is unproblematic. If one is interested in identifying atypical cases – for instance in order to identify change or inconsistency between different datasets – a more subtle approach is required. To this end a series of approaches originating in computing science have been suggested to address the data integration problem – interoperability (Bishr 1998; Harvey et al. 1999), formal ontologies (Frank 2001; Pundt and Bishr 2002; Visser et al. 2002) and standardisation (Herring 1999; OGC 2003) – all of which identified the need to overcome differences in the semantics of the objects described in different data. 2.3 Previous work Comber et al. (2003b; 2003c; in press a) have proposed and developed an approach based on expert expressions of how the semantics of the elements (classes) of two data sets relate to each other, using land cover mapping in the UK as an example. An expert familiar with both datasets was describe the pair-wise semantic relations between LCMGB and LCM2000 classes in a Look Up Table (LUT). It involved the following stages: 1. The area covered by each parcel was characterised in 1990: x For each parcel the number and type of intersecting LCMGB was determined. x The distribution of LCMGB classes was interpreted via the Expert LUT. The descriptions were in terms of whether the LCM2000-LCMGB class pair is “Unexpected”, “Expected” or “Uncertain” (U, E, Q). A triple for each parcel was generated by placing each intersecting pixel into one of these descriptions. x The U, E, Q scores were normalised by parcel area. 2. The parcel was characterised in 2000: x For each parcel the number and type of spectral heterogeneity attribute spectral subclasses were extracted. x These were interpreted via a Spectral LUT. This represented knowledge of expected, unexpected or possible spectral overlap relative to the parcel broad habitat class. This information accompanied the release of the LCM2000 data. These descriptions were used to generate a (U, E, Q) triple based on the percentage of each spectral subclass within each LCM2000 parcel. 3. Changes in U, E and Q were calculated for each parcel (ǻU, ǻU and ǻQ) and normalised to a standard distribution function for each class. 4. The normalised ǻU and ǻE provided beliefs for a hypothesis of change, combined using Dempster-Shafer, which was given further support from ǻQ.
576
Alexis Comber, Peter Fisher and Richard Wadsworth
5.
A sample of parcels was surveyed in the field and assessments were made of whether the land cover matched LCM2000 and whether it had changed since 1990. The headline result was that inconsistency between LCMGB and LCM2000 was identified in 100% of the parcels, with 41% of it was attributable to change and 59% to data error (Comber et al., in press b). The aim of the current work was to explore the effects of different types of expert evidence and different experts.
3. Methodology Three experts – a User, Producer and a Supplier – completed tables of relations between LCMGB and LCM2000 under three scenarios: - Semantic: based on their understanding of the links between the semantics of the two datasets; - Change: describing the transitions between land cover classes; - Technical: describing relations based on knowledge of where confusions may occur (e.g. spectral confusions between classes). Each scenario from each expert was substituted in turn as the Expert LUT in the outline described above (Section 2.3). Beliefs from the different scenarios for each expert were combined in two ways for each expert. First, using Dempster-Shafer with a normalisation term to produce Aggregate combinations of belief. Second, by adding the beliefs together to produce Cumulative combinations of beliefs. For each parcel 22 beliefs were calculated - Semantic from the semantic Expert LUT (S) - Technical from technical Expert LUT (T); - Change from change Expert LUT (C); - Semantic and Technical LUTs combined cumulatively (ST+); - Semantic and Technical LUTs combined by aggregation (ST*); - All three LUTs combined cumulatively (STC+); - All three LUTs combined by aggregation (STC*). This was done for each expert and an overall belief was calculated by combining all three scenarios using Dempster-Shafer (All). The beliefs for the visited parcels were extracted from the data and thresholds applied to explore how well the change and no change parcels were partitioned by different combinations of expert relations.
Comparing and combining different expert relations of land cover
577
578
Alexis Comber, Peter Fisher and Richard Wadsworth
4. Results
4.1 Belief thresholds Table 1 shows how the evidence from different experts partitions the visited using thresholds (T) of 1 and of 0.9. The error of omission is the proportion of parcels with a combined belief below the threshold but found to have changed. The error of commission is the proportion parcels with a belief greater than the threshold but found not to have changed. The overall figure is the proportion of all parcels from both subsets correctly partitioned by the threshold. In all cases where T = 1 the overall reliability is higher than for T > 0.9, due to lower errors of commission. However more parcels that have changed are missed at the higher threshold. The lowest errors of omission are for the aggregate combinations, but these also have the highest errors of commission. Thus the overall reliability for cumulative (additive) combinations is higher than from aggregate (multiplicative) due to their having lower errors of commission. 4.2 Experts The results in Table 1 are described below in terms of each individual expert and the levels of the different types of errors caused by the selection of this threshold (T = 1). The Supplier partitioned the field data with the fewest errors of omission using the Semantic relations (S) for the single scenarios and the aggregated combination of the Semantic Technical and Change relations (STC*). The fewest errors of commission were from the Change relations (C) and the additive Semantic Technical and Change relations (STC+). The Producer partitioned the field data with the fewest errors of omission are the Semantic relations (S) for the single scenarios and either the aggregated combination of the Semantic Technical and Change relations (STC*) or the Semantic and Technical (ST*). The fewest errors of commission were from the Change relations (C) and the aggregated Semantic Technical and Change relations (STC*). The User partitioned the field data with the fewest errors of omission with the Semantic relations (S) for the single scenarios and the aggregated combination of the Semantic Technical and Change relations (STC*). The fewest errors of commission were from the Technical relations (T) and the additive Semantic Technical and Change relations (STC+). 4.3 Methods of combination The two methods of combining belief Cumulative and Aggregate do so with different results. For example:
Comparing and combining different expert relations of land cover
579
S = 0.946 T = 0.425 C = 0.953 Aggregate: ST* = 0.928 STC* = 0.996 Cumulative: ST+ = 0.685 STC+ = 0.775 The cumulative method pushes the distribution of beliefs away from the extremes. This is as expected if one considers that all the terms (scenarios) must have high beliefs for the cumulative belief to be high. For example in the case of the Supplier S ĺ ST+ ĺ STC+ 0.65 0.58 0.53 However the number of false positives using this approach decreases correspondingly (i.e. the proportion of ‘No change’ parcels erroneously partitioned) S ĺ ST+ ĺ STC+ 0.33 0.33 0.28 The aggregate (Dempster-Shafer) method pushes the distributions of the beliefs apart from the median towards the extremes of the distribution of beliefs. For each expert we see that the proportion of ‘Change’ parcels correctly partitioned by the data increases as each scenario term is incorporated in the aggregate. For example in the case of the Supplier: S ĺ ST* ĺ STC* 0.65 0.70 0.72 However the number of false positives increases correspondingly (i.e. the proportion of ‘No change’ parcels erroneously partitioned): S ĺ ST* ĺ STC* 0.33 0.41 0.44 Thus the All combined belief generated in this way from all the experts and scenarios partitions the ‘Change’ parcels most reliably (0.79), but it includes 50% of the ‘No change’ parcels in the partition.
5 Discussion and Conclusions This work has explored the differences between multiple expert expressions of relations between datasets. Earlier work has shown that such approaches reliably identify inconsistency between two datasets (Comber et al. in press b). Various factors have been shown to influence the parcels that are partitioned: different experts, different scenarios, threshold selection and different methods for combining the belief. They generate different sets of change parcels and necessarily different error terms. The extent to which one set of results from the interaction of these factors is preferred to another, centres on the acceptability of different types of error. The preference will depend on the question being asked of the data and by whom, which will determine the acceptability of different levels of errors. This point is illustrated in Figures 1, 2 and 3 which show the different woodland parcels identified as having changed for an area of Sherwood Forest. The differences can be explained in terms of:
580
Alexis Comber, Peter Fisher and Richard Wadsworth
different scenarios (Figure 1); different thresholds (Figure 1); different combination methods (Figure 2); different experts and the sequence of scenarios (Figure 3). In this work LCMGB and LCM2000 have been analysed for change. They are separated by a 10 year interval and therefore the set of inconsistent parcels will contain change parcels. For datasets with a smaller interval separating them, the hypothesis of change may be less appropriate; however this method would still identify inconsistency between the datasets. This is of interest to many areas of geographic research that are concerned with shifting paradigms in the way that geographic phenomenon such as land cover are measured and recorded. In such cases, this method identifies where one dataset is considered to be inconsistent relative to the reporting paradigms of the other. Semantic
Technical
Change
Combined belief > 0.9
Combined belief =1 Fig. 1. Woodland parcels identified as having changed from different scenarios using different thresholds and evidence from the Producer expert
Fig. 2. Woodland change parcels identified from ways of combining beliefs, cumulative (+) and aggregate (*) (T > 0.9)
Comparing and combining different expert relations of land cover
581
The problem of detecting change between thematic (land cover) maps was described by Fuller et al. (2003) with reference to the accuracy of LCMGB and LCM1990. This work has applied the recommendation made by Fuller et al. (2003), namely that change detection using LCMGB and LCM2000 may be possible using the parcel structure of LCM2000 to interrogate the LCMGB raster data. This has provided local (LCM2000 parcel) descriptions of LCMGB distributions, allowed the previous cover dominance for each LCM2000 land parcel to be determined and, when interpreted through the lens of the Expert LUT, identified change and inconsistency on a per-parcel basis. A complementary description of LCM2000 parcel heterogeneity was provided by one of the LCM2000 meta-data attributes “PerPixList” and this in turn was interpreted in qualitative terms using the information on expected spectral overlap provided by the data providers.
Fig. 3. Parcels identified by different experts combining the beliefs cumulatively from the different scenarios, where T > 0.9
Whilst field validation of change provided error terms and confidences in the different combinations of experts, scenarios, thresholds and combination methods, it difficult to say in any absolute sense that one combination of these factors is objectively better than another. Rather the user must decide what aspects of the question they wish to ask of the data are the most important. It is hoped that techniques such as this are of interest to users in any instance where two datasets with nonmatching semantics are used (possibly but not necessarily temporally distinct). Indeed, it should be noted that where the datasets are synchronous or the data is not expected to have changed, the method provides an approach to data verification and semantic consistency.
582
Alexis Comber, Peter Fisher and Richard Wadsworth
Acknowledgements This paper describes work done within the REVIGIS project funded by the European Commission, Project Number IST-1999-14189. We would especially like to thank our collaborators, especially Andrew Frank, Robert Jeansoulin, Geoff Smith, Alfred Stein, Nic Wilson, Mike Worboys and Barry Wyatt.
References Bishr, Y., 1998. Overcoming the semantic and other barriers to GIS interoperability. International Journal of Geographical Information Science, 12 (4), 299-314. Comber, A. Fisher, P., and Wadsworth, R. (in press a). Integrating land cover data with different ontologies: identifying change from inconsistency. International Journal of Geographic Information Science. Comber, AJ, Fisher, PF, Wadsworth, RA., (in press b). Assessment of a Semantic Statistical Approach to Detecting Land Cover Change Using Inconsistent Data Sets. Photogrammetric Engineering and Remote Sensing. Comber, A. Fisher, P. and Wadsworth, R., 2003a. Actor Network Theory: a suitable framework to understand how land cover mapping projects develop? Land Use Policy, 20, 299–309 Comber, A.J., Fisher, P.F and Wadsworth, R.A, 2003b. A semantic statistical approach for identifying change from ontologically divers land cover data. In AGILE 2003, 5th AGILE conference on Geographic Information Science, edited by Michael Gould, Robert Laurini, Stephane Coulondre (Lausanne: PPUR), pp. 123-131. Comber, AJ, Fisher, PF, Wadsworth, RA., 2003c. Identifying Land Cover Change Using a Semantic Statistical Approach: First Results. In cd Proceedings of the 7th International Conference on GeoComputation, 8th-10th September 2003 (University of Southampton). Frank, A.U. 2001. Tiers of ontology and consistency constraints in geographical information systems, International Journal of Geographical Information Science, 15 (7), 667678. Fuller, R.M., Smith, G.M., and Devereux, B.J. (2003). The characterisation and measurement of land cover change through remote sensing: problems in operational applications? International Journal of Applied Earth Observation and Geoinformation, 4, 243-253. Fuller, R.M., G.B., Groom, A.R. Jones, 1994. The Land Cover Map of Great Britain: an automated classification of Landsat Thematic Mapper data. Photogrammetric Engineering and Remote Sensing, 60, 553-562. Fuller, R.M., Smith, G.M., Sanderson, J.M., Hill, R.A. and Thomson, A.G., 2002. Land Cover Map 2000: construction of a parcel-based vector map from satellite images. Cartographic Journal, 39, 15-25. Guarino, N, 1995. Formal ontology, conceptual analysis and knowledge representation. International Journal of Human-Computer Studies, 43, 625-640. Harvey, F., Kuhn, W., Pundt, H., Bishr, Y. and Riedemann, C.,, 1999. Semantic interoperability: A central issue for sharing geographic information. Annals of Regional Science 33 (2), 213-232. Herring J.R., 1999. The OpenGIS data model. Photogrammetric Engineering and Remote Sensing, 65 (5), 585-588.
Comparing and combining different expert relations of land cover
583
OGC, 2003. OpenGIS Consortium. http://www.opengis.org/ (last date accessed: 10 June 2003). Pundt, H. and Y. Bishr, 2002. Domain ontologies for data sharing-an example from environmental monitoring using field GIS. Computers and Geosciences, 28 (1), 95-102. Smith, G.M. and R.M. Fuller, 2002. Land Cover Map 2000 and meta-data at the land parcel level, In Uncertainty in Remote Sensing and GIS, edited by G.M. Foody and P.M. Atkinson (London: John Wiley and Sons), pp 143-153.
Visser, U., Stuckenschmidt, H., Schuster, G. and Vogele, T., 2002. Ontologies for geographic information processing. Computers and Geosciences, 28, 103-117.
Representing, Manipulating and Reasoning with Geographic Semantics within a Knowledge Framework James O’Brien and Mark Gahegan GeoVISTA Center, Department of Geography, The Pennsylvania State University, University Park, PA 16802, USA. Ph: +1-814-865-2612; Fax: +1-814-863-7643; Email: [email protected], [email protected]
Abstract This paper describes a programmatic framework for representing, manipulating and reasoning with geographic semantics. The framework enables automating tool selection for user defined geographic problem solving, and evaluating semantic change in knowledge discovery environments. Methods, data, and human experts (our resources) uses, inputs, outputs, and semantic changes are described using ontologies. These ontological descriptions are manipulated by an expert system to select resources to solve a user-defined problem. A semantic description of the problem is compared to the services that each entity can provide to construct a graph of potential solutions. An optimal (least cost) solution is extracted from these solutions, and displayed in real-time. The semantic change(s) resulting from the interaction of resources within the optimal solution are determined via expressions of transformation semantics represented within the Java Expert System Shell. This description represents the formation history of each new information product (e.g. a map or overlay) and can be stored, indexed and searched as required. Examples are presented to show (1) the construction and visualization of information products, (2) the reasoning capabilities of the system to find alternative ways to produce information products from a set of data methods and expertise, given certain constraints and (3) the representation of the ensuing semantic changes by which an information product is synthesized.
586
James O’Brien and Mark Gahegan
1 Introduction The importance of semantics in geographic information is well documented (Bishr, 1998; Egenhofer, 2002; Fabrikant and Buttenfield, 2001; Kuhn, 2002). Semantics are a key component of interoperability between GIS; there are now robust technical solutions to interoperate geographic information in a syntactic and schematic sense (e.g. OGC, NSDI) but these fail to take account of any sense of meaning associated with the information. Visser et al., (2002) describe how exchanging data between systems often fails due to confusion in the meaning of concepts. Such confusion, or semantic heterogeneity, significantly hinders collaboration if groups cannot agree on a common lexicon for core concepts. Semantic heterogeneity is also blamed for the inefficient exchange of geographic concepts and information between groups of people with differing ontologies (Kokla and Kavouras, 2002). Semantic issues pervade the creation, use and re-purposing of geographic information. In an information economy we can identify the roles of information producer and information consumer, and in some cases, in national mapping agencies for example, datasets are often constructed incrementally by different groups of people (Gahegan, 1999) with an implicit (but not necessarily recorded) goal. The overall meaning of the resulting information products are not always obvious to those outside that group, existing for the most part in the creators’ mental model. When solving a problem, a user may gather geospatial information from a variety of sources without ever encountering an explicit statement about what the data mean, or what they are (and are not) useful for. Without capturing the semantics of the data throughout the process of creation, the data may be misunderstood, be used inappropriately, or not used at all when they could be. The consideration of geospatial semantics needs to explicitly cater for the particular way in which geospatial tasks are undertaken (Egenhofer, 2002). As a result, the underlying assumptions about methods used with data, and the roles played by human expertise need to be represented in some fashion so that a meaningful association can be made between appropriate methods, people and data to solve a problem. It is not the role of this paper to present a definitive taxonomy of geographic operations or their semantics. To do so would trivialize the difficulties of defining geographic semantics.
Representing, manipulating and reasoning with geographic semantics
587
2 Background and Aims This paper presents a programmatic framework for representing, manipulating and reasoning with geographic semantics. In general semantics refers to the study of the relations between symbols and what they represent (Hakimpour and Timpf, 2002). In the framework outlined in this paper, semantics have two valuable and specific roles. Firstly, to determine the most appropriate resources (method, data or human expert) to use in concert to solve a geographic problem, and secondly to act as a measure of change in meaning when data are operated on by methods and human experts. Both of these roles are discussed in detail in Section 3. The framework draws on a number of different research fields, specifically: geographical semantics (Gahegan, 1999 and Kuhn, 2002), ontologies (Guarino, 1998) computational semantics (Sowa, 2000), constraint-based reasoning and expert systems (Honda and Mizoguchi, 1995) and visualization (MacEachren, in press) to represent aspects of these resources. The framework sets out to solve a multi-layered problem of visualizing knowledge discovery, automating tool selection for user defined geographic problem solving and evaluating semantic change in knowledge discovery environments. The end goal of the framework is to associate with geospatial information products the details of their formation history and tools by which to browse, query and ultimately understand this formation history, thereby building a better understanding of meaning and appropriate use of the information. The goal of the framework is not to develop new theories about ontologies or semantics, but instead to use ontologies to specify a problem and to govern the interaction between data, methods and human experts to solve that problem. The problem of semantic heterogeneity arises due to the varying interpretations given to the terms used to describe facts and concepts. Semantic heterogeneity exists in two forms, cognitive and naming (Bishr, 1998). Cognitive semantic heterogeneity results from no common base of definitions between two (or more) groups. Defining such points of agreement amounts to constructing a shared ontology, or at the very least, points of overlap (Pundt and Bishr, 2002). Naming semantic heterogeneity occurs when the same name is used for different concepts or different names are used for the same concept. It is not possible to undertake any semantic analysis until problems of semantic heterogeneity are resolved. Ontologies, described below, are widely recommended as a means of rectifying semantic heterogeneity (Hakimpour and Timpf, 2002; Kokla and Kavouras, 2002; Kuhn, 2002; Pundt and Bishr, 2002; Visser et al., 2002). The framework presented in this paper utilizes that work and other ontological re-
588
James O’Brien and Mark Gahegan
Fig. 1. Interrelationships between different types of ontology.
search (Brodaric and Gahegan, 2002; Chandrasekaran et al., 1997; Fonseca, 2001; Fonseca and Egenhofer, 1999; Fonseca et al., 2000; Guarino, 1997a; Guarino, 1997b; Mark et al., 2002) to solve the problem of semantic heterogeneity. The use of an expert system for automated reasoning fits well with the logical semantics utilized within the framework. The Java Expert System Shell (JESS) is used to express diverse semantic aspects about methods, data, and human experts. JESS performs string comparisons of resource attributes (parsed from ontologies) using backward chaining to determine interconnections between resources. Backward chaining is a goal driven problem solving methodology, starting from the set of possible solutions and attempting to derive the problem. If the conditions for a rule to be satisfied are not found within that rule, the engine searches for other rules that have the unsatisfied rule as their conclusion, establishing dependencies between rules. 2.1 Ontology In philosophy, Ontology is the “study of the kinds of things that exist” (Chandrasekaran et al., 1997; Guarino, 1997b). In the artificial intelligence community, ontology has one of two meanings, as a representation vo-
Representing, manipulating and reasoning with geographic semantics
589
cabulary, typically specialized to some domain or subject matter and as a body of knowledge describing some domain using such a representation vocabulary (Chandrasekaran et al., 1997). The goal of sharing knowledge can be accomplished by encoding domain knowledge using a standard vocabulary based on an ontology (Chandrasekaran et al., 1997; Kokla and Kavouras, 2002; Pundt and Bishr, 2002). The framework described here utilizes both definitions of ontology. The representation vocabulary embodies the conceptualizations that the terms in the vocabulary are intended to capture. Relationships described between conceptual elements in this ontology allow for the production of rules governing how these elements can be “connected” to solve a geographic problem. In our case these elements are methods, data and human experts, each with their own ontology. In the case of datasets, a domain ontology describes salient properties such as location, scale, date, format, etc., as currently captured in meta-data descriptions (see figure 1). In the case of methods, a domain ontology describes the services a method provides in terms of a transformation from one semantic state to another. In the case of human experts the simplest representation is again a domain ontology that shows the contribution that a human can provide in terms of steering or configuring methods and data. However, it should also be possible to represent a two-way flow of knowledge as the human learns from situations and thereby expands the number of services they can provide (we leave this issue for future work). The synthesis of a specific information product is specified via a task ontology that must fuse together elements of the domain and application ontologies to attain its goal. An innovation of this framework is the dynamic construction of the solution network, analogous to the application ontology. In order for resources to be useful in solving a problem, their ontologies must also overlap. Ontology is a useful metaphor for describing the genesis of the information product. A body of knowlFig. 2 The creation of an information product relies on data, methods, and human edge described using the experts domain ontology is utilized
590
James O’Brien and Mark Gahegan
in the initial phase of setting up the expert system. A task ontology is created at the conclusion of the automated process specifically defining the concepts that are available. An information product is derived from the use of data extracted from databases and knowledge from human experts in methods, as shown in figure 2. By forming a higher level ontology which describes the relationships between each of these resources it is possible to describe appropriate interactions. Fonseca et al. (2001) highlight the complexity of determining a solution and selecting appropriate methods and data while attempting to define a relationship between climate and vector borne diseases (e.g. West Nile virus). The authors outline a problem where a relationship exists between climate and infectious diseases, as the disease agents (viruses, bacteria, etc.) and disease vectors (ticks, mosquitoes, rodents) are sensitive to temperature, moisture and other climatic variables. While climate affects the likely presence or abundance of an agent or vector, health outcomes are dependent on other criteria such as age demographics and other stressors such as air and water quality can determine risk, while land use and land cover are significant controls on ecosystem character and function, and
Fig. 3. A concrete example of the interaction of methods, data and human experts to produce an information product.
Representing, manipulating and reasoning with geographic semantics
591
hence any disease dynamics associated with insects, rodents, deer, etc. (Fonseca et al., 2001). The lack of an ability to assess the response of the system to multiple factors limits our ability to predict and mitigate adverse health outcomes. Such a complex problem (figure 3) is obviously more difficult to constrain, a larger number of data sources, methods and human expertise is required, potentially with changes between data models, data scales, and levels of abstraction and across multiple domains (climate, climate change, effects of environment variation on flora, fauna, landuse / landcover and soils, epidemiology, and human health-environment interactions) potentially from fields of study with different conceptualizations of concepts. One interesting feature demonstrated in this example is the ability of human experts to gain experience through repeated exposure to similar situations. In this example a basic semantic structure is being constructed and a lineage of the data can be determined. 2.2 Semantics While the construction of the information product is important, a semantic layer sits above the operations and information (figure 4). The geospatial knowledge obtained during the creation of the product is captured within this layer. The capture of this semantic information describes the REASONING ENGINE
Geospatial Knowledge (meaning layer)
ork d ew a a n s ram emat thod f tic h e an g sc n m se m ndin atio d m te po or jec res n sf Pr o h cor ic tra t t i n w ma se
Geospatial Information (operational layer)
M D
H
K
s tion ec ing o nn s s c r ce int e pr o eo and g g t ure n i c ist Ex rastru in f
Fig. 4. Interaction of the semantic layer and operational layer.
transformations that the geospatial information undergoes, facilitating better understanding and providing a measure of repeatability of analysis, and
592
James O’Brien and Mark Gahegan
improving communication in the hope of promoting best practice in bringing geospatial information to bear. Egenhofer (2002) notes that the challenge remains of how best to make these semantics available to the user via a search interface. Pundt and Bishr, (2002) outline a process in which a user searches for data to solve a problem. This search methodology is also applicable for the methods and human experts to be used with the data. This solution fails when multiple sources are available and nothing is known of their content, structure and semantics. The use of pre-defined ontologies aids users by reducing the available search space (Pundt and Bishr, 2002). Ontological concepts relevant to a problem domain are supplied to the user allowing them to focus their query. A more advanced interface would take the user’s query in their own terms and map that to an underlying domain ontology (Bishr, 1998). As previously noted, the meaning of geospatial information is constructed, shaped and changed by the interaction of people and systems. Subsequently the interaction of human experts, methods and data needs to be carefully planned. A product created as a result of these interactions is dependent on the ontology of the data and methods and the epistemologies and ontologies of the human experts. In light of this, the knowledge framework outlined below focuses on each of the resources involved (data, methods and human experts) and the roles they play in the evolution of a new information product. In addition, the user’s goal that produced the product, and any constraints placed on the process are recorded to capture aspects of intention and situation that also have an impact on meaning. This process and the impact of constraint based searches are discussed in more detail in the following section.
3 Knowledge framework The problem described in the introduction has been implemented as three components. The first, and the simplest, is the task of visualizing the network of interactions by which new information products are synthesized. The second, automating the construction of such a network for a userdefined task, is interdependent with the third, evaluating semantic change in a knowledge discovery environment, and both utilize functionality of the first. An examination of the abstract properties of data, methods and experts is followed by an explanation of these components and their interrelationships.
Representing, manipulating and reasoning with geographic semantics
593
3.1 Formal representation of components and changes This section explains how the abstract properties of data, methods and experts are represented, and then employed to track semantic changes as information products are produced utilizing tools described above. From the description in Section 2 it should be evident that such changes are a consequence of the arrangement of data, computational methods and expert interaction applied to data. At an abstract level above that of the data and methods used, we wish to represent some characteristics of these three sets of components in a formal sense, so that we can describe the effects deriving from their interaction. One strong caveat here is that our semantic description (described below) does not claim to capture all senses of meaning attached to data, methods or people, and in fact as a community of researchers we are still learning about which facets of semantics are important and how they might be described. It is not currently possible to represent all aspects of meaning and knowledge within a computer, so we aim instead to provide descriptions that are rich enough to allow users to infer aspects of meaning that are important for specific tasks from the visualizations or reports that we can synthesize. In this sense our own descriptions of semantics play the role of a signifier—the focus is on conveying meaning to the reader rather than explicitly carrying intrinsic meaning per-se. The formalization of semantics based on ontologies and operated on using a language capable of representing relations provides for powerful semantic modelling (Kuhn, 2002). The framework, rules, and facts used in the Solution Synthesis Engine (see below) function in this way. Relationships are established between each of the entities, by calculating their membership within a set of objects capable of synthesizing a solution. We extend the approach of Kuhn by allowing the user to narrow a search for a solution based on the specific semantic attributes of entities. Using the minimal spanning tree produced from the solution synthesis it is possible to retrace the steps of the process to calculate semantic change. As each fact is asserted it contains information about the rule that created it (the method) and the data and human experts that were identified as resources required. If we are able to describe the change to the data (in terms of abstract semantic properties) imbued by each of the processes through which it passes, then it is possible to represent the change between the start state and the finish state by differencing the two. Although the focus of our description is on semantics, there are good reasons for including syntactic and schematic information about data and methods also, since methods generally are designed to work in limited circumstances, using and producing very specific data types (pre-conditions and post-conditions). Hence from a practical perspective it makes sense to
594
James O’Brien and Mark Gahegan
represent and reason with these aspects in addition to semantics, since they will limit which methods can be connected together and dictate where additional conversion methods are required. Additional potentially useful properties arise when the computational and human infrastructure is distributed e.g. around a network. By encoding such properties we can extend our reasoning capabilities to address problems that arise when resources must be moved from one node to another to solve a problem (Gahegan, 1998). Describing Data As mentioned in Section 2, datasets are described in general terms using a domain ontology drawn from generic metadata descriptions. Existing metadata descriptions hold a wealth of such practical information that can be readily associated with datasets; for example the FGDC (1998) defines a mix of semantic, syntactic and schematic metadata properties. These include basic semantics (abstract and purpose), syntactic (data model information, and projection), and schematic (creator, theme, temporal and spatial extents, uncertainty, quality and lineage). We explicitly represent and reason with a subset of these properties in the work described here and could easily expand to represent them all, or any other given metadata description that can be expressed symbolically. Formally, we represent the set of n properties of a dataset D as in Eq. 1 (Gahegan, 1996).
D p1 , p 2 , , p n
(1)
Describing Methods While standards for metadata descriptions are already mature and suit our purposes, complementary mark-up languages for methods are still in their infancy. It is straightforward to represent the signature of a method in terms of the format of data entering and leaving the method, and knowing that a method requires data to be in a certain format will cause the system to search for and insert conversion methods automatically where they are required. So, for example, if a coverage must be converted from raster format to vector format before it can be used as input to a surface flow accumulation method, then the system can insert appropriate data conversion methods into the evolving query tree to connect to appropriate data resources that would otherwise not be compatible. Similarly, if an image classification method requires data at a nominal scale of 1:100,000 or a pixel size of 30m, any data at finer scales might be generalized to meet this requirement prior to use. Although such descriptions have great practical
Representing, manipulating and reasoning with geographic semantics
595
benefit, they say nothing about the role the method plays or the transformation it imparts to the data; in short they do not enable any kind of semantic assessment to be made. A useful approach to representing what GIS methods do, in a conceptual sense, centers on a typology (e.g. Albrecht’s 20 universal GIS operators, 1994). Here, we extend this idea to address a number of different abstract properties of a dataset, in terms of how the method invoked changes these properties (Pascoe & Penny, 1995; Gahegan, 1996). In a general sense, the transformation performed by a method (M) can be represented by preconditions and post-conditions, as is common practice with interface specification and design in software engineering. Using the notation above, our semantic description takes the form shown in Eq. 2, where Operation is a generic description of the role or function the method provides, drawn from a typology. Operation M : D p1 , p2 , , pn o D' p1 ' , p2 ' , , pn '
(2)
For example, a cartographic generalization method changes the scale at which a dataset is most applicable, a supervised classifier transforms an array of numbers into a set of categorical labels, an extrapolation method might produce a map for next year, based on maps of the past. Clearly, there are any number of key dimensions over which such changes might be represented; the above examples highlight spatial scale, conceptual ‘level’ (which at a basic syntactic level could be viewed simply as statistical scale) and temporal applicability, or simply time. Others come to light following just a cursory exploration of GIS functionality: change in spatial extents, e.g. windowing and buffering, change in uncertainty (very difficult in practice to quantify but easy to show in an abstract sense that there has been a change).
Again, we have chosen not to restrict ourselves to a specific set of properties, but rather to remain flexible in representing those that are important to specific application areas or communities. We note that as Web Services (Abel et al., 1998) become more established in the GIS arena, such an enhanced description of methods will be a vital component in identifying potentially useful functionality. Describing People Operations may require additional configuration or expertise in order to carry out their task. People use their expertise to interact with data and methods in many ways, such as gathering, creating and interpreting data, configuring methods and interpreting results. These activities are typically structured around well-defined tasks where the desired outcome is known,
596
James O’Brien and Mark Gahegan
although as in the case of knowledge discovery, they may sometimes be more speculative in nature. In our work we have cast the various skills that experts possess in terms of their ability to help achieve some desired goal. This, in turn, can be re-expressed as their suitability to oversee the processing of some dataset by some method, either by configuring parameters, supplying judgment or even performing the task explicitly. For example, an image interpretation method may require identification of training examples that in turn necessitate local field knowledge; such knowledge can also be specified as a context of applicability using the time, space, scale and theme parameters that are also used to describe datasets. As such, a given expert may be able to play a number of roles that are required by the operations described above, with each role described by Eq. 3, meaning that expert E can provide the necessary knowledge to perform Operation within the context of p1…, pn. So to continue the example of image interpretation, p1…, pn might represent (say) floristic mapping of Western Australia, at a scale of 1:100,000 in the present day. Operation E : o p1 , p2 , , pn
(3)
At the less abstract schematic level, location parameters can also be used to express the need to move people to different locations in order to conduct an analysis, or to bring data and methods distributed throughout cyberspace to the physical location of a person. Another possibility here, that we have not yet implemented, is to acknowledge that a person’s ability to perform a task can increase as a result of experience. So it should be possible for a system to keep track of how much experience an expert has accrued by working in a specific context (described as p1…, pn). (In this case the expert expression would also require an experience or suitability score as described for constraint management described below in section 3.2). We could then represent a feedback from the analysis exercise to the user, modifying their experience score. 3.2 Solution Synthesis Engine The automated tool selection process or solution synthesis relies on domain ontologies of the methods, data and human experts (resources) that are usable to solve a problem. The task of automated tool selection can be divided into a number of phases. First is the user’s specification of the problem, either using a list of ontological keywords (Pundt and Bishr, 2002) or in their own terms which are mapped to an underlying ontology
Representing, manipulating and reasoning with geographic semantics
597
Fig. 5. Solution Synthesis Engine user interface
(Bishr, 1997). Second, ontologies of methods, data and human experts need to be processed to determine which resources overlap with the problem ontology. Third, a description of the user’s problem and any associated constraints is parsed into an expert system to define rules that describe the problem. Finally networks of resources that satisfy the rules need to be selected and displayed. Defining a complete set of characteristic attributes for real world entities (such as data, methods and human experts) is difficult (Bishr, 1998) due to problems selecting attributes that accurately describe the entity. Bishr’s solution of using cognitive semantics to solve this problem, by referring to entities based on their function, is implemented in this framework. Once again it should be noted that it is not the intention of this framework to describe all aspects of semantics, but to provide descriptions that are rich enough to allow users to infer aspects of meaning that are important for specific tasks from the visualizations or reports that I can synthesize. Methods utilize data or are utilized by human experts and are subject to conditions regarding their use such as data format, scale or a level of human knowledge. The rules describe the requirements of the methods (‘if’) and the output(s) of the methods (‘then’). Data and human experts, specified by facts, are arguably more passive and the rules of methods are applied to or by them respectively. The set of properties, outlined above, describing data and human experts governs how rules may use them.
598
James O’Brien and Mark Gahegan
The first stage of the solution synthesis is the user specification of the problem using concepts and keywords derived from the problem ontology. The problem ontology, derived from the methods, data and human expert ontologies, consist of concepts describing the intended uses of each of the resources. This limitation was introduced to ensure the framework had access to the necessary entities to solve a user’s problem. A more advanced version of the problem specification using natural language parsing is beyond the scope of this proposal. This natural language query would be mapped to the problem ontology allowing the user to use their own semantics instead of being governed by those of the system. The second stage of the solution synthesis process parses the rules and facts, from with DAML+OIL or OWL to rules and facts describing relationships between data, methods, and human experts. It is important to note that these rules do not perform the operations described rather they mimic the semantic change that would accompany such an operation. The future work section outlines the goal of running this system in tandem with a codeless programming environment to run the selected toolset automatically. With all of the solutions defined by facts and rules defined, the missing link is the problem. The problem ontology is parsed into JESS to create a set of facts. These facts form the “goal” rule that mirrors the user’s problem specification. The JESS engine now has the requisite components for tool selection. During the composition stage, as the engine runs, each of the rules “needed” are satisfied using backward chaining, the goal is fulfilled, and a network of resources is constructed. As each rule fires and populates the network a set of criteria is added to a JESS fact describing each of the user criteria that limits the network. Each of these criteria is used to create a minimal spanning tree of entities. User criteria are initially based upon the key spatial concepts of identity, location, direction, distance, magnitude, scale, time (Fabrikant and Buttenfield, 2001), availability, operation time, and semantic change. Users specify the initial constraints, via the user interface (figure 5) prior to the automated selection of tools. Once again using the vignette, a satellite image is required for the interpretation task, but the only available data is 6 months old and data from the next orbit over the region will not be available for another 4 weeks. Is it “better” to wait for that data to become available or is it more crucial to achieve a solution in a shorter time using potentially out of date data? In the case of landuse change, perhaps 6 month old data is acceptable to the user, however in a disaster management scenario, more timely data may be important. It is possible that the user will request a set of limiting conditions that are too strict to permit a solu-
Representing, manipulating and reasoning with geographic semantics
599
tion. In these cases all possible solutions will be displayed allowing the user to modify their constraints. It is proposed that entities causing the solution to be excluded are highlighted allowing the user to relax their constraints. The user specified constraints are used to prune the network of proposed solutions to a minimal spanning tree that is the solution (or solutions) that satisfies all of the user’s constraints. The final stage of the solution synthesis is the visualization of the process that utilizes a self-organising graph package (ConceptVISTA). This stage occurs in conjunction with the composition stage. As each of the rules fires, it generates a call to ConceptVISTA detailing the source (the rule which fired it – the method) and the data and human experts that interacted with the method. This information is passed to ConceptVISTA, which matches the rules with the ontologies describing each of the entities, loads visual variables and updates the self-organising graph. As well as the visual representation in ConceptVISTA, an ontology is constructed detailing the minimal spanning tree. This ontology describes the relationships between the methods, data and human experts as constructed as part of the solution synthesis.
4 Results This section presents the results of the framework’s solution synthesis and representation of semantic change. The results of the knowledge discovery visualization are implicit in this discussion as that component is used for the display of the minimal spanning tree (Figure 6). A sample problem, finding a home location with a sunset view is used to demonstrate the solution synthesis. In order to solve this problem, raster (DEM) and vector (road network) data needs to be integrated. A raster overlay, using map algebra, followed by buffer operations is required to find suitable locations, from height, slope and aspect data. The raster data of potential sites needs to be converted to a vector layer to enable a buffering operation with vector road data. Finally a viewshed analysis is performed to determine how much of the landscape is visible from candidate sites. The problem specification was simplified by hard-coding the user requirements into a set of facts loaded from an XML file. The user’s problem specification was reduced to selecting pre-defined problems from a menu. A user constraint of scale was set to ensure that data used by the methods in the framework was at a consistent scale and appropriate data layers
600
James O’Brien and Mark Gahegan
were selected based on their metadata and format. With the user requirements parsed into JESS and a problem selected, the solution engine selected the methods, data and human experts required to solve the problem. The solution engine constructed a set of all possible combinations and then determined the shortest path by summing the weighted constraints specified by the user. Utilizing the abstract notation from above, with methods specifying change (see Eq. 4), , the user weights were included and summed for all modified data sets (see Eq. 5). As a result of this process the solution set is pruned until only the optimal solution remains (based on user constraints).
M 1 : D p1 , p 2 , , p n Operation o D' p1 ' , p 2 ' , , p n '
(4)
¦ D1 ' u1 p1 ' , u 2 p 2 ' , , u n p n ' ,...Dn ' u1 p1 ' , u 2 p 2 ' , , u n p n '
(5)
Fig. 6. Diagram showing the simple vignette solution, derived from a thinned network
5 Future Work The ultimate goal of this project is to integrate the problem solving environment with the codeless programming environment GEOVISTA Studio (Gahegan et al., 2002) currently under development at Pennsylvania State University. The possibility of supplying data to the framework and determining the types of questions which could be answered with it is also an interesting problem. A final goal is the use of natural language parsing of the user’s problem specification.
Representing, manipulating and reasoning with geographic semantics
601
6 Conclusions This paper outlined a framework for representing, manipulating and reasoning with geographic semantics. The framework enables visualizing knowledge discovery, automating tool selection for user defined geographic problem solving, and evaluating semantic change in knowledge discovery environments. A minimal spanning tree representing the optimal (least cost) solution was extracted from this graph, and can be displayed in real-time. The semantic change(s) that result from the interaction of data, methods and people contained within the resulting tree represents the formation history of each new information product (such as a map or overlay) and can be stored, indexed and searched as required.
Acknowledgements Our thanks go to Sachin Oswal, who helped with the customization of the ConceptVISTA concept visualization tool used here. This work is partly funded by NSF grants: ITR (BCS)-0219025 and ITR Geosciences Network (GEON).
References Abel, D.J., Taylor, K., Ackland, R., and Hungerford, S. 1998, An Exploration of GIS Architectures for Internet Environments. Computers, Environment and Urban Systems. 22(1) pp 7 -23. Albrecht, J., 1994. Universal elementary GIS tasks- beyond low-level commands. In Waugh T C and Healey R G (eds) Sixth International Symposium on Spatial Data Handling : 209-22. Bishr, Y., 1997. Semantic aspects of interoperable GIS. Ph.D Dissertation Thesis, Enschede, The Netherlands, 154 pp. Bishr, Y., 1998. Overcoming the semantic and other barriers to GIS interoperability. International Journal of Geographical Information Science, 12(4): 299314. Brodaric, B. and Gahegan, M., 2002. Distinguishing Instances and Evidence of Geographical Concepts for Geospatial Database Design. In: M.J. Egenhofer and D.M. Mark (Editors), GIScience 2002. Lecture Notes in Computing Science 2478. Springer-Verlag, pp. 22-37. Chandrasekaran, B., Josephson, J.R. and Benjamins, V.R., 1997. Ontology of Tasks and Methods, AAAI Spring Symposium.
602
James O’Brien and Mark Gahegan
Egenhofer, M., 2002. Toward the semantic geospatial web, Tenth ACM International Symposium on Advances in Geographic Information Systems. ACM Press, New York, NY, USA, McLean, Virginia, USA, pp. 1-4. Fabrikant, S.I. and Buttenfield, B.P., 2001. Formalizing Semantic Spaces for Information Access. Annals of the Association of American Geographers, 91(2): 263-280. Federal Geographic Data Committee. FGDC-STD-001-1998. Content standard for digital geospatial metadata (revised June 1998). Federal Geographic Data Committee. Washington, D.C. Fonseca, F.T., 2001. Ontology-Driven Geographic Information Systems. Doctor of Philosophy Thesis, The University of Maine, 131 pp. Fonseca, F.T., Egenhofer, M.J., Jr., C.A.D. and Borges, K.A.V., 2000. Ontologies and knowledge sharing in urban GIS. Computers, Environment and Urban Systems, 24: 251-271. Fonseca, F.T. and Egenhofer, M.J., 1999. Ontology-Driven Geographic Information Systems. In: C.B. Medeiros (Editor), 7th ACM Symposium on Advances in Geographic Information Systems, Kansas City, MO, pp. 7. Gahegan, M., Takatsuka, M., Wheeler, M. and Hardisty, F., 2002. Introducing GeoVISTA Studio: an integrated suite of visualization and computational methods for exploration and knowledge construction in geography. Computers, Environment and Urban Systems, 26: 267-292. Gahegan, M. N. (1999). Characterizing the semantic content of geographic data, models, and systems. In Interoperating Geographic Information Systems (Eds. Goodchild, M.F., Egenhofer, M. J. Fegeas, R. and Kottman, C. A.). Boston: Kluwer Academic Publishers, pp. 71-84. Gahegan, M. N. (1996). Specifying the transformations within and between geographic data models. Transactions in GIS, Vol. 1, No. 2, pp. 137-152. Guarino, N., 1997a. Semantic Matching: Formal Ontological Distinctions for Information Organization, Extraction, and Integration. In: M.T. Pazienza (Editor), Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology. Springer Verlag, pp. 139-170. Guarino, N., 1997b. Understanding , building and using ontologies. International Journal of Human-Computer Studies, 46: 293-310. Hakimpour, F. and Timpf, S., 2002. A Step towards GeoData Integration using Formal Ontologies. In: M. Ruiz, M. Gould and J. Ramon (Editors), 5th AGILE Conference on Geographic Information Science. Universitat de les Illes Balears, Palma de Mallorca, Spain, pp. 5. Honda, K. and Mizoguchi, F., 1995. Constraint-based approach for automatic spatial layout planning. 11th conference on Artificial Intelligence for Applications, Los Angeles, CA. p38. Kokla, M. and Kavouras, M., 2002. Theories of Concepts in Resolving Semantic Heterogeneities, 5th AGILE Conference on Geographic Information Science, Palma, Spain, pp. 2. Kuhn, W., 2002. Modeling the Semantics of Geographic Categories through Conceptual Integration. In: M.J. Egenhofer and D.M. Mark (Editors), GIScience 2002. Lecture Notes in Computer Science. Springer-Verlag.
Representing, manipulating and reasoning with geographic semantics
603
MacEachren, A.M., in press. An evolving cognitive-semiotic approach to geographic visualization and knowledge construction. Information Design Journal. Mark, D., Egenhofer, M., Hirtle, S. and Smith, B., 2002. Ontological Foundations for Geographic Information Science. UCGIS Emerging Resource Theme. Pascoe R.T and Penny J.P. (1995) Constructing interfaces between (and within) geographical information systems. International Journal of Geographical Information Systems, 9:p275. Pundt, H. and Bishr, Y., 2002. Domain ontologies for data sharing–an example from environmental monitoring using field GIS. Computers & Geosciences, 28: 95-102. Smith, B. and Mark, D.M., 2001. Geographical categories: an ontological investigation. International Journal of Geographical Information Science, 15(7): 591-612. Sotnykova, A., 2001. Design and Implementation of Federation of SpatioTemporal Databases: Methods and Tools, Centre de Recherche Public - Henri Tudor and Laboratoire de Bases de Donnees Database Laboratory. Sowa, J. F., 2000, Knowledge Representation: Logical, Philosophical and Computational Foundations (USA: Brooks/Cole). Turner, M. and Fauconnier, G., 1998. Conceptual Integration Networks. Cognitive Science, 22(2): 133-187. Visser, U., Stuckenschmidt, H., Schuster, G. and Vogele, T., 2002. Ontologies for geographic information processing. Computers & Geosciences, 28: 103-117.
A Framework for Conceptual Modeling of Geographic Data Quality Anders Friis-Christensen1 , Jesper V. Christensen1 , and Christian S. Jensen2 1 2
National Survey and Cadastre, Rentemestervej 8, CPH, Denmark, {afc|jvc}@kms.dk Aalborg University, Fredrik Bajers Vej 7E, Aalborg, Denmark, [email protected]
Abstract The notion of data quality is of particular importance to geographic data. One reason is that such data is often inherently imprecise. Another is that the usability of the data is in large part determined by how “good” the data is, as different applications of geographic data require different qualities of the data are met. Such qualities concern the object level as well as the attribute level of the data. This paper presents a systematic and integrated approach to the conceptual modeling of geographic data and quality. The approach integrates quality information with the basic model constructs. This results in a model that enables object-oriented specification of quality requirements and of acceptable quality levels. More specifically, it extends the Unified Modeling Language with new modeling constructs based on standard classes, attributes, and associations that include quality information. A case study illustrates the utility of the quality-enabled model.
1 Introduction We are witnessing an increasing use and an increasingly automated use of geographic data. Specifically, the distribution of geographic data via web services is gaining in popularity. For example, the Danish National Survey and Cadastre has developed an initial generation of such services [12]. This development calls for structured management and description of data quality. The availability of associated quality information is often essential when determining whether or not available geographic data are appropriate for a given application (also known as the fitness for use). When certain geographic objects from a certain area are extracted for use in an application, these should be accompanied by quality information specific to these objects and of relevance to the application. Quality information is necessary metadata, and there is a need for a more dynamic approach to quality management in which quality information is an integrated part of a geographic data model. This allows the relevant quality information to be retrieved together with the geographic data. This is in contrast to the use of traditional, general metadata reports that are separate from the geographic data itself.
606
Friis-Christensen et al.
We suggest that the specification of data quality requirements should be an integrated part of conceptual modeling. To enable this, we extend the metamodel of the Unified Modeling Language to incorporate geographic data quality elements. The result is a framework that enables designers and users to specify quality requirements in a geographic data model. Such a quality-enabled model supports application-specific distribution of geographic data, e.g., one that uses web services. Use of this framework has several advantages. • • •
It “conceptualizes” the quality of geographic data. It enables integrated and systematic capture of quality in conceptual modeling. It eases the implementation of data quality requirements in databases.
Fundamental elements of quality have been investigated in past work [3, 7, 8, 14], on which this paper builds. In addition, the International Standardization Organization’s TC211 is close to the release of standards for geographic data quality [10, 11]. Their work focuses on identifying, assessing, and reporting quality elements relevant to geographic data; they do not consider the integration of data quality requirements into conceptual models. We believe that conceptual modeling is important when making the management of geographic data quality operational. Constraints are an important mechanism for quality management. Constraints enable the specification of when data are consistent and enable the enforcement of the consistency requirements. An approach to ensure geographic data quality using constraints specified in the Object Constraint Language (OCL) has been investigated previously [2]. However, constraints and consistency issues are only relevant to a subset of geographic data quality; in this paper, we focus on geographic data quality from a more general point of view. The remainder of the paper is organized as follows. Section 2 presents the quality elements and investigates how to classify quality requirements. Section 3 develops a framework for modeling quality by extending UML to include quality elements. Section 4 discusses and evaluates the approach based on an example of how the extended model can be used. Finally, Section 5 concludes and briefly presents research directions.
2 Quality of Geographic Data This section describes quality and the requirements to geographic data quality. Two overall approaches may be used for describing quality: the user-based approach and the product-based approach [6]. The former emphasizes fitness for use. It is based on quality measures evaluated against user requirements dependent on a given application. The product-based approach considers a finite number of quality measures against which a product specification can be evaluated. While the product specification is based on user requirements, it is often a more general specification that satisfies the needs of a number of different customers. The ISO 9000 series (quality management) defines quality as: the totality of characteristics of a product that bear on its ability to satisfy stated and implied needs [9]. This definition is used in the ISO standards on quality of geographic data [10, 11], and it covers both the
A Framework for Conceptual Modeling of Geographic Data Quality
607
user-based and the product-based approach. Geographic data quality is defined by a finite set of elements, which we term quality information. The quality elements that need to be included in a quality report are identified by ISO and others [11, 14] and they include: lineage, accuracy, consistency, completeness (omission and commission). In addition to the elements specified by ISO, we include precision and metric Quality Elements
Completeness Consistency Constraints
Lineage
Userdefined
Quality Subelements
Accuracy
Precision
Commission
Thematic
Domain
Thematic
Spatial
Format
Spatial
Temporal
Topological
Temporal
Omission
Metric
Fig. 1. Quality Information: Quality Elements and Quality Subelements
consistency [8] because they are identified as necessary elements by the Danish National Survey and Cadastre. The difference between accuracy and precision is that precision is the known uncertainty, e.g., that an instrument used for data capture has. All elements have additional subelements as specified in Figure 1. As quality is a relative measure, data users and database designers are faced with two challenges: how to specify their quality requirements and how to determine whether data satisfy the requirements. The first challenge concerns the specification of requirements, which are often not expressed directly. Introducing quality elements in conceptual models is helpful in meeting this challenge because it this enables the capture of quality requirements in the design phase. The second challenge concerns the possible difference between actual and required data quality. A solution is to make it possible for users to evaluate their requirements against the data, by including quality information about each object, attribute, and association in the database. Section 3 provides an approach to meet these challenges. To be able to support quality requirements in conceptual models, we need to identify the characteristics of such requirements. In doing so, we divide the requirements into two. First, it should be possible to specify which quality subelements are relevant for a given application. These requirements are termed Quality Element Requirements (QERs). Second, it should be possible to express requirements related to the values of the actual quality, i.e., express specifications or standards that the data should satisfy. These requirements are termed Acceptable Quality Levels (AQLs). The two types of requirements are at different levels and are similar to the notions of data quality requirements and application quality requirements [15].
608
Friis-Christensen et al.
The QERs can be characterized as the elements that are necessary to be able to assess and store quality. Thus, QERs reside on a detailed design level. On a more abstract level, we find requirements such as “90% of all objects must be present in the data,” which exemplifies the AQLs. It is important to be able to separate these two levels of requirements. The AQLs do not necessarily have to be specified, but if any quality information is needed, the QERs must be specified. Figure 2 shows the two levels. The requirements are specified as user-defined, which means that the designers are specifying the requirements, which should reflect the requirements from the application users. As seen in the figure, the quality
Acceptable quality level User specified requirements
uses
Quality assessment (internal and external)
uses
requires
Quality element requirements (aggregation/instance level) User specified requirements
Fig. 2. Quality Requirements
elements require different methods for their assessment: internal and external methods. In the internal approach, quality is assessed using internal tests immediately when data are inserted into the database, and this testing is independent of external sources. For example, topological constraints are evaluated immediately after data are inserted into the database. Data that do not satisfy the constraints are rejected. In the external approach, data are checked against an external reference that is believed to be the “truth.” For example, if 90% of all forest objects are required to be present in the data, this has to be measured against some kind of reference. Quality elements assessed using external methods need to be assessed at a later stage, by means of an application running on top of the database. Finally, the quality elements are divided into an aggregation level and an instance level. These levels relate to whether quality is measured/specified for each instance of an object type or the quality is aggregated at, e.g., an object-type level. It should be possible to specify requirements related to both levels. As an example, a building object (instance level) has a spatial accuracy, which is different from the aggregated spatial accuracy of all buildings (aggregation level). Both levels may be important to an application.
3 Modeling Quality We proceed to present a framework for conceptual modeling of geographic data quality. We first describe how conceptual data models can be extended to support quality requirements. Then the various quality subelements are related to model constructs. This sets the stage for the quality-enabled model, which is covered last. A more thorough description of the quality-enabled model is given elsewhere [4].
A Framework for Conceptual Modeling of Geographic Data Quality
609
3.1 Conceptual Data Models Conceptual data models aim to enable domain experts to reflect their modeling requirements clearly and simply. We use the Unified Modeling Language (UML) [1] for modeling. The UML metamodel represents model constructs3 as classes, e.g., class, association, and attribute. We term these UML base classes. UML provides three built-in mechanisms for extending its syntax and semantics: stereotypes, tagged values, and constraints [1, 16]. Using these mechanisms, it is possible to adapt the UML semantics without changing the UML metamodel, and so these are referred to as lightweight extensibility mechanisms [13]. Another approach to extending UML is to extend the metamodel, e.g., by introducing new metaclasses. A metaclass is a class whose instances are classes [1], and it defines a specific structure for those classes. The new metaclasses can also define stereotypes, which are used when the new modeling constructs are required. We refer to this as the heavyweight extensibility mechanism [13]. 3.2 Quality in Conceptual Data Models Conceptual data models typically support constructs such as classes, associations, and attributes. No special notation is offered for the modeling of data quality, and database designs often do not model data quality, perhaps because data producers do not realize the importance of data quality information. This may reduce the applicability of a database: to be able to use data appropriately, it is important to have access to quality information. Integration of data quality subelements with the common model constructs is a significant step in making it possible to express data quality requirements in the conceptual design phase. This contributes to enabling subsequent access to quality information. Table 1 relates data model constructs to data quality elements, and is used to extend UML with data quality elements. As can be seen from the table, we divide quality subelements related to model constructs into an aggregation level and an instance level. The aggregation level concerns aggregated quality subelement measures for all objects of a class. The instance level concerns quality subelement measures for an individual object. As described in Section 2, quality requirements can reside at both levels. Furthermore, the quality subelements are classified according to how they are assessed. Completeness and accuracy subelements require external data to be assessed, whereas the remaining quality subelements do not. Precision and lineage information are closely related to the production process and are attached to instances of the model constructs. Precision can later be assessed at the aggregation level. The consistency requirements are related to the specification of the universe of discourse. They specify requirements that must be satisfied and are usually expressed as constraints in conceptual models. Consistency requirements can be specified for instances, but also for collections and sets (aggregation level). 3
Termed model elements in UML; here, we use “model constructs” to avoid confusion with “quality element.”
610
Friis-Christensen et al. Table 1. Quality Related to Model Constructs
At the instance level, the omission subelement of completeness is not included for objects. This is because information cannot be associated with objects that do not exist in the database. On the other hand, commission errors can be stored; this means that an object in the data set does not exist in the universe of discourse. For attributes and associations, we can assess both omission and commission. When a class is instantiated, it is possible to determine whether or not an attribute has a value. The same applies for explicit associations, as they can be implemented as foreign keys or object reference attributes. Only spatial accuracy and precision of associations are possible. This is the relative distance, which may be relevant in certain situations. Based on our requirements, we do not find relative temporal and thematic accuracy/precision to be relevant. 3.3 Quality-Enabled Model The quality-enabled model presented in this section consists of several elements. First, the class, attribute, and association constructs are extended in the UML metamodel to contain quality subelements. They are used to describe the QERs at the instance and aggregation levels. Second, to express consistency requirements, the constraints are extended to include specific geographic constraints (e.g., topological). Third, optional tagged values are created and used to specify which QERs are relevant and need to be assessed at the instance and aggregation levels. Finally, to specify the AQLs, a specification compartment is used for both the instance and the aggregation level. Quality Element Requirements In Figure 3, the UML metamodel is extended to support geographic data quality based on Table 1. The metaclass specifies the structures of the quality subele-
A Framework for Conceptual Modeling of Geographic Data Quality
611
ments of classes, attributes, and associations, and of their instances. Three new metaclasses are defined: ClassQuality, AttributeQuality, and AssociationQuality. These are subclasses of Class, Attribute, and Association, respectively, from the UML core metamodel. For the abstract metaclass AttributeQuality, we define three subclasses that represent the spatial, thematic, and temporal dimensions. As a subclass to the metaclass Association from the UML core, we define an AssociationQuality, which is used when information on omission and commission needs to be captured. We also define a subclass SpatialAssocQuality, which, in addition to omission and commission, stores information about relative accuracy and precision.
{Each association instance must have QualityInformationAttr of type: Lineage, Omission, Commission}
{Each attribute instance must have QualityInformationAttr of type: Lineage, Omission, Commission, Accuracy, Precision}
Association (from Core)
Attribute (from Core) initialValue
AssociationQuality origin
1
AttributeQuality 1 origin
{Each class instance must have QualityInformationAttr of type: Lineage, Commission, Accuracy, Precision}
The attributes accuracy, precision, omission, and commission, some of which are associated with metaclasses that carry quality information, are statically scoped, meaning that their values belong to the class and, hence, apply to all instances. They enable specification of quality requirements at the aggregation level. Quality subelements relevant at the instance level are modeled using the metaclass QualityInformationAttr. This metaclass inherits from Attribute from the UML core and defines a new attribute, which has a type and assessment method. The type of the attribute can be: Lineage, Accuracy, Precision, Omission, or Commission. ClassQuality, AttributeQuality, and AssociationQuality are all associated with instances of QualityInformationAttr, which describe various aspects of quality. They are instance-scoped attributes, which means that they are attributes of each instance of instantiated metamodel classes. The assessment method is optional; however, it can be useful when analyzing a quality report. The instance-scoped attributes are differentiated from the statically scoped attributes by having an initial capital letter.
612
Friis-Christensen et al.
The extended metamodel and its metaclasses are elsewhere used to define new stereotypes [4]. The tagged values and acceptable quality level are explained later in this section. The string <<stereotype>> denotes a new stereotype, which can be used in a model. An example is that a class, the instances of which needs to carry quality information, is associated with the stereotype <>. Since we change the structure of UML’s metamodel (adding new attributes to the metaclasses), we apply heavyweight extension to UML. It has to be noted that the stereotypes only express which quality subelements are relevant. Not all quality subelements have to be assessed if certain requirements from producers exist. A tagged value (see Section 3.3) can be used to specify which subelements need to be assessed. Constraints specify consistency requirements in a conceptual model. Constraints can be associated with all modeling constructs (e.g., classes, attributes, and associations), and all types are subclasses of the metaclass constraint in the UML metamodel. Constraints are specified in OCL, which means that OCL must be extended with new operators, e.g., the topological operator inside. An example domain constraint specified for an attribute follows: domainConstraint context building self.shape.area > 25. This constraint specifies that the area of a building should be greater than a predefined value (25). Four constraint types are classified and stereotypes are defined accordingly: <>, <>, <<MetricConstraint>>, and <>. These constraints are further specified elsewhere [4]. Quality Subelements to be Assessed If there is a need to reduce the number of quality subelements, we can use tagged values (qualityElementI and qualityElementA) for specifying which quality elements should be assessed and stored. It is important to note that tagged values can be seen as additional documentation details and do not have to be specified. A comprehensive quality report assesses all elements. QualityElementI specifies which elements need to be assessed at the instance level and qualityElementA specifies which elements need to be assessed at the aggregation level. There is a difference in allowed subelements; for example, lineage information is only relevant at the instance level. The tagged values can be associated with classes, attributes, and associations. To assess quality at the aggregation level, the relevant information must exist at the instance level. As an example, we add a tagged value to a class Building, specifying the subelements that should be assessed at the aggregation level: qualityElementA = . Acceptable Quality Level Finally, to specify the AQLs, two specification compartments are introduced. They can be associated with classes (for classes and attributes) and associations. There is one compartment for specifying the AQLs for each of the instance level and the
A Framework for Conceptual Modeling of Geographic Data Quality
613
aggregation level. The database designer specifies the AQLs. However, users of data can also express AQLs. Data can be evaluated against their requirements and the usability can be determined. Specification of AQLs is optional. The expressions stated in the specification compartments are simply constraints related to the assessed quality, which must be satisfied. The form of the expression is: <modelconstruct> : (). An example of the acceptable quality level is given in the next section.
4 Discussion of Approach This section discusses the approach presented in the previous section. First, a case study elicits some quality requirements relevant for certain geographic entities. We then show how these requirements can be modeled using our approach. Finally, the approach is evaluated against other approaches. 4.1 Case Study The case study consists of a map database with three classes: Road, Road Segment, and Municipality that are depicted in Figure 4 using UML. It is a simplified model of a road network with associated municipalities. The objects in the model are intended for use in the scale of 1:10,000. The model contains no ex0..*
Road name[1] : String
1..*
located in
Road_Segment 1
1..*
extent[1] : Polyline
is within 0..*
Municipality 1
name[1] : String shape[1] : Polygon
Fig. 4. Conceptual Model for Road Network
plicit information of quality requirements, although some are represented implicitly in the model. For example, the cardinalities of associations express consistency requirements. Apart from such implicit quality requirements, several additional quality requirements exist. As examples, we state five quality requirements that are to be included in the model. 1. For all roads, road segments, and municipalities, all quality information should be recorded (completeness, accuracy, precision, and lineage requirement). 2. All road segments must have an average root mean square error of maximum 1.7 meters (accuracy requirement). 3. The name of a road object should be correct (accuracy requirement). 4. At least 99% of all roads in the universe of discourse must be represented in the data, and no more than 1% of all roads must be represented in data without being in the universe of discourse (completeness requirement). 5. A road segment must not intersect a municipality border (consistency requirement). The above list exemplifies the variety among quality requirements. We proceed to express these requirements in our conceptual model.
614
Friis-Christensen et al.
4.2 Modeling Quality Requirements We use the stereotypes specified above to model the QERs. Classes Road, RoadSegment, and Municipality are all stereotyped with <> (as depicted in Figure 5). This is because we are interested in all available quality information about each class and associated instance (i.e., quality information at both aggregation and instance level). To the attribute name of Road, we associate the stereotype <> because thematic attribute quality information is required. To the attributes extent and shape of Road Segment and Municipality, we associate the stereotype <<SpatialAttrQuality>> because spatial attribute quality information is required. For the remaining thematic attribute name of Municipality, we do not require any quality information. AQLs «ClassQuality» Road «ThematicAttrQuality» name[1] : String
«TopologicalConstraint» {Road_segment.extent do not cross Municipality.shape}
Fig. 5. Case with Quality Elements
appear in the specification compartments. An example is that for Road, we specify that commission and omission must be less than 1%, which means that less than 1% of all roads must be in excess compared to reality and less than 1% of the roads may be missing from the data. These are requirements to the data producers; if they fail, new registrations must be initiated. Different approaches can be used to develop a quality-enabled model. We have developed an extension to UML. Another approach is to add quality information to standard geographic data models using design patterns [5]. This approach has the advantage that extension to UML is not needed. An example is that the QERs at the aggregation level for both classes and attributes is specified using a quality class for each class and attribute for which quality information is required. Only one instance is allowed for each of these quality classes. This is similar to the design pattern Singleton [5]. The instance-level QERs for attributes can be specified using additional attributes. An example is that for a shape attribute, e.g., a lineageShape and an accuracyShape attribute are required for specifying the quality of the attribute. The AQLs can be specified at the application level on top of a given geographic database, or alternatively tagged values can be used to specify the required AQLs.
A Framework for Conceptual Modeling of Geographic Data Quality
615
This approach complicates the schema unnecessarily, as, e.g., numerous common quality attributes and classes must be associated with classes, attributes, and associations that require quality information. The alternative solutions described above do not meet our requirements. We require that quality information should be an integrated part of standard UML constructs, such as classes and associations, thus, we have extended the properties of existing modeling constructs. The advantage of using this approach is that, even though it may perhaps at first seem complex, it provides a standard interface to model the quality of geographic data. It enables designers to conveniently reflect all requirements related to the quality of geographic data in a conceptual design. Not all quality requirements necessarily have to be specified, and certain requirements can be visually omitted if presented to users with no interest in quality requirements. An example is that the AQLs can be hidden from some users. In the design of a conceptual data model, there is always a balance between how much information to capture in the model and the readability of the model. If a model becomes overloaded with information, the disadvantage can be that the model is not easily read. This is an issue that designers have to consider when using our approach. Further advantages of our approach are that the formulation of quality requirements has been “standardized,” which is helpful in the design and implementation phases. Modeling of quality requirements using standard UML does not ensure a common approach.
5 Conclusions and Future Work A main motivation for the work reported here is the ongoing change in the distribution and use of geographic data. The trend is towards online and automated access. This leads to an increasing need for the ability to select relevant and appropriate data. In this selection, proper quality information is essential; and since users are selective when it comes to geographic theme and extent, they should receive only the quality information relevant to their selection. This requires new approaches to quality information management. This paper presents a new framework for the integrated conceptual modeling of geographic data and its associated quality. First, we advocate a solution that captures that quality information together with the data it concerns. Second, we present a notation for creating conceptual models, in which quality requirements can be captured. The use of models based on our notation and framework enable application-oriented quality reports. An example illustrates that the designers are given an approach to formulate more precisely the quality requirements that are relevant to a resulting geographic data model. Furthermore, the framework provides a systematic approach to the capture of quality requirements. The paper thus offers a significant step towards a common framework for quality modeling. Several interesting directions for future work exist. Some additional aspects of quality may be taken into consideration. Currently, it is not considered how quality information relates to the results of GIS-operations. For example, if spatial data have errors that are spatially autocorrelated, then the data may still be useful for
616
Friis-Christensen et al.
measuring distances. Next, the development of a more formal specification of the quality-enabled model is of interest, as is the development of a quality evaluation prototype. How to transform conceptual models, created within the proposed framework, into logical models is also a topic for future research. Furthermore, there is a need to develop applications that can be used to assess quality and generate quality reports based on the specified quality requirements. Finally, applications that support the extended UML metamodel should be implemented.
References [1] G. Booch, J. Rumbaugh, and I. Jacobson. The Unified Modeling Language User Guide. Object Technology Series. Addison-Wesley, 1999. [2] M. Casanova, R. Van Der Straeten, and T. Wallet. Automatic Constraint Generation for Ensuring Quality of Geographic Data. In Proceedings of MIS, Halkidiki, Greece, 2002. [3] M. Duckham. Object Calculus and the Object-Oriented Analysis and Design of an Error-Sensitive GIS. GeoInformatica, 5(3):261–289, 2001. [4] A. Friis-Christensen. Issues in the Conceptual Modeling of Geographic Data. Ph.D. thesis, Department of Computer Science, Aalborg University, 2003. [5] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns. Elements of reusable Object-Oriented Software. Addison Wesley, 1995. [6] David A. Garvin. Managing Quality: The Strategic and Competitive Edge. Free Press, 1988. [7] M. F. Goodchild and S. Gopal, editors. The Accuracy of Spatial Databases. Taylor & Francis, 1989. [8] S. C. Guptill and J. L. Morrison, editors. Elements of Spatial Data Quality. Elsevier, 1995. [9] ISO. Quality Management and Quality Assurance—Vocabulary. Technical Report 8402, International Standardization Organization, 1994. [10] ISO. Geographic information—Quality evaluation procedures. ISO/TC 211 19114, International Standardization Organization, 2001. [11] ISO. Geographic information—Quality principles. ISO/TC 211 19113, International Standardization Organization, 2001. [12] KMS. Map Service (in Danish), 2003. http://www.kms.dk/kortforsyning. [13] OMG. White Paper on the Profile Mechanism. 99-04-07, 1999. [14] H. Veregin. Data Quality Parameters. In P. A. Longley, M. F. Goodchild, D. J. Maguire, and D. W. Rhind, editors, Geographical Information Systems: Principles and Technical Issues, Vol. 1, pp. 177–189. 2nd Edition, John Wiley & Sons, 1999. [15] R. Y. Wang, M. Ziad, and Y. W. Lee. Data Quality. Advances in Database Systems. Kluwer Academic Publishers, 2001. [16] J. B. Warmer and A. G. Kleppe. The Object Constraint Language : Precise Modeling with UML. Object Technology Series. Addison-Wesley, 1999.
Consistency Assessment Between Multiple Representations of Geographical Databases: a Specification-Based Approach David Sheeren1,2, Sébastien Mustière1, and Jean-Daniel Zucker3 1 COGIT Laboratory - IGN France, 2-4 avenue Pasteur, 94165 Saint Mandé {David.Sheeren,Sebastien.Mustiere}@ign.fr 2 LIP6 Laboratory, AI Section, University of Paris 6 3 LIM&BIO, University of Paris 13, [email protected]
Abstract There currently exist many geographical databases that represent a same part of the world, each with its own levels of detail and points of view. The use and management of these databases therefore sometimes requires their integration into a single database. The main issue in this integration process is the ability to analyse and understand the differences among the multiple representations. These differences can of course be explained by the various specifications but can also be due to updates or errors during data capture. In this paper, we propose an new approach to interpret the differences in representation in a semiautomatic way. We consider the specifications of each database as the “knowledge” to evaluate the conformity of each representation. This information is grasped from existing documents but also from data, by means of machine learning tools. The management of this knowledge is enabled by a rule-based system. Application of this approach is illustrated with a case study from two IGN databases. It concerns the differences between the representations of traffic circles. Keywords. Integration, Fusion, Multiple Representations, Interpretation, Expert-System, Machine Learning, Spatial Data Matching.
1 Introduction In recent years, a new challenge has emerged from the growing availability of geographical information originating from different sources: their combination in a consistent way in order to obtain more reliable, rich and useful information. This general problem of information fusion is encountered
618
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
in different domains: signal and image processing, navigation and transportation, artificial intelligence… In a database context, this is traditionally called “integration” or “federation” [Sheth and Larson 1990, Parent and Spaccapietra 2000]. Integration of classical databases has already been given much attention in the database community [Rahm and Bernstein 2001]. In the field of geographical databases, this is also subject to active research. Contributions concern the integration process itself (the schemas integration and the data integration) [Devogele et al. 1998, Branki and Defude 1998], the development of matching tools [Devogele 1997, Walter and Fritsch 1999], the definition of new models supporting multiple representations [Vangenot et al. 2002, Bédard et al. 2002], and new data structures structures [Kidner and Jones 1994]. Some ontology-based approaches are now being proposed [Fonseca et al. 2002]. But some issues still need to be addressed in the process of unifying geographical databases, particularly the phase of data integration, i.e. the actual population of the unified database. Generally speaking, this phase is mainly thought of as a matching problem. However, it is also essential to determine if the differences in representation between homologuous objects are “normal”, i.e. originating from the differences of specification. Some contributions exist to evaluate the consistency between multirepresentations, especially the consistency of spatial relations [Egenhofer et al. 1994, El-Geresy and Abdelmoty 1998, Paiva 1998]. Most of the time, studies rely on a presupposed order between representations. This assumption may not be adapted to the study of databases with similar levels of detail, but defined according to different points of view. In this paper, we too address the issue of the assessment of consistency between multiple representations. The approach we suggest is based on the use of specifications from individual databases. We consider these specifications as the key point to understand the origin of the differences, and we suggest the use of a rule base to explicitly represent this knowledge and manage it. We make no assumptions on a hierarchy between the databases that need integrating. The paper is organised as follows: in section 2, we examine the origins of the differences and the specification-based approach to interpret them. Then we propose an interpretation process and the architecture of the system to implement it in section 3. The feasibility of the approach is demonstrated with a particular application in section 4. We conclude our study in section 5.
Consistency Assessment Between Multiple Representations of GDB
619
2 Specifications for Interpreting Differences
2.1 Origin of Differences Between Representations Geographical databases are described by means of specifications. These documents describe precisely the contents of a database, i.e. the meaning of each part of the data schema, which objects of the real world are captured, and how they are represented (figure 1). Specification 1 Data capture1 DB1
View of the World
Schema1
Data capture2 Specification 2
DB2 Schema2
Fig. 1. Specifications govern the representation of geographical phenomena in databases
The differences between specifications are responsible for the majority of the differences between representations. These differences are completely normal and illustrate the diversity of points of views turned on the world. For example, a traffic circle may be represented by a dot in one database, or by a detailed surface in another. However, all differences are not justified. The data capture process is not free of errors and differences can occur between what is supposed to be in the databases and what is actually in the databases. Some other differences are due to the differences of update between databases. These differences are problematic because they can lead to inconsistent representations in the multi-representation system and for that reason, they must be detected and managed in a unification process. More formally, we define hereafter the concepts of equivalence, inconsistency and update between representations. Let O be the set of objects from a spatial database DB1 and O' the set of objects from a spatial database DB2. Let us consider a matching pair of the form (M,M'), where M is a subset of O and M' a subset of O'. Definition 1 (equivalence). Representations of matching pairs (M,M’) are said to be equivalent if these representations can model a world such as, at the same
620
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
time, M and M’ respect their specifications and correspond to the same entity of the real world. Definition 2 (update). Representations of matching pairs (0,0’) are said to be of different periods if these representations can model a world such as M and M' respect their specifications and correspond to the same entity of the real world, but at different times. Definition 3 (inconsistency). Representations of matching pairs (0,0’) are said to be inconsistent if they are neither an update, nor an equivalence. Thus either M or M' does not respect its specifications (error in databases), or M and M’ do not correspond to the same entity of the real world (matching error).
The purpose of our work is to define a process to automatically detect and interpret differences between databases. This process, embedded in a decision support system, aims at guiding the management of differences during a unification process. 2.2 Knowledge Acquisition for the Interpretation of the Differences The key idea of this approach is to make explicit, in an expert-system, the knowledge necessary to interpret the differences. As explained above, a great deal of the knowledge comes from the specifications. Nevertheless, it is rather difficult to draw knowledge from the specifications and to represent it. Actually, the documents are usually rich but voluminous, relatively informal, ambiguous, and not always organised in the same way. Moreover, part of the necessary knowledge comes from common geographical knowledge (for example: traffic circles are more often retained than discarded) and experts are rarely able to supply an explicit description of the knowledge they use in their reasoning. We are thus faced with the wellknown problem of the "knowledge acquisition bottleneck". In our process, we try to solve it in three different ways. The first technique is to split up into several steps the reasoning involved in the problem solving (section 3). This is the approach of second generation expert-systems [David et al 1993]. The control over the inferences that need to be drawn is considered as a kind of knowledge in itself, and explicitly introduced in expert-systems. The second technique is to develop rules by hand with the help of a knowledge acquisition process. We believe that such a process should rely on the definition of a formal model of specifications [Mustière et al 2003]. In the example developed in section 4, some of the rules managed by the expert-system have been introduced by hand, after formalising the actual specifications by means of a specific model. This is still ongoing research and it will not be detailed in this paper.
Consistency Assessment Between Multiple Representations of GDB
621
The last technique is the use of supervised machine learning techniques [Mitchell 1997]. These techniques are one of the solutions developed in the Artificial Intelligence field. Their aim is to automatically build some rules from a set of examples given by an expert. These rules can then be used to classify new examples introduced into the system. Such techniques have already been used to acquire knowledge in the geographical domain [Weibel et al 1995, Sester 2000, Mustière et al 2000, Sheeren 2003].
3 The Interpretation Process
3.1 Description of the Steps In this section, we describe the interpretation process we have defined. It is decomposed in several steps which are illustrated in figure 2. One correspondence
Specifications analysis
Enrichment of datasets
Intra-Database Control
Spatial Data Matching
Global Evaluation
Inter-Database Control
DB1
DB2
Machine learning
Rules
Fig. 2. From individual databases to the interpretation of differences
The process starts with one correspondence between classes of the schemas of the two databases. We presume that matching at the schema level has already been carried out. For instance, we know that the road class in DB1 tallies with the road and track classes in DB2. The task of specifications analysis is then the key step: the specifications are analysed in order to determine several rule bases that will be used to guide each of the ensuing steps. These rules primarily describe what exactly the databases contain, what differences are likely to appear, and in which conditions. This step is performed through the analysis of documents, or through machine learning techniques. The next step concerns the enrichment of each dataset. This is compulsory before the actual integration of the databases [Devogele et al 1998]. The purpose is to express the heterogeneous datasets in a more homogene-
622
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
ous way. For this step, the particularity of the geographical databases arises from the fact that they contain lots of implicit information on spatial relations through the geometry of objects. Their extraction requires specific analysis procedures. A preliminary step of control is then planned: the intra-database control. During this step, part of the specifications is checked so as to detect some internal errors and determine how the data instances globally respect specifications. This will be useful for the identification of the origin of each difference but also, for the detection of matching errors. Once the data of both databases has been independently controlled, it is matched. Matching relationships between datasets are computed through geometric and topologic data matching. We end up with a set of matching pairs, each one characterised by a degree of confidence. The next step consists in the comparison of the representations of the homologous objects. This is the inter-database control. This comparison leads to the evaluation of the conformity of the differences and particularly implies the use of specifications and expert knowledge. Results of the first control previously carried out are also exploited. At the end, differences existing between each matching pair are expressed in terms of equivalence, inconsistency or update. After the automatic interpretation of all the differences by means of the expert-system, a global evaluation is supplied: the number of equivalencies, the number of errors and their seriousness, and the number of updates. 3.2 The Architecture of the System An illustration of the structure of the system is given in figure 3. It is composed of two main modules: the experimental Oxygene GIS and the Jess expert-system. Oxygene is a platform developed at the COGIT laboratory [Badard & Braun 2003]. Spatial data is stored in the relational Oracle DBMS, and the manipulation of data is performed with the Java code in the object-oriented paradigm. The mapping between the relational tables and the Java classes is done by the OJB library. A Java API exists to make the link between this platform and the second module, the Jess rule-based system. It is an open source environment which can be tightly coupled to a code written in Java language [Jess 2003]. The rules used by Jess originate directly from the specifications, or have been gathered with the learning tools.
Consistency Assessment Between Multiple Representations of GDB JESS EXPERT-SYSTEM
Facts Inference engine
623
OXYGENE
Application Schemas
Java API
DB Schemas
Java Implementation
Object Schema
Rules Mapping (OJB)
Spatial Data (DBMS Oracle)
Specifications
External Knowledge
GROUND KNOWLEDGE
Learning algorithms
Fig. 3. Architecture of the system
4 Differences Between Representations of Traffic Circles: a Case Study In this section, we study the differences existing between the traffic circles of two databases from the IGN (French National Mapping Agency): BDCarto and Georoute (figure 4). BDCarto is a geographical database meant in particular to produce maps at a scale ranging from 1:100,000 to 1:250,000. Georoute is a database with a resolution of 1 m dedicated to traffic applications. The representation of the traffic circles can differ from one database to another because the specifications are different. Our question is thus as follow: which differences are “normal”, i.e. which representations are equivalent, and which differences are “abnormal”, i.e. which representations are inconsistent ? We detailed the implementation of the process below. Georoute
BDCarto
Fig. 3. The road theme of the two geographical databases examined
Specifications analysis. Specifications of BDCarto and Georoute explicitly describe the representation of the traffic circles. For both databases the representation can be simplified (corresponding to a node) or detailed (corresponding to connected edges and nodes). The modelisation depends
624
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
on the diameter of the object in the real world, but also on the presence of a central reservation (figure 5).
Fig. 4. Some specifications concerning traffic circles of BDCarto and Georoute
Specifications introduce differences between datasets. They appear both at geometry and attribute levels. The description also reveals a difficulty that has already been brought up: the gap existing between the data mentioned in the specifications and the data actually stored in the databases. The traffic circles are implicit objects made of several edges and nodes, but there is no corresponding class in the database. In the same way, the diameter of the objects and the direction of the cycle does not exist as an attribute in the databases. It is thus necessary to extract this information in order to check the specifications and enable the comparison between the data. This enrichment is the subject of the next step. Enrichment of the data. In the unification context, the enrichment of the databases concerns both geometrical data and schemas. In figure 6, we illustrate the new classes and relations created at the schema level. These classes can constitute federative concepts to put in correspondence the two schemas of the databases during the phase of creating the unified schema [Gesbert 2002]. At the data level, it is also necessary to extract implicit information and instanciate the new classes and relations created. Several operations have been carried out to achieve this, for the two databases (figure 7). First, we created the simple traffic circles and their relation with the road nodes. The simple traffic circles are road nodes for which the attribute nature takes the value ‘traffic circle’. Concerning the complex traffic circles, the construction of a topological graph was first necessary. Faces were created and all the topological relations between edges, nodes and faces were computed. We then filtered each face in order to retain only those corresponding to a traffic circle. Several criteria were taken into account: the direction of the cycle, the number of nodes for each cycle and the value of Miller’s circularity index. These criteria were embedded in rules and com-
Consistency Assessment Between Multiple Representations of GDB
625
bined with the decision support system. In doing so, only faces corresponding to a traffic circle were retained. Traffic Circle
start > Simple Traffic Circle
Complex Traffic Circle
Road Section
1..*
0..1
Road Node
< end 1..1
0..1
Fig. 5. Extract of the Georoute schema: new classes and relations are in dashed line.
The enrichment phase is thus performed to extract the implicit information required for the specifications control, but also to bring the structure of the data and schemas closer to each other. Creation of the Topological
Characterization of each Face
Filtering of Faces (Jess rules)
Examples of created object
Input data
Output data
Fig. 7. The creation of complex traffic circles (extract of Georoute).
Intra-database Control. Two kinds of traffic circles were created in the previous step: simple traffic circles (nodes) and complex traffic circles (connected edges and nodes). At this level, the representations of the objects were checked to detect some internal errors. The control was automated thanks to several rules activated by the expert-system. These rules were developed and introduced by hand. For example: (defrule control_diameter_georoute (if diameterLength > 30) => (set diameterConformity "conform"))
Only part of the representations were controlled at that stage for each database: the complex traffic circles and the information associated with them (the diameter, the number of nodes,…). The node representation was checked later during the inter-database control because of the lack of in-
626
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
formation at that point. Some errors were identified during this process and the results were stored in specific classes for each database. Spatial Data Matching. The matching tools used in this process are those proposed by [Devogele 1997]. They are founded on the use of both geometric and topologic criteria. They have been enriched by using the polygonal objects created during the previous steps, in order to improve the reliability of the algorithms. A degree of confidence was systematically given for each matching pair, according to the cardinality of the link, the dimension of the objects constituting the link and the matching criteria used. Finally, we have retained 89% of matching pairs, for a total of 690 correspondences computed. BDCarto
Georoute
Matching
Fig 8. Example of homologous traffic circles and roads matched
Inter-database Control. Some internal errors were already detected during the first step of control but the representations of the two databases had not been compared. The comparison was the purpose of this step. It led to the classification of each matching pair in terms of equivalence and inconsistency (no updates were found for these datasets). The introduction of rules by hand to compare the representations was first considered, but because of numerous possible cases and the complexity of some rules, we decided to use supervised machine learning. An example of a rule computed by the C5.0. algorithm [Quinlan 1994] is presented below. It enables the detection of an inconsistency: If the type of the traffic circle in Georoute = ‘dot’ And if the node type of the traffic circle in BDCarto = ‘small traffic circle’ Then the representations are inconsistent
A set of rules have been introduced in the expert-system and finally, the set of matching pairs have been interpreted automatically. We have computed 67% of equivalencies and 33% of inconsistencies. Various types of inconsistencies were highlighted: modelling errors, attribute errors and geometrical errors (the variation between the diameters of the detailed objects were sometimes too high). We have noted that the errors were more frequent in the BDCarto.
Consistency Assessment Between Multiple Representations of GDB
627
5 Conclusion and Future Work This paper has presented an new approach to deal with the differences in representation during the phase of data integration of geographical databases. The key idea of the approach is to use the specifications of each database to interpret the origin of the differences: equivalence, inconsistency or updates. The knowledge is embedded in rules and handled by an expertsystem. The rules are introduced in two ways: by hand and thanks to supervised machine learning techniques. This approach opens up many new prospects. It will be possible to improve the quality and up-to-dateness of each analysed database. The specifications could also be enriched and described in a more formal way. The use of the specifications and representations of one database can indeed help precise the capture constraints of the other database. Finally, we think that the study of the correspondences between the data could help find the mapping between the elements at the schema level. Few research have been made in that direction.
References Badard T. and Braun A. 2003. OXYGENE : an open framework for the deployment of geographic web services, In Proceedings of the International Cartographic Conference, Durban, South Africa, pp. 994-1003. Bédard Y., Bernier E. et Devillers R. 2002. La métastructure vuel et la gestion des représentations multiples. In Généralisation et représentation multiple, A. Ruas (ed.), chapitre 8. Branki T. and Defude B. 1998. Data and Metadata: two-dimensional integration of heterogeneous spatial databases, In Proceedings of the 8th International Symposium on Spatial Data Handling, Vancouver, Canada, pp. 172-179. David J.-M., Krivine J.-P. and Simmons R. (eds.) 1993. Second Generation Expert Systems, Springer Verlag. Devogele T. 1997. Processus d’intégration et d’appariement de bases de données Géographiques. Application à une base de données routières multi-échelles, PhD Thesis, University of Versailles, 205 p. Devogele T., Parent C. and Spaccapietra S. 1998. On spatial database integration, International Journal of Geographical Information Science, 12(4), pp.335352. Egenhofer M.J., Clementini E. and Di Felice P. 1994. Evaluating inconsistencies among multiple representations, In Proceedings of the Sixth International Symposium on Spatial Data Handling, Edinburgh, Scotland, pp. 901-920.
628
David Sheeren, Sébastien Mustière and Jean-Daniel Zucker
El-Geresy B.A. and Abdelmoty A.I. 1998. A Qualitative Approach to Integration in Spatial Databases, In Proceedings of the 9th International Conference on Database and Expert Systems Applications, LNCS n°1460, pp. 280-289. Fonseca F.T., Egenhofer M., Agouris P. and Câmara G. 2002. Using ontologies for integrated Geographic Information Systems, Transactions in GIS, 6(3). Gesbert N. 2002. Recherche de concepts fédérateurs dans les bases de données géographiques, Actes des 6ème Journées Cassini, École Navale, pp. 365-368. Jess 2003. The Jess Expert-System, http://herzberg.ca.sandia.gov/jess/ Kidner D.B. and Jones C. B. 1994. A Deductive Object-Oriented GIS for Handling Multiple Representations, In Proceedings of the 6th International Symposium on Spatial Data Handling, Edinburgh, Scotland, pp. 882-900. Mitchell T.M. 1997. Machine Learning. McGraw-Hill Int. Editions, Singapour. Mustière S., Zucker J.-D. and Saitta L. 2000. An Abstraction-Based Machine Learning Approach to Cartographic Generalisation, In Proceedings of the 9th International Symposium on Spatial Data Handling, Beijing, pp. 50-63. Mustière S., Gesbert N. and Sheeren D. 2003. A formal model for the specifications of geographic databases, In Proceedings of the 2nd Workshop on Semantic Processing of Spatial Data (GeoPro’2003), Mexico City, pp. 152-159. Paiva J.A. 1998. Topological equivalence and similarity in multi-representation geographic databases, PhD Thesis, University of Maine, 188 p. Parent C. and Spaccapietra S. 2000. Database Integration: the Key to Data Interoperability. In Advances in Object-Oriented Data Modeling, Papazoglou M., Spaccapietra S. and Tari Z. (eds). The MIT Press. Quinlan J.R. 1993. C4.5 : Programs for machine learning, Morgan Kaufmann. Rahm E. and Bernstein P.A. 2001. A survey of approaches to automatic schema matching, Very Large Database Journal, 10, pp. 334-350. Sester M. 2000. Knowledge Acquisition for the Automatic Interpretation of Spatial Data, International Journal of Geographical Information Science, 14(1), pp. 1-24. Sheeren D. 2003. Spatial databases integration : interpretation of multiple representations by using machine learning techniques, In Proceedings of the International Cartographic Conference, Durban, South Africa, pp. 235-245. Sheth A. and Larson J. 1990. Federated database systems for managing distributed, heterogeneous and autonomous databases, ACM Computing Surveys, 22(3), pp. 183-236. Vangenot C., Parent C. and Spaccapietra S. 2002. Modeling and manipulating multiple representations of spatial data, In Proceedings of the International Symposium on Spatial Data Handling, Ottawa, Canada, pp. 81-93. Walter V. and Fritsch D. 1999. Matching Spatial Data Sets: a Statistical Approach, International Journal of Geographical Information Science, 13(5), pp. 445473. Weibel R., Keller S. et Reichenbacher T. 1995. Overcoming the Knowledge Acquisition Bottleneck in Map Generalization : the Role of Interactive Systems and Computational Intelligence, In Proceedings of the 2nd International Conference on Spatial Information Theory, pp. 139-156.
Integrating structured descriptions of processes in geographical metadata Bénédicte Bucher Laboratoire COGIT, Institut Géographique National, 2 avenue Pasteur, 94 165 St Mandé Cedex, France, [email protected]
Abstract. This paper extends upon a category of information, processes description, and its relevance in metadata about geographic information. Metadata bases about processes are needed because processes are themselves resources to manage. Besides, specific processes participate in the description of other types of resources like data sets. Still, current metadata models lack structured containers for this type of information. We propose a model to build structured descriptions of processes to be integrated in metadata bases about processes themselves or about data sets. Keywords : metadata, process, task
1 Introduction During the past five years, significant progresses have been made in modelling metadata to enhance the management of geographical data. The release of the ISO19115 international standard was an important milestone. Metadata models initially aimed at resolving data transfer issues. Their application is now to support data exchange and cataloguing. The very scope of these models has widened from describing geographical datasets to describing aggregates of data sets, i.e. specifications of representation, and services. This paper extends upon the necessity of enriching geographic information metadata with structured descriptions of processes. The first section of this paper details the need for this category of knowledge in metadata about geographical information. The next section is a brief state of the art of existing models to describe processes, including
630
Bénédicte Bucher
a prior work of the author in this domain. The section after describes our current approach that focuses on describing data production and management tasks to enrich metadata bases within IGN.
2 The need for structured descriptions of processes in metadata In this paper, studied processes are manipulations of geographical data to meet an objective. The need for structured descriptions of processes in metadata is twofold. 2.1 Processes management metadata Processes themselves are resources to manage so that a metadata model is needed to describe them. Managing a process usually means to track its running or to assist operators in performing it. For instance, the process of acquiring metadata about a data set could be designed as a process distributed between several people participating in the production of the data set. In this case, metadata associated to the data production process would be of the following forms : flags warning to perform a data metadata acquisition event associated to certain events in the data production process, guidelines to perform these metadata acquisition events. Managing a process can also mean cataloguing it. The famous three stages of data cataloguing proposed by the Global Spatial Data Infrastructure Technical Committee (GSDI 00), discovery, exploration and exploitation, can be translated as follows for processes cataloguing : discovery : What processes exist? exploration : Which processes are close to the process I need? exploitation : Can I use this process? Can I adapt it to my context? Or can I define a new process through reusing existing processes patterns? Processes that typically need cataloguing are repair processes. Repairing a given error consists in performing an existing repair process when the error is already referenced. When it is not, it may consist in designing a new repair process possibly based on catalogued diagnosis and repair processes.
Integrating processes in geographical metadata
631
2.2 Data sets description metadata Modelling processes in metadata is also needed to manage other types of resources in the description of which processes appear. Indeed, descriptions of processes are already present in the ISO19115 model where the lineage entity is composed of sources and processes. This specific category of metadata, lineage, is essential to IGN sale engineers. They use it as a major source to assess the content and quality of a data product or data set. They don't find so far this information in metadata bases but rather by contacting production engineers. The ISO19115 structure for processes elements in the lineage entity is free text. Obviously, this information calls for a richer structure. 2.3 A need for a model of processes types and instances To conclude this section, let us summarize what structures are needed to describe processes in metadata, as illustrated on Figure 1. A model is needed to describe : processes types, like data matching, processes instances, like consistency checking for two specific data sets. These structures are needed to build process management metadata. Descriptions of specific types of process constitute a metadata base about these types of processes. It could entail information like the name of the process, its generic signature, its decomposition. Metadata bases about specific instances of a process can then be obtained through specifying elements in the description of the corresponding type of process, like the name of the operator who was responsible for performing this process instance and some decisions he made during the process. These structures are also needed to document the lineage metadata in data descriptions. A type of process can be the production of a type of data sets. For instance the Scan25® product in IGN is a data set series associated to a type of production process. The description of this type of process is a lineage metadata for this specific aggregate. The lineage metadata for a data set belonging to this aggregate is then the description of the specific process instance which output was this data set.
(All instances) Metadata Base about particular processes of this type
(one instance) lineage metadata for a specific data set
Fig. 1. Integration of structured descriptions of processes in metadata about geographic information.
3 Existing models to describe processes Models to describe processes as such are to be found in the domain of business management. They integrate components like triggering events, decomposition and agents (Tozer 99). In the context of the Web, there exist models to describe Web services. Web services realise specific categories of process. In this area, the most achieved approach is that of OWL-S (OWL-S 02). OWL is a W3C effort to propose languages to build ontologies on the Web. A section of OWL is specifically dedicated to Web Services. It proposes the following RDF-like model to describe services. A resource provides a service. A service presents a service profile, i.e. what the service does. A service is described by a service model, i.e. how it works. A service supports a service grounding, i.e. how to access it. In the area of geographic information, there exist classifications of geographical Web services like that of OGC (OGC 02). A promising work is that of (Lemmens et al.03) who extend the OGC Web service taxonomy schema using DAML-S (former name of OWL-S).
Integrating processes in geographical metadata
633
All these models support the description of processes instances or of very specific process types like a Web Service is. But they don’t provide much description for more generic process types, apart from classifications. In the COGIT laboratory, the TAGE model has been proposed to enrich ISO19115 metadata with how-to-use knowledge (Bucher 03). This model describes application patterns like to locate an entity or to match data sets. These applications patterns can be specified to obtain a description of a geographic application that should yield the result the user expects. TAGE relies on the classical concepts of tasks and roles. A task is a family of problem and their solution. A task is a process type, generic or specific. It can be specified to a more specific process type or to a process instance. A process instance is a realisation of a task. The inputs and output of tasks are described through roles. A role is a variable with a name and a set of possible values. For instance, the output of the task « to locate an entity » is a role called « location ». The set of possible values for this role includes : a geometry in a spatial reference system, coordinates in a linear reference system, a place defined by a relationship with a geographic feature, a symbol on a map or route instructions. The role itself is task dependant whereas the elements describing its values or not task dependant. These elements are called “domain elements”. TAGE has been embedded in the TAGINE application to assist users in specifying a task which realisation should meet their need. Tasks and roles are relevant concepts in the description of generic processes. The TAGE model supports the description of tasks which decomposition varies depending on how they are specified, which is not supported by other decomposition models. Moreover, the use of roles and of specification rules embedded within the task allows for the consistent specification of the output depending on the specification of the input. This is far more precise than describing the signature of a process. Indeed, the effect of a process, as well as other properties of the process, may depend on the input data. The limitations of TAGE are the complexity of this model and the consequent difficulty of acquiring tasks to feed in the TAGINE database.
4 TAGE2 : a simplified TAGE model Our current approach to describe processes consists in simplifying the TAGE model to obtain a model and a database that are more readable and easy to maintain.
634
Bénédicte Bucher
4.1 A shift of objective The corresponding application does not meet the same objective as TAGINE. Our ambitions have shifted from enhancing external users access to geographical information to enhancing internal users access to geographical information. This takes place in a new context in IGN where people are more aware of the importance of metadata. This context is described in (Bucher et al. 03). Technically, TAGINE aimed at supporting cooperative specification of a task. The new application only aims at : browsing tasks, editing a task and interactively specifying it, storing tasks and storing realisations of tasks.
4.2 The representation of tasks in TAGE2 In the initial TAGE model, tasks were to be modelled as instances of one class, similar to the ISO FC_FeatureType structure. In TAGE2, a task can be modelled as a class extending the class “Task” or as an instance of the class “TaskType”, as shown on Figure 2. The same holds for domain elements with the classes “DomainElement” and “ElementType”. We use this because the java language does not support the representation of metaclasse. Still metaclass like structures are needed as explained in section 2.3. A specific task should be both an instance of another class and a class itself. Task
TaskType type >
DataSet Matching
TTDSMatching : TaskType
type : TTDSMatching
Fig. 2. Double view on tasks : a specific task can be described as a class extending the class Task or as an instance of the class TaskType.
Describing a task as an instance of the class TaskType, or of a subclass of it, is useful to describe numerous tasks, typically to build a metadata da-
Integrating processes in geographical metadata
635
tabase about processes. Describing a task as a class is useful to focus on the model of the specific task. Instances of this task will be descriptions of processes instances. These needs were listed on Fig 1. The use of TAGE2 structure to meet these needs is summarized on Fig 3. Metadata about processes
Metadata model for particular processes of this type
structures
DSM1:DataSetMatching DSM2:DataSetMatching …
Metadata Base about particular processes of this type
Fig. 3. The use of TAGE2 structures to meet the needs expressed in the first section
4.3 A prototype To build a prototype of TAGINE2, we have chosen to describe a specific data management task : data set matching. This task is used by the data producer to assess the quality of a data set, to update a data set or to build a multi-representation data set. The domain elements for this task have been modelled according to the implementation of ISO standards in the laboratory plate form OXYGENE (Badard and Braun 2003). We introduced the concept of data set and related it to features implementing the ISO FT_Feature interface. A data set is also associated to unit of storage that can be XML files or Oracle tables or directories. It has a description that is a ISO19115 MD_DataDescription entity. These elements are illustrated on Figure 4. Other domain elements relevant to the data set matching task are Feature catalogues that are referred to in MD_Description entities.
636
Bénédicte Bucher
DataSet < content
description >
container v Feature
UnitOfStorage
MD_Description
^ is mapped to FeatureList
Fig. 4. Domain elements representing a data set.
The overall mechanism of the task is to find minimas of the distance between two representations of the same space, at the model level and at the data level. Its decomposition is the following. The first subtask is to match schemas. This consists in matching groups of elements in both schemas that represent the same category of objects in reality. In this task, it is also important to identify relationships and attributes that don’t vary with the representation and to mark those that are identifiers. This task should rely on stored correspondences between feature types in feature catalogues and on stored marking of non varying attributes and relationships in each feature catalogue. In the future, these elements will be integrated in the domain. The next subtask is to rank the schemas correspondences after the evaluated easiness and quality of data matching for the corresponding features. If two classes are matched that bear a non varying identifier, then features corresponding to these classes should be matched first. If two classes are matched and bear numerous non varying relationships, the corresponding features should be matched early. Indeed, these first results will then be used to match features related to them by these relationships. There exist numerous rules like these ones that should be applied to build a tree of correspondences. The last subtask is to match features. It is itself decomposed into three tasks : to select a feature in the reference data set –after specific criteria-,
Integrating processes in geographical metadata
637
to restrict the compared data set to features that may be matched to the selection and to use geometric algorithms to assess the correspondence.
Fig. 5. Browsing of the DataSetMatching task in the prototype. The interface is in french but some elements (the tasks) have been translated into English or the purpose of this paper.
The Figure 5 shows task browsing supported by the prototype. A task is described by its roles, a brief description of its model, and a summarized description of its decomposition. This decomposition can be browsed more in detail in another window. The “detail” buttons next to each role open windows describing the possible values for the role.
638
Bénédicte Bucher
Perspectives On-going work aims at providing a simple interface to edit tasks as well as domain elements. The next step will consist in acquiring tasks and realisations of tasks. A first method to acquire tasks consists in asking people to describe through the TAGINE2 interface a task they are familiar with, either by building a new instance of TaskType that will possibly reuse other instances, or by building a new subclass of Task that will possibly reuse other subclasses. Another method we plan to experiment is the analysis of a set of log files corresponding to the same task to extract common patterns. Acquiring realisations of a specific task will rely on the description of this task as a specific class. We will focus on supporting this acquisition in a distributed context, i.e. when it is performed by several people. Again this may be done through these people using the TAGINE2 interface or through the analysis of their log files. We intend to focus on the acquisition of data production tasks and on the integration of these descriptions in the ISO19115 lineage entity.
Acknowledgements The author wishes to thank Sandrine Balley, Sébastien Mustière and Arnaud Braun from the COGIT laboratory for their help in modelling the Data Set Matching task, its roles and decomposition, and the corresponding domain elements.
References (Badard and Braun 03) Thierry Badard, Arnaud Braun, OXYGENE : An Interoperable Platform Enabling the Deployment of Geographic Web Services, GISRUK conference, London, 2003 (Bucher 2003) Bénédicte Bucher, Translating user needs for geographic information into metadata queries, 6th AGILE Conference, Lyon, 2003, pp.567576. (Bucher et al. 03) Bénédicte Bucher, Didier Richard, Guy Flament, A metadata profile for a National Mapping Agency Enterprise Portal - PARTAGE, GISRUK Conference, Londres, 2003 (GSDI 00) GSDI Technical Working Group, Developing Spatial Data Infrastructures : the SDI Cookbook, v1.0, Douglas Nebert (Ed), 2000.
Integrating processes in geographical metadata
639
(Lemmens et al. 03) Rob Lemmens, Marian de Vries, Trias Aditya, Semantic extension of Geo Web service descriptions with ontology languages, 6th AGILE Conference, Lyon, 2003, pp.595-600 (OGC 02) The OpenGIS Consortium, OpenGIS® OWS 1.2, 2002 (OWL-S 03) The OWL Services coalition, OWL-S 1.0, Semantic Marks up for Web Services, 2003 (Tozer 1999) Guy Tozer, Metadata management for information control and business success, Artech House, Boston, 1999
Toward Comparing Maps as Spatial Processes Ferko Csillag1* and Barry Boots2 1 Department of Geography, University of Toronto, 3359 Mississauga Rd, Mississauga, ON, L5L1C6, Canada [email protected] 2 Department of Geography and Environmental Studies, Wilfrid Laurier University,Waterloo, ON, N2L 3C5, Canada [email protected]
Abstract We are concerned with comparing two or more categorical maps. This type of task frequently occurs in remote sensing, in geographical information analysis and in landscape ecology, but it is also an emerging topic in medical image analysis. Existing approaches are mostly pattern-based and focus on composition, with little or no consideration of configuration. Based on a web-survey and a workshop, we identified some key strategies to handle local and hierarchical comparisons and developed algorithms which include significance tests. We attempt to fully integrate map comparison in a process-based inferential framework, where the critical questions are: (1) Could the observed differences have arisen purely by chance? and/or (2) Could the observed maps have been generated by the same process? Keywords: stochastic processes, spatial pattern, inference, local statistics, hierarchical decomposition
1 Introduction Advances in digital data collection, processing and imaging technologies result in enormous databases, for example, in environmental and medical sciences. These data sets can be used in many ways, one of which is to determine categories that reflect some sort of human summary of the data (e.g., label land cover classes to pixels of a satellite image, identify tumors on a medical image, assign habitat types on a simulation model output). With recent demands on the usage of such databases comparison of spatial data sets is becoming a more and more frequent task in various settings.
642
Ferko Csillag and Barry Boots
The relevant traditions in spatial data analysis are focused on (1) accuracy assessment, which is primarily concerned with the coincidence (or confusion) matrix accounting for compositional differences at each data site, and is most frequently used to characterize the match (or mismatch) between data sets with identical labels (Congalton 1994, Stehman 1997, Stehman 1999, Smith et al. 2002, Foody 2002); (2) change detection, where differences (either before or after labeling/classification) are interpreted against time (Richards and Xiuping 1999, Metternicht 1999, Rogerson 2002); (3) model comparison, where (predicted or simulated) model outputs are compared to either observed landscapes and/or to other model outputs (White et al. 1997, Hargrove et al. 2002).; (4) landscape indices, which usually summarize some characteristics of spatial patterns by one (or a few) numbers and thus facilitate comparisons (Trani and Giles 1999, Turner et al. 2001., Rogan et al. 2002). There are recent efforts to introduce fuzzy approaches that allow for a level of uncertainty in categories or locations or both (Power at el. 2001, Hagen 2003). In general, however, these traditions have had much less impact on each other than would have been expected based on their strengths and weaknesses (Fortin et al. 2003). In particular, these approaches are, in large part, inconsistent with our understanding of the relationships between processes and patterns. Therefore, we refer to them as pattern-based approaches since they concentrate on the patterns shown on the maps without explicitly considering how the patterns were generated (e.g., it is a frequent assumption that data sites are spatially independent, or that the data-generating process is stationary). We suggest that the fundamental question in map comparison should be: "Could the observed differences have arisen purely by chance?". Considering the pattern-based approaches, it is virtually certain that we will encounter some differences between the maps in the sense that not every location will have identical values on the maps. The above listed methods do not provide any guidelines for users to evaluate if the differences are significant (i.e., "surprising"). Such statistical inference would require us to evaluate the likelihood of each map (or certain values on a map), which in turn would allow us to answer the question: "Could the observed maps have been generated by the same process?". We report here the initial steps toward incorporating map comparisons in a process-based inferential framework (Getis and Boots 1978). It is a challenging task because there is no (consensus on a) general strategy toward a theoretically and operationally feasible framework in spite of the recognition of the need for simultaneous use of compositional and configurational information and spatially variable (or adaptive) data description (Csillag and Boots 2003). Therefore, to assess the appropriateness of these two approaches in light of "what do users want?", we designed a web-
Toward Comparing Maps as Spatial Processes
643
based test for map comparison and followed it up with a workshop. In the next section we report the major findings of these open consultations. Then outline a "top-down" (hierarchical) and a "bottom-up" (local) pattern-based map comparison procedure that both include an inferential component. It is followed by our preliminary results in process-based methods and some concluding remarks regarding the likely decisions related to these approaches and their advantages/disadvantages.
2 What You See Is What You Get? Ultimately, the goal of map comparison is to assist users in decisionmaking. It is imperative, therefore, to account for users' perspectives (e.g., what is it that users see or what is it that users want?) even if they do not match some envisioned formal stochastic framework (D'Eon and Glenn 2000). Our primary goal in designing the test was to assess how users react to various aspects of differences in the data-generating process. We were also interested if these reactions varied by expertise and/or specific fields of expertise. For simplicity we simulated 64-by-64 binary (black-and-white) isotropic stationary landscapes with fixed (known) parameters for composition (proportion of B/W) and configuration (first-order neighbours). We used four simulation algorithms as pattern generators: (1) conditional autoregression, where the configuration parameter is spatial autocorrelation (Cressie 1993, p.407, Csillag et al. 2001); (2) fragmentation where the configuration parameter is spatial clustering (Fahrig, 1997, 1998); (3) inhibition (Upton and Fingleton, 1985, pp.18-22), where the configuration parameter is the maximum number of neighbouring pixels of the same colour; (4) a pixel-level inhomogeneous Poisson (Cliff and Ord, 1981, p.89), where the configuration parameter is the probability that neighbouring pixels are positively correlated. We created 102 systematically chosen pairs from identical processes (36 with no difference between the parameters, 20 with compositional difference, 20 with configurational difference, 26 with difference in both) and posted it on the World Wide Web on 11 April 2003 at http://eratos.erin.utoronto.ca/fcs/PRES/2MAPS/2maps_ind ex.html
(Figure 1). The test consisted of 20 randomly selected pairs about which the only question asked was: "Are these two maps different?", and comments could be entered in a text window. At the beginning of the test we asked users to provide their names and e-mail address, some information about their experience and rate their expertise (0-10), and after completion
644
Ferko Csillag and Barry Boots
we asked them to comment on the test as well as to rate the difficulty in answering the question. On 24 May 2003 we organized a workshop at the GEOIDE Annual
Fig. 1. Four sample pairs from the web-based map comparison test. The process names are followed by the composition-configuration parameter pairs and the percentage of respondents who found them different. Top-left: fragmentation [40-2540-25] 57% top-right: CAR [75-12,65-00], 86% bottom-left: CAR [55-12,45-12], 91% bottom-right: inhom [20-01,20-02], 51%.
Conference in Victoria, BC. There were fourteen participants who repeated the test, listened to a brief presentation about the fundamental considerations and then interactively evaluated their findings. Here we report results based on the first 200 users. The average expertise was 5.75 with a bimodal distribution with a standard deviation of 2.49. According to the field of expertise we recorded "geography", "GIS", "remote sensing", "computing/statistics" and "landscape ecology" (anything else was grouped in "other"). Participants could list more than one field; "GIS" was mentioned by half of them. The average reported difficulty was 4.65 resembling a uniform distribution (standard deviation: 2.67). There was a weak (inverse linear) relationship between expertise and difficulty, there was no relationship between expertise and number of correct answers (Figure 2) and there was a weak (linear) relationship between difficulty and number of correct answers (not shown). In 74% of all pairs respondents found the pairs different (Figure 3), and 68.1% of the answers were correct. The results are not significantly different when grouped by field of expertise (Figure 4).
Toward Comparing Maps as Spatial Processes
645
Both "raw score" evaluation and the analysis of the commentaries strongly indicate that respondents were, in general, much more sensitive to composition than configuration. In fact, composition was almost always
Fig. 2. Relationship between "expertise" and "difficulty" (left) and "expertise" and "number of correct answers out of 20" (right) for the first 200 participants of the web-test. The respondents were grouped by expertise; the centre of the circle represents the group-average, the area of the circle represents the size of the group.
considered first (and foremost). Many respondents chose to interpret the pairs in a specific context (e.g., "I tried to imagine these as vegetation maps..."). Most frequently references were made to "clusters", "clumping", "density" and "scale". COMPOSITION
CONFIGURATION
same
different
same
54.7%
69.4%
different
80.8%
89.9%
Fig. 3. Percentage of respondents finding pairs different by compositional and configurational parameter differences. (The difference between the lowest and highest cells is statistically significant at the 0.1 level.)
If and when composition was found the same, two types of 'ad hoc' strategies seem to dominate. One of these was "scanning" the landscape for differences in specific locations (e.g., "I started to count black cells from the corners", "The big black 'blob' in the center of the left image is missing on the right one"), and the other one was "zooming" in and out over the entire landscape revisiting sections at coarser/finer resolution (e.g., "I tried to find 'objects' in the images at coarser resolution", "They do not appear to be different unless one is interested in a sub region of the image"). Frequently, some combination of these strategies was employed (e.g., switching back and forth between "scanning" and "zooming"). There were several suggestions to include a scale for differences, but the term 'significant' was only mentioned nine times (in 200*21=4200 comments). Finally, it is
646
Ferko Csillag and Barry Boots
important to note that in more than half of the cases where the pair was generated (simulated) by an identical process (i.e., differences were purely due to chance) respondents found them different.
Fig. 4. Relationship between "expertise" and "difficulty" (left) and "expertise" and "number of correct answers out of 20" (right) for the first 200 participants of the web-test. Participants were grouped by self-identified field of expertise: LEC=landscape ecology, CST=computing/statistics, RSE=remote sensing, GIS=geographical information, GGR=geography, and OTH=other, and are represented by the group averages
During the workshop discussion, participants were introduced to the concept of both pattern-based and process-based map comparisons. The proportion of correct answers was only slightly higher than among webparticipants, but a definite need was emphasized for testing the hypothesis if the observed differences were due to chance. Although process-based inferential statistics has a long tradition in geographical information analysis, we have not found any explicit reference to map comparison (Legendre and McArdle (1999) mentioned it but never explored the idea).
3 Pattern-Based Map Comparisons with Significance Test We developed methods that include a significance test (or a series of tests) within the pattern-based framework that meet the expectations formulated in our surveys. Below we illustrate one that follows the idea of evaluating differences at all locations within neighbourhoods on pairs of categorical maps on a regular grid, and another one that implements a hierarchical approach. Both approaches lead to a series of comparisons, each with a significance test, rather than a single value to characterize the differences. 3.1 A local approach to map comparison This approach makes use of local statistics for categorical spatial data (local indicators for categorical data – LICDs) (Boots, 2003). LICDs are
Toward Comparing Maps as Spatial Processes
647
based on the two fundamental characteristics of categorical spatial data, composition, which relates to the aspatial characteristics of the different categories (e.g., colours), and configuration, which refers to the spatial distribution (or arrangement) of the categories. Further, it is argued that, when considered locally, configuration should be measured conditionally with respect to composition. While composition can be uniquely captured by measures based on category counts or proportions, no simple characterization is possible for configuration. Current results indicate that as many as five measures of configuration (number of patches, patch sizes, patch dispersion, join counts, and category eccentricity) are required in order to differentiate between all possible local categorical maps. Previously, local spatial statistics, in general, and LICDs, in particular, have been used to identify spatial variations within a data set (e.g., complementing global methods). Here, we use them for map comparison. For now, for simplicity, assume two binary raster maps A and B. A and B can be compared by computing LICDs in (n x n) windows centred on each pixel i in both maps. Let cin be the value of the number of black cells (composition) and fjin be the value of the jth configuration measure in the (n x n) window centred on pixel i. Generate the distributions of cin and the conditional distributions of fjin | cin for A and B. Test if the corresponding pairs of distributions are significantly different. Thus, by using values of n =3, 5, 7, 9, … we can determine at what spatial scales and for which characteristics, A and B differ (Figure 5). 3.2 A hierarchical approach to map comparison This approach is based on the measurement and pyramid-like hierarchical decomposition of the mutual information between two categorical maps on a regular grid (Csillag et al. 2003). The basic measures used are the mutual information: I(X1,X2) = H(X2)-H(X2|X1), the amount of the common part of information in X2 and X1 (which equals D[(X1,X2), (X1xX2)], the Kullbackdivergence between the actual joint distribution and the joint distribution with independent X1 and X2, usually obtained as the cross-product of the marginals), and the uncertainty coefficient: U = 100•I(X1,X2)/H(X2), the proportion of the common part, and the significance of U can be assessed by the appropriate chi-square distribution (Freeman 1987). We encode the quadrants (1,...,4 for NW, NE, SW, SE, respectively) of the pyramid for a 2L-by2L grid for each level of the pyramid, and we denote the variables associated with the levels as (coarsest to finest) Y1, Y2,..., YL. Let X denote the variable representing the (collection of) maps, and Z denote the variable for colours. The decomposition I(X,(Y,Z))=H(Y,Z)-H((Y,Z)|X) is straight-
648
Ferko Csillag and Barry Boots
forward (and note that H(Y)=log2(#cells) at each level). The series of mutual information forms a monotonic sequence: I(X,(Z,Y1)) < I(X,(Z,Y1,Y2))... and the (residual) uncertainty (and its significance) can again be computed for each step.
Fig. 5. The LICD-based map comparison software with a convenient graphical user interface.
This approach, in principle, can be easily illustrated by an analogy to a digital camera. Imagine that we are looking at two completely random images with the camera out of focus (i.e., very coarse resolution): the images appear similar. As we change the resolution to finer and finer quadrants, slight differences may show up, but the vast majority of the differences will be encountered in the very last step. Conversely, if we compare two images consisting of a few large patches with different colours, we would encounter almost all the differences in the first step, and no more differences will be found until we reach the finest resolution. Both the mutual information and the uncertainty coefficient can be plotted as a function against pyramid-level, the significance of each level (conditioned on the coarser ones) can be tested, and the shape of these functions provides visual characterization of the differences (Figure 6).
Toward Comparing Maps as Spatial Processes
649
4 Concluding Remarks: Toward Process-Based Map Comparison Comparing categorical data sets in a process-oriented framework requires some formal assumptions about the interactions between locations and categories: a stochastic process in this sense is the description (parametrization) of the interactions. For instance, we may assume that there is no relationship between locations and colours on a map (i.e., the colour distribution in space is random), in which case we are able to test this hypothesis in various forms (see example-2 above), or we may assume that
Fig. 6. The hierarchical decomposition of mutual information and uncertainty along pyramid levels (software implementation in R).
a given colour distribution does not change significantly from one data set to another (see example-1 above). Characterization of the interactions between locations and categories for cases other than random puts map comparison into the inferential framework where we estimate tha parameters of a model for two (or more) realizations and assess the difference(s) with some (required) level of confidence. This can be quite challenging, and here we briefly highlight the avenues we will pursue in the near future. The first challenge we face is that the generality of a model usually comes at a price of extremely large number of paramaters. To illustrate this, consider a broad class of stochastic processes generating a categorical map (C categories and N locations) which can be described by Markov Random Fields (MRFs). The number of potential parameters in the general case is prohibitive (~C2N) because, in principle, both compositional and configurational parameters can change at each location (Li 2001). Thus,
650
Ferko Csillag and Barry Boots
simplifying assumptions are necessary to make the model manageable, of which the most frequent one is stationarity, i.e., the parameters are assumed to be spatially homogeneous (Getis and Boots 1978, Cressie 1993). In the stationary case for MRFs the local conditional distributions of firstorder neighbours fully characterize the joint distribution (Geman and Geman 1984, Besag 1986). In this case the joint likelihood can be written as l (P{s1,...,sn},D,E,J) v D + E(si) + J(si,sj), where D is the so-called partition function (to ensure integration to 1), E accounts for composition (~the probabilities of categories), and J accounts for configuration (~the probability of given category-combinations for given neighbours). Even for binary maps this means 1 parameter for composition and 16 (=24) parameters for configuration. Furthermore, these parameters cannot be estimated independently (the proportion of B or W cells obviously influences the probabilities of BB, WW and BW neighbours). However, the local conditional probabilities (the elements of J, e.g., the probability of a cell being B given that four neigbours are B) and the corresponding frequencies (i.e., how many times it actually occurs that a B cell has four B neighbours) fully characterize the stochastic model. Thus, using Markov-chain Monte-Carlo (MCMC) methods, we can simulate realizations for any parameter set and this allows us to derive the empirical distributions of widely used or newly proposed measures of map comparison. With stationary MRFs as stochastic processes, we can model the differences between two maps as, for example: (1) independent realizations with constant parameters (e.g., the data were produced by two different procedures), (2) independent realizations with changing parameters (e.g., forest harvesting, disease spreading, urbanization), (3) non-independent realizations with constant parameters (e.g., undisturbed forest conservation area at two different dates), (4) nonindependent realizations with dependent parameters (e.g., pollution plume and fish distribution). A major challenge is the non-stationary case, when spatial heterogeneity of the parameters is allowed. This makes all single-value-based traditional comparisons highly suspect. While a general solution is nowhere in sight, some "wavelet-like" optimization seems to be feasible. The above illustrated non-parametric pattern-based approaches are also likely to be most useful in these situations where we can make no (or only limited) assumptions concerning the process(es) that generated the pattern. With recent advances in digital spatial databases there is increasing demand for comparing such data sets (e.g., evaluating changes over time, assessing situations at two different locations, considering the advantages and disadvantages of new methods). We propose to use existing mathematical-statistical foundations for putting map comparison into an inferen-
Toward Comparing Maps as Spatial Processes
651
tial framework. The necessary extra effort pays off by being able to ascertain if observed differences could have been caused purely by chance. We demonstrated the preliminary steps toward drawing conclusions about such comparisons in terms of the significance of the differences. These developments are likely to change the way we view differences between maps.
Acknowledgement The authors gratefully acknowledge the financial support of the GEOIDE Network of Centres of Excellence (Canada) and the Natural Sciences and Engineering Research Council of Canada, and the constructive discussions with Sándor Kabos (Eötvös University, Budapest) about entropy-based measures of similarity.
References Besag, J.E. 1986: On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society B, 48:302-309. Boots, B. 2003. Developing local measures of spatial association for categorical data. Journal of Geographical Systems, 5(2), 2003, 139-160. Cliff, A.D. and Ord, J.K. 1981. Spatial Processes: Models and Applications. London: Pion. Congalton, R. (ed.) 1994: International Symposium on the Spatial Accuracy of Natural Resources Data Bases. ASPRS, Bethesda, MD. Csillag, F. and Boots, B. 2003 A statistical framework for decisions in spatial pattern analysis. Canadian Geographer (in review) Csillag, F., Boots, B., Fortin, M-J., Lowell, K and Potvin, F. 2001 Multiscale characterization of ecological boundaries. Geomatica 55: 291-307. Csillag, F., Remmel, T., Mitchell, S. and Wulder, M. 2003. Comparing categorical forest maps by information theoretical distance. Technical Report CFS-03, Department of Geography, University of Toronto, p.49. Cressie, N.A.C. 1993. Statistics for spatial data. New York: John Wiley & Sons. D'Eon, R.G., Glenn, S.M. 2000. Perceptions of landscape patterns: Do the numbers count? Forestry Chronicle 76 (3): 475-480. Fahrig, L. 1997. Relative effects of habitat loss and fragmentation on population extinction. Journal of Wildlife Management, 61(3), 603-610. Fahrig, L. 1998. When does fragmentation of breeding habitat affect population survival? Ecological Modelling, 105, 273-292. Foody, G. 2002: Status of land cover classification accuracy assessment. Remote Sensing of Environment 80:185-201.
652
Ferko Csillag and Barry Boots
Fortin, M-J., Boots, B., Csillag, F. and Remmel, T. 2002: On the role of spatial stochastic models in understanding landscape indices in ecology. Oikos 102:203-212. Freeman, D.H. 1987. Applied categorical data analysis. New York: M. Dekker. Geman, D. and Geman, S. 1984: Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE T. Pattern Analysis and Machine Intelligence 4: 721-741. Hargrove, W., Hoffman, F.M., Schwartz, P.M. 2002. A fractal landscape realizer for generating synthetic maps. Conservation Ecology 6 (1): Art. No. 2. Legendre, P. and McArdle, B.H., 1997. Comparison of surfaces. Oceanologia Acta 20: 27-41. Li, S.Z. 2001. Markov random field modeling in image analysis. New York: Springer. Metternicht, G. 1999. Change detection assessment using fuzzy sets and remotely sensed data: an application of topographic map revision. ISPRS Journal of Photogrammetry and Remote Sensing 54: 221-233. Power, C., Simms, A., White, R. 2001. Hierarchical fuzzy pattern matching for the regional comparison of land use maps. International Journal of Geographical Information Science 15: 77-100. Richards, J.A. and Xiuping, J. 1999: Remote sensing digital image analysis: an introduction. New York: Springer. Rogan, J., Franklin, J. and Roberts, D.A. 2002: A comparison of methods for monitoring multitemporal vegetation change using Thematic Mapper imagery. Remote Sensing of Environment 80:143-156. Rogerson, P.A. 2002. Change detection thresholds for remotely sensed images. Journal of Geographical Systems 4:85-97. Smith, J.H., Wickham, J.D., Stehman, S.V. 2002. Impacts of patch size and landcover heterogeneity on thematic image classification accuracy. Photogrammteric Engineering and Remote Sensing 68 (1): 65-70. Stehman, S.V. 1997. Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment 62:77-89. Stehman, S.V. 1999. Comparing thematic maps based on map value. International Journal of Remote Sensing 20:2347-2366. Trani, M.K. and Giles, R.H. 1999. An analysis of deforestation: Metrics used to describe pattern change. Forest Ecology and Management 114:459-470. Turner, M.G., Gardner, R.H., O'Neill, R.V. 2001. Landscape ecology in theory and practice. New York: Springer. Upton, G. and Fingleton, B. 1985. Spatial Data Analysis by Example. Volume 1: Point Pattern and Quantitative Data. Chichester: John Wiley & Sons. White, R., Engelen, G., Uljee, I. 1997. The use of constrained cellular automata for high-resolution modelling of urban land-use dynamics. Environment an Planning B 24: 323-343.
Integrating computational and visual analysis for the exploration of health statistics
Etien L. Koua and Menno-Jan Kraak International Institute for Geoinformation Science and Earth Observation (ITC), P.O. Box 6, 7500 AA Enschede, The Netherlands
Abstract One of the major research areas in geovisualization is the exploration of patterns and relationships in large datasets for understanding underlying geographical processes. One of the attempts has been to use Artificial Neural Networks as a technology especially useful in situations where the numbers are vast and the relationships are often unclear or even hidden. We investigate ways to integrate computational analysis based on the SelfOrganizing Map, with visual representations of derived structures and patterns in an exploratory geovisualization environment intended to support visual data mining and knowledge discovery. Here we explore a large dataset on health statistics in Africa. Keywords: Exploratory visualization, Data mining, Knowledge discovery, Self-Organizing Map, Visual exploration.
1 Introduction The exploration of patterns and relationships in large and complex geospatial data is a major research area in geovisualization, as volumes of data become larger and data structure more complex. A major problem associated with the exploration of these large datasets, is the limitation of common geospatial analysis techniques, in revealing patterns or processes (Gahegan et al. 2001; Miller and Han 2001). New approaches in spatial analysis and visualization are needed to represent such data in a visual form that can better stimulate pattern recognition and hypothesis genera-
654
Etien L. Koua and Menno-Jan Kraak
tion, and to allow for better understanding of the processes, and support knowledge construction. Information visualization techniques are increasingly used in combination with other data analysis techniques. Artificial Neural Networks have been proposed as part of a strategy to improve geospatial analysis of large, complex datasets (Schaale and Furrer 1995; Openshaw and Turton 1996; Skidmore et al. 1997; Gahegan and Takatsuka 1999; Gahegan 2000), because of their ability to perform pattern recognition and classification, and because they are especially useful in situations where the data volumes are large and the relationships are unclear or even hidden (Openshaw and Openshaw 1997). In particular, the Self-Organizing Map (SOM) (Kohonen 1989) is often used as a means of organizing complex information spaces (Girardin 1995; Chen 1999; Fabrikant and Buttenfield 2001; Skupin 2003; Skupin and Fabrikant 2003). Recent effort in Knowledge Discovery in Databases (KDD) has provided a window for geographic knowledge discovery. Data mining, knowledge discovery, and visualization methods are often combined to try to understand structures and patterns in complex geographical data (MacEachren et al. 1999; Wachowicz 2000; Gahegan et al. 2001). One way to integrate KDD framework in geospatial data exploration is to combine the computational analysis methods with visual analysis in a process that can support exploratory and knowledge discovery tasks. We explore the SOM for such integration, to uncover the structure, patterns, relationships and trends in the data. Some graphical representations are then used to portray derived structures in a visual form that can support understanding of the structures and the geographical processes, and facilitate human perception (Card et al. 1999). We present a framework for combining pattern extraction with the SOM and the graphical representations in an integrated visual-computational environment, to support exploration of the data and knowledge construction. An application of the method is explored for a large socio-demographic and health dataset for African countries, to provide some understanding of the complex relationships between socio-economic indicators, locations and the burden of diseases such as HIV/AIDS. The ultimate goal is to support visual data mining and exploration, and gain insights on underlying distributions, patterns and trends.
Integrating computational and visual analysis
655
2 Visual data mining and knowledge discovery for understanding geographical processes The basic idea of visual data exploration is to present the data in some visual form, that allows to get insight into the data and draw conclusions (Keim 2002). Visual data mining is the use of visualization techniques to allow users to monitor, evaluate, and interpret inputs and outputs of process of data mining process. Data mining and knowledge discovery in general are one approach to analysis of large amount of data. The main goal of data mining is identifying valid, novel, potentially useful and ultimately understanding patterns in data (Fayyad et al. 1996). Typical tasks, for which data mining techniques are often used, include clustering, classification, generalization and prediction. The different applications of data mining techniques suggest three general categories of objectives (Weldon 1996): explanatory (to explain some observed events), confirmatory (to confirm a hypothesis), and exploratory (to analyze data for new or unexpected relationships). These techniques vary from traditional statistics to artificial intelligence and machine learning. Artificial Neural Networks are particularly used for exploratory analysis as non-linear clustering and classification techniques. Unsupervised neural networks such as the SOM are a type of neural clustering, and network architectures such as backpropagation and feedforward are neural induction methods used for classification (supervised learning). The algorithms used in data mining are often integrated into Knowledge Discovery in Databases (KDD), a larger framework that aims at finding new knowledge from large databases. This framework has been used in geospatial data exploration (Openshaw et al. 1990; MacEachren et al. 1999; Wachowicz 2000; Miller and Han 2001) to discover and visualize the regularities, structures and rules in data. The promises inherent in the development of data mining and knowledge discovery processes for geospatial analysis include the ability to yield unexpected correlation and causal relationships. Since the dimensionality of the dataset is very high, it is often ineffective to work in such high dimension space to search for patterns. We use the SOM algorithm as a data mining tool to project input data into an alternative measurement space based on similarities and relationships in the input data that can aid the search for patterns. It becomes possible to achieve better results in such similarity space rather than the original attribute space (Strehl and Ghosh 2002).
656
Etien L. Koua and Menno-Jan Kraak
3 The Self-Organizing Map and the exploration of geospatial data
3.1 The Self-Organizing Map The Self-Organizing Map (Kohonen 1989) is an Artificial Neural Network used to map multidimensional data onto a low dimensional space, usually a 2D representation space. The network consists of a number of neural processing elements (units or neurons) usually arranged on a rectangular or hexagonal grid, where each neuron is connected to the input. The goal is to group nodes close together in certain areas of the data value range. Each of the units i is assigned an n-dimensional weight vector mi that has the same dimensionality as the input patterns. What changes during the network training process, are the values of those weights. Each training iteration t starts with the random selection of one input pattern xt . Using Euclidean distance between weight vector and input pattern, the activation of the units is calculated. The resultant maps (SOMs) are organized in such a way that similar data are mapped onto the same node or to neighboring nodes in the map. This leads to a spatial clustering of similar input patterns in neighboring parts of the SOM and the clusters that appear on the map are themselves organized internally. This arrangement of the clusters in the map reflects the attribute relationships of the clusters in the input space. For example, the size of the clusters (the number of nodes allotted to each cluster) is reflective of the frequency distribution of the patterns in the input set. Actually, the SOM uses a distribution preserving property which has the ability to allocate more nodes to input patterns that appear more frequently during the training phase of the network configuration. It also applies a topology preserving property, which comes from the fact that similar data are mapped onto the same node, or to neighboring nodes in the map. In other words, the topology of the dataset in its ndimensional space is captured by the SOM and reflected in the ordering of its nodes. This is an important feature of the SOM that allows the data to be projected onto the lower dimension space while roughly preserving the order of the data in its original space. Another important feature of the SOM for knowledge discovery in complex datasets, is the fact that it is an unsupervised learning network meaning that the training patterns have no category information that accompany them. Unlike supervised methods which learn to associate a set of inputs with a set of outputs using a training data set for which both input and output are known, SOM adopts a
Integrating computational and visual analysis
657
learning strategy where the similarity relationships between the data and the clusters are used to classify and categorize the data. The SOM can be useful as a knowledge discovery tool in database methodology since it follows the probability density function of underlying data. 3.2 Computational analysis and visualization framework One of the advantages of the SOM is that the outcome of the computational process can easily be portrayed through visual representation. The first level of the computation provides a mechanism for extracting patterns from the data. As described in the previous section, the SOM adapts its internal structures to structural properties of the multidimensional input such as regularities, similarities, and frequencies. These properties of the SOM can be used to search for structures in the multidimensional input. The computational process provides ways to visualize the general structure of the dataset (clustering), as well as the exploration of relationships among attributes, through graphical representations that visualize the resultant maps (SOMs). The graphical representations are used to enable visual data exploration allowing the user to get insight into the data, evaluate, filter, and map outputs. This is intended to support visual data mining (Keim 2002) and the knowledge discovery process by means of interaction techniques (Cabena et al. 1998). This framework is informed by current understanding of effective application of visual variables for cartographic and information design, developing theories of interface metaphors for geospatial information displays, and previous empirical studies of map and information visualization effectiveness.
4 Application to the exploration of geographical patterns in health statistcs
4.1 The data The dataset consists of 74 variables on socio-demographic, and health indicators for all African countries. Maps of few attributes of the dataset are provided in figure 1. In this section, the dataset is explored, and different visualization techniques are used to illustrate the exploration of (potential) multivariate patterns and relationships among the different countries.
658
Etien L. Koua and Menno-Jan Kraak
Fig. 1. Example of attributes of the test dataset: HIV prevalence rate end of 2001, HIV rate among commercial sex workers, Total literacy rate, percentage of married women, birth rate, total death rate, life expectancy at birth, average age at first mariage, and GNI per capita 2001.
4.2 Exploration of the general patterns and clustering The SOM offers a number of distance matrix visualizations to show the cluster structure and similarity (patterns). These techniques show distances between neighbouring network units. The most widely distance matrix technique used is the U-matrix (Ultsch and Siemon 1990). In figure 2a, the structure of the data set is visualized in a U-matrix. Countries having similar characteristics based on the multivariate attributes are positioned close to each other, and the distance between them represents the degree of similarity or dissimilarity. These common characteristics representation can be regarded as the health standard for these countries. Light areas represent clusters (vectors are close to each other in the input space), and dark areas represent cluster separators (large distance between the neurons: a gap between the values in the input space). Alternative representations to the Umatrix visualization are discussed below: 2D and 3D projections (using projection methods such as the Sammon's mapping and PCA), 2D and 3D surface plots, and component planes.
Integrating computational and visual analysis
(a)
(b)
(c)
659
(d)
Fig. 2. Representation of the general patterns and clustering in the input data: The unified distance matrix showing clustering and distances between positions on the map (a), projection of the SOM results in 3D space (c); 3D surface plot (d), and a map of the similarity coding extracted from the SOM computational analysis (b).
In figure 2c, the projection of the SOM offers a view of the clustering of the data with data items depicted as colored nodes. Similar data items are grouped together with the same type or color of markers. Size, position and color of markers can be used to depict the relationships between the data items. The clustering structure can also be viewed as 2D or 3D surfaces representing the distance matrix (figure 2d) using color value to indicate the average distance to neighboring map units. This is a spatialization (Fabrikant and Skupin 2003) that uses a landscape metaphor to represent the density, shape, and size or volume of clusters. Unlike the projection in figure 2c that shows only the position and clustering of map units, areas with uniform color are used in the 2D and 3D surface plots to show the clustering structure and relationships among map units. In the 3D surface (figure 2d), color value and height are used to represent the regionalization of map units according to the multidimensional attributes. 4.3 Exploratory visualization and knowledge discovery The correlations and relationships in the input data space can be easily visualized using the component planes visualization (figure 3). The component planes show the values of different attributes for the different map units (countries) and how each input vector varies over the space of the SOM units. They are used to support exploratory tasks, to facilitate the knowledge discovery process, and improve geospatial analysis. Comparatively with the maps in figure 1, patterns and relationships among all the attributes can be easily examined in a signle visual represention using the SOM component planes visualization. Since the SOM represents the similarity clustering of the multivariate attributes, the visual representation becomes more accessible and easy to explore. This kind of spatial
660
Etien L. Koua and Menno-Jan Kraak
clustering makes it possible to conduct exploratory analyses to help in identifying the causes and correlates of health problems (Cromley and McLafferty 2002), when overlayed with environmental, social, transportantion, and facilities data. These map overlays have been important hypothesis-generating tools in public health research and policymaking (Croner et al. 1992). In figure 3a, all the components are displayed and a selection of few of them are made more visible for the analysis in figure 3b. Two variables that are correlated will be represented by similar displays. The kind of visual representation (imagery cues) provided in the SOM component planes visualization can facilitate visual detection, and has an impact on knowledge construction (Keller and Keller 1992). As such, the SOM can be used as an effective tool to visually detect correlations among operating variables in a large volume of multivariate data. From the global patterns, correlations and relationships exploration in figure 3a, hypotheses can be made and further investigation can follow, in the process of understanding the patterns. To enhance visual detection of the relationships and correlations, the components can be ordered so that variables that are correlated are displayed next to each other (see figure 3c and 4d), in a way similar to the collection maps of Bertin (Bertin 1981). It becomes easy to see for example that that the HIV prevalence rate in Africa is related to a number of other variables including the literacy rate and behavior (characterized in the dataset as high risk sexually behavior and limited knowledge on risks factors), and other factors such as the high prevalence rate among prostitutes, and the high rate of infection for other sexual transmitted diseases. As a consequence of the high prevalence rate in regions such as southern Africa, there seem to be a low birth rate, and life expectancy at birth, and a high death rate, highly impacted by the HIV infection. The birth rate in mostly infected regions seems to be a consequence of the prevention measures, the increase use of condom among a large proportion of single females for contraception, but who seem to prevent themselves against the HIV prevention rather. It is also observed through the component planes visualization that factors such as the percentage of married women, the percentage of sexually active single female, and the average age at first marriage in these countries, are highly related to the prevalence rate.
Integrating computational and visual analysis
661
(a) (b)
(c)
(d)
(e)
Fig. 3. Detail exploration of the dataset using the SOM component visualization: all the components can be displayed to reveal the relationships between the attributes for different spatial locations (countries) in (a). Selected components related to a specific hypothesis can be further explored (b). Component planes can be ordered based on correlations among the attributes to facilitate visual recognition of relationships among all the attributes (c) or selected attributes (d). Position of countries on the SOM grid is shown in (e).
4.4 The integrated visual-computational environment We have extended the alternative representations of the SOM results used to highlight different characteristics of the computational solution and integrated them with other graphics into multiple views, to allow brushing and linking (Egbert and Slocum 1992; Monmonier 1992; Cook et al. 1996; Dykes 1997), for exploratory analysis (see figure 4). These multiple views are used to simultaneously present interactions between several variables over the space of the SOM, maps and parallel coordinate plots, and to emphasize visual change detection and the monitoring of the variability through the attribute space. These alternative and different views on the data can help stimulate the visual thinking process, characteristic for visual exploration, and support hypothesis testing, evaluation and interpretation of patterns from general patterns extracted to specific selection of attributes and spatial locations.
662
Etien L. Koua and Menno-Jan Kraak
(a) (b)
(e)
(c) (d)
(f)
(g)
Fig. 4. The user interface of the exploratory geovisualization environment, showing the representation of the general patterns and clustering in the input data: unified distance matrix (b), projection of the SOM results in 3D space (c), 3D surface plot (e), map of the SOM similarity coding (a) and parallel coordinate plot (d), component planes (f) and map unit labels in (g).
5 Conclusion In this paper we have presented an approach to combine visual and computational analysis into an exploratory visualization environment intended to contribute to the analysis of large volumes of geospatial data. The approach focuses on the effective application of computational algorithms to extract patterns and relationships in geospatial data, and visual representation of derived information. A number of visualization techniques were explored. The SOM computational analysis was integrated with visual exploration tools to support exploratory visualization. Interactive manipulation of the graphical representations can enhance user goal specific querying and selection from the general patterns extracted to more specific user selection of attributes and spatial locations. The link between the attribute space visualization based on the SOM, the geographic space with maps representing the SOM results, and other graphics such as parallel coordinate plots, in multiple views provides alternative perspectives for
Integrating computational and visual analysis
663
better exploration, hypothesis generation, evaluation and interpretation of patterns, and ultimately support for knowledge construction.
References Bertin, J. (1981). Graphics and Graphic Information Processing. Berlin, Walter de Gruyter. Cabena, P., P. Hadjnian, R. Stadler, J. Verhees and Z. Alessandro (1998). Discovering data mining: From concept to implementation. New Jersey, Prentice Hall. Card, S. K., J. D. Mackinlay and B. Shneiderman (1999). Readings in Information Visualization. Using Vision to Think. San Francisco, Morgan Kaufmann Publishers. Chen, C. (1999). Information visualization and Virtual Environments. London, Springer-Verlag. Cook, D., J. J. Majure, J. Symanzik and N. Cressie (1996). Dynamic Graphics in a GIS: Exploring and Analyzing Multivariate Spatial Data Using Linked Software. Computational Statistics: Special Issue on Computeraided Analysis of Spatial Data 11(4): 467-480. Cromley, E. K. and S. L. McLafferty (2002). GIS and public health. New York, The Guilford Press. Croner, C., L. Pickle, D. Wolf and A. White (1992). A GIS approach to hypothesis generation in epidemiology. ASPRS/ACSM technical papers. A. W. Voss. Washinton, DC, ASPRS/ACSM. 3: 275-283. Dykes, J. A. (1997). Exploring Spatial Data Representation with Dynamic Graphics. Computers & Geosciences 23(4): 345-370. Egbert, S. L. and T. A. Slocum (1992). EXPLOREMAP: An Exploration System for Choropleth Maps. Annals, Association of American Geographers . 82(2): 275-288. Fabrikant, S. I. and B. Buttenfield (2001). Formalizing semantic spaces for information access. Annals of the Association of American Geographers 91(2): 263-280. Fabrikant, S. I. and A. Skupin (2003). Cognitively Plausible Information Visualization. Exploring GeoVisualization. M. J. Kraak. Amsterdam, Elsevier. Fayyad, U., G. Piatetsky-Shapiro and P. Smyth (1996). From data mining to knowledge discovery in databases. Artificial Intelligence Magazine. 17: 3754. Gahegan, M. (2000). On the application of inductive machine learning tools to geographical analysis. Geographical Analysis 32(2): 113-139. Gahegan, M., M. Harrover, T. M. Rhyne and M. Wachowicz (2001). The integration of geographic visualization with Databases, Data mining, Knowledge Discovery Construction and Geocomputation. Cartography and Geographic Information Science 28(1): 29-44.
664
Etien L. Koua and Menno-Jan Kraak
Gahegan, M. and M. Takatsuka (1999). Dataspaces as an organizational concept for the neural classification of geographic datasets. Fourth International Conference on GeoComputation, Fredericksburg, Virginia, USA. Girardin, L. (1995). Mapping the virtual geography of the World Wide Web. Fifth International World Wide Web conference, Paris, France. Keim, D. A. (2002). Information Visualization and Visual Data Mining. IEEE transactions on Visualization and Computer Graphics 7(1): 100-107. Keller, P. and M. Keller (1992). Visual clues: Practical Data Visualization. Los Alamitos, CA, IEEE Computer Scociety Press. Kohonen, T. (1989). Self-Organization and Associative memory, Spring-Verlag. MacEachren, A. M., M. Wachowicz, R. Edsall, D. Haug and R. Masters (1999). Constructing knowledge from multivariate spatiotemporal data: integrating geographical visualization with knowledge discovery in databases methods. International Journal of Geographical Information Science 13(4): 311-334. Miller, H. J. and J. Han (2001). Geographic data mining and knowledge discovery. London, Taylor and Francis. Monmonier, M. (1992). Authoring Graphics Scripts: Experiences and Principles. Cartography and Geographic Information Systems 19(4): 247-260. Openshaw, S., A. Cross and M. Charlton (1990). Building a prototype geographical correlates machine. International Journal of Geographical Information Systems 4(4): 297-312. Openshaw, S. and C. Openshaw (1997). Artificial Intelligence in geography. Chichester, John Wiley & Sons. Openshaw, S. and I. Turton (1996). A parallel Kohonen algorithm for the classification of large spatial datasets. Computers-and-Geosciences 22(9): 10191026. Schaale, M. and R. Furrer (1995). Land surface classification by Neural Networks. InternationalJourna of Remote Sensing 16(16): 3003-3031. Skidmore, A., B. J. Turner, W. Brinkhof and E. Knowles (1997). Performance of a Neural Network: mapping forests using GIS and Remote Sensed data. Photogrammetric Engineering and Remote Sensing 63(5): 501-514. Skupin, A. (2003). A novel map projection using an Artificial Neural Network. 21st International Cartographic Conference (ICC), 'Cartographic Renaissance;, Durban, South Africa. Skupin, A. and S. Fabrikant (2003). Spatialization Methods: A Cartographic Research Agenda for Non-Geographic Information Visualization. Cartography and Geographic Information Science. 30(2): 99-119. Strehl, A. and J. Ghosh (2002). Relationship-based clustering and visualization for multidimensional data mining. INFOMS Journal on Computing 00(0): 1-23. Ultsch, A. and H. Siemon (1990). Kohonen's self-organizing feature maps for exploratory data analysis. Proceedings International Neural Network Conference INNC'90P, Dordrecht, The Netherlands. Wachowicz, M. (2000). The role of geographic visualization and knowledge discovery in spatio-temporal modeling. Publications on Geodesy 47: 27-35. Weldon, J., L. (1996). Data mining and visualization. Database programming and design 9(5).
Using Spatially Adaptive Filters to Map Late Stage Colorectal Cancer Incidence in Iowa Chetan Tiwari and Gerard Rushton Department of Geography, The University of Iowa, Iowa City, Iowa 52242, USA
Abstract Disease rates computed for small areas such as zip codes, census tracts or census block groups are known to be unstable because of the small populations at risk. All people in Iowa diagnosed with colorectal cancer between 1993 and 1997 were classified by cancer stage at the time of their first diagnosis. The ratios of the number of late-stage cancers to cancers at all stages were computed for spatial aggregations of circles centered on individual grid points of a regular grid. Late-stage colorectal cancer incidence rates were computed at each grid point by varying the size of the spatial filter until it met a minimum threshold on the total number of colorectal cancer incidences. These different-sized areas are known as spatially adaptive filters. The variances analyzed at grid points showed that the maps produced using spatially adaptive filters gave higher statistical stability in computed rates and greater geographic detail when compared to maps produced using conventional fixed-size filters.
1 Introduction With the availability of high quality geospatial data on disease incidences, disease maps are increasingly being used for representing, analyzing and interpreting disease incidents. They are being used to make important decisions about resource allocation and to study the interrelationships that might exist between spatially variable risk factors and the occurrence of disease. Several methods exist for mapping disease. It is common to determine a rate using disease cases as the numerators and the population-at-
666
Chetan Tiwari and Gerard Rushton
risk as the denominators. Rates of disease occurrence or incidence that are computed for small areal units like zip codes, census tracts or block groups are known to be unstable because of the small populations at risk. This problem is particularly found in rural areas that tend to have low population densities. A common solution to this problem is to aggregate the data (not the rates) over some larger area to estimate the disease rate at a point. There are other methods of spatial smoothing that are not of interest in this paper that involve the fitting of smooth surfaces to point measures. These include smoothing methods that range from linear smoothers in which the smoothing function is primarily dependent on the distance and weight of points within a defined neighbourhood to non-linear methods like response-surface analysis, kriging, etc. Other methods like median polish and headbanging provide tools for local smoothing of data (Pickle 2002). Kafadar (1994) provides a comparison of several such smoothing methods. Methods that address smoothing of point based data, which are available commonly as aggregates over some geographical unit or less commonly at the individual level are of interest in spatial epidemiology (Lawson 2001). With the availability of high-resolution data, there has also been a move towards developing local statistics as opposed to global statistics with a focus on point-based methods. Atkinson and Unwin (2002) developed and implemented point-based local statistical methods within the MapInfo GIS software. Bithell (1990) used density estimation techniques to study the relative risk of disease at different geographical locations using a dataset on childhood leukaemia in the vicinity of the Sellafield nuclear processing plant in the U.K. His study introduced the idea of ‘adaptive kernel estimation’, in which he proposed that the size of the bandwidth should vary according to the sample density. The use of such adaptive kernels would ideally provide greater smoothing in areas of low densities and greater sensitivity in areas of higher density. Rushton and Lolonis (1996) used a series of overlapping circles of fixed width to estimate the rates of infant mortality in Des Moines, Iowa for points within a regular grid. However, such fixed-size spatial filters are not sensitive to the distribution of the base population, and hence this method produces unstable rates in areas with low population densities. Talbot et al. (2000) used a modified version of such fixed spatial filters to borrow strength from surrounding areas. They used spatial filters with constant or nearly constant population size to create smoothed maps of disease incidence. Other methods like the Empirical Bayes method that shrink the local estimates towards the global mean gained popularity in the 1990s. With increased computing capability, this method led to the development of full Bayes methods that used
Using Spatially Adaptive Filters
667
Markov Chain Monte Carlo (MCMC) methods to account for any spatial autocorrelation (Bithell 2000; Rushton 2003). In this paper we examine and statistically compare the properties of maps produced using fixed-size spatial filters and spatially adaptive filters.
2 Data The data consisted of all people in Iowa diagnosed with colorectal cancer (n=7728) between 1993 and 1997. Each person was geocoded to their respective street address. In some cases, when this was not possible, the location of the centroid of their zipcodes was substituted. Each person was classified by stage of cancer at the time of their first diagnosis. Each eligible case had a morphology that was consistent with the adenomaadenocarcinoma sequence. The stage of the disease was defined as the summary stage at the time of diagnosis using the SEER Site-Specific Summary Staging Guide. Late-stage CRC was defined as AJCC stage of at least II. Rates of late-stage colorectal cancer incidence were calculated as a ratio of the number of late-stage colorectal cancer cases by the total number of colorectal cancer cases. These rates were calculated at all points within a regular grid laid out over Iowa. For computational efficiency, the distance between grid points was selected to be 5 miles. The proportion of persons found to have late-stage colorectal cancer at the time of first diagnosis is generally thought to correlate inversely with the proportion of people who are screened for this cancer. Areas with low late-stage rates are expected to have higher survival rates. Finding areas with high late-stage rates can lead to enhanced prevention and control efforts that involve improved rates of screening for this cancer.
3 Spatial Filtering Choropleth maps are examples of spatially filtered maps. However, their usefulness is affected by the different sizes, shapes and populations they represent as well as the property of spatial discreteness. Such maps produced from point-based data are known to produce different patterns when the shape and scale of the boundaries are changed. This is the Modifiable Areal Unit Problem (Openshaw and Taylor 1979; Amrhein 1995). Changing the scale at which the disease data are aggregated by choosing different types of areas like zip code areas, census block group areas or counties results in different patterns of disease incidence. To illustrate the aggrega-
668
Chetan Tiwari and Gerard Rushton
tion/zoning effect, consider the maps in Fig. 1a through Fig. 1d. These maps are produced by shifting the boundaries of the zip codes in different directions. By doing this, we are keeping the size of the areal units approximately the same, but we are changing the structure of the boundaries. It is important at this stage to note that the distribution of the point-based data remains unaltered. As expected, the patterns of disease are quite different.
The problem of scale is easily solved when point-based data are available. The map in Fig. 2a was produced using fixed-size spatial filters in which the rates of late-stage colorectal cancer incidence were computed
Using Spatially Adaptive Filters
669
for spatial aggregations around each point within a regular grid. The rates at each grid point were calculated as the number of late-stage colorectal cancer cases within a 20-mile circular filter divided by the number of all colorectal cancer cases within the same 20-mile radius. The disease rates computed at grid points were then converted into a surface map using spatial interpolation (Fig. 2b). The inverse distance weighted (IDW) method was used to create the surface maps. To minimize double smoothing of the results only the eight direct neighbors were used.
In this method, since the rates of disease incidence are computed for spatial aggregations around points within a regular grid and because this grid is independent of any administrative or other boundary, the use of spatial filters controls for both the issues that contribute to the MAUP. ‘Control’ in this context does not mean that the resulting pattern is insensitive to the choice of the size of the spatial filter being used. It does mean, however, that the resulting pattern changes in predictable ways as the size of the spatial filter changes. The size of the spatial filter used affects the reliability and amount of geographic detail that is portrayed by the map. Larger spatial filters produce maps with high levels of reliability, but low geographic detail. Alternatively, smaller filters produce maps of high geographic detail, but low reliability. As a result, in some cases, fixed-size spatial filters have problems of oversmoothing and undersmoothing disease rates in relation to the distribution of the population at risk. In other words, fixed-size spatial filters may be using large filter sizes in areas where a smaller filter size could have been used as effectively, thereby resulting in loss of geographic detail; or they may be using smaller filter sizes that produce unreliable estimates in areas with sparse populations at risk.
670
Chetan Tiwari and Gerard Rushton
Spatially adaptive filters overcome these problems by using variable width spatial filters that consider the base population at risk when computing disease rates. In the following sections of this paper, maps produced using both these methods are compared and we expect to see high geographic detail and reliability in maps that were produced using spatially adaptive filters.
4 Spatially Adaptive Filters In the spatially adaptive filters method, late-stage rates were computed for spatial aggregations centered on each grid point within a regular grid. GIS and other supporting software were used to compute these rates by varying the size of the spatial filter until it met a minimum threshold value on the total number of colorectal cancer incidences (denominator of the rate). These different-sized areas are known as spatially adaptive filters.
Using Spatially Adaptive Filters
671
Fig. 3a and 3b are examples of two filter sizes (a fixed 12-mile filter on the left and a fixed 36-mile filter on the right) that were used to create maps of late-stage colorectal cancer incidence in Iowa. In Fig. 3a, we see high geographic detail and in Fig. 3b we see an extremely smooth map. Fig. 3d shows large variability in rates for the 12 mile filter map and low variability for rates for the 36 mile map. To overcome this problem, we use spatially adaptive filters to create maps of disease incidence with maximum possible spatial resolution but without an increase in variability of rates (Fig. 3c). As can be seen in Fig. 3d, the variability of rates in Fig. 3c is less than in Fig. 3a, yet considerable geographic detail is preserved compared with Fig. 3b. Correlations between rates computed at each grid point using different sized fixed-spatial filters and different threshold values for spatially adaptive filters were examined (Fig. 4). The x-axis corresponds to the rates computed at each grid point using a fixed filter method. The y-axis indicates the correlation of rates computed at each grid point between the fixed filter method and the spatially adaptive filter method. The curves indicate different threshold values on the denominator. As we increase the size of the fixed-spatial filter, we would expect it to pull in a larger number of colorectal cancer cases. The correlation curves in Fig. 4 follow this trend and we notice greater correlation as fixed-filter sizes and threshold values increase. For each threshold value there is a fixed filter size with which it has the highest correlation. As the threshold value increases, the filter size with which it has the highest correlation also increases. Thus each threshold size has a correlation pattern in the form of an inverted “U” shape with the maximum value moving to the right as the threshold value increases. This indicates some similarity between maps produced using the fixed-filter method and the spatially adaptive filters method. The more uniform the population density in the area mapped, the higher is the correlation between a fixed filter map and the spatially adaptive filter size with which it most closely corresponds. It follows that in an area with variable population density there will be some areas of the map where the results of the fixed filter and spatially adaptive filter are the same and other areas where the results will be different. In terms of the amount of geographical detail that is obtained, we find that the maps produced using spatially adaptive filters perform better. In this paper we illustrate this argument by comparing two maps that were produced using a fixed spatial filter of 20 miles and using spatially adaptive filters with a minimum threshold value of 75 colorectal cancer cases.
672
Chetan Tiwari and Gerard Rushton 1.0 0.8
R2
0.6 0.4
No. of cases at the denominator
0.2 0.0 5
12
16
20
24
36
Threshold Threshold Threshold Threshold
> > > >
10 50 75 100
Filter Size (in miles)
Frequency
Fig. 4. Similarity in late-stage Colorectal Cancer (CRC) incidence rates computed using different sized fixed spatial filters and spatially adaptive filters with varying threshold values 1000 900 800 700 600 500 400 300 200 100 0
a b
450 500 550 600 650 700 750 800 850 900 950
Fig. 5. Distribution of rates produced using (a) the 20-mile fixed filter method and (b) the spatially adaptive filters method with threshold > 75
Fig. 5 shows that the overall distribution of rates of late-stage colorectal cancer incidence in Iowa produced using both these methods are very similar Fig. 6, however, shows that considerable differences are found between the rates at specific grid locations. As illustrated in Fig. 7, the map produced using the spatially adaptive filters method gives us greater geographic detail when compared to the fixed filter method.
Using Spatially Adaptive Filters
673
900 850
Frequency
800 750 700 650 600 550 500 450 450
500
550
600
650
700
750
800
850
900
Fig. 6. Comparison of rates at each grid point from the 20 mile fixed filter and the adaptive filter with threshold > 75
Fig. 7. Late-stage colorectal cancer incidence in Iowa: Comparison of geographic detail in maps produced using (a) the 20-mile fixed spatial filter and (b) the spatially adaptive filter with a threshold > 75
674
Chetan Tiwari and Gerard Rushton
In the spatially adaptive filters method, approximately 70% of the filters used were less than 20 miles and approximately 10% were larger than 20 miles (Fig. 8). The variances on the resulting maps, however, as noted in Fig. 5 were the same. 100%
600
90% 500
80% 70%
Frequency
400
60% 50%
300
40% 200
30% 20%
100
10% 0
0% 12 14 16 18 20 22 24 26 28 30 32 34 36 Filter size at grid points Frequency
Cumulative %
Fig. 8. Amount of over-smoothing in the fixed 20-mile filter map
The map produced using spatially adaptive filters used spatial filter sizes ranging from 12 miles to 36 miles. Fig. 9 illustrates a small sample (approximately 5%) of the filter sizes that were used in the spatially adaptive filters method. Notice that smaller filter sizes are used in urban areas (central part of the state) and larger filter sizes are used in rural areas and in areas close to the state boundary. Larger filter sizes were required close to the map boundaries because of known problems relating to edge effects (Lawson et al. 1999).
Using Spatially Adaptive Filters
675
12-14 14-18 18-22 22-26
0
100 miles
26-30
Fig. 9. Sample of filter sizes used in the spatially adaptive filters method - every twentieth grid point
5 Conclusions These results demonstrate the advantages of using spatially adaptive filters in obtaining stable maps of disease incidence compared with maps produced using fixed-size spatial filters. We found that in most cases, the maps produced using spatially adaptive filters were using smaller filter sizes than a fixed filter map to give results with high reliability in rates and better geographic detail. Many cancer registries throughout the world are now able to geocode cancer incidences to very fine levels of geographic resolution. Although access to such data is restricted because of privacy and confidentiality concerns, an important question is whether this geographic detail is needed to produce more accurate maps of cancer rates. Future research will explore this question using both disaggregated and aggregated cancer data. As part of the research project that supported this work, a new version of the Disease Mapping and Analysis Program is being developed at The University of Iowa that uses spatially adaptive filters for creating disease maps. Its current status can be viewed at http://www.uiowa.edu/~gishlth/DMAP.
676
Chetan Tiwari and Gerard Rushton
Acknowledgements We thank Ika Peleg, Michele West, Geoffrey Smith and the Iowa Cancer Registry for their assistance. Partial support for this work was provided by NIH Grant # ROI CA095 961, “A GIS-based workbench to interpret cancer maps”
References Amrhein CG (1995) Searching for the elusive aggregation effect: evidence from statistical simulations. Env and Pl A 27:105-199 Atkinson PJ, Unwin DJ (2002) Density and local attribute estimation of an infectious disease using MapInfo. Comp & Geosci 28:1095-1105 Bithell JF (1990) An application of density estimation to geographical epidemiology. Stat in Med 9:691-701 Bithell JF (2000) A classification of disease mapping methods. Stat in Med 19:2203-2215 Kafadar K (1994) Choosing among two-dimensional smoothers in practice. Comp Stat & Data Analysis 18:419-439 Lawson AB (2001) Statistical methods in spatial epidemiology. John Wiley & Sons, West Sussex Lawson AB, Biggeri A, Dreassi E (1999) Edge effects in disease mapping. In: Lawson AB, Biggeri A, Bohning D, Lesaffre E, Viel JF, Bertollini R (eds) Disease mapping and risk assessment for public health. John Wiley & Sons, pp 83-96 Openshaw S, Taylor P (1979) A million or so correlation coefficients: three experiments on the modifiable areal unit problem. In: Wrigley N. (ed) Statistical applications in the spatial sciences. Pion, London, pp 127-144 Pickle LW (2002) Spatial analysis of disease. In: Beam C (ed) Biostatistical applications in cancer research. Kluwer Academic Publishers, Tampa, pp 113-150 Rushton G (2003) Public health, GIS, and spatial analytic tools. Ann Rev of Pub Hlth 24:43-56 Rushton G, Lolonis P (1996) Exploratory spatial analysis of birth defect rates in an urban population. Stat in Med 15:717-726 Talbot TO, Kulldorff M, Forand SP, Haley VB (2000) Evaluation of spatial filters to create smoothed maps of health data. Stat in Med 19:2399-2408