Lecture Notes in Geoinformation and Cartography Series Editors: William Cartwright, Georg Gartner, Liqiu Meng, Michael P. Peterson
Markus Jobst Editor
Preservation in Digital Cartography Archiving Aspects
123
Editor Dr. techn. Markus Jobst Austrian Federal Office for Metrology and Surveying Informationmanagement I1/INSPIRE Schiffamtsgasse 1-3 A-1020 Vienna Austria
[email protected]
ISSN 1863-2246 e-ISSN 1863-2351 ISBN 978-3-642-12732-8 e-ISBN 978-3-642-12733-5 DOI 10.1007/978-3-642-12733-5 Springer Heidelberg Dordrecht London New York Library of Congress Control Number: 2010936135 c Springer-Verlag Berlin Heidelberg 2011 This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer. Violations are liable to prosecution under the German Copyright Law. The use of general descriptive names, registered names, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. Cover design: deblik, Berlin Layout and Prepress: Jobstmedia, Austria (www.jobstmedia.at) Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface
This book “Preservation in Digital Cartography: Archiving Aspects” should give an overview on how to preserve digital cartographic applications and geospatial data in a sustainable way. The intention of this book is to shape the opinion of affected parties and to bring together various disciplines. Therefore adjacent chapters will generally deal with information technologies, Service-Oriented Architectures, cybercartography, reproduction and historic cartography, which all together can be subsumed in prospective cartographic heritage. The survival of this digital cartographic heritage will base on long-term preservation strategies that make use of extensive dissemination on the one hand and sustainable digital archiving methods on the other. This includes a massive development of paradigm that expands from “store-and-save” to “keep-it-online”. The paradigm “store-and-save” is mainly used for analogue masters that consist of storage media, like vellum, and their visible content. Avoiding the storage media from degeneration in climate-controlled areas will help to keep the content accessible. In the digital domain the high interdependency of storage media, format, device and applications leads to the paradigm “keep-itonline” which for example describes the migration to new storage devices. In fact this expansion of paradigm means that the digital domain calls for ongoing actions in order to preserve cartography for a long term. The topics within this book span from a prospective cartographic heritage´s complexity, aspects of geospatial preservation, problems in keeping digital cartography online to pragmatic considerations for a prospective cartographic heritage. The contributions of this book will describe and help to identify main foci of preservation in digital cartography supported by state-of-the-art practices and experience reports. The first section focuses on complexity of a prospective digital cartographic heritage. Especially the field of cybercartography, which uses interactive, dynamic, multisensory formats with employing multimedia and multimodal interfaces, and new developments of cartographic communication, especially neo-cartography with its massive use of Geo- and Map Services in Service Oriented Architectures (SOA), lead to unsolved archiving topics, which necessitate new methods, structures and technologies.
VI Preface
The second section deals with sustainability in terms of geospatial preservation. Sustainability concerns various aspects in cartographic heritage, where the change of media carrier from analogue to digital, the accessibility of content in terms of various formats, codings and (hard- and software) requirements have to be addressed. Additionally this section may show up main difficulties in long-term accessibility of digital media and possible methods to ensure long-term accessibility of digital cartography. Standardization initiatives, like the data preservation working group of the Open Geospatial Consortium (OGC), are one keyfactor for the sustainable preservation of geospatial data. Sustainability by means of accessibility focuses on needs, formats and relationships for enabling long-term data collections on the one hand and data mining (finding the right content) on the other hand. The third section lists experience reports on keeping digital cartography online, which is an important part of preservation in digital cartography. Only the knowledge and consciousness of digital cartographic heritage allow for designing appropriate preservation strategies and making expenses on their implementations in order to assure this heritage for future times. These selected examples will give an introduction to the viewpoint of a digital library and cartographic archive. The section will be concluded with a chapter on user-friendly access to geospatial data collections, which highlights the importance of WYSIWYG interfaces. The fourth section discusses pragmatic considerations, which span from reproduction quality to legal issues in Service-Oriented Architectures. The chapter on digital reproduction compares advantages of direct digitalization versus hybrid methods in terms of long-term preservation. The Arcanum project highlights difficulties due to administration change and their influence on cartographic heritage applications. At least the development of distributed digital map libraries leads to an intellectual property rights approach that should be considered in future distributed applications. Finally the editor likes to appreciate the support of following institutions, which gave their benefit to realize this book: • The Research Group Cartography at the Vienna University of Technology and the Hasso Plattner Institute at the University of Potsdam encouraged the editor in his interest on preserving digital cartography with several discussion- and working opportunities although this was not their specific research area. • The Institute of Cartography at the Technical University of Dresden granted access to personal knowledge resources, material and
Preface VII
reproduction camera for comparing studies in the field of reproduction quality. • The Commission on “Digital Technologies in Cartographic Heritage” of the International Cartographic Association (ICA) provided an adequate discussion- and presentation forum for acquired experiences, ongoing ideas and developments in digital cartographic heritage. As one result of these discussions several members contributed their experiences in this book. Finally, I would like to thank my family for their support, understanding and patience during the last months whenever I had to work on this book on weekends and holidays. To Beatrix and Paula...
“My heritage has been my grounding, and it has brought me peace.” Maureen O'Hara
“Heritage is our legacy from the past, what we live with today, and what we pass on to future generations. Our cultural and natural heritage are both irreplaceable sources of life and inspiration.” World Heritage Definition of UNESCO
September 2 0 1 0 , Markus Jobst
About the Contributers
Renate Becker is CEO of GIS-Service GmbH she founded in 1999. The company focuses on services for environmental agencies and mining companies by providing expertise in the fields of data preparation, management and analysis, particular for hydrological and mining applications. Miguel Ángel Bernabé-Poveda has a BSc. in Land Surveying, a MSc. in Fine Arts and a PhD. in Education. He is currently Professor in Cartography (Technical University of Madrid), and Head of the MERCATOR Research Group. Uwe M. Borghoff holds Diploma and Doctoral degrees in Computer Science from the Technische Universität München, Munich, Germany. In 1993, he was awarded the postdoctoral university lecturing qualification (Habilitation) in Computer Science. He worked at the Technische Universität München for seven years as research scientist before joining the Xerox Research Centre Europe (formerly Rank Xerox Research Centre) at the Grenoble Laboratory, France, in 1994. At Xerox, he was Senior Scientist, project leader, and group leader of the coordination technology area. In 1998, he joined the Faculty of the Universität der Bundeswehr München, Munich, Germany, where he is a full professor of Computer Science at the Institute for Software Technology. Jens Bove was born 1969 in Minden, Germany. From 1991 to 1996 he studied Art History, German Literature und Media Sciences in Marburg. Since 1993 he is working for the German Documentation Center of Art History - Bildarchiv Foto Marburg, finished his PhD in 2001 on the work of Conceptual Pop Artist Richard Hamilton. 2002 to 2003 Managing Director of Foto Marburg. Since 2003 he is the Head of the Deutsche Fotothek in the Saxon State Library. Manfred F. Buchroithner (born in 1950) is Full Professor of Cartography at and Director of the Institute for Cartography (IfC) of the Dresden University of Technology (TUD). He holds degrees in both Geology & Paleontology (Graz, Austria) and Cartography and Remote Sensing (ITC, NL)
X About the Contributers
and obtained his PhD in 1977. Out of his more than 300 articles more than 65 have been published in reviewed journals. He has written three books and edited three volumes on remote sensing. His major research interests cover true-3D geodata visualisation and high-mountain cartography. William Cartwright is President of the International Cartographic Association and a Federal Councillor of the Mapping Sciences Institute, Australia. He is Professor of Cartography and Geographical Visualization in the School of Mathematical and Geospatial Sciences at RMIT University, Australia. He holds a Doctor of Philosophy from the University of Melbourne and a Doctor of Education from RMIT University. He has six other university qualifications - in the fields of cartography, applied science, education, media studies, information and communication technology and graphic design. He joined the University after spending a number of years in both the government and private sectors of the mapping industry. His major research interest is the application of integrated media to cartography and the exploration of different metaphorical approaches to the depiction of geographical information. Pilar Chias is professor at the Technical School of Architecture and Geodesy of the University of Alcalá (Spain). After a long tradition on implementing GIS about the Cultural Heritage in a wide sense, her particular interest has focused on the historical cartography as a main source for researches on the historical evolution of the Spanish territories and landscapes. Author of many books and academic journal articles as Los caminos y la construcción del territorio en Zamora. Catálogo de puentes (2004), Eduardo Torroja. Obras y Proyectos (2005), ‘Las vías de comunicación en la cartografía histórica de la Cuenca del Duero: construcción del territorio y paisaje’, Ingeniería Civil, no. 149 (2008). Efrén Díaz-Díaz has an LLB from the University of Navarre, and is currently a lawyer at the law firm ‘Mas y Calvet’ (Madrid), a member of the Working Group of the Spatial Data Infrastructure of Spain. He is interested in the legal aspects of the new technologies applied to urban and rural properties, Land Registry and Cadastre. Robert Dixon-Gough commenced his professional career as a cartographer working with a multi-disciplinary group evaluating future water management policies in England and Wales. He has been a lecturer and re-
About the Contributers
XI
searcher at the University of East London (formerly North East London Polytechnic and the Polytechnic of East London) since 1974 and during this period has published extensively on issues relating to cartography, remote sensing, and land management. Over the past two decades he has cooperated with a number of research institutes and universities across Europe and is also an active member of the European Faculty of Land Use and Development. He was the founder of the International Land Management Series (Ashgate Publishing Ltd.) and his current research is centred on the processes, actors and actions related to the evolution of cultural landscapes. Alberto Fernández-Wyttenbach has a BSc. in Land Surveying, and is currently finishing a MSc. in Geodesy & Cartography (Technical University of Madrid). He is interested in new technologies applied to Cartographic Heritage and Digital Libraries. Georg Gartner is a Professor for Cartography and Geo-Mediatechniques at the Research Group of Cartography at the Vienna University of Technology. He holds graduate qualifications in geography and cartography from the University of Vienna and the Vienna University of Technology. He serves as a vice-president of the International Cartographic Association ICA and as a member of the Executive Board of the Austrian Geographic Society. Rudolf Haller is head of Geoinformation and ICT of the Swiss National Park. He has received his Masters in Geography from the University of Zurich. His main interests are in the application of GI technologies in nature protection and conservation. Recently he has been involved in several international projects on location based services and virtual globe implementations on protected areas as well as international habitat studies. Józef Hernik graduated from the University of Agriculture in Krakow and the Jagiellonian University. He is a lecturer and researcher at the University of Agriculture in Kraków and carries on the research on land management and cultural landscape. He was the coordinator of a number of projects on cultural landscape. He initiated and edited a series of monographs on cultural landscape.
X I I About the Contributers
Lorenz Hurni is professor and head of the Institute of Cartography of ETH Zurich. Under his lead, the multimedia “Atlas of Switzerland” as well as a new interactive version of the “Swiss World Atlas”, the official Swiss school atlas, are being developed. The emphasis of his research lies in cartographic data models and tools for the production of printed and multimedia maps. Another focus of research covers interactive, multidimensional multimedia map representations. The new possibilities are being explored in international, interdisciplinary projects and being imparted to a broad audience in lectures and courses for students and practitioners. Stephan Imfeld is a senior researcher at the GIScience Centre at the Department of Geography at the University of Zurich and at the Angiology Department at the University Hospital in Basel. He has received a PhD in Natural Sciences from the University of Zurich and a MD at the University of Basel, Switzerland. He is currently responsible for research and development for the GIS of the Swiss National Park. His main interests are in methodological aspects of GIS applications in biological sciences and in applied medical research. Helen Jenny is a scientific collaborator at the Institute of Cartography of ETH Zurich, Switzerland. She received a MS in Physical Geography and GIS from the University of Stuttgart, Germany. Bernhard Jenny is a scientific collaborator at the Institute of Cartography of ETH Zurich. He received a MS in Geomatics from EPFL Lausanne, Switzerland. Markus Jobst is technically coordinating the INSPIRE directive at the Austrian Federal Office for Metrology and Surveying and providing lectures on cartographic interfaces and geospatial data infrastructures at the Vienna University of Technology in Vienna and Hasso Plattner Institute in Potsdam. He finished his PhD in 2008 with the focus on semiotics in 3D maps at the Vienna University of Technology. After being a research assistant at the Institute of Geoinformation and Cartography at the Vienna University of Technology from 2003 to 2007, he worked on geospatial Service-Oriented Architectures as post-doc research fellow at the Hasso Plattner Institute at the University of Potsdam. Throughout the years Markus Jobst acquired substantiated knowledge in digital photography and cross-media production processes. His main foci in scientific work are the
About the Contributers X I II
communication of spatial related data as well as cartographic heritage especially within spatial data infrastructures (SDI´s) and digital cartographic presentation methods. Wolf G. Koch (born 1943, Diploma in Cartography 1969, Doctorate 1974, Habilitation 1989), since 1990 head of course of studies „Cartography“, 1992-2008 Professor for theoretical cartography and map design at the Dresden University of Technology, Germany, since then emeritus. 19921994 and 2000-2003 director of the Institute for Cartography, 1993-2005 leader of the commission “Cartographic Terminology” of the German Society for Cartography; co-editor of the dictionary “Lexikon der Kartographie und Geomatik” (2001/2002). Adviser for map design with the National Atlas of Germany (1998 – 2006). International activities: Corresponding member of ICA-Commission “Theoretical Cartography” and “Maps and Graphics for Blind and Visually Impaired People”. Fields of interest: Theoretical cartography, map design, tactile cartography, history of cartography. Nico Krebs holds a Diploma degree in Computer Science from the Universität der Bundeswehr München since 2004. After his studies, he worked as trainer for computer systems, and returned to the Universität der Bundeswehr, where he is now in the position of a research assistant. His field of research is the long-term archiving, with a strong focus on emulation techniques. Tracey P. Lauriault is a Doctoral student at the GCRC and participates in activities and represents the Centre on topics related to the access to and preservation of geospatial data. She led the Cybercartographic Atlas of Antarctica Case Study and General Study 10 Archival Policies of Scientific Data Portals for the International Research on Permanent Authentic Records in Electronic Systems (InterPARES) 2. Steve Morris is Head of Digital Library Initiatives at North Carolina State University Libraries where he leads development of new digital projects and services. Previously Steve led the GIS data services program at NCSU. From 2004 to 2010 Steve was principal investigator on a geospatial data digital preservation project funded through the Library of Congress National Digital Information Infrastructure and Preservation Program
XI V
About the Contributers
(NDIIPP). He is currently Chair of the Data Preservation Working group within the Open Geospatial Consortium. Andreas Neumann leads the GIS Competence Center of the City of Uster and is a former scientific collaborator of the Institute of Cartography of ETH Zurich. He received a MA in Geography from the University of Vienna, Austria. Peter L. Pulsifer is a postdoctoral fellow with the Geomatics and Cartographic Research Centre, Department of Geography, Carleton University.. He has been an active member of what is now the SCAR Standing Committee on Antarctic Geographic Information and the Standing Committee on Antarctic Data Management At present, he is leading the data and knowledge management aspects of the International Polar Year Inuit Sea Ice Use and Occupancy Project (http://gcrc.carleton.ca/isiuop/) while actively developing a research program focused on geographic information management for polar environments. Dalibor Radovan is the head of R&D sector at the Geodetic Institute of Slovenia. He also lectures cartography at the Faculty for Civil Engineering and Geodesy, Department of Geodesy. He holds B.Sc. and M.Sc. in geodesy from the University of Ljubljana, and Ph.D. in geoinformatics from the Technical University of Vienna. He is working in the fields of topography, cartography, geodesy and GPS, maritime hydrography, GIS, navigation and LBS. Peter Sandner. Dr. phil. (University of Frankfurt, 2002). Archivist in the Central Archives of the federal state of Hesse (Hessisches Hauptstaatsarchiv) in Wiesbaden, Germany. Member of the working group 'Electronic systems in law and administration' (Elektronische Systeme in Justiz und Verwaltung – AG ESys) and of 'IT committee' (IT-Ausschuss) of the Conference of archival officials of the federation and the federal states of Germany (Konferenz der Archivreferentinnen und Archivreferenten des Bundes und der Länder – ARK). Renata Šolar is the head of Map and Pictorial Collection of the National and University Library of Slovenia. She holds B.Ss. and M.Sc. in geography, and Ph.D. in information sciences. She works in the field of map librarianship.
About the Contributers
XV
Zdeněk Stachoň, member of research team of Laboratory on Geoinformatics and Cartography on Masaryk University, Brno. He focuses on topics of thematic cartography, history of cartography, semiotics and toponomastics. D.R. Fraser Taylor, Director of the GCRC; Distinguished Research Professor of Geography and Environmental Studies and is a Fellow of the Royal Society of Canada. His main research interests in cartography lie in the application of geomatics to the understanding of socio-economic issues. He also has a strong interest in the theory and practice of cybercartography and cartographic education. He was a collaborator in the InterPARES 2 project and currently is a member of the International Cartographic Association Working Group on Open Data Access and Intellectual Property Rights and Chair of the International Steering Committee for Global Mapping. He is also a Board Member of the OGC Interoperability Institute. Axel Thomas is a geographer with a focus on Geographical Information Systems/Science. He has held a position as professor of Geography at Gießen University and currently is Privatdozent (ass. prof.) at Mainz University, Germany. He works as senior R&D manager at GIS-Service GmbH, Wackernheim, Germany, specializing in application development for civil services and industry. Georg Zimmermann, born in 1955, was first educated as a professional cartographic technician (1972 – 1975). He holds both a Diploma (1983) and a PhD (1986) in Cartography from the Dresden University of Technology, Germany. Since 1986 he has been Head of the Map Collection of the former Library of the Free State of Saxony, today Saxon State and University Library Dresden. He is founder and contact person of the Kartenforum Sachsen (Map Forum Saxony). Since 2009 he has been managing the research project “Innovative Access to Spatial Graphic Information: Exemplary Digitizing, Processing and Presentation of Historical maps and Vedutes for the Map Forum” funded by the German Research Council DFG.
Table of Contents
Preface .........................................................................................................V About the Contributers............................................................................IX Section I: Understanding Complexity of a Prospective Cartographic Heritage.........................................1 1 From Plan Press to Button Push: the Development of Technology for Cartographic Archiving and Access..........................3 William Cartwright 1.1 Introduction.....................................................................................3 1.2 Archiving Historical Cartographic Artefacts...................................5 1.3 Installations.....................................................................................6 1.4 Hypercard........................................................................................7 1.5 Microfiche..................................................................................... 10 1.6 Videodisc....................................................................................... 10 1.7 CD-ROM....................................................................................... 12 1.8 Internet applications...................................................................... 14 1.9 Mobile applications....................................................................... 19 1.10 Conclusion..................................................................................... 21 2 The Preservation and Archiving of Geospatial Digital Data: Challenges and Opportunities for Cartographers...........................25 Tracey P. Lauriault, Peter L. Pulsifer, D.R. Fraser Taylor 2.1 Introduction...................................................................................26 2.2 Contemporary Cartography...........................................................26 2.3 Geospatial Data and Portals..........................................................28 2.4 Maps, Data, Technological Systems and Infrastructures Leave Historical Traces.................................................................29 2.5 Maps and Archives........................................................................31 2.6 The Rescue and Salvage of the Canada Land Inventory...............32 2.7 Evaluating progress in Preserving Cartographic Heritage............33
X VI I I Table of Contents
2.8 How Can Today’s Maps, Data and Technologies be Preserved?. .34 2.9 Multidisciplinary Archival Research.............................................35 2.10 What is being done?......................................................................43 2.11 Conclusion.....................................................................................47 3 Structural Aspects for the Digital Cartographic Heritage..............57 Markus Jobst, Georg Gartner 3.1 Introduction...................................................................................57 3.2 Prominent Features of Digital Cartography..................................58 3.3 Structural Considerations for the Preservation of Digital Cartography.......................................................................64 3.4 Conclusion.....................................................................................73 4 Archiving the Complex Information Systems of Cultural Landscapes for Interdisciplinary Permanent Access – Development of Concepts................................77 Józef Hernik, Robert Dixon-Gough 4.1 Introduction...................................................................................78 4.2 History of Archives in the Respective Countries.......................... 83 4.3 Complex Information Systems of Cultural Landscapes................ 87 4.4 Archiving Cultural Landscapes..................................................... 90 4.5 Need for Permanent Interdisciplinary Permanent Access............. 93 4.6 Conclusions................................................................................... 95 Section II: Sustainability in Terms of Geospatial Preservation...........99 5 State-of-the-Art Survey of Long-Term Archiving – Strategies in the Context of Geo-Data / Cartographic Heritage.....................101 Nico Krebs, Uwe M. Borghoff 5.1 Introduction.................................................................................101 5.2 Strategies of Long-Term Digital Preservation.............................103 5.3 Which Strategy for which Kind of Data......................................113 5.4 Long-Term Preservation of Geodata...........................................116 5.5 Conclusion...................................................................................123
Table of Contents X I X
6 Preservation of Geospatial Data: the Connection with Open Standards Development..........................................................129 Steven P. Morris 6.1 Introduction.................................................................................130 6.2 Standards Work of the Open Geospatial Consortium (OGC) ....131 6.3 Key OGC Standards and Specifications: Dynamic Geospatial Data...........................................................................133 6.4 Key OGC Standards and Specifications: Data Files and Network Payloads................................................................135 6.5 The OGC Data Preservation Working Group..............................137 6.6 Points of Intersection Between Geospatial Standards and Data Preservation Efforts.....................................................138 6.7 Conclusion...................................................................................143 7 Pitfalls in Preserving Geoinformation - Lessons from the Swiss National Park....................................................................147 Stephan Imfeld, Rudolf Haller 7.1 Introduction.................................................................................148 7.2 Archiving needs in a wilderness area..........................................150 7.3 Hardware threats.........................................................................151 7.4 Software considerations..............................................................153 7.5 Data volume................................................................................156 7.6 Brainware ...................................................................................156 7.7 Documentation............................................................................157 7.8 Integration...................................................................................158 7.9 Conclusions.................................................................................159 8 Geospatialization and Socialization of Cartographic Heritage.....161 Dalibor Radovan, Renata Šolar 8.1 Introduction.................................................................................162 8.2 The Geolibrary Paradigm............................................................162 8.3 Cartographic Heritage and GIS...................................................165 8.4 Geospatial Applications with Cartographic Heritage..................167 8.5 Social Implications of Digital Cartographic Heritage................. 174 8.6 Conclusions and Open Questions................................................ 176
X X Table of Contents
Section III: Keep It Online and Accessible...........................................179 9 More than the Usual Searches: a GIS Based Digital Library of the Spanish Ancient Cartography...............................................181 Pilar Chías, Tomás Abad 9.1 Introduction.................................................................................181 9.2 Digital Approaches to the Cartographic Heritage.......................182 9.3 The Cartographic Databases........................................................185 9.4 Conclusion................................................................................... 202 10 Map Forum Saxony. An Innovative Access to Digitized Historical Maps.................................................................207 Manfred F. Buchroithner, Georg Zimmermann, Wolf Günther Koch, Jens Bove 10.1 Introduction.................................................................................208 10.2 Access to Spatial and Graphic (Historical-Cartographic) Information and Concept of the Contents...................................209 10.3 Structure of and Access to the Digital Collection .....................213 10.4 Requirements for Internet Presentations of Historical Maps. Potential of Zoomify...................................................................215 10.5 Recent Developments..................................................................216 10.6 Prospects......................................................................................217 10.7 Concluding Remarks...................................................................218 11 A WYSIWYG Interface for User-Friendly Access to Geospatial Data Collections ........................................................221 Helen Jenny, Andreas Neumann, Bernhard Jenny, Lorenz Hurni 11.1 Introduction: Common Types of Geospatial Data Collection Interfaces...................................................................222 11.2 The WYSIWYG User Interface.................................................. 227 11.3 Conclusion...................................................................................237
Table of Contents X X I
Section IV: Pragmatic Considerations..................................................239 12 Considerations on the Quality of Cartographic Reproduction for Long-Term Preservation.............................................................241 Markus Jobst 12.1 Introduction ................................................................................242 12.2 Importance of Map Content as Cartographic Heritage...............242 12.3 Overall Considerations on Sustainability of Digital Technologies...................................................................244 12.4 Digital Reproduction Techniques................................................246 12.5 Hybrid Reproduction Issues........................................................247 12.6 The Test Case and its Comparison..............................................250 12.7 Conclusion and Further Aspects.................................................. 254 13 Issues of Digitization in Archiving Processes..................................257 Zdeněk Stachoň 13.1 Introduction.................................................................................257 13.2 Preservation Strategies................................................................259 13.3 Chosen Issues of Digitization...................................................... 261 13.4 Conclusions.................................................................................270 14 Digitized Maps of the Habsburg Military Surveys – Overview of the Project of ARCANUM Ltd. (Hungary)................................273 Gábor Timár, Sándor Biszak, Balázs Székely, Gábor Molnár 14.1 Introduction.................................................................................274 14.2 Overview of the ARCANUM Project.........................................274 14.3 The Rectification Methods.......................................................... 277 14.4 The User Interface (GEOVIEW).................................................280 14.5 Legal Issues.................................................................................280 14.6 Conclusions.................................................................................281 15 The Archiving of Digital Geographical Information......................285 Peter Sandner 15.1 The Statutory Framework for Archiving.....................................286 15.2 Strategies in the Public Archives of the German States..............287 15.3 Conclusion...................................................................................292
X XII
Table of Contents
16 An Intellectual Property Rights Approach in the Development of Distributed Digital Map Libraries for Historical Research.....................................................................295 A. Fernández-Wyttenbach , E. Díaz-Díaz, M. Bernabé-Poveda 16.1 Introduction.................................................................................296 16.2 From Digital Map Libraries to Virtual Map Rooms....................296 16.3 A New Legal Landscape in Geographical Information...............299 16.4 Digital Rights Management........................................................302 16.5 Rights Expression Languages.....................................................304 16.6 Geospatial Digital Rights Management Reference Model..........304 16.7 Future Considerations.................................................................306 16.8 Conclusions.................................................................................306
Section I Understanding Complexity of a Prospective Cartographic Heritage
1 From Plan Press to Button Push: the Development of Technology for Cartographic Archiving and Access..........................3 William Cartwright 2 The Preservation and Archiving of Geospatial Digital Data: Challenges and Opportunities for Cartographers...........................25 Tracey P. Lauriault, Peter L. Pulsifer, D.R. Fraser Taylor 3 Structural Aspects for the Digital Cartographic Heritage..............57 Markus Jobst, Georg Gartner 4 Archiving the Complex Information Systems of Cultural Landscapes for Interdisciplinary Permanent Access – Development of Concepts................................77 Józef Hernik, Robert Dixon-Gough
1 From Plan Press to Button Push: the Development of Technology for Cartographic Archiving and Access
William Cartwright RMIT University, Melbourne, Australia
[email protected]
Abstract Properly archiving cartographic artefacts once involved only the need of providing plan presses (or drawers) within which to store paper maps, drawings and diagrams. Access to the archive was made via manual processes that involved physically handling the documents during retrieval, inspection and return to storage. This methodology, whilst still in use, has, in many instances been replaced or complemented with electronic counterparts. The method for storing cartographic artefacts has changed from the use of a plan press to a button press (via a computerised cartographic archive system). This chapter traces the development of technology-based cartographic archiving systems – from microfiche, to videodisc, to CDROM, to the Internet, the Web and Web 2.0. It provides examples of cartographic collections that have been built and delivered using these different media.
1.1 Introduction The use of computers and optical storage media is now commonplace in museums and historical map libraries. Scanned map collections are available via automated systems within libraries or accessed through Web
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_1, © Springer-Verlag Berlin Heidelberg 2011
4 W. Cartwright
portals. Researchers and the general public alike are able to view and use these repositories of facsimile or virtual map collections without the need to actually handle the originals or to even visit the collection in person. Such immediate access to collections of historical cartographic artefacts has not always been that easy and those interested in viewing historical cartographic documents needed to actually go to a library to view the maps needed. Then, technology was applied to record collections, initially on film or microfilm, and later using high-definition scanners and frame-grabbers. Retrieving historical documents moved from having to physically open drawers and cupboards to clicking a computer mouse. This chapter looks at systems used to record and store historical cartographic artefacts using film, computers and optical media. It has been written to contribute to the understanding of the lineage of current systems and methodologies that are now in use.
Fig. 1.1. Map Library at Kent State University. Source: http://www.library .kent.edu/page/10647
From plan press to button push 5
Fig. 1.2. Map and atlas display at the State Library of Victoria. Photograph: William Cartwright.
1.2 Archiving Historical Cartographic Artefacts Archiving historical cartographic artefacts once involved storing paper maps in map drawers. Map libraries contained hundreds of drawers where these artefacts were retrieved when needed for consultation or research. Figure 1 shows Kent State University Map Library’s metal map drawers that are used to hold its map collection. Other artefacts like atlases, globes and models are stored in cupboards or displayed in glass cabinets. The example in figure 2 shows the method used to display historical maps and atlases at the State Library of Victoria, Melbourne, Australia. Here, purpose-built display furniture is used to house and protect the collection, whilst allowing visitors to the Library to view this important part of the Library’s collection.
6 W. Cartwright
Fig. 1.3. Panorama display at the State Library of Victoria. Photograph: William Cartwright.
1.3 Installations As well as storing maps in drawers and having display cabinets, some libraries incorporated installations into their public areas. One example of this type of exhibition method is the system used in the State Library of Victoria to show panoramas of the City of Melbourne. The Library holds many
Fig. 1.4. Control panel of the panorama display at the State Library of Victoria. Photograph: William Cartwright.
From plan press to button push 7
Fig. 1.5. Detail of the control panel of the panorama display at the State Library of Victoria. Photograph: William Cartwright.
fine panoramas of the city, including the ‘Cyclorama’ of early Melbourne. This one of the earliest 360-degree views showing Melbourne in the 1840s (Colligan, 1990). Scenic artist John Hennings painted the Cyclorama in 1892, based on Samuel Jackson's Panoramic Sketch of Port Phillip of 30 July 1841(State Library of Victoria, 2010). To enable visitors to the library to view panoramas of the city, and thus to appreciate its development a computer installation, shown in figure 3, provides access to six panoramas of the city, presented in chronological order. This collection includes Jackson’s sketch of Melbourne. Visitors interact with the system using a track ball and buttons to select which of the six panoramas available are to be viewed. The interaction panel is shown in figure 4. As noted earlier, the system offers six panoramas of the city. Detail of the choices provided to users via buttons is shown in figure 5.
1.4 Hypercard ‘Hypermapping’ arrived with the introduction of the Apple Macintosh and Apple’s HyperCard. Users could develop interactive products by creating HyperCard ‘stacks’ – black and white images that could be viewed randomly and ‘hyperlinked’ to other ‘cards’ in the stack. An early cartographic product was HYPERSNIGE (Camara and Gomes, 1991), a hypermedia system, which included Portugal’s national, regional and sub-regional maps and information. Parsons (1995), in order to evaluate his virtual worlds-based interfaces, developed the Covent Garden prototype. The user was presented with a
8 W. Cartwright
Fig. 1.6. The Covent Garden area prototype interface. (Parsons, 1995, p. 207)
‘through the window’ view of the market via a 3-D view in perspective. Users could then navigate around the package using conventional cursor controls and mouse clicks on directional arrows indicating movement directions. As well as current information, the package included scanned historical artefacts. This interface is depicted in Figure 6. As well, education was an early innovator using HyperCard and packages like ‘Medieval Florence’, developed jointly by the Department of Educa-
Fig. 1.7. Architectural history system. Source: Ore, 1994, p. 288.
From plan press to button push 9
tional Sciences and the Department of History of Florence; the University of Florence is typical of educational packages developed with this medium. It used the map metaphor as a navigation method. A hypermapping application that contained historical maps was developed at the Norwegian Computing Centre for the Humanities. Map examples from the University Museum of Antiquities in Oslo, dating from 1880 and 1910, were incorporated into an architectural history project (Ore, 1995). Two sets of maps were used – the 1882 1:1000 and the 1910 1:2000 maps of Bergen. This was part of a pilot study on the use of digital maps in the humanities that was published using HyperCard. It incorporated text, photographs, architectural drawings and maps generated from reports from archeological digs. The maps were scanned, geocoded and incorporated into the HyperCard package. Figure 7 shows the schema for hyperlinking documents in the package.
Fig. 1.8. MICROCOLOUR Map Display System™. Source: http://www.microcolour.com/mci040.htm.
10 W. Cartwright
1.5 Microfiche Microformats were used as a medium for the storage of maps during the late seventies and early eighties. Exploratory work was carried out by a research team led by Massey (Massey, Poliness and O'Shea, 1985) who produced an 'Atlas' depicting the socio-economic structure of Australia on thirteen monochrome microfiche. Each microfiche featured one of Australia's capital cities, states and territories. Also, the US Census produced colour microfiche depicting census data that had been previously made available on colour paper maps (Meyer, Broome and Schweitzer, 1975). The example shown in figure 8 is the MICROCOLOUR Map Display System™, used by A-Z Maps and Atlases (United Kingdom) to store colour images of their street mapping. The system utilised colour microfiche to record the map imagery. This example is a high resolution, 148mm x 105mm (6 x 4 inches) colour transparency recorded on Ilfochrome (formerly Cibachrome) colour microfiche. The reduction factor is 7.5X to 17.5X.
1.6 Videodisc Philips made the videodisc publicly available in 1979. The medium stores analogue video signals and can be controlled by programs executed on a computer to which the videodisc is attached. Two types of videodisc were used - CAV (Constant Angular Velocity) for interactive applications and CLV (Constant Linear Velocity) for applications like linear movies (Cotton and Oliver 1994). Each side of a 12” videodisc holds 52,000 frames, which can be viewed as stills, video or animation. The basic design parameters were to develop code in any of a number of programming languages, and then use commands to guide the laser reader in the videodisc player to specific frames. The program controlled the display of frames and access to a database that may reside on the controlling computer or be embedded on the actual videodisc. Programs could be developed as generic code, which could then be used to control other videodiscs produced to similar guidelines. Video laserdiscs became a standardised product through NATO and a specification (STANAG 7035) was set for the Worldwide Defense Mapping Agency (DMA) database. The Canadian Department of National Defence (DND) introduced videodisc mapping in 1987 (Bilodeau, 1994). It
From plan press to button push 11
became the interim geographical information package of preference for the Canadian Forces due to its large storage capacity and rapid retrieval (Bilodeau and Cyr, 1992, Bilodeau, 1994). In Australia the New South Wales Government Printing Office produced four videodiscs containing the State's archival photographic collection in 1986. These videodiscs were later archived at the State Library of New South Wales. The State Library later digitized the contents of the videodiscs (NSW Government Printing Office, 1988). An Australian videodisc prototype for mapping, the Queenscliff Video Atlas, was produced in 1987 (Cartwright 1989a, 1989b). It stored historic maps and images. The system consisted of a Philips videodisc player, a Sony composite (analogue and digital) screen and a PC. Programs were written in Turbo Pascal to control access to frames in the videodisc. Figure 9 shows images from the package – at left the DOS initial interface and at right a typical historical may display. (Note: these images are photographs from the screen and thus are slightly distorted and show some reflections.) Figure 10 illustrates historical aerial photography stored in the package.
Fig. 1.9. Queenscliff Video Atlas - DOS interface and map component.
Fig. 1.10. Queenscliff Video Atlas – aerial photography – vertical and oblique.
12 W. Cartwright
Fig. 1.11. Queenscliff Videodisc – map selection and ‘zoom’ facility.
When using the videodisc package the users are presented with a menu that shows the contents available. Users can select maps and ‘zoom’ to further detail. Storing many images of the map at different scales and then moving to a particular frame that contains that image when the user chooses the zoom option facilitate the zoom. These interaction outcomes needed to be incorporated into the initial product design, as all elements of videodisc packages needed to be pre-determined, and images captured as individual analogue video frames. Figure 11 illustrates choosing a map from the menu and then ‘zooming’ into a chosen part of the map (bottom two frames).
1.7 CD-ROM Sony of Japan and Philips of The Netherlands jointly developed CD-ROM in 1982. Initially, the potential of the large storage capacity of CD-ROMs for the distribution of geographical information fostered interest in publishing digital maps using this new medium (Rystedt 1987, Siekierska and
From plan press to button push 13
Fig. 1.12. CD-ROM introductory ‘page’. Source: The British Library, 1994.
Palko 1986). Products like the Digital Chart of the World (DCW) and the World Vector Shoreline (Lauer 1991) were some of the first products to be made available. Multimedia maps stored on CD-ROM were used as a means to extend the impact of exhibitions. Visitors to exhibitions could take home a complementary CD-ROM package. The CD-ROM, The Image of the World: An interactive exploration of ten historic world maps, was developed as part of The Earth and the Heavens: the art of the mapmaker exhibition
Fig. 1.13. Screen image of the maps described in the CD-ROM. Source: British Library, 1994.
14 W. Cartwright
held in the British Library in 1995. It contains historical images of ten world maps that were also part of the physical exhibition, dating from the 13th to the late 20th Centuries. The CD-ROM charts the development of maps from the medieval period to current times. Users move to the mapping era of interest via a menu (figure 12). One example map (figure 13) is a medieval Christian view of the world circa 1250. At the bottom of the map are options for navigation and detailed information retrieval. An audio narration describes each map example as a flashing highlight moves over the map to indicate the part of the map being described in the narrative. The introductory text option provides voluminous textual descriptions of the project itself and the maps described by the audio narrations. Text from the Resource Directory can be printed out or copied to a computer disc for personal use or study (as can the map images). The Image of the World is an exemplar of the use of multimedia for mapping, making maps available after the exhibition had been attended. The expert narrations and comments make the maps and their makers more readily understandable.
1.8 Internet applications Cartography has shown great interest in using the Internet and the Web for providing access to map collections. An early innovative project was the Alexandria Digital Library (ADL) prototype (Smith 1996), based at the
Fig. 1.14. Broer Map Library. Source: http://www.broermapsonline.org/members/
From plan press to button push 15
Fig. 1.15. Perry-Castañeda Library Map Collection user interface. Source: http://www.lib.utexas.edu/maps/historical/history_texas.html
University of California at Santa Barbara. This Web-based project provided the means for a consortium of researchers, developers and educators to explore the use of a digitally-referenced geographical subject matter and to access a digital library of spatially indexed information. It focused on the provision of spatially indexed information via the World Wide Web. This was followed by many early products, which, in general, just provided hyperlinks to scanned documents. The Web pages were text heavy and it was only later that more interactive interfaces appeared. Even on some current on-line map library applications this is still the case. For example, the interfaces to both the Broer Map Library historical map collection (figure 14) and the University of Texas at Austin’s Perry-Castañeda Library Map Collection (figure 15) are still text -focussed. At Oxford University’s Bodleian Library, a repository to numerous historical artifacts related mainly to Oxford and Oxfordshire, a number of rare map facsimiles are made available via the Web (figure 16). In some instances scholars may use these high-resolution scanned images without the need to formally request copyright clearance. As well, digitized images from the Bodleian Libraries Special Collections that contain maps and map-related objects can also be accessed. These include “Maps in script and print”, “Gough Map - gateway to Medi-
16 W. Cartwright
eval Britain”, “Bringing Laxton to Life” and the Sheldon Tapestry Map of Gloucestershire. To deliver the “Maps in script and print” project the Library is scanning around 1,000 existing 35mm slides and filmstrips. These will be added to the ‘Map Room’. The “Gough Map - gateway to Medieval Britain” provides a digital archive of this fourteenth century map. This is the oldest surviving road map of Great Britain (around 1360). It was discovered by Richard Gough in 1774 and donated to the Library in 1809 (Crowe, 2010). The digitizing project is a joint effort between the Bodleian Library and Queen’s University, Belfast, under the ‘umbrella’ of the ‘Mapping the Realm’ research project funded by the British Academy (Lilley, 2010).
Fig. 1.16. Interface to the Bodleian http://www.bodley.ox.ac.uk/guides/maps/
Library
map
collection.
Source:
From plan press to button push 17
Fig. 1.17. Selected sections of the Bodleian Libraries Special Collections Web access pages. Source: http://www.odl.ox.ac.uk/digitalimagelibrary/index.html.
The “Bringing Laxton to Life” project provides on-line the digitized Mark Pierce 1635 manuscript map of the Village of Laxton (Beckett, 1989). The interest in this village is that it retains an unchanged pattern of feudal agricultural land use. As well as the map, the project also provides a digital replication of the ‘terrier’, which accompanied the map and which describes each of the thousands of strips of land on the map. The terrier includes information about who farmed each strip and its size. The Sheldon Tapestry Map of Gloucestershire (1590s) (BBC, 2010) is part of the Library’s collection of Tudor tapestry maps that were woven in wool and silk (British Library 2009; Reus, 2008). Selected sections of the Library Web access pages are shown in figure 17. A Vision of Britain Through Time was developed at the University of Portsmouth for the Great Britain Historical GIS Project. The site is ‘underpinned’ with GIS technology and delivers the results of GIS analysis through a Web portal. Maps in the system are from 1801-2001. The project was developed from the larger Great Britain Historical Geographical Information System (GBHGIS) that contains information from census reports, historical gazetteers, travellers' tales and historic maps (A Vision of Britain Through Time, 2010).
18 W. Cartwright
Fig. 1.18. A Vision of Britain Through Time – British War Office AMS 1202, Sheet 19 - Central Europe Source: http://www.visionofbritain.org.uk/maps/
Fig. 1.19. Mash-up of historical and current maps of Manhattan. Source: http://www.webmonkey.com/2009/05/historical_map_mashups_turn_cities_into_ glass_onions_of_time/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+wired%2Findex+Wired%3A+Index+3+Top+Stories+2
From plan press to button push 19
1.8.1
Web 2.0
Relatively recently maps are being published on the Web by producerusers using a process called ‘mash-ups’ with Web 2.0 and Social Software. Web 2.0 is the use of the Web by individuals and groups of individuals to provide and share information, including geographical information. It provides a new model for collaborating and publishing. Historical Web mash-ups are part of this new cartography. ‘Hypercities’ provides mashups of historical maps overlaid (or onion-skinned) atop of current maps. This allows users to compare the contemporary city to the historic. The mash-up in figure 19 shows two maps of Lower Manhattan, New York - a 1891 street map overlaid with a 2006 map of the subway system. At the time of writing the site has 39 maps of New York City available from 1766 to 2009. As many maps as needed can be simultaneously overlaid (Calore, 2009).
1.9 Mobile applications A big change to telecommunications came with the arrival of the cellular telephone. Now users acre able to not only converse using the system, but they can also access the Internet whilst mobile and use applications that are downloaded from sites like Apple’s i-Tunes store. The biggest impact has been applications (‘apps’) for Apple’s i-phone. Downloadable software apps are available for many applications including navigation and mapping. This section provides information on typical i-phone applications. Historic Earth (http://emergencestudios.com/historicearth/) was created by Emergence Studios in conjunction with Historic map Works. It is based on Old Map App, an earlier prototype application. It contains geocoded historic maps, which can be overlaid on top of current maps. At the time of writing 32,000 high-resolution images were available, covering several U.S. cities and states.
20 W. Cartwright
Fig. 1.20. Historic Earth. Source: http://emergencestudios.com/historicearth/)
Users can access historical maps at their current location when the inboard GPS on the i-phone is utilised. When the application is used on the Apple iPhone 3GS users are able to rotate the map so that it aligns to the user’s orientation. Figure 20 shows screen images from this application. HISTORY: Maps of the World (http://seungbin.wordpress.com/historymaps-of-world/) is an i-phone app that focuses on historical maps. Twenty historical maps of the world are delivered on the Apple i-phone. Maps range from the US Collective Defense Arrangements of 1967 to the South Pole 1894. There is a world map from 1598 and a NATO map from 1970 (Reisinger, 2009). See figure 21 for screen images from the application.
Fig. 1.21. HISTORY: Maps of the World His. http://images.macworld.com/appguide/images/300/403/629/ss1.jpg
From plan press to button push 21
1.10 Conclusion The use of technology to store and deliver historic cartographic artefacts has provided the means for scholars and the general public alike to access and use cartographic products that were once relatively hard to obtain and use. Technology has been applied through the use of photography (microfiche), computers (installations), interactive media (HyperCard), analogue and digital optical media (videodisc and CD-ROM), communications systems (the Internet, the Web and Web 2.0) and consumer electronics (Personal Digital Assistants (PDAs) like the Apple i-phone). Users perhaps now take for granted that they can immediately access historical cartographic artefacts in the workplace, in schools and universities, at home and when on the move. However, much innovative research and development in the application of technology to storing and delivering cartographic artefacts has taken place. This has ensured that we can access and appreciate the wealth of knowledge that is represented and delivered through historic cartographic representations of the Earth.
1.11 References A Vision of Britain Through Time, 2010, www.visionofbritain.org.uk BBC, 2010, “Sheldon Tapestry Map of Warwickshire”, A History of the World. http://www.bbc.co.uk/ahistoryoftheworld/objects/zfQEm8NfRZ-AzovZiwFhpw. Beckett, J. V., 1989, A History of Laxton: England’s Last Open Field Village. New York: Basel Blackwell. Bilodeau, P. 1994, "Video disk mapping in Canada”, Proceedings Canadian Conference on GIS, 2,pp.1157-1162. Bilodeau, P. and Cyr, D. 1992, "Video Disk mapping - An Interim GIS for the Canadian Military", proceedings Canadian Conference on GIS, Ottawa, vol. 1, pp. 696 - 699. British Library, 2010, “Bringing Laxton to Life: a unique insight into feudal England”.http://www2.odl.ox.ac.uk/gsdl/cgi-bin/library? site=localhost&a=p&p=about&c=mapsxx01&ct=0&l=en&w=iso-8859-1. Accessed 27 March 2010. British Library, 2009, The Sheldon Tapestry Map of Gloucestershire. http://www.bodley.ox.ac.uk/boris/guides/maps/sheldon.html. Accessed 27 March 2010.
22 W. Cartwright Calore, M., 2009, “Historical Map Mashups Turn Cities Into Glass Onions of Time”, WebMonkey, 21 May 2009. http://www.webmonkey.com/2009/05/ historical_map_mashups_turn_cities_into_glass_onions_of_time/? utm_source=feedburner&utm_medium=feed&utm_campaign=Feed %3A+wired%2Findex+Wired%3A+Index+3+Top+Stories+2. Accessed 23 march 2010. Calvani, A., 1990, “Hypermedia: Interactive Exploration of a Medieval Town”, Innovations in Education and Teaching International, 27: 1, 51 — 57. http://dx.doi.org/10.1080/1355800900270107 Camara, A. and A.L. Gomes, 1991, "HYPERSNIGE: A Navigation System for Geographic Information", in proceedings of EGIS '91, Second European Conference on Geographical Information Systems, Brussels, Belgium, April 2-5 1991, pp 175-179. Cartwright, W. E. 1989a, "The use of videodiscs for mapping applications", proceedings of the Australian Optical Disc Conference, Melbourne, Australia. Cartwright, W.E. 1989b, "Videodiscs as a Medium for National and Regional Atlases", paper presented at the 14th International Conference on GIS, Brussels, April 2-5, pp. 175 - 179. Colligan, M., 1990, “Samuel Jackson and the Panorama of Early Melbourne”, LaTrobe Journal, No 45 Autumn 1990. http://nishi.slv.vic.gov.au/latrobejournal/issue/latrobe-45/latrobe-45-012.html Cotton, B. and Oliver, R. 1994, The Cyberspace Lexicon - an illustrated dictionary of terms from multimedia to virtual reality, London: Phaidon Press Ltd. Crowe, J., 2010, “The Gough Map”, The Map Room A Weblog About Maps. http://www.mcwetboy.net/maproom/2005/09/the_gough_map.php. Accessed 27 March 2010. Damasio, A., “The Feeling of What Happens”. San Diego, CA: Harcourt, 1999. NSW Government Printing Office, 1988, "Priceless Pictures from the remarkable NSW Government Printing Office Collection 1870-1950". Lauer, B., J., 1991, "Mapping Information on CD-ROM", Technical papers of the 1991 ACSM-ASPRS Annual Convention. Baltimore: ACSM-ASPRS, vol. 2, pp. 187 - 193. Lilley, K., 2010, “Mapping the Realm: English cartographic constructions of fourteenth century Britain”. http://www.qub.ac.uk/urban_mapping/gough_map/. Accessed 27 March 2010. Massey, J., Poliness, J. and O'Shea, B., 1985, "Mapping the socio-economic structure of Australia - A microfiche approach", The Globe, no. 23, pp 32 - 38, Melbourne: The Australian Map Circle. Meyer, Broome and Schweitzer, 1975, "Colour statistical mapping by the U.S. Bureau of the Census", The American Cartographer, vol. 2, no. 2, pp. 100 117. Ore, E. S., 1994, “Teaching new tricks to an old map”, Computers and the Humanities, Vol. 28, Nos. 4-5 / July, 1994, pp. 283-289.
From plan press to button push 23 Parsons, E., 1995, “GIS visualisation tools for qualitative spatial information”, Innovations in GIS 2, Ed. Peter Fisher, London: Taylor and Francis, pp. 201 210. Reisinger, D., 2009, “Useful educational iPhone apps for students”, cnet news. http://news.cnet.com/webware/?tag=rb_content;overviewHead Reus, B., 2008, “Sheldon Tapestry Map Goes On Display At Oxford's Bodleian Library”, Culture 24, 12 January 2008. http://www.culture24.org.uk/history+ %2526+heritage/art53379. Accessed 28 March 2010. Rystedt, B. 1987, "Compact Disks for Distribution of Maps and Other Geographic Information", proceedings 13th ICC, Morelia, Mexico: ICA, vol. IV, pp. 479 484. Siekierska, E. M. and Palko, S. 1986, "Canada's Electronic Atlas", proceedings AutoCarto, London: International Cartographic Association, vol. 2, pp. 409 417. Smith, T. R., 1996, “A Digital Library for Geographically Referenced Materials”, US Digital Library Initiative. http://www.library.ucsb.edu/untangle/frew.html State Library of Victoria, 2010, Cyclorama. http://www.slv.vic.gov.au/collections/ treasures/cyclorama1.html
2 The Preservation and Archiving of Geospatial Digital Data: Challenges and Opportunities for Cartographers
Tracey P. Lauriault, Peter L. Pulsifer, D.R. Fraser Taylor Geomatics and Cartographic Research Centre (GCRC), Department of Geography and Environmental Studies, Carleton University
[email protected],
[email protected],
[email protected]
Abstract In terms of preserving our digital cartographic heritage, the last quarter of the 20th century has some similarities to the dark ages. In many cases, only fragments or written descriptions of the digital maps exist. In other cases, the original data have disappeared or can no longer be accessed due to changes in technical procedures and tools. Where data has not been lost, as with the Canada Land Inventory, the cost of recovery has been high. Based on experience gained through participation in a major research project focused on preservation, the development of several digital cartographic frameworks, systems and artifacts (e.g. Maps and atlases), multidisciplinary work with archivists, data preservationists, data librarians, public officials and private sector cartographers, the authors discuss possible strategies toward the preservation of maps, geospatial data, and associated technologies – cartographic heritage. The chapter also discusses the findings of two International Research on Permanent Authentic Records in Electronic Systems (InterPARES 2) studies: Case Study 06 The Cybercartographic Atlas of Antarctica and General Study 10 on Preservation Practices of Scientific Data Portals in the natural and geospatial sciences. The chapter concludes with an overview of some of the questions and research opportunities that are emerging from the discussion.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_2, © Springer-Verlag Berlin Heidelberg 2011
26 T. P. Lauriault e t al. “the mottling, wavy lines, and occasional offset in lines on this map bear evidence to several layers of correction sheets applied sequentially to the original. Over an extended period of time, these maps became quite weighty palimpsest of urban development in the most literal sense” Robert R. Churchill (2004, p.16) describing historical urban maps of Chicago.
Quod non est in actis, non est in mundo What is not in the records does not exist Old Latin Proverb
2.1 Introduction In terms of preserving our digital cartographic heritage, the last quarter of the 20th century has some similarities to the dark ages. In many cases, only fragments or written descriptions of the digital maps exist. In other cases, the original data have disappeared or can no longer be accessed due to changes in technical procedures and tools. As new technologies and approaches to data collection and cartographic production are established, new challenges in preserving and archiving of geospatial digital data and maps are emerging. This chapter examines some of these emerging challenges and presents possible strategies for the preservation and archiving of contemporary digital maps, geospatial data, and associated technologies. Developments in contemporary cartography are presented along with an identification of key issues related to preservation and archiving. This discussion is followed by an elaboration on these issues informed by applied historical review and applied research. The chapter concludes with an overview of some of the questions and research opportunities that are emerging from the discussion.
2.2 Contemporary Cartography All of the geospatial sciences, both natural and social, are making increasing use of Internet technology. Internet mapping, such as Google Maps and Google Earth, is increasingly popular with the general public. Many other Web 2.0 tools such as online photo, audio and video sharing ser-
The Preservation and Archiving of Geospatial Digital Data 27
vices; blogs and Geowikis (e.g., Visible Past Initiative) are geo-enabling their content by including the ability to add a georeference and/or providing geographically encoded objects (e.g. GeoRSS) capabilities. The United States Library of Congress for instance is now making some of its photo collections available in Flickr, a popular photo sharing service (Library of Congress 2009), allowing these photos to be used as multimedia content in a variety of historical thematic maps and atlases. National mapping organizations (NMOs) are primarily producing geospatial data that are born digital while continuing to digitize and scan older paper maps and airphotos. NMOs are now rendering their data in online maps and atlases (e.g., The Atlas of Canada), distributing frameworks (e.g., GeoBase) and other key datasets using interoperable data services. The data used to render maps not only come from many sources, but are also now being rendered, in some cases in real time from myriad distributed databases via a number of sharing protocols (e.g http://gcrc.carleton.ca/isiuop-atlas). Concurrently, many datasets are being registered into online discovery portals using international metadata standards (e.g., ISO 19115) and being distributed with new licenses (Wilson and O'Neil 2009, GeoBase, GeoGratis, Science Commons). Cybercartography, a term coined by D. R. Fraser Taylor, is also a new way of approaching the theory and a practice of map making. It is "the organization, presentation, analysis and communication of spatially referenced information on a wide variety of topics of interest and use to society in an interactive, dynamic, multimedia, multisensory and multidisciplinary format" (Taylor 1997, 2003). Cybercartographic atlases, for example, are increasing the complexity of map making while also enriching the way topics, issues and stories are rendered and conveyed (e.g., Living Cybercartographic Atlas of Indigenous Perspectives and Knowledge, Atlas of Canadian Cinema or the Atlas of the Risk of Homelessness). Geospatial data are also increasingly being included into and are inseparable from models and simulations (e.g., general circulation models). Regardless of who is creating geospatial data and maps the increasing volume of digital maps; and irrespective of their increasingly varied and complex form or functions they all have one thing in common; few of these new products are being effectively preserved and many are being permanently lost. There is generally a naïve belief and practice by digital map creators that back-up and storage techniques are enough to ensure preservation. Nothing could be further from the truth. Without a clear strategy for ensuring that the data will be accessible using future technologies, and
28 T. P. Lauriault e t al.
that storage media remain intact, simple backup and storage is not sufficient. We are losing digital spatial data as fast as we are creating them.
2.3 Geospatial Data and Portals Maps are knowledge representations, and in pragmatic terms, data representations. Cartography cannot exist without data, which come in many forms and are cartographically rendered in an increasing number of new ways such as cybercartography, distributed online mapping, and municipal enterprise systems. Databases are often thought of “as relatively new forms of records, the essential concept of structured information gathering has existed for thousands of years” (Sleeman 2004, p.174). For example, the “Ptolemaic census, written in demotic, unearthed by the archaeologist Flinders Petrie in Rifeh in the early twentieth century…was similar to modern-day census data, tabular in structure with data divided into columns and equally as indecipherable to the naked eye” (Sleeman 2004, p.173). Data are therefore more than just “facts, ideas, or discrete pieces of information” (Pearce-Moses 2007). Geospatial data can also be “numbers, images, video or audio streams, software and software versioning information, algorithms, equations, animations, or models/simulations” (National Science Foundation 2005, p.18) which have a spatial referent. Geospatial data, according to one definition by the US Environmental Protection Agency (EPA): “identifies, depicts or describes geographic locations, boundaries or characteristics of Earth's inhabitants or natural or human-constructed features. Geospatial data include geographic coordinates (e.g., latitude and longitude) that identify a specific location on the Earth; and data that are linked to geographic locations or have a geospatial component (e.g., socio-economic data, land use records and analyses, land surveys, environmental analyses)” (EPA 2009). With advancements in scientific methods, computer technology and cartographic innovations, we can infer more from data than ever before. Cumulative sets of data can assist with understanding trends, frequencies and patterns, and can form a baseline upon which we can develop predictions, and the longer the record, the greater the confidence we can have in conclusions derived from them (National Research Council 1995). Thus, preservation of geospatial data is important as these data can provide the raw materials required for future unanticipated uses, especially as technology
The Preservation and Archiving of Geospatial Digital Data 29
advances. The assembled record of geospatial and scientific data “has dual value: it is simultaneously a history of events in the natural world and a record of human accomplishments. The history of the physical world is an essential part of our accumulating knowledge, and the underlying data form a significant part of that heritage” (National Research Council 1995, p.11). These data also portray a history of our geographic, scientific and technological development while databases “constitute a critical national resource, one whose value increases as the data become more readily and broadly available” (National Research Council 1995, p.50). It is cost effective to maximize the returns on these investments by preserving them for the future and disseminating them widely. In addition, the costs of preserving and archiving data are relatively small in comparison with the costs of re-acquisition. The cost of repairing a lost and abandoned dataset yields partial results at a significant cost, as seen in the Canada Land Inventory (CLI) example discussed later in this chapter. There is also an argument to be made that publicly funded data should also be accessible data, now and for future generations.
2.4 Maps, Data, Technological Systems and Infrastructures Leave Historical Traces Maps have been collected in archives and libraries for nearly 2500 years (Ehrenberg 1981, p.23) and today’s digital cartographers and geomatics practitioners may perhaps be unintentionally disrupting that historical practice. Maps, regardless of who created them are “an integral part of the record of a nation's history, and any national archives should include a rich cartographic collection” (Kidd 1981-1982, p.4). Maps and atlases picture a time, they are part of our memory, give and provide evidence of the past while also creating continuity and a sense of belonging. Their role as artifacts is also to “inform and reform” as Churchill reminds us in his historical work on their influence in shaping urban Chicago (2004, p.13). Maps are socially shaped and in turn are also social shapers. They “often are made not on the basis of the territory itself but on some preconceived sense or vision of the territory. Informed by these maps, subsequent actions move the territory toward the vision” (Churchill 2004, p.11) eventually changing the maps themselves. Those visions were consolidated in maps which in turn molded our physical world - today’s cities, boundaries, the nation. They illustrate how we thought, provide an argument with a dis-
30 T. P. Lauriault e t al.
tinct point of view, and these views provide a context to understand or read representations of the social construct of places. The data used to render those maps are no less important: “legacy data, archived data or data used for independent research is very valuable: valuable in terms of absolute cost of collection but more importantly, as a resource for others to build upon” (Wilson and O'Neil 2009, p.2). Historical and contemporary maps inform decisions. Newsworthy, cataclysmic and obvious examples are the information resources accumulated to respond to disasters. For example, the information gathered in response to Hurricane Katrina, including geospatial data and maps, is being preserved by the US Federal Emergency Management Agency and the value of this information is recognized: “one of the best resources we have for preparing for the next major event are the lessons and data accumulated from this catastrophic experience. If we do not preserve this data and use it for research purposes, then we have wasted time and energy and done a great disservice to those who will be affected by the next major hurricane” (Curtis et al., 2006a in Warren Mills et al., 2008, p.477). FEMA is not a national mapping organization, nonetheless it requires specialized data and maps for its ongoing work in emergency preparedness, during an emergency and post emergency work. Maps, beyond being contributions to intellectual material for historians and content for decision makers are intractably tied to the innovative technologies that have created them (Kinniburgh 1981, p.91). They are socialtechnological systems (Hughes 2004) that represent the evolution of map making and “effort should be made to preserve the apparatus of cartography: the tools and the machinery” (Kinniburgh 1981, p.95) or in present-day terms : code, software, metadata models and systems. Mapping and data related technologies have changed the way we do things and “while datasets can be selected for the important data they hold reflecting government policy and administration, they also represent interesting innovations, either technological or organizational…Computer systems that changed what was possible, rather than just re-implemented manual processes, are of great historical interest” (Sleeman 2004, p.183) and can be learning tools for the future. Today’s programmers, especially those involved in the development of open source technologies, are continuously building on previous technological innovations, and there is merit in preserving those. Particularly since these can enable archivists to view today’s content tomorrow. Based on the experience of the authors, many archivists are already thinking in this way. They argue that this requires
The Preservation and Archiving of Geospatial Digital Data 31
keeping maps and data “in the information systems in which they were created”. The ‘records continuum’ concept is built upon this approach. The basic idea is that records can function both actively in the organization in which they were created and passively as part of an archive” (Doorn and Tjalsma 2007, p.9).
2.5 Maps and Archives Maps, data, technologies and their related infrastructures, “are a product of society’s need for information, and the abundance and circulation of documents reflects the importance placed on information in society. They are the basis for and validation of the stories we tell ourselves, the story-telling narratives that give cohesion and meaning to individuals, groups, and societies” (Schwartz and Cook 2002, p.13). Accordingly, the function of an archive in a society “must deal with two intimately related, but separately conceived themes: ‘knowledge and the shaping of archives’ and ‘archives and the shaping of knowledge” (Schwartz and Cook 2002, p.14). Contemporary cartographers should reflect on their role in ensuring their content is preserved in the archive and subsequently shaping future knowledge while also working with archivists to transform the archive so that it can actually ingest their artifacts. This is not to be taken lightly as “memory, like history, is rooted in archives. Without archives, memory falters, knowledge of accomplishments fades, pride in a shared past dissipates. Archives counter these losses. Archives contain the evidence of what went before. This is particularly germane in the modern world” (Schwartz and Cook, 2002, p.18). The following section provides an account of a ‘rescue’ process that recovered an important Canadian geospatial dataset. This account provides some insight into many of the issues related to heritage preservation: lack of clear policy and stewards; technical and methodological obsolescence; time and effort required for data rescue. Given the historical and practical importance of the Canada Land Inventory data, this account also highlights that valuable heritage may be at risk.
32 T. P. Lauriault e t al.
2.6 The Rescue and Salvage of the Canada Land Inventory The Canada Land Inventory (CLI) story demonstrates the importance for cartographers of considering preservation as they are creating maps, compiling data and programming systems. If our cartographic digital heritage is to be preserved it is important that cartographers consider archiving and preservation as an integral part of the life cycle of the creation of new digital geospatial products and maps. The CLI was part of the Canadian Geographic Information System (CGIS) developed in 1963 at the Department of the Environment in Canada. The CGIS was in fact the world’s first GIS. It was a revolutionary idea at the time and “one of the principal driving forces behind the Canadian Geographic Information System (CGIS) was the idea that the CLI maps could be interpreted and analyzed in a myriad of ways if the information could be manipulated by computers” (Schut 2000). It was established “as a joint federal-provincial project to guide the development of policy on the control and management of land-based resources” (Ahlgren and McDonald 1981, p.61). The CGIS ultimately grew to contain the equivalent of thousands of maps and unknowingly became a technology that “spawned an industry that today is worth billions of dollars” (Schut 2000). To demonstrate how innovative CGIS was, Library and Archives Canada (then known as the Public Archives) did not hire their first computer systems specialist with the “responsibility to develop the automation requirements for a National Map Collection intellectual and physical control system” until 1977-78 (Kidd 1981-1982, p.17). The CLI was an incredibly ambitious federal - provincial program that mapped 2.6 million square kilometers of Canada an “the original cost of the program was in the order of 100’s of millions of dollars in the 1970's” (Wilson and O'Neil 2009, p.6). The CGIS was both a set of electronic maps and the “computer programs that allowed users to input, manipulate, analyze, and output those maps” (Schut 2000). By the late 1980s, the CGIS was no longer being used. Priorities changed, people retired, and institutional memory was being lost. Numerous boxes of tapes, and racks of documentation were left behind with only a few computers left in Ottawa that were capable of reading 9 track tapes let alone run the programs (Schut 2000). In 1995, an informal trans-organizational group of individuals from Statistics Canada, the National Atlas of Canada, Archives Canada and a private sector programmer came together
The Preservation and Archiving of Geospatial Digital Data 33
to restore the CLI. They knew this was a valuable heritage dataset, needed some of the layers and they had the skills, know how and more importantly the will to restore the CLI. All had either formerly worked on the CGIS or had an interest in its preservation. Their work consisted of converting into a modern coordinate system a technology that encoded each point as a relative offset (distance and direction) from the previous point, that did not use discrete tiles, and where the entire country of Canada was coded as one enormous database (Schut 2000). On June 18, 1998, the agriculturally relevant portions of the CLI were handed over on one CD and it worked flawlessly on the analytical tools built in anticipation of the new format (Schut 2000). The data were eventually distributed on Natural Resources Canada’s GeoGratis site for free with documentation and some text to help interpret the content. The CLI “rapidly became their most popular product” (Schut 2000). An updated and the original versions are both also available through the CanSIS website. Not all of the CLI data was saved and the cost of the effort described above was very substantial. The CLI demonstrates the archival adage that “where the information and form of the record are so tenuously related, archivists must appraise, acquire, preserve, and control whole systems of information within which various physical media may exist” (Ahlgren and McDonald 1981, p.64).
2.7 Evaluating progress in Preserving Cartographic Heritage The preceding sections have provided an overview of developments in contemporary cartography, issues related to preserving and archiving the resulting artifacts, and the importance of preserving our geospatial information and cartographic heritage. Unfortunately, the technological, institutional and organizational issues related to the long-term preservation of data remain largely unresolved. The basic digital data upon which we depend to inform decisions on planning, health, emergency preparedness, industrial exploration and research are rarely being effectively archived and preserved and, as a result much is being lost, some permanently. John Roeder, a researcher on both International Research on Permanent Authentic Records in Electronic Systems (InterPARES) projects, discovered that one fifth of the data generated by the 1976 Viking (a space probe) exploration of Mars (Cook 1995 and Harvey 2000), the entire 1960 U.S. Census (Waters and Garret 1996) and the works of nearly half of digital music
34 T. P. Lauriault e t al.
composers (Longton 2005) and one-quarter of digital photographers (Bushey and Brauen 2005) have been lost or threatened by technological obsolescence or inadequate preservation strategies. It has been argued that “in archiving terms the last quarter of the 20th century has some similarities to the dark ages. Only fragments or written descriptions of the digital maps produced exist. The originals have disappeared or can no longer be accessed” (Taylor, Lauriault and Pulsifer 2005). It has also been noted that “indeed digital technology is responsible for much of the loss, as storage technology has given a false sense of security against loss and obsolescence (Strong and Leach 2005, p.13) and “an unprecedented firestorm is incinerating Canada’s digital research wealth” (SSHRC 2002).
2.8 How Can Today’s Maps, Data and Technologies be Preserved? Researchers from the Geomatics and Cartographic Research Centre (GCRC), participated in the InterPARES 2 Project precisely to try to answer the question posed in the heading above. InterPARES 2 (IP2) was a research initiative led by the University of British Columbia. The goal of IP2 “was to ensure that the portion of society's recorded memory digitally produced in dynamic, experiential, and interactive systems in the course of artistic, scientific and e-government activities can be created in accurate and reliable form, and maintained and preserved in authentic form, both in the short and the long term, for the use of those who created it and of society at large, regardless of digital technology obsolescence and media fragility” (Duranti 2007, p.115). The GCRC led two IP2 studies in the Science Focus: i) a Case Study about the Cybercartographic Atlas of Antarctica and (Lauriault and Hackett 2005) ii) a General Study examining the preservation practices of scientific data portals (Lauriault and Craig 2007; Lauriault, Craig, Pulsifer & Taylor 2008). The following sections provide a summary of the results of these studies with a focus on the challenges faced, possible strategies for meeting these challenges and research opportunities emerging from the studies.
The Preservation and Archiving of Geospatial Digital Data 35
2.9 Multidisciplinary Archival Research
2.9.1 Case Study 06 (CS06) Cybercartographic Atlas of Antarctica (CAA) The Cybercartographic Atlas of Antarctica (CAA) research project was designed to contribute to developing the theory and practice of cybercartography and emerging forms of geographic information processing. The first phase of the project (completed fall 2007) resulted in the development of a series of chapters or modules that examine and explore topics of interest to both Atlas users and researchers alike. The project was a collaborative effort developed as a project under the Scientific Committee on Antarctic Research’s (SCAR) geographic information program. The model used to develop the CAA includes the use of multimedia cartography and distributed data sources. For more information, the reader is directed to https://gcrc.carleton.ca/confluence/x/XAc. 2.9.1.1
CS06 Research Methodology
The primary information-gathering tool for CS06 was the InterPARES 2 case study questionnaire, comprising 23 questions (InterPARES 2 2003). Two sets of semi-structured interviews at two different development stages of the CAA Project were conducted to answer questions of interest to the archival community (Lauriault and Hackett 2005). Concurrently, these interviews helped GCRC researchers make explicit some implicit, tacit assumptions in the production of the CAA. Reflections on production processes identified both shortcomings and strengths in archival terms. 2.9.1.2
CS06 Observations
The archival research inquiry revealed a number of issues that may prevent effective preservation and archiving. First, much of the data used in the project did not have persistent identifiers. While data records used to create maps and other types of representation often had an identification number or ‘primary key’, there was no long-term strategy to ensure that this value would not change over time; during a database migration, for example. Lack of a persistent identifier may introduce ambiguity that prevents subsequent users or archivists from effectively establishing data
36 T. P. Lauriault e t al.
provenance. Secondly, custom software was developed to create the atlas. This software uses a markup language combined with processing algorithms to define and integrate data resources. While this software is open source and uses many standard approaches (e.g. ‘XML schema’), the software system as a whole was not comprehensively documented. Thus, while archivists and future generations may be able to understand the components of the atlas, establishing an operational version of the CAA may be difficult. GCRC researchers realized the importance of detailed documentation including the possibility for creating training courses, and capturing processes in the CAA's online forums and WIKIs. However, dedicating resources to these activities in a research environment that places priority on peer-reviewed publication presents a challenge. And while the primary funding for the project stipulated a requirement for preservation, no guidelines existed for how this was to be done and more importantly no funding was available to do it! To mitigate the impact of limited documentation the researchers used, well documented open standards, established software development methods, a source code versioning system (to document the evolution of the software) and open source licensing (allowing others to contribute to documentation). In considering source data preservation and archiving, the origin and provenance of the data is an important consideration. The Cybercartographic Atlas of Antarctica (CAA) was endorsed by the Scientific Committee on Antarctic Research (SCAR) the most reputable body in Antarctic science. The Atlas uses data from authoritative and reputable scientific organizations and are accompanied by standards compliant metadata. Data reliability, in archival terms, is therefore assured by the quality of the base data used and by the methods applied by content creators. These data were provided using formats based on open standards and so the risk of effectively ‘locking’ future generations out of the data is small. Some concerns with respect to using proprietary formats for some multimedia objects remain. While the Nunaliit Framework's has an open source license and it generates high performance, standards-based Web applications, some of the multimedia content in the Atlas is encoded in proprietary formats. While the Nunaliit generated application may outlive the multimedia in terms of preservation, it is recognized that at times practical decisions such as the use of video content compressed with proprietary software may result in information loss.
The Preservation and Archiving of Geospatial Digital Data 37
The CAA project included standard intellectual property concerns, although the terms of the Antarctic Treaty system allow much of the data used in the creation of the CAA to be used at little or no cost. The CAA also follows typical license agreements, use rights to objects and data, and copyright while its software, created at the GCRC, is distributed under an open BSD license. The atlas itself includes caveats and disclaimers (e.g. the CAA is intended for information, not navigation purposes) and the project must adhere to the requirements of the funding agency, research clearances and the Nunaliit License (Nunaliit 2006). The CAA production process was considered to be adequate to meet the challenge of technological obsolescence. The use of open source software was thought by archivists to make the CAA more sustainable than if proprietary products were being used. If, for example a popular open source software project is discontinued, the source code may still be available for use in preservation and archiving activities. Additionally, in the rare case that popular projects are discontinued, emerging communities typically develop backwards compatibility that support access to legacy data. Concurrently, complete and available documentation of proprietary formats is also considered important. The use of XML (an open standard) for the content modules should make the CAA easily translatable (via new compilers) into any future markup languages. The CAA also adheres to other open standards such as the OGC interoperability specifications (2006) and the International Standards Organization 19115 Geomatics Standards (2003). Although a strong foundation for preservation of the CAA exists, effort is required to ‘package’ these elements in a way that would promote preservation and archiving. While the components of the CAA may be suitable for archiving, a ‘map’ of the project as a whole does not yet exist. Such a map is required to document the various component of the CAA and the relationships between and among these components. Working with archivists was beneficial to the CAA production process in terms of considering preservation at the point of creation. Preservation issues were considered early in the development process, thus reducing cost and disruptions in development. The case study revealed that the CAA development processes were adequate for preservation and archival purposes. The focus on interoperability, adherence to open source standards, documentation through metadata creation, use of professional software development practices, establishment of data quality standards and are all strong features of the atlas in terms of its suitability for preservation and archiv-
38 T. P. Lauriault e t al.
ing. See Pulsifer et al., (2005, 2008) for additional theoretical and technical details related to the production of the CAA). The InterPARES 2 project also benefited by gaining an increased understanding of cybercartography and collaborative science, practices and processes. InterPARES 2 researchers also learned that the fields of geomatics and producers of scientific data have very rigorous metadata descriptions, excellent standards, and professional data gathering and maintenance procedures that can be used as models for records created in the arts and in egovernment which are other IP2 focus priorities. 2.9.1.3
CAA Preservation Challenges
The greatest challenges limiting the long-term preservation of the CAA are neither technological nor procedural. The greatest roadblock is simply the fact that no Canadian archival institution is currently in a position to ingest the CAA. This is a major problem not only for the Atlas but for the preservation of similar digital products in Canada. The GCRC is having ongoing discussions with members of the Data Library at Carleton University to attempt archive the CAA as required by its funder. The Data Library potentially has both the technological, policy and human resource capacity to archive the CAA but not the technology nor the mandate to do so. Discussions are ongoing with Library and Archives Canada (LAC), the National Research Council and GeoConnections. The CAA project is now complete (CAA 2009), and to date there is no explicit transfer plan in place. The GCRC is fortunate to be located in the Nation’s Capital as it has ready access to Canada’s top officials in key organizations that can assist with resolving this problem, but alas no obvious solutions have presented themselves. The GCRC waits for the creation of an Institutional Repository (IR) at Carleton University that can ingests more than text based research material, or the creation of a cartographic and geospatial data Trusted Digital Repository, a data archive and/or for LAC to develop the capabilities and mandate to ingest the output of publicly funded research on complex digital mapping in Canada. 2.9.2 General Study 10 (GS10) Preservation Practices of Scientific Data Portals Geospatial and science data are increasingly being discovered and accessed in data portals (i.e., data repositories, clearinghouses, catalogues,
The Preservation and Archiving of Geospatial Digital Data 39
archives, geo-libraries and directories). In this context, a portal can be defined as a user interface that acts as a starting point for finding and accessing geospatial and scientific data. Portals can provide all or some of the following services: search and retrieval of data, item descriptions, display services, data processing, the platform to share models and simulations, and the collection and maintenance of data. Much but not all of the data derived from portals are raw in nature and require the user to interpret, analyze and/or manipulate them. The reasons for their creation are one-stop-shopping, distributed responsibility over data sets, discoverability, and reduction in cost as data are stored once and used many times (Lauriault 2003). Data portals are the technical embodiments of data-sharing policies. Individuals within organizations, research projects, or scientific collaborations register their data holdings in the portal via an online form organized according to a metadata standard, and then choose to make their data available for free, sale, viewing or downloading (Lauriault 2003). Metadata standards “establish the terms and definitions to provide a consistent means to describe the quality and characteristics of geospatial data” (Tosta and Michael Domaratz 1997, p.22) and the ISO 19115 metadata (ISO 2003) standard has become an international standard in the field of geomatics. Thus, portals and the data resources that they connect, can be seen as a collective geospatial information artifacts. The GS10 study examined portals to reveal issues related to preservation and archiving of portals. 2.9.2.1
GS10 Research Methodology
The GS10 study included an extensive literature review of publications from national and international scientific organizations, government and research funding bodies and empirical evidence from a selection of IP2 Case Studies and 32 scientific Data Portals most of which included geospatial data (Lauriault and Craig 2007; Lauriault, Craig, Pulsifer & Taylor 2008). A GS10 Survey was undertaken to collect information about the actual practices, standards, and protocols (Lauriault and Craig 2007). 2.9.2.2
GS10 Observations
The portals selected pertained to different communities of practice in geomatics and other sciences that are thematically heterogeneous, and each adheres to that community’s specific methodologies, tool, technologies, practices and norms. As expected, portals are rich repositories of data and
40 T. P. Lauriault e t al.
information that serve the needs of many types of users. The architecture of data portals varies: some are a single enterprise sponsored portal (like a national library); a network of enterprises (like a federation of libraries) or a loose network connected by protocols (like the Web) (NRC 1999). Distributed data portals have datasets described according to a given standard, and when a request is sent to them by a given site a search is executed by a search agent to access or render the data into a map or some other form GRID1 portals. Those using use Web Map Services are an example of these. A Collection level catalog/portal identifies a data custodian’s holdings and uses them to direct searches (e.g. Z39.50, Ocean Biogeographic Information System – Spatial Ecological Analysis of Megavertebrates Populations). A unified catalogue exists in one place: data custodians submit metadata for each data set to a central site which makes them available for searching, and the record directs the user to the data set (e.g. GeoConnections Discovery Portal). Digital collections/portals can be housed in a single physical location (e.g. Statistics Canada), and they may be virtual (e.g. Earth Systems GRID), housed in a set of physical locations and linked electronically to create a single, coherent collection (e.g. Global Change Master Directory, International Comprehensive Ocean Atmospheric Dataset). The distinction between centralized, distributed or unified portals may have funding, policy and preservation implications. Data collections may also differ because of the unique policies, goals, and structure of the funding agencies. There are three functional data collections/portal categories: research data collections; resource or community data collections; and reference data collections (AIP 2007). These are not rigid categories. Research data collections portals contain the results of one or more focused research projects and data that are subject to limited processing. Data types are specialized and may or may not conform to community standards, adhere to metadata standards, or to content access policies. Data collections vary in size but are intended to serve a specific scientific group, often limited to immediate participants. These collections are supported by relatively small budgets, often through research grants funding a specific project, and therefore do not have preservation as a priority (e.g., Indiana University Bio Archive, National Virtual Observatory (NVO)). Resource or community data collections serve a single science, geomatics or engineering community. 1
Grid computing refers to the automated sharing and coordination of the collective processing power of many widely scattered, robust computers that are not normally centrally controlled, and that are subject to open standards.
The Preservation and Archiving of Geospatial Digital Data 41
These digital collections are often large enough to establish communitylevel standards, either by selecting from among pre-existing standards or by bringing the community together to develop new standards where they are absent or inadequate. The CanCore Learnware metadata standard is an example of this type of community standard. The budgets for resource or community data collections are moderate and often supported by a government agency. Preservation is contingent on departmental or agency priorities and budgets (e.g. Canadian Institute for Health Information (CIHI), Southern California Earthquake Center (SCEC), National Geophysical Data Center (NGDC - NOAA)). Reference data collections are intended to serve large segments of the scientific, geomatics and education community. These digital collections are broad in scope; serve diverse user communities including scientists, students, policy makers, and educators from many disciplines, institutions, and geographical settings. Normally they have well-established and comprehensive standards which often become either de jure or de facto standards, such as the Geomatics ISO 19115 Metadata standards. Budgets supporting these are often large and come from multiple sources in the form of direct, longterm support; and the expectation is that these collections will be maintained indefinitely (e.g. Canadian Geospatial Data Infrastructure (CGDI), Global Change Master Directory – Global Change Data Center) 2.9.2.3
GS10 Conclusions
There are three types of issues relating to portals and data quality: i) those related to the portal’s operation and its design, management, and long-term viability; ii) those related to the accuracy of the individual datum and data sets; iii) and those related to the relationship between the portal, its data and services, and the individual or corporate user – essentially those issues that emerge from a history of interaction that builds trust and comfort with the user. The issues that are related to the portal itself are those that are linked to maintaining an authentic memory, especially of the sources of the data, their management or changes over time, and their connections to contributors or sources. Building sites and services that continue to be what they purport to be, and whose changes and transitions over time are visible and knowable to a user build conditions of trust. The InterPARES 1 project developed benchmarks that could be used by portals to ensure that their data continue to be authentic over time.
42 T. P. Lauriault e t al.
Science and geomatics are heterogeneous domains, and each field and subfield has its own culture, methods, quality measures and ways of explaining what they do. Formal ontologies are an emerging method used to help mediate the myriad metadata standards and facilitate the production of meaningful ways to represent the world and preserve the data. Data portals reflect the policies, funding agencies and the technologies chosen by the organizations that create and manage them. Organizational, technological, metadata and data quality considerations aspects affect appraisal decisions and provide challenges for archivists. Science is a collaborative endeavour that is premised on the notion of knowledge sharing, dissemination, reproducibility, verification, and the possibility that new methods will yield new results from old data. Therefore, there is an argument to be made that publicly funded collections of data should be made available to the citizens who paid for them and they should be made available to future generations for the advancement of knowledge. The IP2 research showed that interoperability is a problem with the rapidly increasing number of digital data bases that need to interact if the challenges knowledge integration are to be met. The Cybercartographic Atlas of Antarctica was faced with the challenge of using information from different databases in different countries and, in order to do this, adopted an open source and open standards approach using OGC specifications. This decision was taken primarily for production reasons but has had beneficial effects in archiving and preservation terms as it helps overcome the problem of technological obsolescence. The IP2 Case studies demonstrate that a lack of interoperability can lead to having data that cannot be archived in the same form the creator had intended. Indeed, it can be argued that interoperability is a key element in archiving all digital data and that an open source standards and specifications approach should be a major facet of any archival strategy. For scientific and geomatics disciplines, trust will continue to rest on specific norms of scientific work. Trusted repositories, whose data are kept reliable, accurate, and authentic over time, will need to be established, managed, and funded on a continuing basis. The problems are on three levels: organizational stability, data and metadata management processes, and technological hand shaking across generations. Established archival repositories that are mandated (and funded) to guarantee the continuing availability of scientific data records and information that support administrative, legal, and historical research are needed. Al-
The Preservation and Archiving of Geospatial Digital Data 43
though there are digital repositories for social science data, true digital scientific data archives are few and far between. The IP2 General Study on data portals demonstrated that there are numerous excellent initiatives in place to make data discoverable and accessible. However, few of these data portals archive their data. The few portals that are government funded in the US and simultaneously housed in government departments do have preservation as a mandate or are considered to be government archives, but most portals do not have this type of financial or institutional stability. At risk in particular are the repositories that are distributed and leave issues of data quality to the data custodians or creators. Therefore, much government funded science is not enveloped in any data preservation or archiving processes. This is quite troubling, considering the investment tax payers have made in these endeavours, let alone the loss in knowledge dissemination and building opportunities All stakeholders, including the scientists who create the information, research managers, major user groups and of course the archivists, should be involved in the appraisal decisions on what is to be archived and by whom. This appraisal should be an ongoing process from the point of creation and is best carried out in a project specific fashion, in collaboration with those most knowledgeable about the data. It is recommended that archivists build on existing data portals and extend these activities with archival policies, techniques and technologies. These data have already been appraised as being worthy or else they would not be in the portals. Also, portal creators and maintainers need to seriously consider adding preservation as part of their mandate, since it is highly unlikely in the immediate future that an archive would be able to ingest these holdings. There is merit in having data preserved at their source where that is feasible.
2.10 What is being done? There are some promising international initiatives particularly in the European Union (e.g. Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval (CASPAR), Digital Repository Infrastructure Vision for European Research (DRIVER), UK Data Archive (UKDA) and the Data Archiving and Networked Services (DANS) in the Netherlands)) and in the US (e.g. The Cyberinfrastructure Project and National Geospatial Digital Archive), and a number of thematic initiatives (e.g., Sierra Nevada ecosystem Digital Spatial Data Archive, European Digital
44 T. P. Lauriault e t al.
Archive of Soil Maps, LSU Geographic Information Systems GIS Clearinghouse Cooperative (LGCC)). And as previously discussed there are a number of portals, data archives and GRID computing systems as part of the GS10 Portal Study. The Open Geospatial Consortium (OGC) Data Preservation Working Group was created in December 2006 to address technical and institutional challenges posed by data preservation, to interface with other OGC working groups, which address technical areas that are affected by the data preservation problem, and to engage in outreach and communication with the preservation and archival information community. This is a very promising initiative, as the OGC is dedicated to interoperability, open standards and open specifications that help overcome many of the issues of platform dependency. The OGC has also done excellent work on the production of the de facto standards of Internet mapping internationally, and this working group is dedicated to developing prototypes and testbeds with software vendors. The CODATA Working Group on Archiving Scientific Data has been holding symposia and workshops on the topic, and the Canadian National Committee for CODATA has been active in documenting and reporting scientific data activities. The Preserving Access to Digital Information (PADI) initiative based in Australia provides excellent practical resources for cartographers and data producers who wish to gain practical information on how to go about the preservation of their resources. The US, the Earth Institute at the Columbia University portal for Geospatial Electronic Records also includes a number of recommendations regarding the management and preservation of geospatial data. Finally, the International Council for Scientific and Technical Information (ICSTI) annual conference Managing Data for Science will be hosted in Ottawa in June 2009 and will focus on issues of data access and preservation. In Canada there is much discussion but to date very little concrete action. GeoConnections is the Government of Canada agency mandated to create the Canadian Geospatial Data Infrastructure (CGDI). GeoConnections conducted a study on Archiving, Management and Preservation of Geospatial Data which provided a well rounded analysis of preservation issues in the field of cartography such as: technological obsolescence; formats; storage technologies; temporal management; and metadata. The Study also provides a list of technological preservation solutions with their associated advantages and disadvantages, and a list of proposed institutional and national actions. A number of studies, reports and committees have made high level recommendations and provided strategies for improving the
The Preservation and Archiving of Geospatial Digital Data 45
archiving of digital data in Canada, as they all recognize the poor state of Canada’s digital data resources. The SSHRC National Data Archive Consultation Report discussed the preservation of data created in the course of publicly funded research projects and identified important institutions, infrastructures, management frameworks and data creators and called for the creation of a national research data archive. The report Toward a National Digital Information Strategy: Mapping the Current Situation in Canada indicates that “the stewardship of digital information produced in Canada is disparate and uncoordinated” and “the area of digital preservation, which involves extremely complex processes at both the organizational and technical levels, comprehensive strategies are not yet being employed. Many feel that much of the digital information being created today will be lost forever.” The Final Report of the National Consultation on Access to Scientific Data, developed in partnership with the National Research Council Canada (NRC), the Canada Foundation for Innovation (CFI), Canadian Institutes of Health Research and NSERC, expressed concern about “the loss of data, both as national assets and definitive longitudinal baselines for the measurement of changes overtime.” This report also provides a comprehensive list of recommendations that include ethics, copyright, human resources and education, reward structures and resources, toward the creation of a national digital data strategy and archive. In December 2006, Library and Archives Canada hosted a National Summit on a Canadian Digital Information Strategy. The challenges of the new Web 2.0 social computing environment, open access, interoperability and licensing among numerous other topics were discussed. While progress is being made and many discussions are in progress, the implementation of concrete solutions lags far behind the rhetoric The problem is not confined to Canada. Few nations are developing comprehensive digital data (I.e., science, research and geospatial) strategies, let alone preservation strategies. National mapping organizations (NMOs) and governmental geospatial data producers have a head start over the private, academic and not for profit sectors since there are often archival accessioning rules in places. There is no guarantee, however, as seen with the CLI example, that governments will preserve these artifacts, nevertheless there is a framework and some resources to take action are available. Research cartographers and data producers, may be able to rely on institutional repositories (IR) providing of course they exist in their home countries. Such repositories must however be able to ingest more than text based research and the custodians must be willing and have the resources
46 T. P. Lauriault e t al.
(i.e., technical, skill, financial and mandate) to carry out this function.. An IR is “is a specific kind of digital repository for collecting, preserving, and disseminating -- in digital form -- the intellectual output of an institution, particularly a research institution” (Glick 2009). The Registry of Open Access Repositories keeps a list of open access IRs (ROAR 2007). IRs are however not archives. An “archive is normally understood as being a trusted steward and repository. but note, it is not just for objects that are digital - it may acquire these and many are actually doing so - but its status as an 'archive' and the trust reposed in its work is largely related to its mission, organizational transparency, and of course, that niggling issue of long-term viability” (Craig 2009). Trusted digital repositories (TDRs) are works in progress, and none have yet been certified, these are IRs and archives which could be seen as “an archives of digital objects only” (Craig 2009). A TDR is any kind of digital repository that meets the requirements of the Trusted Repository Audit Checklist (Centre for Research Libraries 2008). The audit checklist “brings together existing best practice and thought about the organizational and technical infrastructure required to be considered trustworthy and capable of certification as trustworthy. It establishes a baseline definition of a trustworthy digital repository and lays out the components that must be considered and evaluated as a part of that determination” (Glick 2009). Additional information can be found in the Records Library Group (RLG) and the Online Computer Library Center (OCLC) Trusted Digital Repositories: Attributes and Responsibilities (OCLC 2002) or the RLG and the US National Archives and Records Administration (NARA) Audit Checklist for Certifying Digital Repositories (2005). Finally, around the world there are some map libraries such as “McGlamery’s networked Map and Geographic Information Center at the University of Connecticut, that ingest digital maps, and cartographers and geospatial data producers are encouraged to spearhead initiatives of their own and begin a conversation with digital map librarians, archivists, content creators and the curators of IRs. Such an approach will be challenging and will require vision and leadership, but it is better than the current state of affairs where our map, data and technological system heritage is disappearing and leaving a large 'blank spot in history' not unlike the one found in the Jedi Archive by Obi-Wan Kenobi (Ketelaar 2002)! As Information and Communication Technology evolves and new forms of information exchange and computing emerge, new challenges for preservation and archiving will materialize. Of current interest are the implications for using distributed systems such as the networks established by
The Preservation and Archiving of Geospatial Digital Data 47
Spatial Data Infrastructure programs, GRID computing systems, and a more recent development; cloud computing. Cloud computing is a nebulous term used to describe any number of distributed models being established by mainstream industry (as opposed to GRID computing in the science). Some key issues and concerns in relation to GRID and Cloud computing are in establishing information provenance in a highly distributed environment ii) establishing a discrete archival record that can be captured managed over time. Doorn and Tjalsma identify the challenges faced: “A descriptive way to explain computational grids is by analogy with the electric power grid. The latter provides us with instant access to power, which we use in many different ways without any thought as to the source of that power. A computational grid is expected to function in a similar manner. The end user will have no knowledge of what resource they used to process their data and, in some cases, will not know where the actual data came from. Their only interest will be in the results they can obtain by using the resource. Today, computational grids are being created to provide accessible, dependable, consistent and affordable access to high performance computers and to databases and people across the world. It is anticipated that these new grids will become as influential and pervasive as their electrical counterpart.”(Doorn and Tjalsma 2007, p.16) While these trends present challenges, they simultaneously provide research opportunities. Given the nature of the issue, it is clear that solutions will require multi and interdisciplinary collaboration that can address the technical, cartographic, preservation, archival and larger social issues implicated in preserving these new information and knowledge phenomena.
2.11 Conclusion If we acknowledge that “remembering (or re-creating) the past through historical research in archival records is not simply the retrieval of stored information, but the putting together of a claim about past states of affairs by means of a framework of shared cultural understanding” (Schwartz and Cook 2002, p.3), then as cartographers and geospatial data producers we need to assure that our artifacts inform that cultural framework. As has been discussed in this chapter, much has been lost. Also, many are beginning to take seriously the fact that “archivists no longer have the luxury of waiting for thirty years to make appraisal decisions. Selection has to be made very near to, if not at, the time the record is created” (Sleeman 2004,
48 T. P. Lauriault e t al.
p.180). In other words creators will need to work collaboratively with archivists, librarians, technology specialists to design cartographic artifacts that will stand the test of time and build them accordingly (Doorn and Tjalsma 2007; Kinniburgh 1981; Sleeman 2004; Schut 2000; Wilson and O'Neil 2009; Ahlgren and McDonald 1981). The development process of Cybercartographic atlases such as the Antarctic Atlas exemplify this practice. On the positive side existing efforts can be built upon, such as existing science and geospatial data portals where appraisal, cataloguing and issues of data quality have already been addressed. The next step will be to transform these into trusted digital data repositories. Concurrently, we need national digital strategies to indentify “future research needs and the establishment of mechanisms that allow stakeholders to consider the potential gains from cooperation in planning the data resources required to meet these needs” (Doorn and Tjalsma 2007, p.13). Finally, “records are not only a reflection of realities as perceived by the “archiver”. They constitute these realities. And they exclude other realities” (Ketelaar 2002, pp.222-223). The map is not the territory, it creates a record of the territory and it occasions it. It is not just a recording: it constitutes the event. Fortunately, many of the 20th century’s digital cartographers are still alive and therefore can be a part of the preservation process and can discuss the history of the mapmaking process of their digital artifacts, and they could also be representatives of technology that created these (Kinniburgh 1981). The solution of the problems of the preservation and access to the remarkable explosion of digital cartography and the emerging variety and volume of Geospatial products is one of the greatest challenges of the 21st century. It is hoped that this book will make a significant contribution by cartographers to meeting those challenges.
2.12 Acknowledgements Much of the content of this chapter is the result of two Canada Social Sciences and Humanities Research Council (SSHRC) funded research projects, InterPARES 2 at the University of British Columbia and Cybercartography and the New Economy at Carleton University.
The Preservation and Archiving of Geospatial Digital Data 49
2.13 References Agriculture and Agri-Food Canada. 2009. Canadian Soil Information System (CanSIS). [Online] (Updated 27 November 2008) Available at: http://sis.agr.gc.ca/cansis/ [Accessed 1 April 2009]. Ahlgren, Dorothy & McDonald, John. 1981, The Archival Management of a Geographic In formation System, Archivaria 13, pp. 59-65. American Institute of Physics (AIP). 2001. AIP Study of Multi-Institutional Collaborations: Final Report. Highlights and Project Documentations [Online] Available at: http://www.aip.org/history/publications.html [Accessed 17 August 2007]. Ballaux, Bart. 2005. CS26 Most Satellite Mission: Preservation of Space Telescope Data Report. Vancouver: InterPARES 2. Bushey, Jessica & Braun, Marta. 2005. Survey of Record-Keeping Practices of Photographers Using Digital Technology Final Report. Vancouver: InterPARES 2. Caquard, Sebastien. 2009. Cybercartographic Atlas of Canadian Cinema. [Online] Available at: http://www.atlascine.org/iWeb/Site/atlasen.html [Accessed 1 April 2009] Centre for Reseach Libraries. 2008. Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist. [Online] Available at http://www.crl.edu/content.asp?l1=13&l2=58&l3=162&l4=91 [Accessed April 1 2009]. Churchill, Robert R. 2004. Urban Cartography and the Mapping of Chicago, The Geographical Review, 94(1), pp.1-22. CODATA. 2009. Canadian National Committee for CODATA. [Online] Available at: http://www.codata.org/canada/ [Accessed April 1 2009]. CODATA. 2009. Preservation of and Access to Scientific and Technical Data in Developing Countries. [Online] Available at: http://www.codata.org/taskgroups/TGpreservation/index.html [Accessed April 1 2009]. Cook, Terry. 1995. It's Ten O’clock, Do You Know Where Your Data Are? Technology Review. 98, pp.48-53 Craig, Barbara.
[email protected] 2009. IR or TDR vs Archive. [EMail]. Message to Tracey P. Lauriault (
[email protected]). Sent March 3 2009 10:50 AM. Data Archiving and Networked Services (DANS). 2009. Home Page. [Online] (Updated March 23 2009) Available at: http://www.dans.knaw.nl/en/ [Accessed April 1 2009]. Doorn, Peter and Heiko Tjalsma. 2007. Introduction: Archiving Research Data. Archival Science. 7, pp.1–20. DRIVER (Digital Repository Infrastructure Vision for European Research). 2009. Digital Repository Infrastructure Vision for European Research Home Page.
50 T. P. Lauriault et al. [Online] (Udated 2009) Available at: http://www.driver-repository.eu/ [Accessed April 1 2009]. Duranti, Luciana. 2007. Reflections on InterPARES. The InterPARES 2 Project (2002-2007): An Overview. Archivaria 64, pp.113-121. Earth Institute at Columbia University. 2005. Center for International Earth Science Information Network (CIESIN).Geospatial Electronic Records web site [Online] Available at: http://www.ciesin.columbia.edu/ger/ [Accessed April 1 2009]. Ehrenberg, Ralph E. 1981. Administration of Cartographic Materials in the Library of Congress and National Archives of the United States. Archivaria 13, pp.23-29. Environmental Protection Agency (EPA), 2009, Environmental Information Exchange Network & Grant Program Glossary, [Online] (Updated September 25th, 2007) Available at: http://www.epa.gov/Networkg/glossary.html#G [Accessed 28 March 2009]. European archive on the soil maps of the World (EuDASM). 2009. Overview of EuDASM. [Online] Available at: http://eusoils.jrc.ec.europa.eu/esdb_archive/ EuDASM/EUDASM.htm [Accessed April 1 2009]. European Commission - Sixth Framework Programme. 2009. Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval (CASPAR) User Community! [Online] (Updated 2006) Available at: http://www.casparpreserves.eu/ [Accessed April 1 2009]. GeoConnections. 2005. Archiving, Management and Preservation of Geospatial Data. [Internet] Ottawa: Government of Canada (Published 2005) Available at http://www.geoconnections.org/publications/policyDocs/keyDocs/geospatial_ data_mgt_summary_report_20050208_E.pdf [Accessed 17 January 2007]. Geomatics and Cartographic Research Centre (GCRC). 2009. Living Cybercartographic Atlas of Indigenous Perspectives and Knowledge. [Online] Available at: https://gcrc.carleton.ca/confluence/display/GCRCWEB/Living+Cybercartographic+Atlas+of+Indigenous+Perspectives+and+Knowledge [Accessed 1 April 2009]. Glick, Kevin.
[email protected]. 2009. IR or TDR vs Archive. [E-mail]. Message to Tracey P. Lauriault (
[email protected]). Sent March 3 2009 5:54 PM. Gupta, A., B. Ludascher, M.E. Martone. 2002. Registering Scientific Information Sources for Semantic Mediation. Lecture Notes in Computer Science. no. 2503, pp.182-198. Harvey, Ross. 2000. An Amnesiac Society? Keeping Digital Data for Use in the Future. LIANZA 2000 Conference, New Zealand. Hughes, Thomas P. 2004. Human Built World: How to Think About Technology and Culture. Chicago: The University of Chicago Press.
The Preservation and Archiving of Geospatial Digital Data 51 International Council for Scientific and Technical Information (ICSTI). 2009. 2009 Conference Managing Scientific Data [Online] Available at: http://www.icsti2009.org/02-program_e.shtml [Accessed April 1 2009]. International Standards Organization (ISO). 2003. Geographic information – Metadata ISO 19115. [Online] Available at: http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26020 [Accessed 10 January 2006]. InterPares 1. 2002. Authenticity Task Force Report. (InterPARES 1) Vancouver [Internet] Available at http://interpares.org/display_file.cfm?doc=ip1_atf_report.pdf [Accessed 1 April 2009]. InterPARES 2. 2003. 23 Case Study Questions that the researchers should be able to answer at the completion of their investigation, [Online] Available at: http://interpares.org/display_file.cfm?doc=ip2_23_questions.pdf [Accessed September 2006]. InterPARES 2. 2006. International Research on Permanent Authentic Records in Electronic Systems Glossary. [Online] Available at: http://interpares.org/ip2/display_file.cfm? doc=ip2_glossary.pdf&CFID=137165&CFTOKEN=15668692 [Accessed September 2006] InterPARES 2. 2009. International Research on Permanent Authentic Records in Electronic Systems Internet Site. [Online] Available at: from http://interpares.org/ [Accessed 2 April 2009]. Ketelaar, Eric. 2002. Archival Temples, Archival Prisons: Modes of Power and Protection. Archival Science 2, pp.221-238. Kidd, Betty. 1981-1982. A Brief History of the National Map Collection at the Public Archives of Canada. Archivaria 13, pp.3-22. Kinniburgh, Ian A.G. 1981. The Cartography of the Recent Past. Archivaria 13, pp.91-97. Lauriault, Tracey P. 2003. A Geospatial Data Infrastructure is an Infrastructure for Sustainable Development in East Timor,, Master’s Thesis, Ottawa: Carleton University.. Lauriault, Tracey P. & Craig, Barbara. 2007. GS10 - Preservation Practices of Scientific Data Portals. [Online] Available at: http://interpares.org/ip2/ip2_case_studies.cfm?study=34 [Accessed 1 April 2009]. Lauriault, Tracey P. & Hackett, Yvette. 2005, CS06 Cybercartographic Atlas of Antarctica Case Study Final Report. [Online] Available at: http://interpares.org/ip2/ip2_case_studies.cfm?study=5 [accessed October 2006]. Lauriault, Tracey P.; Caquard, Sebastien; Homuth, Christine & Taylor, D. R. Fraser. 2009. Pilot Atlas of the Risk of Homelessness. [Online] Available at: https://gcrc.carleton.ca/confluence/display/GCRCWEB/Pilot+Atlas+of+the+R isk+of+Homelessness [Accessed 1 April 2009].
52 T. P. Lauriault e t al. Lauriault, Tracey P.; Craig, Barbara; Pulsifer, Peter. L. & Taylor, D. R. Fraser. 2008. Today's Data are Part of Tomorrow's Research: Archival Issues in the Sciences. Archivaria 64, pp.123-181. Library and Archives Canada (LAC). 2007. Toward a Canadian Digital Information Strategy: National Summit (Library and Archives Canada) [Internet] Available at: http://www.collectionscanada.ca/cdis/012033-601-e.html [Accessed 27 January 2007]. Library of Congress. 2009. Library of Congress Flickr pilot. [Online] Available at: http://www.flickr.com/photos/library_of_congress/collections/721576013555 24315/ [Accessed 1 April 2009] Longton, John. 2005. Record Keeping Practices of Composers Survey Report. [Online] Available at: http://www.interpares.org/ip2/ip2_general_studies.cfm? study=27 [Accessed 23 August 2007). MacDonald, John & Shearer, Kathleen. 2005. Toward a National Digital Information Strategy: Mapping the Current Situation in Canada. (Ottawa: Library and Archives Canada) [Internet] Available at: http://www.collectionscanada.gc.ca/ obj/012033/f2/012033-300-e.pdf [Accessed 1 April 2009). National Digital Information Infrastructure and Preservation Program (NDIIPP). 2007. Digital Preservation: The National Digital Information Infrastructure and Preservation Program. [Online] Available at: http://www.digitalpreservation.gov/index.html [Accessed 27 January 2007]. National Geospatial Data Archive (NGDA). 2009. Home Page. [Online].(Updated March 2 2009) Available at: http://www.ngda.org/ [Accessed April 1 2009]. National Library of Australia. 2009. Preserving Access to Digital Information (PADI). [Online] Available at: http://www.nla.gov.au/padi/about.html [Accessed April 1 2009]. National Research Council (NRC). 1995. Commission on Physical Sciences, Mathematics, and Applications, Preserving Scientific Data on Our Physical Universe: A New Strategy for Archiving the Nation's Scientific Information Resources [Online] Available a: http://www.nap.edu/catalog.php? record_id=4871 [Accessed 23 August 2007]. National Research Council (NRC). 1999. Spatial Information Resources, Distributed Geolibraries. [Online] Available at: http://www.nap.edu/openbook.php? isbn=0309065402 [Accessed 23 August 2007]. National Science Foundation (NSF). 2005. Report of the National Science Board: Long-Lived Digital Data Collections: Enabling Research and Education in the 21st Century. [Online] Available at: http://www.nsf.gov/pubs/2005/nsb0540/ [Accessed 23 August 2007]. National Science Foundation (NSF). 2008. NSF Wide Investment Cyberinfrastructure. [Online] (Updated July 10 2008) Available at: http://www.nsf.gov/news/priority_areas/cyberinfrastructure/index.jsp [Accessed April 1 2009]
The Preservation and Archiving of Geospatial Digital Data 53 Natural Resources Canada. 2006. GeoBase Unrestricted Use Licence Agreement. [Online] (Updated 2 may 2006) Available at: http://www.geobase.ca/geobase/en/licence.jsp [Accessed 1 April 2009]. Natural Resources Canada. 2009. Geogratis Splash Page. [Online] Available at: http://geogratis.cgdi.gc.ca/ [Accessed 1 April 2009]. Natural Resources Canada. 2009. The Atlas of Canada. [Online] (Updated 22 May 2007) Available at: http://atlas.nrcan.gc.ca/site/english/index.html [Accessed 1 April 2009]. Nunaliit.2006. The Nunaliit Cybercartographic Framework License. [Online], Available at: http://nunaliit.org/license.html [Accessed 1 April 2009]. Nunaliit.2009. The Nunaliit Cybercartographic Framework Internet Site. [Online] Available at: http://nunaliit.org/index.html [Accessed 1 April 2009]. Open Geospatial Consortium (OGC). 2006. OGC Specifications (Standards). [Online] Available at: http://www.opengeospatial.org/standards [Accessed September 2006] Open Geospatial Consortium (OGC). 2009. Data Preservation Working Group. [Online] Available at: http://www.opengeospatial.org/projects/groups/preservwg [Accessed April 1 2009]. Preston, Randy.
[email protected]. IR or TDR vs Archive. [E-Mail]. Message to Tracey P. Lauriault (
[email protected]). Sent March 2 2009 6:44 PM. Pulsifer, P.L., Parush, A., Lindgaard, G., & Taylor, D. R. F. (2005). The Development of the Cybercartographic Atlas of Antarctica. In D. R. F. Taylor (Ed.), Cybercartography: Theory and Practice (pp. 461-490). Amsterdam: Elsevier. Pulsifer, P. L., Caquard, S. & Taylor, D.R. F. 2007. Toward a New Generation of Community Atlases - The Cybercartographic Atlas of Antarctica. In Cartwright, W., M. Peterson and G. Gartner eds. Multimedia Cartography, Second Edition, Springer-Verlag. Ch.14. Pulsifer, P.L., Hayes, A., Fiset, J-P., & Taylor, D. R. F. (2008). An Open Source Development Framework in Support of Cartographic Integration. In Peterson, M. (Ed.). International Perspectives on Maps and the Internet (pp.165-185). Berlin: Springer. Reagan, Brad. 2006. The Digital Ice Age: The documents of our time are being recorded as bits and bytes with no guarantee of future readability. As technologies change, we may find our files frozen in forgotten formats. Will an entire era of human history be lost? (Popular Mechanics) [internet] (Published 2006) Available at: http://www.popularmechanics.com/technology/industry/4201645.html [Accessed April 1 2009]. Records Library Group (RLG) and the Online Computer Library Center (OCLC). 2002. Trusted Digital Repositories: Attributes and Responsibilities. An RLGOCLC Report. (OCLC Pulbication ) [Internet] Available at: http://www.oclc.org/programs/ourwork/past/trustedrep/repositories.pdf [Accessed April 1 2009].
54 T. P. Lauriault e t al. Records Library Group (RLG) and the US National Archives and Records Administration (NARA). 2005. Audit Checklist for Certifying Digital Repositories (OCLC RLG) [Internet] Available at http://worldcat.org/arcviewer/ 1/OCC/2007/08/08/0000070511/viewer/file2416.pdf [Accessed April 1 2009]. Registry of Open Access Repositories (ROAR). 2007. Home Page. [Online] Available at: http://roar.eprints.org/index.php [Accessed April 1 2009]. Schut, Peter. 2000. Back from the Brink: the story of the remarkable resurrection of the Canada Land Inventory data. [Online] Available at: http://www.igs.net/ ~schut/cli.html [Accessed 27 March 27 2009]. Schwartz, Joan M. & Cook, Terry. 2002. Archives, Records, and Power: The Making of Modern Memory. Archival Science, 2, pp.1-19. Science Commons. 2009. The Science Commons Page. [Online] Available at: http://sciencecommons.org/ [Accessed 1 April 2009]. Sierra Nevada ecosystem Project (SNEP) Digital Spatial Data Archive. 2009. Home Page. [Online] Available at: http://ceres.ca.gov/snep/data.html [Accessed April 1 2009]. Sleeman, Patricia. 2004. It’s Public Knowledge: The National Digital Archive of Datasets. Archivaria 58, pp.173-200. Social Science and Humanities Research Council (SSHRC). 2001. National Data Archive Consultation Report, Phase 1: Needs Assessment Report. (SSHRC Report) [Internet] Ottawa, Available at: http://www.sshrc.ca/site/aboutcrsh/publications/da_phase1_e.pdf [Accessed April 1 2009]. Social Science and Humanities Research Council (SSHRC). 2002. Final Report of the SSHRC National Consultation on Research Data Archiving, Building Infrastructure for Access to and Preservation of Research Data. [Online] Available at: http://www.sshrc.ca/web/about/publications/da_finalreport_e.pdf [Accessed 23 August 2007]. Social Science and Humanities Research Council (SSHRC). 2002. Research Data Archiving Policy. [Online] Available at: http://www.sshrc.ca/web/apply/policies/edata_e.asp [Accessed 23 August 2007]. Strong, David F. & Leach, Peter B. 2007. The Final Report of the National Consultation on Access to Scientific Data, [Online] Available at: http://ncasrdcnadrs.scitech.gc.ca/NCASRDReport_e.pdf [Accessed 5 January 2007]. Taylor, D. R. F. 1997. Maps and Mapping in the Information Era, Keynote address at 18th ICA Conference, Stockholm, Sweden. In Ottoson, L. ed., Proceedings, 1, Swedish Cartographic Society. Taylor, D. R. Fraser, Lauriault, Tracey P. & Peter L. Pulsifer. 2005. Preserving and Adding Value to Scientific Data: The Cybercartographic Atlas of Antarctica. In PV2005: Ensuring Long-Term Preservation and Adding Value to Scientific Technical Data. Edinburgh. Taylor, D.R. Fraser. 2003. The Concept of Cybercartography. In Peterson, M. ed. Maps and the Inernet, Elsevier. pp.413-418.
The Preservation and Archiving of Geospatial Digital Data 55 Taylor, D. R. F., Pulsifer, Peter L.and Lauriault, Tracey P. 2005. Preserving and Adding Value to Scientific Data: The Cybercartographic Atlas of Antarctica. PV2005, Ensuring Long-term Preservation and Adding Value to Scientific and Technical Data. Edinburgh, Scotland. Tosta, Nancy & Domaratz, Michael. 1997. The U.S. National Spatial Data Infrastructure. In Geographic Information Research: Bridging the Atlantic, ed. Massimo C. Craglia and Helen Couclelis, London, 1997. UK Data Archive (UKDA). 2009. About Page. [Online] (Updated April 1 2009) Available at: http://www.data-archive.ac.uk/about/about.asp [Accessed April 1 2009]. Visible Past initiative. 2009. Home Page. [Online] Available at: http://www.visiblepast.net/home/ [Accessed 1 April 2009]. Warren Mills, Jacqueline; Curtis, Andrew; Pine, John C.; Kennedy, Barrett; Jones, Farrell; Ramani, Ramesh & Bausch, Douglas. 2008. The clearinghouse concept: a model for geospatial data centralization and dissemination in a disaster. Disasters, 32(3), pp.467-479. Waters, Donald & Garrett, John. 1996. Preserving Digital Information, Report of the Task Force on Archiving of Digital Information. [Online] Available at: http://www.rlg.org/en/page.php?Page_ID=114 [Accessed 11 January 2007]. Wilson, C. and Robert A. O'Neil, 2009, GeoGratis: A Canadian Geospatial Data Infrastructure Component that Visualises and Delivers Free Geospatial Data Sets.
3 Structural Aspects for the Digital Cartographic Heritage
Markus Jobst, Georg Gartner Research Group Cartography, Vienna University of Technology,
[email protected],
[email protected]
Abstract The preservation of digital cartography may result in a digital cartographic heritage in future. One main requirement is the understanding of structural aspects in terms of technological and semiotic dependencies. These dependencies become clear when prominent features of modern digital cartography are discussed and their important core characteristics lead to structural considerations to describe a digital heritage´s content. On the basis of these remarks a conceptual cartographic heritage architecture can be designed, which should help to understand technical complexity within the preservation of digital cartography.
3.1 Introduction Historic geospatial contents form an important part in actual planning, documentation and cartographic applications. Spatial planning situations take historic developments or states into account and therefore use historic maps, cartographic applications and geo-reference data. The main dilemma with historical geospatial content occurs, when the required content cannot be accessed, understood or (geo-)referenced, which is a result of technological dependencies, loss of semiotic description and loss of metadata descriptions. Analogue maps, like paper maps, offer a visible depiction at
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_3, © Springer-Verlag Berlin Heidelberg 2011
58 M. Jobst, G. Gartner
any time and request a legend and reference frame in order to be spatially usable. Simple digital maps need much more: beside an application that offers some interaction with the digital map, the format, the data´s reference frame, transmitting media and transmitting media´s characteristics / requirements need to be supported in order to receive a visible map. This requirement´s complexity is also true for (primary) digital geoinformation, which is mainly stored in bits and bytes. The ongoing investigation in the field of cartographic heritage focuses on latest technical developments of modern maps, which lead to neo-cartographic environments and its related archiving concepts. The steps from digital maps to multimedia-, web-, and Service-oriented maps result in real-time content which is affected by user participation in ubiquitous environments. It becomes obvious that these distributed, interactive, multimedia and real-time maps can hardly be archived by following old archiving paradigms: to keep the application/content in a save place forever. Instead, new methods have to be developed in order to keep digital contents “online”, accessible and thus ensure modern cartographic heritage, which is the historic application of tomorrow. Accompanied by technical methods, legal issues and the interdisciplinary understanding of archiving have to be adapted to the prospective historic use of digital geoinformation and modern maps. This is a starting point to remove the main barrier for the prospective history in modern cartography. This contribution lists prominent features of digital cartography which play an important role for preservation of these maps. In combination with technical core elements and their structural aspects in modern maps, the complexity of sustainable preservation in digital cartography should be expressed in terms of technical dependencies.
3.2 Prominent Features of Digital Cartography Modern cartography is heavily influenced by digital approaches. Reproduction processes as well as dissemination procedures make use of digital mechanisms that mostly enhance traditional processes. This technical changes lead to new and extended applications as well as use cases. Thus the access to geospatial information becomes public and even the creation of maps can be done by public too, as it is shown with the OpenStreetMap initiative (www.openstreetmap.org). In terms of cartographic heritage these technological developments of neogeography result in new chal-
Structural Aspects for the Digital Cartographic Heritage 59
lenges for enabling sustainable cartographic heritage for the future. The latest cartographic developments and their core characteristics lead to a more complicated framework for the preservation in digital cartography. 3.2.1
Cartographic developments in the digital domain
Latest cartographic developments extensively make use of digital approaches. Thus new areas of cartography span geovisualization, web mapping, geospatial services, location based services, locative media, volunteered geography and neo-cartography. Geovisualization includes scientific visualization of geospatial content, which is mostly derived from data analysis (Andrienko et al 2007). Geovisualization focuses on the use of computer graphics to create visual images which aid in understanding of complex, often massive numerical representation of geospatial concepts or results. It emphasizes knowledge creation by different information visualization and therefore extends knowledge storage. The combination of GIS (analysis) and geovisualization allows for a more interactive exploration of data with the base functionalities of map layer exploration, zooming, altering of visual appearance and digital interfaces (Jiang et al 2003). Additionally geovisualization advantages concern the ability to render time- and space-changes in real time, expand the visual exploration to n-dimensions and allows users to adjust mapped data abstraction in real time (MacEachren et al 1997). Geovisualization uses digital transmission media, like computer displays, for information presentation. Therefore the resolution and content complexity for one single view is restricted at the moment. The main advantage is the dynamic character of geoinformation presentation. Web mapping describes creation and dissemination processes for maps that make use of the Internet. These processes cover the designing, implementing, generating and delivering of maps via the World Wide Web (Peterson 2003). Beside the technological issues of how to establish Internet maps, theoretic aspects concern additional studies of web mapping: the usability of web maps, the techniques and workflows' optimization and even social aspects (Kraak et al 2001). Therefore web mapping serves as presentation media with an increasingly amount of analytical capabilities (Web GIS, Web Services). In addition developing client devices, like PDA, hand-held or mobile phones, expand web mapping to ubiquitous cartography, where maps are time- and space- independently available (Raper et al 2007).
60 M. Jobst, G. Gartner
Geospatial Web Map Services (WMS) focus on a server sided processing of geospatial information with the general aim to render simple images and send these to the client. Variations of Web Services (WFS, WCS, WPS, WPVS ,... www.opengeospatial.org) even enable direct manipulation of geospatial database contents and analysis, which simulates full GIS functionality via the Internet. As effect these services have great impact on expert's and layperson's user experience, due to their accessibility via the Internet. Location Based Services (LBS) make use of Web Services by reason of utilizing the geographical position of a device and offering location and task relevant information and entertainment services. Thus LBS incorporate services to locate the device, to access content gazetteers, which enable the use of various information, and to possibly track user moods. For example LBS include personalized weather services or location based games (Reichenbacher 2004, Schiller 2004, Gartner et al 2008). Locative Media describe media of communication that are bound to a location. Accompanied by the technical possibilities of Web Services and LBS, digital presentation media (pictures, video, sound, ...) can be virtually applied to locations. This leads to a triggering of real social interactions. Although mobile positioning technologies, like GPS or mobile phones, enable the intense spreading of location media, these technologies are not the main aim for an ongoing development in projects in this field. Instead the social component, which provides information on the relationship of consciousness to a place and other people, forms the framework to actively engage, discuss and shape spatial bound topics in a very wide public environment (Galloway et al 2005). The notion volunteered geography or neo-geography subsumes public use and creation of geospatial data. It can be seen that the public use of web technologies is a major development in cartography that opens new opportunities. Neo-geography is a notion for “new geography”, which bases on a public access to geospatial data and participation in geographic applications (Turner 2006). The access to geospatial data is executed via the Internet and various Web Services (like Google Earth). The participation in geographic applications describes the user's possibility of recording and sharing geospatial data, which own special personal/individual importance. In addition to the public recording and exchanging of geospatial data, the notion neo-cartography combines neo-geographic characteristics with ubiquitous cartography and geo-media techniques. Beside a time- and
Structural Aspects for the Digital Cartographic Heritage 61
space-independent access to maps and modification of geospatial data, neo-cartography takes the characteristics of transmitting media, the impact of information-content and user needs for the presentation of geospatial information into account. The new aspects of neo-cartography indicate the possibility to directly access mental imagery by using user inputs. The ubiquitous existence of maps and a public participation develop a social imagery of space that should be used for the abstracted and simplified presentation of space. Cartographic developments in the digital domain depict main important core characteristics of digital cartography for preservation activities, which will be explained in the next section. 3.2.2
Important core characteristics for preservation
The core characteristics expand modern cartography in terms of interaction, availability and participation. Whereas maps on “traditional” transmitting media focus on a precise or even artistic documentation of Earth's surface and its best usability, many of the modern maps follow the aim of a expressive and effective presentation with immersive interfaces and easy distributable/accessible technologies. The difficulty for cartographic heritage bases on the complexity that comes along with digital base media, time- and space independence as well as participation in content supplementation. There is not one transmitting media with its precise tuned geospatial content, but various, resolution differing interfaces that are accessed for any reason/context and are even completed by map users to their needs. Therefore the core characteristics of modern cartography relate to the user interface, independency by time, space and device as well as user participation. 3.2.2.1
User Interface (UI)
The user interface in modern cartography varies from paper to displays, epaper or hard- and softcopy holograms. The characteristics of these technologies heavily differ and thus deliver various qualities in resolution. Whereas the paper called for a resolution above 1200 dpi (dots per inch) for an expressive visualization of “binary” graphics, most of displays just offer between 72 and 96 dpi. This resolution variation of the information transmitting interface leads to a reduction of information depth in order to keep information content perceivable (Brunner 2001). Thus the informa-
62 M. Jobst, G. Gartner
tion content has to be adapted to the transmitting media in use. If a recipient uses a mobile phone with a display of 320 x 240 pixel instead of a PC display with 1024 x 768 pixel, the content of the map has to be reduced by means of the perceptive border. On the other hand digital displays may deliver a higher immersion with an appropriate adapted geospatial content. It becomes easy to generate virtual 3D environments and therefore invoke intuitive transmission of third dimension topography (Bülthoff 2001). This expansion of usable interfaces in modern cartography and as a consequence thereof varying information depth (depending on the interface) necessitates some modelling of these aspects for cartographic heritage. The reduced geospatial information content of a mobile device visualized on a high resolution display will make no sense, e.g. if this content will be visualized on the large display of a library system. The original operating range, information environment and context scope is not available just by storing/archiving the software application. This media-relevant information has to be embedded, for example in specific metadata. 3.2.2.2
Independence by time and space
One main characteristic of modern cartography, especially ubiquitous cartography, is its time- and space independence. Cartographic applications and geospatial content can be accessed any time and everywhere for any occasion. The Internet as well as wireless transmission technologies make this access possible. Therefore a highly context sensitive content, which covers all possible eventualities, describes the basement for most usable ubiquitous cartographic applications (Reichenbacher 2004, Nivala 2005, Gartner et al 2008). The content and application should be usable in any situation, like a leisure- or business trip, which may also change during a journey. In consideration of cartographic heritage this content flexibility leads to a complex archiving structure that needs ongoing investigation. On one hand masses of information wait for their combination and appropriate use, on the other hand distributed networks, ad-hoc connections and active as well as passive sensors establish an ubiquitous environment. Neither emulation nor migration, as these are in use for archiving information and applications nowadays, can help for archiving an ubiquitous environment. This special cartographic application, which depends on distributed networks and ad-hoc connections in order to build up an ubiquitous environment, cannot be isolated, put in an image and kept functional
Structural Aspects for the Digital Cartographic Heritage 63
by reason of long-term archiving. Thus the development of appropriate archiving strategies is requested for future investigations. 3.2.2.3
User participation
In addition to device variety and time- and space independence, user participation is the third big aspect in modern cartography. Using special devices/sensors and distributed networks (or more general: ubiquitous environments), map users are able to collect geospatial data and even leave their “moods” for specific places behind. These collections are then incorporated into the geospatial environment/application, which may be used by others. The same social mechanism and technology is used for the collection of geodata in order to establish freely available geodatasets. As consequence of this “social mapping activity”, social interesting areas and topics can be identified, which is most essential information for a series of user-oriented cartographic tasks. In terms of cartographic heritage it becomes obvious that “saving” this dynamic content leads to more questions than solutions at the moment. How can we deal with the notion of completeness and consistency? What does actuality mean for these socialbased data? Can incremental archiving procedures be created? In fact the social image of geospatial facts will also be influenced by individual knowledge/perspectives, social preferences, political motivated interests and so on. Maybe the social interest can also be motivated in a way that sustainable archiving can be solved as distributed volunteered network in future? The cartographic heritage relevant core characteristics of modern cartography show the increasing complexity of digital heritage architectures. In addition to digital requirements, individual and social interests are increasingly embedded in geospatial structures, which leads to unrevealed archiving processes. In the end the question for sensitive map content and applications arises. What are the historical values that are worth saving? Is it enough to keep original data (first model data) or do we have to save the map product or even entire virtual environments for keeping our cartographic heritage?
64 M. Jobst, G. Gartner
3.3 Structural Considerations for the Preservation of Digital Cartography The complexity of geospatial datasets, geoinformation and geospatial applications is growing exponentially. Comparisons with homogenous datasets, like those resulting from scanning, clearly show additional requirements if not only the file-size of datasets is growing, but also interrelations of features and attributes concerning e.g. cultural landscapes, become important. Additionally the access methodology to these complex structures need further investigation, as the UNESCO convention for intangible cultural heritage (http://www.unesco.org/ culture/ich/) explains the importance of a sustainable handling of nature, universe and cultural developed techniques. Herein the indemnification of interoperability, consistent archives and ad-hoc analysis of complex applications are main foci for further methodical developments. This section focuses on structural considerations for the preservation of digital cartography, which span from save-worthy values to media-dependency, content description, conceptual cartographic heritage architecture and characteristics of SOA-based structures. 3.3.1
Save-worthy values
Historical values in context with maps generally cover cultural, geometric and informative values. Cultural values show how the map content was actually understood or used and thus became abstracted/depicted within the map graphics. For example political influences become obvious when map graphics were distorted for specific propaganda expressions. Geometric values describe the precision of map projection on one hand and document the relation to reality on the other hand, like topographic elements show the location of real world objects. Informative values mainly discover influences of technical driven processes. Due to technical developments, map production technologies change. Alike new dissemination procedures and interaction possibilities develop. At least all the facets of historical values can be adapted to a content-based and artistic-based part. The content-based part of historic values covers topographic, topological and thematic aspects. It describes element's geometry, the relation of elements among each other and the relation to non geometric, but geospatial guided information. Additionally the topological aspect also supports
Structural Aspects for the Digital Cartographic Heritage 65
an overall map picture, which can enhance “important” statements. Therefore the immanent strategies within maps, which help to keep all the content perceivable, evoke some information emphasis according to a context, task or political importance. These strategies inevitably document a map's usage and status. The artistic-based part covers social pictures and technical driven processes. Social pictures base on the semiotic acceptance of society, which means that graphical coding and design follow a general zestfulness. The game with zestfulness and disfavour creates some kind of impact, which can be used to specifically mark a map, its content or parts of a map in the mental knowledge. Mainly this social acceptance bases on geographical regions, cultural communities or educational levels. It can be heavily influenced by mass-media (Kroeber-Riel 2000), like it is done in advertisement industry. Technical driven processes make their influence on map qualities, information depth or extent of dissemination remarkable. Depending on the production processes of a map, the actuality and dissemination extent will vary. In addition the quality/resolution of information carrier allows for specific information depth and requires adapted coding mechanisms. The combination of interaction, multimedia and virtual reality expands publicly acceptable presentation forms. Its effectiveness is used for high immersive geospatial transmissions. Nowadays the potpourri of technical developments and actual techniques marks off from former techniques. Therefore maps will be easily associated with the corresponding time slot of creation and map use, which is an important point for the assessment of a map, its content and importance. However, the historic values, its content- and artistic-based parts, can easily show that a sustainable cartographic heritage depend on a complete anthology of involved parts. Which means that information content, transmitting media as well as multimedia methods are related in order to make a geospatial statement or serve a specific task best. Although it may be easy to archive first model data (originating data for map production), their transformation to a second model with the preparation, embedding, usage and immanent characteristics express main historic values that will document time, space and context ranges and may be of importance in future. Thus it can be stated that a holistic application “map” defines cartographic heritage, whereas the holistic map means that all components are stored/saved at (or at least accessible from) one place. This brings up difficulties in the new field of ubiquitous cartography, where services, time-
66 M. Jobst, G. Gartner
and space independence, distributed networks and interface flexibility prevent a holistic approach for cartographic heritage. The intent to create a first conceptual structure for a digital cartographic heritage leads to the definition of main components in digital cartography. This technical fundament can describe main difficulties in digital map archiving and plays an important role for the preservation of digital cartography in future. 3.3.2
Media-dependency
The determination if geospatial data analogically exist and were digitized afterwards (analog-born) or were digitally created and do not have any tangible original (digital born) has generally to be made. Whereas analogborn data can follow the predominant paradigm of archiving (store and save), digital-born data call for new methods in archiving. The reason is that analog-born data can be created by the original template anytime, often with much higher quality due to improvements in digitizing technologies. Digital-born data do not have an original master, which can be used for “digitalization”. Instead concepts for long-term preservation are needed in order to access these digital originals. Digitalization of analogue data enables an easy dissemination, access and professional analysis. Just the step of digitalization cannot be seen as “archiving procedure” of the original material! In fact the original is needed for a renewal of the digital representation. In contrary a reconstruction of an original (the rebuilding of an analogue master) out of a digital representation calls for appropriate qualities: in case of maps, which mainly consist of graphics and line art, reproduction specifications will need about 1200 dpi. Geospatial data and -applications, that only exist as digital-born artwork, need new archiving strategies, like for instance migration or emulation. These copying- and accessing-methods try to keep digital born content read- and accessible although the technical framework changes. 3.3.3
Heritage´s content description
The main components in digital cartography from an archiving viewpoint form a fundament for cartographic heritage. If one of the components cannot be successfully archived, the access to the whole cartographic application is in danger. Thus these components are closely related to each other.
Structural Aspects for the Digital Cartographic Heritage 67
The components can be classified to content, format, application, device and storage, which all together imply specific strategies for sustainable preservation. 3.3.3.1
Content
The content of a cartographic application is the information media. This means that the selection of information according to a context or map use and the graphical coding of this selection builds up the content. For this reason semiotic rules have to be considered, which help to make content understandable and usable (Chandler 2002). Semiotic rules cover semantics, syntax and pragmatics. In terms of cartographic heritage it becomes important to clearly associate semantics, syntax and pragmatics of a content to its operational area. Then its statement and aimed application becomes clear and can be used for alternate use or proceeding interpretation. In general a picky selection of metadata facilitate powerful search algorithms for the content or provide additional background knowledge of the content, like metadata of information acquisition and precision will inform about data quality. 3.3.3.2
Format
The format describes the structure of the digital document, which should be read/used by an application. For cartographic heritage the documentation of this structure is inevitable important, because this will help to build an application in case of inaccessibility. For example the old word processing format “Word Perfect” cannot be opened by one of the most spread word processing software. This leads to a loss of file content, if no appropriate software for reading the file format exists anymore or cannot be rebuild. In addition to the format documentation, the kind of format is of importance. Two various kinds of format can be identified: binary and ascii. While a binary format is a direct machine code, which cannot be humanly read, the ascii format can directly read with any text editor. Thus the ascii format should be favored in terms of cartographic heritage. Format standards like Extensible Markup Languages (XML) are also of the form ascii. By these means the content of an open document format (ODT), which is of the family XML, can always be accessed by simple text editors even if the word processing application does not exist anymore. For this reason an extensive usage and standardization of GML (one main import-
68 M. Jobst, G. Gartner
ant XML standard for the geospatial domain) support geospatial preservation. 3.3.3.3
Application
The application forms the processing part of a multimedia map, which is some kind of software. Generally an application calls for specific hardware requirements, like the playing of sound requires a soundcard and speakers or virtual 3D environments need graphic cards, appropriate drivers and specific displays. Therefore the application is closely related to the device. Similar to the format, an application can be encapsulated in a compiled proprietary format or readable as open source application. Regularly the software has to be compiled for reason of operating system and processing unit usage. A proprietary application, which is not well documented and lacks of an API (Application Programmable Interface) or similar programmable extensions, can hardly be adapted to new operating system- and hardware environments. If the source code of an application is accessible, this piece of software can also be adapted to any new hardware environment. A lot of examples of various Open Source initiatives prove that an open source code enables individual extensions and adaptations to operating systems and hardware environments, like the Open GIS software GRASS (www.grass.itc.it), which runs on any operating system, even on PDA's, in the meantime. 3.3.3.4
Device
The device is the interface between computer and human being. This interface plays a central role for an expressive and effective transmission of information. For cartographic applications the interface plays the role of information carrier. The interface makes geospatial information accessible for the human sensual system. Depending on the graphical resolution of the interface, geospatial information depth of the map has to be adapted/prepared in order to keep the content perceptible. If a specific interface does not exist anymore, the original information content cannot be transmitted again. For example stereoscopic information for lenticular displays on a standard display will show up illegible fractals due to the splitting of information according to the lenticular lenses. If this specific information should be sustainably archived, also the interface is needed for a correct presentation.
Structural Aspects for the Digital Cartographic Heritage 69 3.3.3.5
Storage
Storage is the most important aspect for cartographic heritage, as it is for any heritage topic. As soon as an element can be accessed in future, it can be called heritage. For digital cartographic heritage with an increasing amount of data, information, applications and so forth, new storage procedures have to be developed. The old paradigm of keeping and saving is only partly true for digital media, which also request constant temperature and atmosphere. But most of digital media's lifetime does not exceed 20- 30 years (Borghoff et al 2003). Therefore these kind of media call for a constant copying to newer storage media until a next migration period has been reached. Within these processes of copying, mistakes in writing the content and the table of contents have to be avoided. As soon as the table of content does not fit the data any more, the archive is destroyed. Unfortunately this statement is true because a table of content describes the allocation of data, which generally makes these data detectable and accessible. It is obvious that storage is still one of the most important components within the cartographic heritage architecture, as it was with analogue media. If storage does not work, neither content nor format, application or device can be usefully accessed. Therefore storage is the basis of cartographic heritage. It relates to the storage device, an application, format and the content. Depending on characteristics of all four components, storage strategies have to be developed, which sustainably enable reconstruction in case of disaster. Furthermore storage is responsible for a long-term accessibility of the whole construct, the comprehensive digital map, which also includes its functionality, usability or semantic. 3.3.4
Conceptual cartographic heritage architecture
The archiving components of digital cartography can be considered in an isolated way, each component for itself. In fact this isolated approach would be not sufficient, because each component shows up relations with the others. Thus the whole framework leads to a first conceptual cartographic heritage architecture.
70 M. Jobst, G. Gartner
Fig. 3.1. A first conceptual cartographic heritage architecture showing main dependencies of considered core components.
On one hand this graphic shows the grade of digitalization, on the other hand cartographic heritage depth can be defined. The grade of digitalization starts with content and its storage media. In principle this categorization begins with analogue media, like a paper map on its storage media paper and a printed content. As soon as a digital content has to be processed, the device and format of the data become important. Consequently the processing application and its dependencies have to be considered. Cartographic heritage depth covers content-based- and artistic-based parts of historic values. Thus a very first description of cartographic heritage starts with storage media, its material, fabrication and condition. In a next step the content with its syntax, pragmatics and semantics adds to storage media. Storage media and content form an artefact, which allows to suggest the resulting application and usability framework. Finally full cartographic heritage depth for an digital cartographic tool additionally covers device, format, application and most of all the interface of the map. The format, application and device form a triangle, which depicts a very close relation of these components. The format has to be processed by an application and therefore understood by this piece of software. In addition the format has to make sense for the device, which means that e.g. sound formats will not be usable by graphical devices. The application needs to support a format and the device. Normally the application transforms a file's content for a specific device. For example a mixture of 2D geodata are processed by an application to show a virtual 3D city model. According to this, the specific device including its processing hardware has to understand the output of an application and present it in a most appropriate
Structural Aspects for the Digital Cartographic Heritage 71
way (which includes that the interface is capable and best suited for this specification). For example the rendering of a virtual 3D city model will deliver a more immersive information transmission on stereoscopic interfaces than on standard displays. The content with its semantic, pragmatic and syntactic dimension bases on the triangle of format, application and device. Looking at a bottom-up approach in the given graphic, the dimensions of content are accordingly placed to the device, application and format. In terms of cartography the syntax of the content follows characteristics of the device (resolution, immersion, ...) and application (interaction, dynamics, ...). For example the resolution of a device defines information depth and therefore affects the preparation of information. The pragmatic dimension relates to the device, application and format. All three parts guarantee best/appropriate usability in specific situations. The semantic dimension mainly pertains to the format, which can additionally map the meaning of a content. For example an object-oriented structure may map the meaning and relations of natural element structures. 3.3.5
SOA-based structures
The archiving of a SOA structures does not only concern technological means. In point of fact the responsibility and process methodology for the protection of geodata and maps has to be discussed in order to structure a technological framework. Assume a working archiving framework, like specific Services in the Internet, and no one who is responsible for parts of the cartographic heritage structure (content, meaning, format standards, ...), then the digital content and entire sequences within the SOA network will definitely be lost. This loss can be exemplarily observed within the picture database flickr (www.flickr.com), when images are removed and links become outdated. In order to overcome this barrier of cartographic heritage, main important aspects that exist today and are needed for a prospective archiving of SOA, have to be identified. 3.3.5.1
SOA characteristics
The concept of Service-Oriented Architecture (SOA) requires loose coupling of services with operating systems and other application-based technologies. Loose coupling means that classes of programming have almost no knowledge of each other. Beside a clear definition of interfaces, operations
72 M. Jobst, G. Gartner
Fig. 3.2. The conceptual cartographic heritage structure in a distributed (Serviceoriented) architecture. In addition to the complex single cartographic applications, the network structure or at least its functionality, protocols, actuality and integrity needs to be mapped.
and attributes of classes are exchanged and adapted for the coupling in real-time. Services maintain a relationship that minimizes dependencies and only requires that they maintain an awareness of each other. The coupling, or interfacing/binding, of SOA services generally makes use of XML, though this is not required (Bell 2010). SOA services are separated modules of data and functions that are accessible via a network in order to be combined and reused for any production of applications. The interoperability and accessibility of these services follow the main principle of SOA: the publish-find-bind principle. Following the aim that services should be bound for an application, those services have to be found in the network. Therefore established services have to be published, which means that their existence becomes maintained in online registries and their capabilities and descriptions stored in accessible meta-databases. This basic principle of SOA leads to its main components: services, metadata and registries. Furthermore ser-
Structural Aspects for the Digital Cartographic Heritage 73
vices may access data. So these resources (data) have to be online and accessible as well. From this point of view a lot of questions arise when thinking on serviceoriented architectures: How can we archive service-oriented applications, that depend on the Internet, communication protocols and ad-hoc connections? The conceptual architecture may help to keep the main dependencies within the range of vision when going into more detail. 3.3.5.2
SOA preservation approaches
There are a lot of initiatives for archiving the WWW. These web-archiving initiatives collect portions of the WWW and store these parts in archives. Web-crawlers are used for an automated collection because of the massive size (Brown 2006, Brügger 2005). Although all links are stored in the archive, these copies miss the dynamic characteristic of the Internet and can only document one single state of time. Other approaches make use of SOA for digital preservation. These approaches build up single modules that are needed within digital preservation, like format conversion- or migration services. At least the preservation architecture can be accessed by individuals as well as archives at the application layer (Ferreira 2007) in order to fulfill preservation tasks. Digital cartography that relies on SOA heavily uses a corporate network or the Internet. This framework does not use SOA modules for preservation processes, but calls for preserving the SOA structure, interfaces and contents. This means that a mapping of the network structure or at least its functionality, protocols, actuality and integrity needs to be solved.
3.4 Conclusion This contribution could show a conceptual cartographic heritage architecture mainly for digital processes that bases on broad understanding of modern cartography, follows heritage relevant core aspects and should be helpful for cartographic heritage concepts in future. Especially latest developments in the field of service-oriented cartographic applications call for intensive investigations on archiving and sustainable access. The broad understanding of modern cartography gathers new notions that emerge from latest technologies. These notions reach from geovisualization to neo-cartography and concern new ways to make geospatial in-
74 M. Jobst, G. Gartner
formation accessible. Nowadays the paradigm's development is shaped by geospatial map services, ubiquitous cartography and user participation, which leads to the core aspects of modern cartography: the user interface, time- and space independence and user participation. All three aspects make an increasing complexity for cartographic heritage architectures clear. In addition to difficulties in archiving digital environments, the embedding of ubiquitous information and social interests in geospatial structures leads to unsolved archiving procedures and a dubious cartographic heritage in future. A possible cartographic heritage of the future is built today. Each digital cartographic application that will be accessible in future times creates cartographic heritage of the future. In order to resort to nowadays geospatial knowledge and identified spatial processes, which are mostly embedded to service-oriented architectures for dissemination purposes, these geospatial data, cartographic information and services have to be sustainably kept active. For a very first step archiving components of digital cartography define core elements for a digital cartographic heritage. Their interrelated dependency build up a framework for a first conceptual cartographic heritage structure. The aim of this structure is to understand main dependencies of the considered core elements and to keep these in further developments of archiving procedures in mind.
3.5 References Andrienko, G., Andrienko, N., Jankowski, P, Keim, D., Kraak, M.-J., MacEachren, A.M., and Wrobel, S. (2007) Geovisual analytics for spatial decision support: Setting the research agenda. International Journal of Geographical Information Science, 21(8), pp. 839-857. Bell, M. (2010) SOA Modeling Patterns for Service-Oriented Discovery and Analysis. Wiley & Sons. pp. 390. ISBN 978-0470481974. Bergeron B. P. (2002) Dark Ages 2: When the Digital Data Die; Prentice Hall PTR, New Jersey Borghoff U.M., Rödig P., Scheffcyzk J., Schmitz L. (2003) Langzeitarchivierung – Methoden zur Erhaltung digitaler Dokumente; dpunkt Verlag; Heidelberg. Brown, A. (2006) Archiving Websites: a practical guide for information management professionals. London: Facet Publishing. ISBN 1-85604-553-6. Brügger, N. (2005) Archiving Websites. General Considerations and Strategies. Aarhus: The Centre for Internet Research. ISBN 87-990507-0-6. http://www.cfi.au.dk/en/publications/cfi.
Structural Aspects for the Digital Cartographic Heritage 75 Bülthoff Heinrich H., van Veen Hendrick A.H.C. (2001) Vision and Action in Virtual Environments: Modern Psychophysics in Spatial Cognition Research; in Vision and Attention; Springer Verlag: New York, Berlin, Heidelberg; ISBN 0-387-95058-3. Chandler Daniel (2002) Semiotics – the basics; Routledge Taylor&Francis Group: New York. ISBN 0-415-35111-1. Galloway, Anne and Matthew Ward (2005): Locative Media as Socialising and Spatialising Practices: Learning from Archaeology. Forthcoming Leonardo Electronic Almanac, MIT Press. Gartner, G. (2008) Location based services and teleCartography - From sensor fusion to context models; International Conference on Location Based Services and TeleCartography, Salzburg; Berlin Heidelberg Springer. Hagedorn B., Döllner J. (2007) High-Level Web Service for 3D Building Information Visualization and Analysis, in ACM 15th International Symposium on Advances in Geographic Information Systems (ACM GIS), Seattle, WA. Jiang, B., Huang, B., and Vasek, V. (2003) Geovisualisation for Planning Support Systems. In Planning Support Systems in Practice, Geertman, S., and Stillwell, J. (Eds.). Berlin: Springer. Kraak, Menno-Jan and Allan Brown (2001): Web Cartography – Developments and prospects, Taylor & Francis, New York, ISBN 0-7484-0869-X. Kroeber-Riel, W., Esch, F.-R. (2000) Strategie und Technik der Werbung, 5.ed., Kohlhammer Verlag, Stuttgart. MacEachren, A.M. and Kraak, M.J. (1997) Exploratory cartographic visualization: advancing the agenda. Computers & Geosciences, 23(4), pp. 335-343. Nivala, A.-M. (2005) User-centred design in the development of a mobile map application. Licentiate Thesis. Helsinki University of Technology, Department of Computer Science and Engineering: Helsinki, 74p. Peterson, Michael P. (ed.) (2003): Maps and the Internet, Elsevier, ISBN 0080442013. Raper Jonathan, Gartner Georg, Karimi Hassan, Rizos Chris (2007) A critical evaluation of location based services and their potential, in Journal of Location Based Services, Volume 1, Issue 1 March 2007 , pages 5 – 45, Taylor&Francis, London. Reichenbacher Tumasch, (2004) Mobile Cartography - Adaptive Visualisation of Geographic Information on Mobile Devices (PhD), Munich University of Technology. Schiller, J. (2004) Location-based services, San Francisco, Kaufmann - (The Morgan Kaufmann series in data management systems ), ISBN 1-55860-929-6 Turner J. A. (2006) Introduction to neogeography, O'Reilly Media, ISBN 978-0596-52995-6. Wrobel, S. (20079 Geovisual analytics for spatial decision support: Setting the research agenda. International Journal of Geographical Information Science, 21(8), pp. 839-857.
4 Archiving the Complex Information Systems of Cultural Landscapes for Interdisciplinary Permanent Access – Development of Concepts
Józef Hernik1, Robert Dixon-Gough2 1
University of Agriculture in Kraków, Faculty of Environmental Engineering and Land Surveying,
[email protected] 2 University of East London, School of Computing, Information Technology and Engineering,
[email protected]
Abstract European countries are characterised by valuable cultural landscapes that have gradually evolved through the interaction of people and the natural landscape, their survival being protected by traditional ways of land use. As the traditional forms of land use has changed, particularly over the past four decades, these valuable cultural landscapes are threatened and in places are facing extinction. Such landscapes often cannot compete with more urgent needs on the level of planning and the implementation of important infrastructural developments, together with the needs and requirements of modern agriculture. This contribution introduces the ad hoc concept of selective historic landscape characterisation programmes leading to the concept of systematically archiving the complex information systems of cultural landscapes for permanent interdisciplinary access prepared on the basis of empirical studies. Much of the data acquired is related directly to specific infrastructural projects and largely neglects the requirements for other similar projects that might arise in the future such as flood protection, climate change. It is considered that the interdisciplinary archiving of landscape elements and entire cultural landscapes is an opportunity to combine practice, through economic drivers, effects, and implica-
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_4, © Springer-Verlag Berlin Heidelberg 2011
78 J. Hernik, R. Dixon-Gough
tions, with science through the development of an interdisciplinary approach to cultural landscapes.
4.1 Introduction European countries are characterised by valuable cultural landscapes (Fig. 1., Fig. 2., Fig. 3.) which have evolved over a considerable period of time through traditional forms of land use that have provided both continuity and sustainable form of land use that is usually in sympathy with both the landscape, geology, and climate of the region. As those landscapes matured by the end of the nineteenth century through the completion of transportation networks and the establishment of industry and agriculture, it became possible to study rural landscapes in two ways: as an entity based upon farming artefacts; or as a planned landscape (Claval 2005). Most rural areas up to the beginning of the First World War had a population in occupations primarily related to the land, and even in those areas where there were other industrial activities, the majority of the land was used for farming. The main occupants fell into four main categories: landowners, tenant farmers, small independent farmers, and farm labourers. This traditional landscape was comprised of artefacts based partly on the need to use the land efficiently for agricultural production (such as fields – both arable and pasture, grazing land, woods, farms, villages, etc.), but also expressions of the society that lived in the area. A number of examples have been provided in England and Wales through, for example, the work of (Winchester 1987; Williamson 2002, 2003; Martins 2004; Martins & Williamson 1999). In addition, the landscape provided researchers with a surface that can be used to analyse and evaluate the existence of ‘other worlds’ and processes responsible for the development of the landscape (Claval 2005). There have been many successive and major landscape changes in the past for which there is very little physical evidence (Antrop 2005). However, much of the rural landscape of Europe has been the result of systematically planned, albeit on a local basis, initiatives that took place during the Middle Ages, including land clearance and drainage, and various form of piecemeal enclosure. Other changes in the landscape over the last two centuries have been more regulated and include the drainage and reclamation of land (Cook & Williamson 1999), and formal actions such as land consolidation and, in the case of England and Wales, Parliamentary
Archiving the Complex Information Systems of Cultural Landscapes 79
Enclosure Acts and Awards. This period coincided with industrial revolution, which in urban areas generated landscape transformations that were devastating and threatening for both the environment and the landscape yet, nevertheless were instrumental in generating a new form of urban cultural landscape. It has only been during second half of the twentieth century that integrated landscape management has developed and, with the revival of landscape ecology since the 1980s, a holistic approach to the landscape has slowly emerged (Antrop 2005). The ultimate aim of that integrated approach is trans- or multi-discipliniarity in which fundamental and applied research can be related to policy implementation. Regrettably, the rapid changes in both rural and urban environments that have increased exponentially over the past two decades have threatened many cultural landscapes as the need to modernise and change to meet societal and economic perceptions and needs have led to the erosion of traditional land use to the extent that valuable cultural landscapes are threatened in general whilst in some locations they are facing complete ex-
Fig. 4.1. Typical landscape of Wiśniowa commune (source: Józef Hernik).
80 J. Hernik, R. Dixon-Gough
Fig. 4.2. Typical landscape of Miechów commune (source: Józef Hernik).
tinction. To counteract such threats and to preserve landscapes under threat the European Council initiated the European Landscape Convention (European Landscape Convention 2000), which recognised that landscapes should be given legal status and should be recognized as a base for: value of life, shaping regional and local consciousness and expressing natural and cultural diversity. Furthermore, landscape protection, management and planning shall be guaranteed, no matter if it is a natural, cultural or urban landscape, if the landscapes are intact or degraded or if they are of outstanding beauty or just every day landscapes (Stöglehner & Schmid 2007). However, in practice, it is seen that the nature of cultural landscapes is underestimated, in particular on local level and cultural landscapes often cannot compete with more urgent needs on the level of planning, forecasting, as well as realization, execution. One of the major problems that exist within both Poland and the UK in this context is the lack of suitably qualified professionals in the discipline of cultural and historic landscapes. For example, in a study of English Local Authorities, Baker and Chitty (2002: 5) discuss the problems of historical environments in the context of those in the forefront of the decisionmaking related to their management, the Local Authorities. In many such areas up to a third of all local planning decisions have some form of im-
Archiving the Complex Information Systems of Cultural Landscapes 81
plication upon historic environments although English Heritage estimated that up to 20 percent of county and district authorities have no appropriately qualified specialist advisors. In this contribution, based on the comparative experience of Historic Landscape Characterisation (adopted by English Heritage and CADW – the corresponding Welsh equivalent – in parallel with the English Landscape Convention) and the international project titled ‘Protecting Historical Cultural Landscapes to Strengthen Regional Identities and Local Economies (Cultural Landscapes)’ (www.cadses.ar.krakow.pl) the authors will there introduce results of studies concerning the importance of archiving the complex information systems of cultural landscapes. The primary aim of this contribution is to reaffirm the importance of documenting the complex juxtapositioning of the multi-temporal landscape elements that comprise the cultural landscape, whilst emphasising the importance of archiving those complex information systems for permanent interdisciplinary access.
Fig. 4.3. Typical landscape of Cumbria looking towards the head of Great Langdale; the farm in the distance isStool End, which certainly dates from the medieval period but possibly to the ninth century of Norse occupation (source: Robert Dixon-Gough).
82 J. Hernik, R. Dixon-Gough
Fig. 4.4. Llanberis Pass in the Snowdonia National Park, North Wales; the image is taken from one of the levels ofthe disused Dinorwic Slate Quarry; it is an example of an historic, post-industrial and cultural landscape and thequarry working now form part of a country park maintained by the local community (source: Robert Dixon-Gough).
The thesis of this contribution is to prove the need for complex archiving the information of past human impacts on the cultural landscapes with the aim of their preservation and protection for future generations. Mander et al. (2004) in their research on the impact of human influence consider that the multi-disciplinary analysis of various aspects of human impact on different landscapes gives a basis for planning and development of further sustainable landscapes. Whilst throughout Europe, cultural landscapes have largely been determined by agriculture other factors can play a key role, for instance, in industrial areas the landscapes are mainly influenced not only by mining and restoration activities but also by polluted air and water fluxes (Mander & Jongma 1998), factors that are also true for urbanisation. The history of these landscapes is different from natural landscapes (Meeus et al. 1990) since both the degree and the intensity of disturbances is greater, although the decisions made by man are the main influence on land-use patterns and ecological features and ecological processes play a less dominant role (Fig. 4.). Through the interdisciplinary archiving of
Archiving the Complex Information Systems of Cultural Landscapes 83
these cultural landscapes there is the opportunity to combine practice with science in the development of interdisciplinary approach to cultural landscapes. In any comparison between the UK and another country there is always the invidious problems that whilst it is possible to make some generalised comments about the United Kingdom, it should always be remembered that there are three separate jurisdiction that have evolved quite separately. Therefore, for the sake of this contribution, the majority of the comments will relate to the Jurisdiction of England and Wales and, in particular to England.
4.2 History of Archives in the Respective Countries In both countries, the earliest archives were based upon monastic and ecclesiastic records. Those of England can be said to date back to the Domesday Survey conducted in the years following the Norman Conquest in 1066, although it is likely that this was based upon Saxon land records (Cahill 2002: 21) but which created a single record of land ownership patterns. The first archives in Poland date back to the end of the 12th century and were maintained by the (diocesan and monastic), towns, provincial rulers and magnates. Archiwum Koronne (the Crown Archive, also called the Kraków Archive) was established in the middle of the 14th century, while Archiwum Metryki Koronnej (the Archive of the Crown Register), which subsequently became the central archive of the Commonwealth, was founded towards the end of that century. Following the Norman Conquest of England, the nation entered a period of relative stability with much of the land distributed between the supporters of the Crown (the Manors) and the Church, with the records being shared between those two authorities as Manorial and Ecclesiastic Records. This period was referred to by Cahill (2002: 20) as the First Great Land Grab although nothing really changed other than who actually owned the land and maintained the records. Up to the beginning of the twentieth century, the Partitions of Poland influenced the fate of Polish archives which varied, depending on the policies pursued by the partitioning powers: Tsarist Russia, Prussia and Austria. Also during this period, England subjugated and absorbed Wales and underwent two more ‘Land Grabs’, Henry VIII’s dissolution of the monasteries in the 1530s and the reallocation of their land and property,
84 J. Hernik, R. Dixon-Gough
and the redistribution of land following the institution of Cromwell’s Republic in 1659. In both instances, these actions led to the loss of land records. However for much of the period between 1087 and the beginning of the twentieth century, the comprehensive land records were maintained throughout the nations of England and Wales, and the later nation of England and Wales. At a basic level, some 13,000 manors maintained land records – many of which are still in existence in various archives (the Manorial Records). The Monasteries also maintained records – some of which were lost during the dissolution of the monasteries, although many of these records were replicated at a parish level. Records were also maintained by the Counties and the Exchequer (Finance Minister). Some of the most comprehensive of these land records was maintained by the Church of England, which through its parishes created the parish tithe maps which showed every parcel of land and dwelling within each parish, used as the basis for church taxation throughout England and Wales until 1929, the process of which was formalised under the Tithe Commutation Act of 1836, which led to the production of accurate maps of all parishes. In addition there remains one further set of land records in England and Wales, those of the Enclosure Acts and Awards, which showed how the land was enclosed, the general boundaries of each parcel, and who owned each individual parcel. During the twentieth century, a further significant land record was added to those of England and Wales in the form of the 1910 Land Valuation Survey, conducted as part of a programme of land reform and land taxation instigated by the Liberal government of the time, for which every dwelling and business, and every parcel of land was identified on Ordnance Survey 1:2,500 maps (1: 10,000 in mountain areas) and which was referenced to a Land Book, in which the individual ownership, occupier and value was listed. In Poland, following the regaining of independence in November 1918, the state archives of the Republic of Poland were established. These archives were charged with the gathering, custody, research and the provision of access to the national archival collection. In independent Poland, several archival centres were established in Warsaw: the General Archive of Historical Records, the Archive of Historical Records, the Treasury Archive, the Archive of Public Education, and the Army Archive. These were transformed, in 1930, into the Archive of New Records. Polish archives suffered heavy losses during World War Two. The worst affected were the Warsaw archives, which were 95 per cent destroyed (all the central archives and the Archive of the Capital City of
Archiving the Complex Information Systems of Cultural Landscapes 85
Warsaw were almost totally destroyed). Similarly, but to a much lesser extent, many of the land records for England and Wales were destroyed during this period, but on a much more random basis since at that time there was either little centralisation of records and those records that had been centralised had been moved to ‘safe’ locations. To counteract the problem in Poland, a Decree on State Archives was issued on March 29, 1951 which introduced a new organization of archival administration as well as the notion of the state archival resources (comprising the archives produced by state organs and institutions, the records of dissolved private enterprises, land estates, parties and organizations, as well as of families and private persons who played a noteworthy historical role). As a result of changes in the socio-political system, self-government, family and economic archives were incorporated into the state archives. The Head Office of State Archives was established as a body in charge of archive collections. The 1983 Act of Parliament, replacing the 1951 decree, introduced the notion of the national archival resource comprising the entire body of archival material preserved and produced on Polish territory irrespective of the nature of ownership as well as those records which, in line with international law and customs, should have belonged to Poland even if they exist beyond the country's frontiers. General Director of State Archives became the central organ of state administration in all matters pertaining to archive keeping (www.archiwa.gov.pl). Throughout England and Wales the organisation of records proceeded during this period on an ad hoc basis using a combination of national (for both England and Wales) and local resources. To a great extent, this arrangement still exists with (in the case of England) a Public Record Office located in London linked to regional offices on the basis of administrative counties. At present, there are three central archives in Poland, with headquarters in Warsaw: the • Archiwum Główne Akt Dawnych (Central Archives of Historical Records) preserving the records of the central, and partially provincial, authorities as well as the archives of families of all-Polish importance produced prior to 1918, • Archiwum Akt Nowych (Central Archives of Modern Records) keeping the records of the central authorities, and institutions and associations of national importance, as well as the documents and papers of outstanding political and social leaders produced after 1918, and
86 J. Hernik, R. Dixon-Gough
• Narodowe Archiwum Cyfrowe (National Digital Archives) preserving photo- and phonographic records as well as film documentation produced since the beginning of the 20th century. Unfortunately, these archives provide little help in providing information relating to the preservation of cultural landscapes and the very process is in need of formalising and being accessible in a more thematic manner. The situation in England is similarly fragmented but most of the records are accessible via digital archives, or if not accessible through those archives at least linked through search engines in order to locate the records and to apply thematic search criteria. The main archives for land records are, at a national level: • The National Archives through the Public Record Office located in London and accessible through search engines to locate geographically and thematically relevant records. This is in turn linked through the same search engine to county archives; and • The National Monument Record Centre, part of English Heritage, is a recognised place of deposit under the Public Records legislation and has high environmental standards for the storage of photographs and other archives. This includes the former Ordnance Survey Archaeological Record, the former National Buildings Record, the National Library of Air Photographs and the archives and information created and acquired by the former Royal Commission on the Historical Monuments of England. On a more local basis are for example: • The County Archives, which collate information on a local basis but which include such information as Manorial Records, Enclosure Acts and Awards, Tithe Surveys, and Land Valuation books and Maps; and • The Historic Environment Records (HERs) are the mainly local authority-based services used for planning and development control. They also operate a public service and fulfil an educational role. These records were previously known as Sites and Monuments Records or SMRs. The name has changed to reflect the wider scope of the information they now contain or are aspiring to maintain. Irrespective, however, of the nature of the archives and the land-related information contained within them it is of critical importance that the available data is interpreted in an appropriate manner.
Archiving the Complex Information Systems of Cultural Landscapes 87
4.3 Complex Information Systems of Cultural Landscapes Most of the European countries, and indeed particular regions within those countries, have their own peculiar and unique historical and cultural landscapes. However, this diversity is increasingly being placed under pressure endangered as a result of general neglect in the both the legislative framework and its implementation in environment protection and conservation, the need to intensify and industrialise agriculture, provide more capacity for tourism and leisure, together with the constant pressure of changing socio-economic processes. For example, in the project ‘Protecting Historical Cultural Landscapes to Strengthen Regional Identities and Local Economies (Cultural Landscapes)’ (www.cadses.ar.krakow.pl), the primary aim was the protection and development of cultural landscapes for preservation within CADSES (Central, Adriatic, Danubian and South-Eastern European Space) with the objective being the development of programmes for sustainable regional development. The systems used within the UK have similar aims but have been developed over a longer timeframe on a systematic basis and with slightly different objectives, those largely related to rural and urban planning. The process of Historic Landscape Characterisation (HLC) within the UK has developed from the 1960s through the concept of ‘character’ that was promoted in the Conservation Area Legislation of 1967 (Clark et al. 2004: 1). This evolved further through the 1990s with Landscape Character Assessment and the English Heritage Historic Landscape Project of 1992-4 (Fairclough et al. 1999). English Heritage’s strategy on landscape had emphasised important policies such as Countryside Stewardship, Environmentally Sensitive Areas (ESA), Area of Outstanding Natural Beauty (AONB) Management Plans, Conservation Area Appraisals, Development Plans and Development Control as methods for landscape management (Fairclough et al. 1999: 19). These ideas were also incorporated in 1994 and further endorsed by the Department of the Environment’s Planning Policy Guidance, ‘PPG7 The Countryside – Environmental Quality and Economic and Social Development’ and ‘PPG15 Planning and the Historic Environment’ The final stage in the evolution of HLC came through the European Landscape Convention (ELC) for which it has become the principal method for achieving the objectives of this Convention, through and integrated, holistic and multi-diciplinary management of spatial data. In a similar manner to HLC, the ELC promotes landscapes (particularly cultural landscapes) as an important aspect of a common heritage in both rural
88 J. Hernik, R. Dixon-Gough
and urban communities and requires a comprehensive understanding of the actions and actors influencing sustainable change through democratic participation of all interested participants. This is summarised succinctly by Fairclough et al. (1999: 56) as: for example developing awareness of local identity, academic understanding, designations and planning policies, development appraisal, management or grant assessment. The current programme work by English Heritage on Historic Landscape Characterisation effectively began during the early part of the 1990s as a means of integrating historical landscape into a framework of general landscape assessment work. Prior to this, much of the available data was effectively point data related to landscape elements such as specific sites, ancient monuments, and listed buildings without relating them to a larger spatial entity such as a historic or cultural landscape. With the introduction of the Department of the Environment’s (DOE) Planning Policy Guidance Notes #15 (PPG15) it was necessary for a suitable tool to be developed that addressed the specific spatial planning and conservation needs of an historic and culturally important landscape. Through a period of consultation, this concept led to a methodology based upon a universal character assessment that would serve both conservation objectives together with those of sustainability (Fairclough et al. 2002: 71). The results of this consultation project were published as ‘Yesterday’s World, Tomorrow’s Landscape’ (Fairclough et al. 1999), which emphasised: the role of landscape characterisation in helping to influence decisions about the future appearance of the landscape, and to inform them historically and archaeologically, rather than trying to prevent all change in a few areas. It also carries the message that landscape, conceptually, only exists in the here-and-now or, in whatever form we choose in the future (Fairclough et al. 2002: 72). In this context, English Heritage identified the following guidance. Firstly, it recognised that all landscapes are historic and cultural but, in addition, that the character of a landscape is based upon a combination of factors such as ecology and scenic values which in turn influence the settlement patterns, social structures, the industry, communication networks, and the economic structure. In this respect, it has been recognised that all landscapes have some form of historical often based upon some dominant factor. Cultural landscapes per se, do not form part of this characterisation. Furthermore, all landscapes are the product of change – some more dy-
Archiving the Complex Information Systems of Cultural Landscapes 89
namic than others – and the patterns and inter-relationships that exist within the landscape are factors of evolution, continuity, and change. The methodology adopted by English Heritage is based upon a countywide basis rather than the smaller administrative unit of the parish based upon chronological spatially-related data and evidence, predominantly derived from maps and aerial photography. This HLC work undertaken on a county-wide basis has been complemented by intensive, large-scale urban surveys in towns and across cities throughout the UK (Winterburn 2008). One of the first of these was the Lancashire Historic Towns Survey (Anon 2005). The relative merits of this approach is that whilst rapid progress can be made over large regions, it is often necessary to make more detailed surveys at a larger scale using the more traditional methods of historical research, often over long time periods, and usually only in small areas, with the local and regional administration bearing the greatest burden. In this respect Grover et al. (2000) reported that most local authorities have at least one dedicated conservation officer and many have dedicated teams that come from a variety of relevant backgrounds, typically from professions such architecture, planning, history, and archaeology. One of the principal factors related to the composition and size of the teams is the perceived importance of the conservation area or historic and cultural landscape within the respective areas of responsibility. Ball et al. (2006) identified one of the main problems of historic and cultural landscapes across the UK, that of the farm buildings that form one of the basis elements of these landscapes. Over the past 50 years many traditional farm buildings have become redundant as a result of changes in farming practices and farm structures. In some cases, those buildings have either fallen into disrepair, been demolished, or have been converted to other uses (residential or industrial). To what extent to the ‘new’ uses form part of the landscape and should records be maintained of those buildings that have been lost – either through cartographic or photographic evidence? These changes will occur throughout Europe as the income of the farming community falls, some 60% in real terms between 1995 and 2005 in the UK (Defra 2005). Such is the fragility of such important landscape elements that English Heritage has commissioned a series of studies on historic farmsteads (as examples see Lake et al. 2006a, 2006b) to identify the best examples and those at most risk based upon the methodology developed by Gaskell and Owen (n.d.). One of the principal questions that must be asked of the system developed specifically throughout England and Wales is who, therefore,
90 J. Hernik, R. Dixon-Gough
should take the financial and intellectual responsibility of maintaining a national archive of historic and cultural landscapes across the jurisdictions of the UK – the European Union though its Landscape Convention, the national government, who defines what should be archives, the county-level administration which is applying a small-scale coverage, or the local authorities who are responsible for the large scale investigations of historic or cultural landscapes? However, the system developed as part of the ‘Protecting Historical Cultural Landscapes to Strengthen Regional Identities and Local Economies (Cultural Landscapes)’ (the project) project provides a more systematic inventory of the landscape elements which comprise the cultural landscape. Basing on the intersectoral approach, including the protection of nature and heritage as well as the development of rural areas, and on the international research structures, the project has developed and integrated some of the best examples of the implementation of the European Landscape Convention. Increasingly, as the result of international research and co-operation, the knowledge of issues concerning cultural landscapes, for example such as digital land register and widely available web portal Landscape Wikipedia, evolve and develop and must, therefore, be included into this material. These research and co-operation programmes have the common aim of integrating landscapes with regional development either systematically or through pilot projects, for example, in the field of agriculture, tourism, regional markets, and renewable sources of energy. Moreover, the project coordinated in Poland also aims at the development of integrated strategies for cultural landscape protection and establishes a programme of development on regional, national and CADSES level through the creation of an international network to facilitate the implementation of the European Landscape Convention in the area of CADSES. This form of cooperation is undoubtedly relevant to the European landscape since cultural landscapes transcend international boundaries and whereas the system adopted in the UK might be suited to ‘island states’, it might prove difficult in transposing such a system across Europe.
4.4 Archiving Cultural Landscapes It is possible to identify distinct differences yet significant similarities between the approaches used throughout the UK and those suggested by the project. In both approaches, the complex information systems of cul-
Archiving the Complex Information Systems of Cultural Landscapes 91
tural landscape should be based on a balanced principle: what one uploads equals what one downloads. However, this principle should be only examined across a longer time period since the principal of balance will only apply over that longer period. In the short-term, there is likely to be forms of evolution that are unbalanced, such as changes in agricultural production, the development of new communication links, and changes in land ownership patters. On the basis of experience gained during the successful completion of the project it would appear that all data about cultural landscapes should be archived, and that all archived data about cultural landscapes will have the potential to be used in the future, for example for studies and works in the field of anti flood protection or climate changes. This principal is also true in the UK since as the use and development of HLC increases, so too does the pool of knowledge, the platform of experience, and the methodology evolved through more advanced interpretative approaches and more complex classifications (Fairclough et al. 2002: 78). The generation of databases of cultural and historic landscapes has evolved through paper-based projects, maps and images towards a more integrative approach utilising ontological studies and definitions and their representation through GIS. The evolution of the use of GIS in HLC was summarised by Aldred and Fairclough, (2003) who defined this as three waves of activities: • Wave 1, characterised by an intensive use of historical maps and documents and a reconstruction of the landscape using GIS mainly as a CAD-style drawing and inventory tool rather than utilising the full functionality of GIS; • Wave 2, which developed further the use of GIS to introduce time-depth analysis and to established it as a practical approach with digitised records (from paper-based documents), attribute values attached to polygons that increased the range and scope of products; • Wave 3 developments included the potential of GIS to characterise and analyse spatial datasets with separate attributes, through multiple attribute data and to develop model concepts for spatial and temporal (Dyson-Bruce et al. 1999; Dyson-Bruce 2002a, 2002b); whilst • Wave 4 has benefited from better digital map bases such as the Ordnance Survey’s MasterMap (see http://www.ordnancesurvey.co.uk/oswebsite/products/osmastermap/), an improved level of data consolidation and attribute data, and the possibility of improved queries and interpretation.
92 J. Hernik, R. Dixon-Gough
For example, in the case of the HLC for the county of Hampshire (southern England) discussed by Fairclough et al. (2002) the basic approach has been tiered. At the bottom of the tier is the County Structure Plan, the main strategic document for spatial planning throughout England, which specifically states that historic landscapes need to be identified to allow planners and landscape managers to assess all parts of the county’s historic landscape in context rather than focussing upon specific sites and locations. In addition, more detailed and specific HLC plans at a District level have been drawn up that address specific landscape elements such as remnants of longstanding areas of medieval hunting forest, with distinctive settlement and field patterns. Finally, at a local level, HLC is carried out in connection with developments (such as a new road, extension of an airport, etc.), specific groups of fields or settlement patterns, parts of villages, towns, and conservation areas (for examples EH/EP 2006) that provides, in combination, a detailed web of information relating to the historic and cultural nature of both the rural and urban landscape. This pool of knowledge will be systematically increased and given good archiving systems and a constant updating of both the systems of the nature of the spatiallyrelated data its versatility will increase and improve. One such tool that will provide both the integration, integrity, and comparability of such datasets across a wider international and interdisciplinary network is that of ontologies. The term ontology has been used for different purposes across different learned communities but in a mapping sense is normally taken to refer to a shared vocabulary or conceptualisation of a specific subject matter. In this respect, ontologies may be used to make the conceptualisation of a domain (for example, geography or a cultural landscape) more explicit, thereby removing the possibilities of ambiguities arising in definition and context. When encoded within a suitable software language, ontologies may be used in the following relevant areas (Visser & Schlieder 2003: 100): • Systems engineering: to identify the requirements and inconsistencies of the design of a system, thereby assisting in the acquisition and search for the available information, together with any extensions to that information; • Information integration to provide semantic interoperability such as the exchange of information between different systems and the integration of systems around a defined core domain for which the vocabulary of the system must be explicit; and
Archiving the Complex Information Systems of Cultural Landscapes 93
• Information retrieval, which is normally dependent upon the specific encoding of the available data through such methodologies as fixed classification codes, or simply full-text analysis. In these instances, an ontological system serves as a common factor to match queries against potential results on a semantic basis. In order to achieve the relationship between ontologies and historic and cultural landscapes, it is necessary to model static knowledge such as parcels and buildings whilst simultaneously dealing with processes and abstract concepts such as rights, relationships, and visual interpretation. Thomson & Béra (2007) have extended this concept through investigation the linkage between human perceptions of spatial relationships in respect to land use and landscape characteristics in order to construct an ontology that may be used to infer land use information for topographic maps. However, the greatest difficulty encountered was the definition of boundaries between landscape types caused by the ‘fuzzy’ nature of where the respective landscapes started and ended particularly when attempting to archive spatially-related data. While carrying out the Project it has also been realised realized how important all spatially-related archived data are for a wide range of differing function, one of the most pertinent being for environmental purposes such as flood protection. However, acquiring and storing such a wide spectrum of archiving data is costly both in terms of the storage of the data, maintaining the archival systems, and in terms of the cost of human resources. It should always be remember that this issue should be costed and evaluated from the perspective of long time frames, but unfortunately this will be a decision made at a political system in which competing demands must be made upon the economics of the state, which makes the system developed in the UK more sustainable since the cost and responsibility is shared across administrative units. However, these considerations are beyond the scope of this contribution.
4.5 Need for Permanent Interdisciplinary Permanent Access Data concerning historic and cultural landscapes must be easily accessed, not only by professionals, researchers and academics but by the general public, in different forms that will satisfy their individual needs and re-
94 J. Hernik, R. Dixon-Gough
quirements. The main tool developed for HLC within the UK has been Geographical Information Systems, linked to existing maps and relational databases to allow both the analysis, evaluation, and availability of data in both a printed form (reports disseminated through the internet as *.pdf files) or in some instances direct availability of the data. Within such data structures, it is not only possible to access the data stored but also to use this data for spatial planning purposes, although the scales and general nature of the county-based HLCs must be recognised and, where necessary, larger scale HLCs implemented. Within historic and cultural landscapes, there is also the possibility to archive other forms of information that is truly of a cultural nature such anecdotal history, local music and customs, the distinctive sounds of the region or parts of the region, and even in the future the smell of villages, and the taste of local food. Much of this is currently difficult to document and will certainly increase both the cost, time taken to implement the system, and complexity of the system. However, it is perfectly viable to use the historic and cultural landscape information system to act as a hub linking together diverse and varied databases. It should never be forgotten that people are a part of those landscapes and that they have successfully of many centuries transferred social behaviour, traditions and local customs from one generation to another, whilst photographs have been taken now for more than 150 years that record many scenes – both landscape and customs that should form an integral part of this information system. The collection of such data should be both systematic and community based, again linked to the main hub of the system. One further element, alluded to above, is the concept of auditory historic and cultural landscapes. For example Mills (2005) investigated the significance of auditory archaeology in reconstructing the influence and significance of sound in present and past daily life. Although this project was primarily aimed at how planned developments might impact upon auditory senses and how the ‘tranquillity’ of cultural and historic landscapes might be adversely affected by the ‘acoustic component of developments, it nevertheless gives indications on how such techniques can be integrated within such landscape studies.
Archiving the Complex Information Systems of Cultural Landscapes 95
4.6 Conclusions The present system of archiving historic and cultural landscapes across is Europe incomplete and generally inadequate for future use. On one hand we have considered the system of HLC used across the UK, in particular in England and Wales, that forms a tier in the spatial planning network, yet whose principals may be applied at a variety of different scales for different purposes and functions, linked together by databases using GIS as an enabling system. In comparison, Polish institutions responsible for archiving do not cover cultural landscapes comprehensively and there is a distinct need to widen research and work within the archiving of cultural landscapes for many future studies and developments especially related to the environmental field of climate change and flood protection. It is also desirable and necessary to monitoring all agricultural, urban and forest areas. In addition, there is the possibility of the formation of an institution (museum) that will be central to the archived cultural heritage at a regional level. It is possible to create this on the basis of the existing Archaeological Museum. Taking this into account it may be possible that the proposed archiving of the complex information system of historic and cultural landscapes for interdisciplinary permanent access is complete and certainly could be more effective that the present state of archiving cultural landscapes.
4.7 Acknowledgements The elements of this contribution relating to the Polish contribution was prepared on the base of results and experience gained within the international project titled "Protecting Historical Cultural Landscapes to Strengthen Regional Identities and Local Economies (Cultural Landscapes)" within the European Union program Interreg III B CADSES www.cadses.ar.krakow.pl. Dr. Hernik would, in particular, like to express his deep and sincere gratitude to Dipl.-Ing. Horst Kremers for his detailed and constructive comments, and for his important support throughout this contribution.
96 J. Hernik, R. Dixon-Gough
4.8 References Aldred O. and Fairclough G. 2003. Historic Landscape Characterisation: Taking Stock of the Method. The National HLC Method Review 2002, Carried out for English Heritage by Somerset County Council. Available at english-heritage.org.uk. Retrieved February 9, 2009. Anon. 2005. Lancashire Historic Towns Survey. Available at: http://www.lancashire.gov.uk/environment/archaeologyandheritage/historictowns/index.asp. Retrieved February 13, 2009. Antrop M. 2005. “Why landscapes of the past are important for the future.” [In] Landscape and Urban Planning, 70, pp. 21–34. Baker D. and Chitty G. 2002. “Heritage Under Pressure: A Rapid Study of Resources in English Local Authorities.” Prepared by Historic Environment Conservation/ Hawkshead Archaeology and Conservation for English Heritage London. Available at: http://www.english heritage. org.uk/ server/ show/nav.001003005001002/chooseLetter/H. Retrieved February 11, 2009. Ball D., Edwards R., Gaskell P., Lake J., Mathews M., Owen S. and Trow S. 2006. Living Buildings in a Living Landscape: Finding a Future for Traditional Farm Buildings. University of Gloucestershire in association with English Heritage and the Countryside Agency. Available at www.helm.org.uk/ruraldevelopment. Retrieved February 11, 2009. Cahill K. 2002. Who Owns Britain: The Hidden Facts Behind Land Ownership in the UK and Ireland. Canongate, Edinburgh. Clark J., Darlington J. and Fairclough G. 2004. Using Historic Landscape Characterisation: English Heritage’s review of HLC Applications 2002 – 03. English Heritage & Lancashire County Council. Claval P. 2005. “Reading the rural landscape.” [In] Landscape and Urban Planning. 70, pp. 9-19. Cook H. and Williamson T. 1999. “Introduction: landscape, environment and History.” [In] Cook and Williamson (eds.). Water Management in the English Landscape. Edinburgh University Press. pp. 1-14. Defra. 2005. Agriculture in the United Kingdom 2005. Department of Environment, Food and Rural Affairs, London. Dyson-Bruce L. 2002a. “Historic Landscape Assessment – The East of England Experience.” [In] Archaeological Informatics: Pushing the Envelope Computer Applications and Quantitative Methods in Archaeology Conference 2001, Gotland, BAR International Series, 1016, pp. 35-42. Dyson-Bruce L. 2002b. “Historic Time Horizons in GIS: Historic Landscape Assessment East of England Project.” [In] Kidner D., Higgs G. and White, S. (eds.). Innovations on GIS 9, Socio-Economic Applications of Geographical Information Science. 107-118, ESRI UK.
Archiving the Complex Information Systems of Cultural Landscapes 97 Dyson-Bruce L., Dixon P, Hingley R. and Stevenson J. 1999. Historic Assessing Historic Landuse Patterns. Report of the pilot project 1996-98. Research Report, Historic Scotland & RCAHMS, Edinburgh. EH/EP. 2006. Graylingwell Hospital, Chichester Historic Landscape Characterisatio. FINAL DRAFT-16 August 2006, English Heritage/ English Partnerships, Swindon. European Landscape Convention. 2000, October 20. Florence. http://www.coe.int/ t/e/cultural_co%2Doperation /environment/landscape/presentation/9_text/02_ Convention_EN.asp#TopOfPage Fairclough G., Lambrick G. and McNab A. (eds.). 1999. Yesterday’s World, Tomorrow’sLandscape: the English Heritage Landscape Project 199294. English Heritage, London. Fairclough G.J., Lambrick G. and Hopkins D. 2002. “Historic Landscape Characterisation in England and a Hampshire case study.” [In] Fairclough and Rippon (eds.). Europe’s Cultural Landscape: Archaeologists and the Management of Change. Europae Archaeologiae Consilium & English Heritage, Brussels & London, pp. 69-83. Gaskell P. and Owen S. n.d. Constructing the evidence base. University of Gloucestershire in association with English Heritage and the Countryside Agency. Available at www.englishheritage.org.uk/hc/upload/ pdf/ Historic_farm_buildings_full.pdf. Retrieved February 10, 2009. Grover P., Thomas M. and Smith P. 2000. Local Authority Practice and PPG15: Information and Effectiveness. Report prepared for English Heritage, Institute of Historic Building Conservation, and the Association of Local Government Archaeological Officers, Environmental Design & Conservation Research Group, School of Planning, Oxford Brookes University, Oxford. Lake J., Edwards R. and Wade Martins S. 2006a. Historical Farmsteads: Preliminary Statement South East Region. University of Gloucestershire in association with English Heritage and the Countryside Agency. Available at www.helm.org.uk/ruraldevelopment. Retrieved February 11, 2009. Lake J., Edwards R. and Wade Martins S. 2006b. Historical Farmsteads: Preliminary Statement North West Region. University of Gloucestershire in association with English Heritage and the Countryside Agency. Available at www.helm.org.uk/ruraldevelopment. Retrieved February 11, 2009. Mander Ü. and Jongman R.H.G. 1998. “Human impact on rural landscapes in central and northern Europe.” [In] Landscape and Urban Planning. 41, pp. 149–154. Mander Ü., Palang H. and Ihse M. 2004. “Development of European landscapes.” [In] Landscape and Urban Planning, 67, pp. 1-8. Martins S.W. 2004. Farmers, Landlords and Landscape: Rural Britain, 1720-1870. Windgather Press, Macclesfield.
98 J. Hernik, R. Dixon-Gough Martins S.W. and Williamson T. 1999. “Roots of Change. Farming and the Landscape in East Anglia, c1700-1870.” [In] The Agricultural History Review, Supplement Series 2. British Agricultural History Society, Exeter. Meeus J.H.A., Wijermans M.P. and Vroom M.J. 1990. “Agricultural landscapes in Europe and their transformation.” [In] Landscape and Urban Planning, 18, pp. 289–352. Mills S. 2005. Applying Auditory Archaeology to Historic Landscape Characterisation: A pilot project in the former mining landscape of Geevor and Levant Mines, West Penwith, Cornwall. A report for English Heritage by the Cardiff School of History and Archaeology, Cardiff University. Available from www.cardiff.ac.uk/hisar/people/sm/aa_hlc/Text/AA_HLC_Report.pdf. Retrieved February 9, 2009. Stöglehner G. and Schmid J. 2007. “Development of Cultural Landscapes – Austrian Situation and Future Perspectives in the light of ELS.” [In] Hernik J. and Pijanowski J. M. (eds.). Cultural Landscape – Assessment, Protection, Shaping. Wyd. AR Kraków. pp. 59-68. Thomson M.-K. and Béra R. 2007. “Relating Land Use to the Landscape Character: Toward an Ontological Inference Tool.” [In] Winstanley A. C. (eds.). Proceedings of GIS Research UK Conference, GISRUK’07. Maynooth, Ireland, April 11-13, 2007, pp. 83-87. Visser U. and Schlieder C. 2003. “Modelling real estate transaction: the potential role of ontologies.” [In] Stuckenschmidt H., Studbjær E. and Schlieder C. (eds.). The Ontology and Modelling of Real Estate Transactions. pp. 99-113, International Land Management Series, Ashgate Publishing Ltd., Aldershot. Williamson T. 2002. The Transformation of Rural England. Farming and the Landscape, 1700-1870. University of Exeter Press, Exeter. Williamson T. 2003. Shaping the Medieval Landscape. Settlements, Society and Environment. Windgather Press, Macclesfield. Winchester A.J.L., 1987. Landscape and Society in Medieval Cumbria. John Donald Publishers Ltd., Edinburgh. Winterburn E., 2008. Historic Landscape Characterisation in Context. FORUM Ejournal for Post Graduate Studies in Architecture, Planning and Landscapes. 8(1), pp. 33-46. Newcastle University (available on http://research.ncl.ac.uk/forum/). Retrieved February 13, 2009. www.archiwa.gov.pl Retrieved January 22, 2009. Website of the State Archives. www.cadses.ar.krakow.pl Retrieved January 22, 2009. Website of the Project Cultural Landscape. www.ordnancesurvey.co.uk/oswebsite/products/osmastermap/. Retrieved February 26, 2009.
Section II Sustainability in Terms of Geospatial Preservation
5 State-of-the-Art Survey of Long-Term Archiving – Strategies in the Context of Geo-Data / Cartographic Heritage.....................101 Nico Krebs, Uwe M. Borghoff 6 Preservation of Geospatial Data: the Connection with Open Standards Development..........................................................129 Steven P. Morris 7 Pitfalls in Preserving Geoinformation - Lessons from the Swiss National Park....................................................................147 Stephan Imfeld, Rudolf Haller 8 Geospatialization and Socialization of Cartographic Heritage.....161 Dalibor Radovan, Renata Šolar
5 State-of-the-Art Survey of Long-Term Archiving – Strategies in the Context of Geo-Data / Cartographic Heritage
Nico Krebs, Uwe M. Borghoff Institute for Software Technology, Universität der Bundeswehr München, Germany,
[email protected]
Abstract Long-term preservation of digital artifacts is a quite young discipline in computer science whose relevance is rising constantly. Most approaches to long-term preservation rely either on the migration of the digital artifacts or on the emulation of their rendering environment. The popular migration approach is used to archive simple or automatically transformable document formats like texts and pictures. Within recent years, acceptance of the emulation approach has grown significantly. It appears to be particularly well suited for archiving complex digital objects, which highly depend on the rendering environment they have been designed for. This includes complex databases, entire applications, and even graphical, real-time computer games. This article gives an overview of both approaches and explores their suitability for the long-term preservation of geo-data in particular.
5.1 Introduction Within living memory the preservation of cultural heritage is an ongoing task, documented by survived artifacts like the Rosetta Stone. With the help of these artifacts historical events can be reconstructed and past cul-
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_5, © Springer-Verlag Berlin Heidelberg 2011
102 N. Krebs, U. M. Borghoff
tures can be researched. But what will survive of our digital world? How do we preserve our cultural heritage for following generations? Rothenberg’s (1999) statement is still true: “There is as yet no viable long-term strategy to ensure that digital information will be readable in the future.” Geospatial data is very interesting for future generations. These future cartographic artifacts could be used for historical research, contemporary planning or to comprehend the changes to soil done by human, and even to make predictions to future. Especially our generation creates geospatial data with high accuracy and let them grow continuously. This prospective cartographic heritage will be invaluable for future generations if our generation is not able to solve the problem of preserving them. The sheer amount of the available data makes it almost impossible to perform extensive “data archaeological” research on the whole digital heritage. Structured archives should help to store and retrieve data in a targetoriented and efficient way. To ensure the accessibility of our data to future generations, measures have to be taken to adopt our data or its representation to emerging technologies. One major approach is to migrate digital objects. It is mostly used to archive simple or automatically transformable document formats like texts and pictures. For more complex scenarios, the emulation approach might be more suitable. Its key idea is to emulate the genuine rendering environment of an entire system. Applications can be found among complex databases, entire applications, and even graphical, real-time computer games. An important issue is given by archiving strategies with reliance on hard copy like microfilms, nickel plates and even sugar cubes. Microfilms are used since decades to store analog data like historical documents. They offer unrivaled advantages by being directly perceptible to human eye for at least 500 years (Borghoff 2006, pp. 40-47). Nickel plates are more durable and engraved information could last for millenniums by offering immense storage capabilities and the feature of being eye-readable by using a simple magnifying lens (Sisson 2008). Denz (2008) offers a new long durable holographic mass storage media, which allows for storing up to one Terabyte data at the size of a sugar cube. Although this is an interesting research topic, this article will focus on the software-related issues of longterm archiving strategies. The article is divided in three parts. First, we will present the migration and emulation approach in a more detailed view, together with the related concept of computer museums. Afterwards, we will discuss which strategy
State-of-the-Art Survey of Long-Term Archiving 103
might be more appropriate for different digital objects. Third, the article sketches how entire geographical information systems (GIS) can be preserved for a long-term perspective.
5.2 Strategies of Long-Term Digital Preservation The life-cycle of most documents and digital objects is very short compared to the periods of time considered by questions with historic background. Today, we are not able to foresee which data will be of interest to historians in 100 years or more. But even on a shorter perspective, measures have to be taken to ensure the availability of our data. The past digital revolution was characterized by often-changing data formats and systems, only available a rather short term of a few years. The fundamental concepts to ensure the availability of our data heritage are the migration of our data or the emulation of their genuine environment. Both approaches will be discussed in the following Sections. Afterwards, we will present the concept of the computer museum as important part of a long-term preservation strategy. 5.2.1
Migration
What does the migration of digital objects mean? An often cited definition of migration was given in the final report of the Task Force on Archiving of Digital Information (TFADI 1996): “Migration is the periodic transfer of digital materials from one hardware/software configuration to another or from one generation of computer technology to a subsequent generation. The purpose of migration is to preserve the integrity of digital objects and to retain the ability for clients to retrieve, display, and otherwise use them in the face of constantly changing technology.” The Reference Model for Open Archival Information Systems (OAIS1) identifies four aspects of migration: Replication, refreshment, repackaging and transformation, which will be described in the following Sections. 1
Consultative Committee for Space Data Systems (CCSDS): Reference Model for an Open Archival Information System (OAIS). Technical report, Space Data Systems (2002)
104 N. Krebs, U. M. Borghoff 5.2.1.1
Refreshment
Storage media have limited life-cycles. Physical, chemical, and magnetic influences affect the lifetime of any kind of storage media. The refreshment of digital media is a key concept to guarantee data preservation independent from media wear-out. The basic idea is to timely copy all data from one storage medium to another. It must be precised that this copy process means a duplication, the source and the target medium have to belong to the same class of storage media. The fact that the data is just copied is a major property of the refreshment process. The stored data itself is not affected. Refreshment should be an automatic process transparent to the user. It is part of the archive maintenance process. Media refreshment has two major drawbacks. First, as Moore's law predicts, storage capacity doubles every two years (Moore 1965). Consequently, it is not very likely to have the same type of storage medium available over a longer period of time. Apart from that, it is not efficient to store the data of a smaller device onto a larger one. Packetizing several media onto a single larger medium lies in the domain of repackaging, and results in the labeling problem (see Section 2.1.2). The second drawback of media refreshment is the fact that several media types do not have a linear quality degression curve. Hard disks for example are known to have a high infant mortality rate (Yang and Sun 1999). Even if a drive outlasts this first lifetime period, the failure probability is not predictable, as large empiric test have shown (Schoreder and Gibson 2007). Several techniques have been proposed to improve the predictability of hard disk failures. The best known among these approaches is the SMART (Self Monitoring and Reporting Technology) approach, widely implemented in modern hard disk drives, which is currently still in development (Hughes et al. 2002). However, up to now, many drive failures are still not predictable (Pinheiro et al. 2007). Even non-magnetic media like compact discs have a significant quality degression curve, depending on environmental parameters. In this context, a design feature of compact discs turns into a major drawback: the data on the disc is stored using error-correcting Reed-Solomon codes that should prevent the disc player from refusing to read a compact disc. This desired feature hides away a beginning wear-out until the player is not able to correct the codes any more (Costello et al. 1998). Up to now, no technology like SMART is available which reports beginning read errors, as long as the code is able to correct them. Special drives with according reporting
State-of-the-Art Survey of Long-Term Archiving 105
techniques have to be designed to prevent a sudden death of the information on a compact disc. 5.2.1.2
Replication
As already stated in the last Section, drive capacity evolves quickly. Even more, there media types come and go constantly. For many years, floppy disks have been the ubiquitous medium of data exchange. Although the basic design remained (Engh 1981), several floppy disk formats have been released, starting with an 8-inch drive in 1971, going over 5.25-inch drives in 1976, to 3.5-inch drives in 1984. Apart from their physical size, the density of the stored data on the disks increased, leading to a capacity of 240MB in the last 3.5-inch disk revision LS-240, released in 1997 (compared to 79KB of the first floppy disk). Today, most computers do not have a floppy disk drive any more. Compact discs, DVDs, and flash storage devices are state of the art. It can be assumed that the evolution of storage media will evolve further on, leading to media having increased capacity, higher speed, greater convenience, and lower price (Schurer 1998). Apparently, it suggests itself to periodically copy one's data on newly available media for efficiency reasons, a process called replication. In this context, some major issues must be considered: First, the new medium will likely have a different internal data structure. Today, this is usually hidden to the applications (and therefore to the user) by the hardware driver within the operating system. However, it should be taken into account that some software, e.g. large database systems, is still accessing the physical media on a very low level for performance reasons. In this case, the replication might be unsuccessful. On a higher level of abstraction, some side effects can arise, too. For example the evolution to file systems with case-sensitive file names can prevent applications from accessing the correct file, as during the replication process a mapping to all upper-case or all lower-case file names had to be performed. A possible approach to meet this challenge is to store the old file-systems as media image, and to emulate the original device on a software level (see Section 2.2.3). Beside of these issues, every replication process implies the labeling problem: hundreds of disk can stored on a single CD. Every media migration forced to label the new media. A lot of information stored on the old labels must be compromised on one single label for the new media. This leads to information losses. Therefore, the old labels have to be stored
106 N. Krebs, U. M. Borghoff
within the archive, too, and must be included within further replication steps. 5.2.1.3
Repackaging
Archives are also affected by a continuous evolution. In case of an update of the archive system, all stored data within the archive has to be moved to the new system – a process called repackaging. This process should be performed automatically, losses of information have to be avoided by careful planning of the repackaging process. 5.2.1.4
Transformation
The proposed migration strategies can ensure the long-term availability of the original bit-streams. On the other hand, the question arises whether these bit-streams could still be interpreted on a long-term perspective. The last decades showed a massive evolution of data formats. Even former defacto standards like Lotus-1-2-3 have disappeared completely. Therefore, it is quite uncertain that documents stored in today's standard formats could still be opened in around 100 years.
State-of-the-Art Survey of Long-Term Archiving 107
The transformation of documents is basically a lossy conversion. Apart from basic features, today's formats differ slightly in the supported feature set. Therefore, conversion often sacrifices subtleties (fonts, footnotes, cross-references, color, numbering…) as it worst the migration process leaves out entire segments (such as graphics, imagery, and sound) or produces meaningless garbage. Apart from the differing feature sets, undisclosured document formats make the transformation of documents a difficult task. In the last years, great efforts have been made to force the software companies to use open standards, or to open their document formats. However, the sheer complexity of the format documentation easily leads to lossy or faulty conversions. Just to give an example: the Office Open XML standard used by Microsoft Word has more than 7000 pages (ISO 29500). The most common approach to transform documents is to use export- or import filters of the applications that handle the corresponding document formats. One should take into account that export filters often lead to lossy conversions, as software providers do not want their customers to be able to easily switch to competing products. Import filter for third-party formats are more attractive because they support user transition towards the own product (Borghoff 2006, p. 48). For standardized formats, conversion tools can be created. These allow for a transformation in both directions. Moreover, recent experiments show that no current tool is nature enough to completely rely on it (ODF Alliance). To measure the quality of a transformation, one transformed document is re-transformed into the former format. A good conversion tool should ensure the equality of the original document with the re-transformed transformed one. A formal approach to ensure the correctness of the transformation has already been presented (Triebsees 2007). However, it might be possible that a document format contains implicit information. For example, simple text documents usually have a fixed character width. This width is often used for implicit structuring of texts, e.g., in source code for indentation, or as so-called ASCII-art. If one of these documents is transformed into a modern office document with variable character width, the transformation result is not acceptable. Interestingly, after re-transformation into a text document, the original layout is restored, so that the presented quality measure would acknowledge a quality ratio of 100%. To cope with this problem, it might be useful to store the (first) original version of the document version, too. This is mostly the case, if it is likely that the original document could be restored by “data archaeologists” (Borghoff 2006, p. 49). Another approach is to store a document in all its
108 N. Krebs, U. M. Borghoff
data formats available to reconstruct lossy conversion steps later on (Nestor 2009). If many transformations are performed, this approach leads to a massive storage and administration overhead, which makes this approach unusable in this case. To avoid an explosion of the transformation cases, it is important to focus on a few document formats within an archive. This leads to fewer transformation steps, and makes transformation planning easier. Document transformations must be planned precisely with a long-term perspective. International open document standards are a key prerequisite for loss-less transformations. 5.2.1.5
Advantages and Disadvantages
One of the most important organizational measures to be taken by archives is the restriction of the diversity of formats. This forces authors to provide their work in a format suitable to the archive, and reduces further migration steps. A permanent monitoring of new technologies is an important task to ensure the sustainability of the archive. As future developments can not be foreseen, a decision for a document format resembles a bet on horses. One might bet on the wrong horse. Although, a commitment on international standards greatly increases the probability of having taken the right decision. Even if not, it is likely that other archives are caught in the same trap, so that new migration steps might be much more affordable using combined efforts. Each transformation step should basically be checked automatically (Triebsees 2007). If an approved automatic checking is not available, the correctness has to be verified by a sample survey. This is a time-consuming, and error-prone process. However, no alternatives are known. If possible, the author himself should check the transformation. On a long-term perspective, it should be assured that experts of the corresponding domain examine the transformation results. Saving the last migration steps is a valid fall-back strategy. However, this evokes the next problem, as this doubles the size of the archive. Saving more migration steps is much more inefficient in terms of space. It should be pointed out that the archived objects can also be ameliorated during migration steps. E.g., the noise of scanned images can reliably be eliminated today, which was not possible some years ago. Also, tech-
State-of-the-Art Survey of Long-Term Archiving 109
niques like the optical character recognition (OCR) have become more mature in recent years (Borghoff 2006, p. 34). 5.2.2
Emulation
As already stated, the migration of documents and whole archives can have some unwanted side effects, mainly due to the fact that a migrated document lives outside its native environment. A completely different concept to ensure a long-term availability of data is to emulate the native environment of a comprehensive system. The basic idea is that architectures like the PC architecture or -on another level of abstraction- the Windows architecture evolve not that quickly as end-user software systems do. As shown in Figure 2.1, emulation can basically be performed on the level of the application, the operating system, or the hardware (Granger 2000). These approaches will be discussed in the following sections, as well as the concept of virtual computers. We will discuss the advantages and disadvantages of these concepts afterwards. 5.2.2.1
Emulation of Software
Software emulators try to imitate an original program not available any more. As already mentioned in Section 2.1.4, Lotus 1-2-3 was the de-facto standard spreadsheet application within the 1980s. Even if a transformation software would be available, it might be more appropriate to develop a new spreadsheet application, just imitating all habits of Lotus 1-2-3. This might be very interesting as the graphical capabilities have evolved quickly in the meantime, and the 1980s tweaks to gain a graphical output are not supported in modern spreadsheet applications. 5.2.2.2
Emulation of Operating Systems
The emulation of software usually needs the knowledge how the original software acted, which features have been supported and so on. Therefore, it can only be performed as long as the access to the original software (and its hardware environment) is available (e.g. by using a computer museum, see Section 2.3). Beside of this, it is a complex task to create an emulator, which might be too cost-intensive for single applications. To allow the use of different programs programmed for the same operating system, it is more appropriate to emulate the whole operating system. Especially as the
110 N. Krebs, U. M. Borghoff
operation system platforms prevail quite long compared to most software systems. This approach adds a new dimension in archiving documents. Instead of just the documents and their meta-data, also the corresponding software system has to be stored within the archive. This might be a challenging and complex task. While former operating systems have been quite monolithic (like the C64, or DOS), one of the key attributes of modern operating systems like Windows or Linux is their modularity and extensibility. In the context of the emulation of operating systems, this requires a detailed description of the operating environment of a software system. Standards like ITIL or ISO 20000 give detailed guidelines to ensure a comprehensive documentation of the software architecture (OGC 2000), (ISO 20000). 5.2.2.3
Emulation of Hardware
The emulation of software and operating systems usually abstracts from the hardware layer. The emulation of the original hardware system performs on the lowest abstraction layer. The idea is to emulate the original hardware system to allow an operating system to be installed on top of the emulation layer. For a comprehensive hardware emulator, the whole ecosystem of a former computer architecture can prevail, which makes this approach a very global one. Like the emulation of operating systems, a prerequisite for hardware emulators is a detailed documentation of the environment to emulate. Additionally, the documents, the required software, and the operating system have to be stored within an archive to guarantee the accessibility of the archived documents. Interestingly, emulation environments are not only used to make old systems accessible, emulation has become a basic concept during the development of new hardware systems. This allows the development of the corresponding software even if the final hardware design is not available yet. The archiving of these emulators should be enforced, as they usually implement the complete feature set provided by the corresponding hardware. 5.2.2.4
Virtual Computers
The concept of virtual computers derives from the emulation approve. Instead of emulating a concrete hardware system, a virtual computer provides access interfaces to standard devices. This virtual computer is implemented on a software level, and will run as host on top of an operating
State-of-the-Art Survey of Long-Term Archiving 111
system. The virtual machine has to be developed for each operating system separately. One of these known virtual computers is the Java Virtual Machine (JVM), developed by Sun Microsystems (Lindholm 1999). Every Java program will run on top of a JVM, and is independent from the hardware the JVM is running on. The last years showed two interesting developments: first, the virtual machines have become decoupled from the physical hardware (Casselman 1993; Grimshaw and Wulf 1997). Today, these systems are called grids (Foster and Kesselman 2003). Second, emulators have been developed on top of virtual computers (JaC64). This ensures the sustainability of the emulator, as only the virtual machine has to be adapted to future developments. Apart from these approaches, there the idea exists to create a universal virtual computer (Lorie 2001). This computer should be an extension of the well-known Turing machine, which acts as basis of all modern computers (Turing 1936). The basic idea is to compile a software program to be used on this universal virtual computer, which would ensure the steady usability of the program, as long an implementation of the universal virtual computer exists (Gladney and Lorie 2005). 5.2.2.5
Advantages and Disadvantages
The main advantage of the emulation approach is that the digital objects within the archives do not need to be modified. In case that the authenticity has been verified with the original environment, the emulation result should match the appearance of the genuine system. Another advantage is that only the emulator has to be adapted to actual developments, transformation losses as common during frequent transformation steps (see Section 2.2) are omitted. Furthermore, the adaption will likely be less cost-intensive than the development of transformators for all object types available on a platform. As the digital objects are not considered to be updated, the emulation approach allows them to be stored on highly reliable and durable devices, which are usually read-only. As the emulator is considered to be adapted, this assumption does not hold for the emulation environment. On the other hand, the documentation of the genuine environment could be stored on durable media, indeed. A drawback of the emulation approach is the cost of the emulation development. As modern computer architectures become more and more
112 N. Krebs, U. M. Borghoff
complex, the development of an emulation environment becomes less affordable. On the other hand, standardization of systems advances gradually, so that the amount of competing systems might decrease. Beside of the complexity of emulation, one should consider the overhead generated by the emulation. As this is just an intermediary layer, performance will naturally be lower within the emulated environment. However, according to Moore's law, capacity doubles all two years, thus leading to an increased performance of the systems available (Moore 1965). Therefore, the emulation overhead is acceptable. The need for constant adaption to actual systems is a major drawback of the emulation approach. The development of new technologies, e.g. new input-/output systems, raises the need to map the genuine environment onto the available technologies. To lower the efforts to be made, it might be an option to cascade different emulators. By this, the mentioned mapping decisions must only be made between two generations of technology, instead of defining mappings from all former systems to the actual one. This approach has not been evaluated in deep yet, as the emulation is a quite young field of research. 5.2.3
Computermuseum
The basic idea of a computer museum is to provide the original hardware and software systems for a medium range of time. Hardware defects should be repaired using original parts if possible, or reproduced parts. The original condition of the system should be conserved. If the original parts can not be reproduced, alternative parts have to be used, as the operability of the system has the highest priority. 5.2.3.1
Entering a Dead End
For a medium range of time, the computer museum is a feasible and mostly efficient solution for archiving. The refreshing (see Section 2.1.1) is usually less cost-intensive as a migration of the data. On a longer perspective, the availability of replacement parts decreases, and the costs for the reproduction of original parts increase dramatically. A second perspective is the question of the operating of these systems. People trained on these systems withdraw from working life, and the knowledge disappears, especially as a computer museum takes over the systems after their phaseout.
State-of-the-Art Survey of Long-Term Archiving 113
Apart from these issues, a running antiquated system will likely not be able to open a workable interface to a modern computer system, and to keep it up to exchange data (Rothenberg 1999). 5.2.3.2 Computer Museums are an important Component of LongTerm Archiving
Even if the concept of a computer museum is a dead end in terms of a long-term perspective, it can be an important component in a comprehensive strategy. First, the computer museum provides the most reliable authenticity of media in- and output. The authenticity of the original output is an important aspect of cultural heritage (Borghoff 2006, p. 14). The first disk record, the first telegraph message are priceless. Not for the data they contain but for their cultural importance. Second, the correctness of emulators are difficult to evaluate if the original environment is not available (Rothenberg 1999). Third, the computer museum opens a time-period in which emulators can be developed, and tested after the phaseout of the original system.
5.3 Which Strategy for which Kind of Data The question about the right strategy is a main research topic in context of long-term archiving. It is widely accepted that the migration approach is an inadequate solution. But, nevertheless, this is the most used approach worldwide yet. The migration approach may be a useful interim approach while a true long-term solution is being developed (Rothenberg 1999). Other approaches have great potential, but are not available yet. The universal virtual computer (see Section 2.2.4) is an example. Maybe they will never be implemented as planned, primarily due to the expected costs (Granger 2000). It must be stressed that no one of the presented strategies can solve the given problem of long-term archiving alone. From today’s perspective, long-term archiving can only succeed if a comprehensive approach of all given strategies is used, tailored to the type of the given digital objects. The first part will show apparent correlations between different kinds of digital data and the approaches of long-term archiving presented above. The second part motivates possible combinations of different approaches.
114 N. Krebs, U. M. Borghoff
5.3.1
Apparent Correlations
A criteria catalog, which offers an approach by given digital objects to archive, does not exist. As shown above, every approach got pros and cons. Just these pros and cons can be used to proceed a pre-selection. The following correlations are derived of that too. 5.3.1.1
Frequency of Access
The more often an archived object will be accessed, the less analog storage media like microfilms are applicable. To avoid damage to the original medium, all access has to be performed on copies. In cases of frequent access direct accessible storage media should be used, which offer a high capacity like hard disks and DVDs. The use of long durable storage media like microfilm can be inadequate because they usually need special storage conditions forbidding frequent access. 5.3.1.2
Latency of Access
The more quick an access has to be, the more the latency of the used storage medium becomes important. The most cost effective solutions available today are hard disks. Despite their relatively short live time, hard disks offer both high capacity and quick accessibility. An ongoing replication in the sense of migration is necessary to avoid the loss of information (see Section 2.1.2). 5.3.1.3
Relevancy for Current Workflow
The more relevant the archived data is for the current workflow, the more important is the high availability of the data and its direct access using current data formats. Only the migration approach fulfills this requirement. An emulator that runs a program to render the requested information might be very reliable, but the shown information has to be ported to current used programs, which for interfaces are usually not available. Under certain conditions, copy and paste functions via graphical user interface are available, but it is not applicable to handle complex tasks using that approach.
State-of-the-Art Survey of Long-Term Archiving 115 5.3.1.4
Dependencies
The more the archived data depends on other data stored within the same archive, the higher the availability has to be. For example, if an archived website is accessed, also the embedded pictures should be available, too. To access data which is stored within an archived database, the corresponding database has to be available, to ensure a valid query result. The migration approach could be used with a possible loss of information. The future will probably facilitate the use of emulators that emulate the original environment. The decision between these approaches should be taken depending on the relevancy to the current workflow (see Section 3.1.3), and on the emulation techniques available. 5.3.1.5
Hardware Dependencies
Several digital objects depend on the hardware they have been originally designed for. Their functionality directly depends of the hardware or on the original rendering environment, respectively. For example, text documents using fonts provided by the operation system could be displayed in a wrong way. To ensure the availability of such a digital object, the computer museum is a temporary solution, while the emulation approach would provide a long-term solution. 5.3.1.6
Authenticity
If the authenticity is the main criterion, the migration approach is not suitable. The emulation approach and even the use of analog storage media like microfilm are viable solutions. But a full authenticity will most likely never be reached. A historic painting preserved by storing a copy on microfilm can not be analyzed regarding the consistency of the oil colors used by the painter. All the same, an emulator would not be able to emulate a rough-running keyboard or a flickering computer monitor. The goal is to freeze and preserve a high-level authenticity. 5.3.2
Hybrid Approaches
Every approach matches best for certain situations. To avoid the disadvantages of a single approach, it is a viable solution to combine different approaches. In recent years, hybrid approaches gained a large popularity.
116 N. Krebs, U. M. Borghoff
Conflicts can arise if, e.g., on the one hand, historical documents have to be preserved and, have to be accessible for a larger audience on the other hand. The more frequent the access to archived objects is, the higher the abrasion would be. And the more the abrasion is, the shorter the expected life time of these documents is. A hybrid approach can solve this conflict. After microfilming the historical documents, they can be digitized and stored on highly available storage media. Afterwards, the microfilms are stored under ideal conditions. The digitized copy however would continuously be migrated to current formats. Even in case of a loss of information, the process of digitizing could be restarted using the stored microfilms. The original historic document would never be touched again in this process. In case of universal virtual computers an analog archiving strategy would become relevant. The specification of that virtual computer, the source code of the emulators, or programs used for presentation could be stored in analog form on long durable media, as they do not need to be updated. Together with an ontology saving the semantic meanings of the specification of the virtual machine, the machine would be self-explaining and understandable, even far in the future. For every-day access, they should be stored on a highly accessible digital medium, too. In case of complex environments a combination of different approaches can be a working strategy, too. For huge database, the approach of data migration might be an appropriate solution. For the data access part of the environment (mostly the end-user applications), the emulation would help to ensure their viability. Unless a reliable emulator is available, the original hardware has to be preserved using a computer museum to be able to proof the correctness and authentication of an emulator developed later on.
5.4 Long-Term Preservation of Geodata Geographical information systems (GIS) are an integrated environment, allowing the user to query spatial information using geographic measures (Martin 1996), containing, amongst others, geodata stored in databases, GIS applications to enter queries and display analyses, and potentially distributed components connected by a network (Longley 2005). GIS have become a major pillar of many applications, as they provide a view to the real world found among us.
State-of-the-Art Survey of Long-Term Archiving 117
In this part we will discuss some issues of the long-term archiving of geodata, which is stored in databases. First we will focus on the spatial aspects of geodata. The second part will accentuate the need for GIS allowing to access geodata gathered at different points in time. This directly causes the need of a long-term archiving strategy for this kind of data. Finally, we will discuss aspect of long-term archiving of whole GIS. 5.4.1 5.4.1.1
Special Aspects of Geodata Wide Range of Usage
It is common practice to store nearly all geodata within GIS. However, this does not implicate a common structure of the used data sets (Janée 2008). Commonly, the data sets correspond to defined points on earth. By this, significant differences between used representations of coordinates can arise. The more precise the location has to be specified, the more specialized the reference system should be. It is possible to translate between different reference systems, but the results do not fulfill high standards in every case (Fotheringham 1994). When implementing a GIS, the used reference system, the precision needed, and the metadata model to implement hardly depends on the given kind of usage. Usually, neither the interoperability nor the aspects of the long-term archiving of the stored data plays a primary role. 5.4.1.2
Huge Amount of Data
Size matters in context of a GIS. Only comprehensive systems allow to answer to complex questions through database queries and are able to get meaningful results. As a consequence, modern GIS are becoming more and more complex. Nowadays, it has become accepted practice to implement a GIS using so-called clusters, which are groups of computers, linked together through fast networks, just to be able to handle the vast amount of data.
118 N. Krebs, U. M. Borghoff 5.4.1.3
Format Diversity
There are a lot of manufacturers of GIS. Up to now, none of them got a monopoly position (Teege 2001). This fact causes affordable GIS, but also most diverse data formats. One of the most popular GIS data formats is the Drawing Interchange Format (DXF), which was intended as a data transfer format between CAD software systems. Like most vector-oriented formats, it allows to describe objects using dots, lines, and polygons. Only a few formats allow the description of additional metadata associated with the spatial objects (Rigaux 2002). Apart from DXF, GML has emerged as de-facto standard for GIS data exchange (see Section 4.3.1). Metadata itself can be stored in different data formats. Metadata can be stored as simple text, or using XML structures. Major efforts have been made to implement universally accepted metadata models, leading to international standards (ISO 19115, ISO 19139). But also photographs, videos, and even 3D objects of a virtual reality environment can act as metadata, too (Hosse 2005). 5.4.1.4
Distributed Data
Many institutions record data just for their own purposes. External persons have no or only restricted access to this data often stored in very custom data formats. The combination of the data coming from different GIS turns out to be difficult (Hosse 2005). To facilitate the interoperability between GIS, standardization efforts have been made, mostly by the International Standardization Organization (ISO), and the Open Geospatial Consortium (OGC). They define interfaces to interchange data, and establish rules for database structures. An interface, defined by the OGC, enables the connection of a web map server to a GIS (de la Beaujardiere 2006). Through this interface, maps from different servers can be requested easily and uniformly through a web browser. Web map servers support well-established bitmap graphic (JPEG, GIF, PNG) and vector (Web-CGM, SVG) formats. The different requested maps can be accurately overlaid to produce a composite map (Teege 2001). This interface enables users to combine the different kinds of information stored in different GIS databases. The web map server interface could be suitable as an interim solution for long-term archiving. As in the domain of paper-based maps, digital maps could be stored in widely accepted data format standards, and
State-of-the-Art Survey of Long-Term Archiving 119
archived. However, maps produced by such a web map server will likely contain just a subset of the information available through GIS applications. It is nearly impossible to create all combinations of available geodata to ensure a long-term archiving using this strategy, especially as no one can truly say which information will be of interest to future generations. 5.4.2
For What Long-Term Archiving Can be Used
Archiving of geodata is mostly practiced occasionally. Only a few archives have access to very old maps of their own. In most cases, national libraries own these treasuries like the maps of Columbus’s four voyages, as this information is part of our cultural heritage. Also the current knowledge represented in geodata should be kept accessible for following generations. This does not focus on historical events that have to be preserved. First, the impact of most events becomes apparent later on. Second, the computer gives us the opportunity to analyse vast amounts of data, allowing to track
120 N. Krebs, U. M. Borghoff
the changes done by humans in soil. This is also part of our heritage, which should be preserved for future generations. 5.4.2.1
Archiving Geological Data
In many countries it is required to document changes in soil. Underground mining is an important example for this. Old mines have to be documented for eternity. These documentations have to be available for further generations until the previous state is established, e.g., the mine is filled again. Nowadays, the necessary data is gathered and saved in digital form. However, the standards used by varying institutions differ extremely. As mentioned before, it is difficult to harmonize the different data sources (see Section 2.1). Our mining example may show the consequences: Future generations might plan to construct new building above these mines, or they plan to continue the mining process, or they decide to use this old mine as an ultimate storage facility (in some cases, old mines provide a perfect storage for conserved microfilms). All of this makes a long-term archived documentation necessary, allowing to query the different information sources uniformly. Sometimes, geological analysis data has no immediate relevance, which can change dramatically after a few decades. E.g., whether a discovered oil field will be explored highly depends on the technology available and the current oil price. Keeping such analysis data viable means avoiding cost intensive re-exploration. For insurance companies, geodata is important, too. Their risk assessment for floodwaters and earthquakes is highly based upon geodata. The company with the best assessments should be the most cost effective provider of insurances. Keeping such geodata viable ensures the existence of insurance companies. 5.4.2.2
Keeping Our Cultural Heritage
The archiving of our cultural heritage is mainly the responsibility of museums and libraries. Their task is to keep the artifacts in the best condition as possible and to make them accessible to scientists and citizens interested in. The digital preservation of our cultural heritage allows the protection of the original artifacts and a quick and worldwide access (Hosse 2005). Historically significant artifacts stored in archives contain even old painted maps or copper engravings of historical buildings that exists no more. Today, historians have to base their knowledge on that kind of arti-
State-of-the-Art Survey of Long-Term Archiving 121
facts. In the future, GIS could be used to provide this information stored in historical artifacts to a larger audience. Recording the information contained in old artifacts into a GIS means a difficult procedure. Historical data does not have a standardized format; it is often artistically decorated, and uses different scales and changing precision. These kinds of artifacts are usually part of our cultural heritage, too. Also the representation differs, e.g., streets, vegetation, buildings, and places. This information has to be adapted to current data formats. This means that they have to be translated in the sense of migration with possible loss of information. In future, the life of historians could be much easier. In case that we can keep our geographic heritage viable for future generations, they will be able to access precise and vectorized maps with altitude information, enriched with detailed metadata like information about city and street names, vegetation, borders, and much more. Nowadays, even 3D-Models of cultural important buildings are created using methods of photogrammetry and laserscanning, e.g., shown by a project of visualizing a cultural important church (Hosse 2005). 5.4.3
Approaches of Long-Term Archiving of Geodata
There do not exist solutions to the problem of long-term archiving of geodata at all. Practiced strategies can only be found within a small part of the data volume, which is ranked among our digital heritage. Geodata in general is not affected by this up to now. The reason for that does not come from the fact that the need for long-term archiving is underestimated. Rather than that, the requirements for long-term archiving are not fulfilled. 5.4.3.1
Geography Markup Language (GML)
The geography markup language (GML) is an XML grammar defined by the OGC (ISO 19136). GML is used as an interchange data format between databases. The current structures of data-set differ significantly. Standards like GML can help to harmonize these structures. The more standardized these data structures are, the easier the data interchange between the different connected databases will be. As a side effect, stored data does not have to be copied and transformed frequently any more. If these standards are enriched using a suitable ontology describing the semantic meaning of the data, the archiving for eternity could be possible.
122 N. Krebs, U. M. Borghoff
An attempt to integrate geographic information based on its semantic content is described by Fonseca (2002). 5.4.3.2
Separate Data and Analyze Program
The emulation of a distributed and heterogeneous GIS seems to be not feasible. The costs of developing emulators for such an environment are not bearable; the essential performance is likely not to be available for years. It could be impossible to evaluate the emulators’ authenticity to the original GIS, because they are already out of order at the time of their availability. To enable the applicability of different approaches of long-term archiving, a strict separation of the data layer and the analyzing programs is indispensable. If there are fixed interfaces used by the analyzing programs of a GIS to query the data out of a database, it would become possible to migrate the database independently from the programs. The wide experience of the migration approach could be ported to the GIS domain. However, such a separation is not given in many cases of GIS implementations. Maybe, the reason for this design flaw is the wish to increase the performance, or even to save the own market position against competing system providers. If the analyzing programs are separated from the underlying database, the emulation approach is applicable, too. In this case, it is important that such programs should be written for a large class of computer systems. 5.4.3.3
Separate from Hardware
The best case in the sense of long-term archiving is that the used software is completely independent from the underlying hardware. The software has to be written for virtual environments, which exist for the most current systems. Thus, the software is able to run on current and future platforms for which the virtual environment is available. The software does not have to be changed and could outlast generations of computers, without abdicating the benefits of their increasing performance (Figure 4.1). 5.4.3.4
Modelling of Time
The most GIS manage large amounts of static data. This means that only actual data sets have to be managed, which is an adequate approach for a GIS just designed for a constant amount of data. For continuously growing
State-of-the-Art Survey of Long-Term Archiving 123
databases, every GIS will reach its maximum. From a historic point of view, the worst case to happen is when newer data sets will overwrite older ones to avoid the continuous growth. If every update to a database causes the loss of older data sets, historical changes are not traceable in the future. The modelling of time enables a GIS to add time specialized data like time stamps or life times to each data set. These additional data descriptions (usually stored within the metadata) can be used as a criterion on how relevant a data set is at a given time. A GIS database does not have to grow continuously, because data could be sourced out to an archive constantly over the time. If necessary, archived data sets could be re-imported into the database again. Because the database accounts for time specific metadata, these data sets do not overwrite other data. “GIS has been used for decades now, so users – especially younger users – are starting to expect that older content will exist” (Morris 2006). Applications that allow analyzing spatiotemporal data over epochs are imaginable, and of course even reality, e.g., in Google Earth 5.0. Such temporal GIS could be used in geology and archaeology, which are scientific disciplines with a need for data sets spanning over decades, centuries or even millenniums (Bill 1999). A contribution to planning procedures like landscape planning or regional developments is possible, too (Hosse and Schilcher 2003). Outsourcing and archiving of data sets imply the migration approach. The huge amount of data should be transformed periodically to current data formats. This is especially important for the data stored in binary large objects (blobs) – a common design decision in GIS with underlying database systems. The need of manual correction can arise in this case, and the loss of information is possible. Such a huge amount of data makes it necessary to establish policies to estimate the quality of data set before archiving them. Only data sets fulfilling these requirements will be archived. Lyon (2007) described that for scientific primary and secondary data. An important criterion is fulfilled, if data is not reconstructible out of earlier archived data with acceptable effort.
5.5 Conclusion Long-term archiving of digital objects and systems is a challenging task. We have presented the main strategies used to meet this challenge, and showed that each of them has major drawbacks. One of the most difficult
124 N. Krebs, U. M. Borghoff
aspects during the design of an archive for digital data is the fact that the future developments cannot be foreseen and that the digital revolution has most likely still not ended. For all that, great efforts have to be done to allow the preservation of our digital heritage. In the context of geo-data, the large amount of data, and the bonding of data with the corresponding application hinders the straight archiving. Great efforts have been made to define generally accepted data formats like GML, which can be ingested into an archive without problems. For the applications, open standards and comprehensive (technical) documentations are usually not available, which also holds for databases used in the context of GIS. Therefore, these applications and databases will likely be preserved using emulators. Therefore, we believe that hybrid approaches of combined emulation and migration systems will open up the door to feasible long-term archiving of geodata stored in databases, and query, representation and analyzing functionality. Future work has to focus on application-tailored archiving strategies and their evaluation.
5.6 References ANSI Art, viewed 12 January 2009,
, Borghoff, U M, Rödig, P, Scheffczyk, J & Schmitz, L 2006, Lon-Term Preservation of Digital Documents, Springer, Berlin Casselman, S 1993, Virtual computing and the Virtual Computer, Proceedings of the IEEE Workshop on FPGAs for Custom Computing Machines, Napa, CA Consultative Committee for Space Data Systems (CCSDS) 2002, Reference Model for an Open Archival Information System (OAIS). Technical report, Space Data Systems (2002) Costello, D J, Hagenauer, J, Imai, H & Wicker, S 1998, Applications of ErrorControl Coding, IEEE Transactions on Information Theory, vol. 44, no. 6 De La Beaujardiere, J 2006, WebMap Service, OGC reference number OGC 06042 Denz, C 2008, Nonlinear Photonics, Competence-Center for Nanoanalytics in Münster, viewed 12 January 2009, < http://www.centech.de/ programme/download.php? FL=ccn_presentation.pdf> Engh, J 1981, The IBM Diskette and Diskette Drive, IBM Journal Res. Develop., vol. 25, no. 5
State-of-the-Art Survey of Long-Term Archiving 125 Fonseca, F, Egenhofer, M, Agouris, P & Câmara, C 2002, Using Ontologies for Integrated Geographic Information Systems, Transactions in GIS, vol. 6, no. 3, pp. 231-257 Foster, I & Kesselaman, C 2003, The Grid. Blueprint for a New Computing Infrastructure, Morgan Kaufmann Fotheringham, A S 1994, Spatial Analysis And GIS, Taylor & Francis Ltd Gladney, H & Lorie, R 2005, `Trustworthy 100-Year Digital Objects: Durable Encoding for When It’s Too Late to Ask´, ACM Transactions on Information Systems, vol. 23, no. 3, pp. 299-324 Granger, S 2000, `Emulation as a Digital Preservation Strategy´, D-Lib Magazine, vol. 6, no. 10 Grimshaw, A S & Wulf, W A 1997, The Legion vision of a worldwide virtual computer, Communications of the ACM , vol.40, issue 1 Hendley, T 1998, Comparison of Methods and Costs of Digital Preservation, British Library Research an Innovation Report 106, CimTech Ltd., Univ. of Hertfordshire Hosse, K & Schilcher, M 2003, Temporal GIS for Analysis and Visualisation of Cultural Heritage, Proceedings of CIPA XIX international Symposium, Commission V, WG5, Antalya, 2003. Hosse, K 2005, Objektorientierte Modellierung und Implementierung eines temporalen Geoinformationssystems für kulturelles Erbe, Dissertation, Technische Universität München Hughes, G, Murray, J, Kreutz-Delgado, K & Elkan, C 2002, Improved Disk-Drive Failure Warnings, IEEE Transactions on Reliability, Vol. 51, No. 3 International Organization for Standardization, 2003, Geographic Information – Spatial Schema, ISO/IEC 19107 International Organization for Standardization, 2005, Geographic Information – Metadata, ISO/IEC 19115 International Organization for Standardization, 2007, Geographic information -Geography Markup Language (GML), ISO/IEC 19136 International Organization for Standardization, 2007, Geographic information Metadata - XML schema implementation, ISO/IEC 19139 International Organization for Standardization, 2005, Information technology -Service management, ISO/IEC 20000 International Organization for Standardization,2008, Information technology – Document description and processing languages – Office Open XML File Formats, ISO/IEC 29500 JaC64, Java Based Commodore C64 Emulation, viewed 12 January 2009, http://www.jac64.com Janée, G, Mathena, J & Frew, J 2008, `A Data Model and Architecture for Longterm Preservation´, JCDL '08: Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries, ACM, pp. 134-144
126 N. Krebs, U. M. Borghoff Lindholm, T & Yellin, F 1999, Java Virtual Machine Specification, Addison-Wesley Longman Publishing Co., Inc. Boston, MA Longley, P, Goodschild, M, Maguire, D & Rhind D 2005, Geographic Information Systems and Science, John Wiley & Sons, Ltd. Lorie, R 2001, `Long Term Preservation of Digital Information´, JCDL '01: Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries, ACM, pp. 345-352 Martin, D, 1996, Geographic Information Systems: Socioeconomic Applications, Routledge Chapman & Hall Moore, G, 1965, Cramming more components onto integrated circuits, Electronics, vol. 38, no. 8, April 19 Morris, S. P., 2006, Geospatial Web Services and Geoarchiving: New Opportunities and Challenges in Geographic Information Services, Librarytrends, vol. 55, no. 2, pp. 285-303 Nestor 2008, Handbuch: Eine kleine Enzyklopädie der digitalen Langzeitarchivierung, viewed 12 January 2009, http://nestor.sub.unigoettingen.de/handbuch/nestor-handbuch.pdf ODF Alliance, MS Office 2007 Service Pack2 With Support for ODF: How Well Does It Work?, http://odfalliance.org/resources/fact-sheet-Microsoft-ODFsupport.pdf, viewed 28 September 2009 Office of Government Commerce, 2000, Service Support, The Stationery Office, ISBN 0-11-330015-8. Pinheiro, R, Weber, W-D & Barroso, L A, Failure Trends in a Large Disk Drive Population, Proceedings of 5th USENIX Conference on File and Storage Technologies (FAST'07), San Jose, CA Rigaux, P, Scholl, M & Voisard, A 2002, Spatial Databases: With Application to GIS, Morgan Kaufmann, San Francisco Rothenberg, J 1999, Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation, Council on Library and Information Resources, Washington, DC. Schroeder, B & Gibson, G 2007, Disk failures in the real world: What does an MTBF of 1,000,000 hours mean to you?, Proceedings of 5th USENIX Conference on File and Storage Technologies (FAST'07), San Jose, CA Schurer, K 1998, `The Implications of Information Technology for the Future Study of History´, History and Electronic Artefacts, Edward Higgs, Clarendon Press, Oxford, pp. 155-158 Sisson, E 2008, Monuments, Memorials, and Spacecraft: A Test-Case in the Treatment of a Spacecraft as a Semiotic Artifact, available at SSRN: http://ssrn.com/abstract=1319376 Teege, G 2001, `Geodaten im Internet: Ein Überblick´, Informatik-Spektrum, Springer, vol. 24, no. 4, pp. 193-206
State-of-the-Art Survey of Long-Term Archiving 127 TFADI 1996, Preserving Digital Information. Report of the Task Force on Archiving of Digital Information, The Commission on Preservation and Access, Washington, DC Triebsees, T & Borghoff, U 2007, `Towards Constraint-Based Preservation in Systems Specification´, Quesada-Arencibia, Proc. 11th Int. Conf. on Computer-Aided System Theory (Eurocast 2007), Springer LNCS, vol. 4739 Turing, A, 1936, On computable numbers, with an application to the Entscheidungsproblem, Proceedings of the London Mathematical Society, Series 2, vol. 42 Yang, J & Sun, F-B, 1999, A Comprehensive Review of Hard-Disk Drive Reliability, Proceedings of the IEEE Annual Reliability and Maintainability Symposium, Washington D.C.
6 Preservation of Geospatial Data: the Connection with Open Standards Development
Steven P. Morris Digital Library Initiatives, NCSU Libraries [email protected]
Abstract Efforts to preserve geospatial data stand to benefit from the development and implementation of open standards which ensure that data are interoperable not only across software systems but also across time. The Open Geospatial Consortium (OGC) is an international industry consortium of companies, government agencies and universities that work together to develop publicly available interface specifications. OGC specifications support interoperable solutions that “geo-enable” the Web, wireless and location-based services, and mainstream information technology. Examples of OGC specifications include Web Mapping Service (WMS), Web Feature Service (WFS), WebCoverage Service (WCS), Geography Markup Language (GML), and OGC KML. Within the OGC a Data Preservation Working Group has been formed to address technical and institutional challenges posed by data preservation, to interface with other OGC working groups that address technical areas that are affected by the data preservation problem, and to engage in outreach and communication with the preservation and archival information community. Key points of intersection between the data preservation problem and the OGC standards development activity include but are not limited to the following: the use of Geography Markup Language (GML) in archival data, development of content packaging schemes, management of data versions, persistent identification of data objects, inclusion of
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_6, © Springer-Verlag Berlin Heidelberg 2011
130 S. P. Morris
archival use cases in rights management schemes, and preservation of data representations. The attention of the OGC is particularly focused on service-oriented environments and, in this context, the challenge of data preservation will increasingly find intersection with the general problem of bringing a temporal component, including data persistence, to services and associated applications.
6 . 1 I n t ro d u ct i o n Digital geospatial data, which encompasses a range of content types including vector data (point/line/polygon), raster or imagery, spatial databases, and other content types, poses numerous preservation challenges. Geospatial data does not typically exist as one definable set of record material but instead often consists of the combination of a type, such as vector and a format, often existing within a specific software environment (McGarva 2009). Some key data preservation challenges for geospatial data include: • Routine overwrite of data: Many data resources are subject to routine update and yet older versions of these data are most often either not retained or not made available for access. Additionally, the diffusion of data production points renders daunting the task of assembling such data into centrally managed archives. • Absence of a widely supported non-proprietary archival format for vector data: Geospatial vector data is most commonly available, in its static form, as a file or set of files adhering to a specific proprietary format. Long-term support of these formats is open to question. Spatial databases, which may include vector data as well as other content types, pose significant preservation challenges. • Persisting data state and context in web services: It may be possible that a shift towards web services- and API-based access to data threatens the formation of secondary archives, as it becomes customary to point to services rather than acquire data resources in bulk. It is often secondary archives, rather than the data producer, which maintain content over longer periods of time. Furthermore, when interacting with web services it is often possible to save service context in a client environment, but it is typically not possible to save data state, since the data underlying a service may be subject to change.
Preservation of Geospatial Data 131
• Preserving data representations: The true counterpart to the old paper map is not the geospatial dataset but rather the combination of various components including a combination of datasets (or services), and application of modeling, symbolization, classification, and annotation. Preserving this amalgam of components in anything other than a “desiccated” form presents a formidable technical challenge. • Time-versioned data: Geospatial metadata standards and traditions have tended to treat datasets as static entities. The relationship between dataset instances at particular points in time and the larger notion of the dataset as a serial entity is not well established in metadata practice. • Semantic challenges: Understanding the meaning of older data will require not just adequate metadata but also adequate means to interpret attribute and classification information, the meaning of which is grounded in the context of a particular time period (Morris 2008). Many preservation challenges relate to or are exacerbated by the absence of open and widely adopted standards addressing such topics as archival representations of vector data, persistent cartographic representations, and content packaging. Development of open standards providing for interoperability of tools and data provide a foundation for efforts to develop technical approaches to data preservation that will stand the test of time and support not just preservation of data but also “permanent access” to data. It will increasingly be important to inform standards development efforts with concerns that relate to long-term and persistent access to data and data derivatives.
6.2 Standards Work of the Open Geospatial Consortium (OGC) The lead organization for creation of international standards in the geospatial realm is the Open Geospatial Consortium, an international industry consortium of over 400 companies, government agencies and universities that work together to develop publicly available standards and specifications. OGC specifications support interoperable solutions that geo-enable the Web, wireless and location-based services, and mainstream information technology. Examples of OGC specifications include Web Mapping Service (WMS), Web Feature Service (WFS), Geography Markup Language (GML), and OGC KML. The OGC has a close relationship with
132 S. P. Morris
ISO/TC 211, which addresses standardization in the field of digital geographic information, and a subset of OGC standards are now ISO standards. The OGC also works with other international standards bodies such as W3C, OASIS, WfMC, and the IETF (McKee 2001). 6.2.1
OGC Standards and Specifications
The main products of the OGC are standards and specifications, which are technical documents that detail interfaces or encodings that have been developed by the membership to address specific interoperability challenges. These documents are built and referenced against the OGC’s Abstract Specification, which provides the conceptual foundation for most OGC specification development activities (OGC 2010 a). Standards and specification are written for a more technical audience and detail the interface structure between software components (OGC 2010 b). The OGC also publishes Best Practices Documents containing discussion of best practices related to the use or implementation of an adopted OGC standard. Best Practices Documents are an official position of the OGC and represent an endorsement of the content of the paper (OGC 2010 c). In addition, the OGC publishes Discussion Papers, which are documents that present technology issues being considered in the Working Groups of the OGC Technical Committee. These papers have the purpose of creating discussion within the geospatial information industry on specific topics (OGC 2010 d). 6.2.2
Interoperability Program
The OGC Interoperability Program (IP) complements the Specification Program by focusing on development, testing, demonstration, and promotion of the use of OGC Specifications. The Interoperability Program organizes and facilitates interoperability initiatives, which include: Test beds: Rapid, multi-vendor collaborative efforts to define, design, develop, and test candidate interface and encoding specifications, which are subsequently reviewed, revised, and potentially, approved in the OGC Specification Program. Pilot Projects: Application and testing of OGC specifications in real world applications using standards-based, commercial, off-the-shelf products that implement those specifications, helping help users under-
Preservation of Geospatial Data 133
stand how to best implement interoperable geo-processing while also helping to identify gaps for further work. Interoperability Experiments: Short-spanned, low-overhead, formally structured and approved initiatives, led and executed by OGC members to achieve specific OGC technical objectives (OGC 2010 e). OGC Web Services (OWS) test beds involve a set of sponsor organizations, such as government agencies and commercial solutions providers, that are seeking open standards for their interoperability requirements. Focus areas for each OWS initiative are based on sponsor requirements. The OWS-7 initiative, under way in the year 2010, is organized around the threads of Sensor Fusion Enablement (SFE), Feature and Decision Fusion (FDF), and Aviation (OGC 2010 f). The OGC has published a wide range of standards and specifications pertaining to geospatial data and associated services, addressing a range of functional areas such as web mapping, location services, imagery encoding, and sensor observation and modeling. While most OGC standards exist in the context of service-oriented architectures, there are standards that address geospatial data in both dynamic and static form.
6.3 Key OGC Standards and Specifications: Dynamic Geospatial Data Geospatial web services allow end-user applications as well as server applications to make requests for sets of data over the web. Requests might also be made for particular data processes, such as finding a route or locating a street address. In web map service client applications, data is drawn from one or possibly many different sources and presented in map form to the user. These mapping environments take the burden of data acquisition and processing away from the end user or client application. Web Map Services (WMS): The OGC WMS specification was released in 2000 (now in version 1.3) and by virtue of its simplicity gained wide adoption and vendor support. WMS is a lightweight web service at the core of which is the “Get Map” request, which allows the client application to request an image representation of a specific data layer. Requests can be made from individual clients such as desktop GIS software, web browsers, as well as other map servers which might blend data sources from a number of different servers (de La Beaujardiere 2006).
134 S. P. Morris
The Web Map Context specification was developed to formalize how a specific grouping of one or more maps from one or more WMS services can be described in a portable, platform-independent format (Sonnet 2005). Additionally, the Styled Layer Descriptor (SLD) profile of the Web Map Service provides a means of specifying the styling of features delivered by a WMS using the Symbology Encoding (SE) language (Lupp 2010). WMS tiling efforts came as a response to the experience of Google Maps and other commercial map services, which demonstrated the speed with which static tiled imagery could be presented in user applications. A Web Map Tiling Service (WMTS) Interface Standard was developed in order to provide a standard approach to provide access to static map tiles (Masó 2010). Web Feature Services (WFS): Web Feature Services (now in version 1.1), which handle vector data, stream the actual data in the form of GML. WFS which was first released as a standard in 2002 has not been implemented on as wide of a scale as WMS, partly due to a higher level of complexity. WFS could be used to automate data harvests using Transactional Web Feature Service (WFS-T) to perform transfer of data updates from distributed sources to a central archive (Vretanos 2005). Other OGC Web Services: Many other web services specifications have been released by the OGC, including the Web Coverage Service (WCS), which addresses distributed access to content such as satellite images, digital aerial photos, digital elevation data, and other phenomena represented by values at each measurement point (Whiteside 2008). OGC members are also specifying a variety of interoperability interfaces and metadata encodings, such as the Sensor Model Langauge (Sensor ML) and the Sensor Observation Service (SOS), that enable real time integration of sensor webs into the information infrastructure. Data Preservation Challenges for Dynamic Geospatial Data Data preservation challenges associated with dynamic data services are mostly related to the problem of retaining data state in an application or end-user representation. While it is typically possible to save service state (e.g., map area or view, zoom level, what data is shown etc.), it is typically not possible to save the state of the data within the service, creating a preservation challenge with regard to capturing such interactions. Additionally, it may be necessary to preserve technical components that relate to the data service but are not part of the data itself. For example, if preservation of the cartographic representation of a map delivered by a WMS is im-
Preservation of Geospatial Data 135
portant then it may be necessary to preserve the associated SLD (McGarva 2009).
6.4 Key OGC Standards and Specifications: Data Files and Network Payloads While most OGC standards focus on components of service-oriented architectures, some standards also address content that can exist either as a network payload or as a static data file. GML: Geography Markup Language (GML) is a standard first introduced in 2000 by the OGC (now available in version 3.2.1) and subsequently published as ISO standard 19136 (Portele 2007). The GML specification declares a large number of elements and attributes intended to support a wide range of capabilities. Since the scope of GML is so wide, profiles of GML that deal with a restricted subset of GML capabilities have been created in order to encourage interoperability within specific domains that share those profiles. For example, a Point Profile has been developed for applications using point geometric data without the need for the full GML grammar. While GML can be used for handling file-based data, it has wider use in web services-oriented environments. A prominent example of a national GML implementation is UK Ordnance Survey MasterMap, based on GML 2.1.2. Notable domain implementations of GML include CityGML, implemented as an application schema for the representation, storage and exchange of virtual 3D city and landscape models (Groger 2008), and the Aeronautical Information Exchange Model (AIXM), which utilizes GML and is designed to enable the management and distribution of Aeronautical Information Services (AIS) data in digital format (AIEM 2010). GML is in some cases used as an enabling component within another technology. For example, the GML in JPEG 2000 for Geographic Imagery Encoding Standard defines the means by which the GML is used within JPEG 2000 images for geographic imagery (Kyle 2006). GML is often used as a network payload in a web services context, such as in the case of data transfers through WFS. GML also is plays a role in providing spatial data types in other standards such the Geospatial eXtensible Access Control Markup Language (GeoXACML).
136 S. P. Morris
KML: KML, formerly known as Keyhole Markup Language and now just referred to as OGC KML, took its original name from the company that created the product that eventually became known as Google Earth following the acquisition of Keyhole by Google in 2004. KML is an XML language focused on geographic visualization, including annotation of maps or images in digital globe or mapping environments. KML initially found use for data visualization within the Google Earth digital globe environment but is now supported in a range of environments both for three-dimensional and two-dimensional functionality. In April 2008 KML version 2.2 was approved as an international implementation standard by the OGC (Wilson 2008). KML provides support for both feature data, in the form of points, lines, and polygons, and image data, in the form of ground and photo overlays. Additional functionality for presentation via icons and informational "balloons" is also supported. While KML is primarily used for data visualization, the Extended Data feature makes it possible to include attribute information in the form of arbitrary XML, untyped name/value pairs, and typed data adhering to a schema. In version 2.2 the AtomPub metadata elements "author", "name", and "link" were added to facilitate addition of author and source information. In addition to being used to represent features on the earth’s surface for visualization purposes, KML can also be used to annotate content in a spatial context for discovery within geospatial search environments. Tools for conversion of preexisting geospatial data into KML are now available in other commercial and open source software environments. KML files may be associated with images, models, or textures that exist in separate files. KMZ files are compressed ZIP archive files which allow one or more KML files to be bundled together along with other ancillary files required for the presentation. KML files may refer to external resources and other KML files via network links, which are used to link related data files and to facilitate data updates. Large data resources such as imagery datasets may be divided into a large number of smaller image files which are then made available via network links on an as needed basis. Data Preservation Challenges for File-Based Data and Network Payloads Numerous data challenges accrue to file-based data resources even if those data are encoded using an open standard. Each different GML implementation will raise its own preservation challenges in terms of schema evolution, ongoing tool support, and dependencies on any data resources or con-
Preservation of Geospatial Data 137
tent that might be externally referenced. In the case of KML, one problem is that presentations using network links pose a preservation challenge in that any data available via the links may no longer be available in the future (McGarva 2009).
6.5 The OGC Data Preservation Working Group
UNCORRECTED
PROOF
Engagement with the Open Geospatial Consortium (OGC) specification development and initiative process has been one important objective within the data preservation effort. While most OGC activities focus on services-oriented scenarios, inserting a temporal component into those services will be important in order to enable interoperability across time, not just across systems. Furthermore, geospatial web services may provide a means to develop archives in a more efficient, automated fashion. Engagement with the OGC, leading to the eventual formation of a Data Preservation Working Group, occurred along two simultaneous tracks. In the first case, NCSU Libraries teamed with EDINA (University of Edinburgh) to present on the intersection of preservation issues with the OGC specification development space at the November 2005 OGC Technical Committee Meeting in Bonn. At this event, a set of seven points of intersection between the digital preservation problem and existing OGC specification development activities were outlined (Robertson 2005). A second thread of contact with the OGC originated from a discussion within the U.S. Federal Geographic Data Committee (FGDC) Historical Data Working Group, which conducted a series of discussions related to the use of GML for archiving data in connection with the publication of National Archives and Records Administration (NARA) submission guidelines (NARA 2004). As a consequence of this early engagement, in December 2006 the OGC Data Preservation Working Group was initially formed to address technical and institutional challenges posed by data preservation, to interface with other OGC working groups that address technical areas that are affected by the data preservation problem, and to engage in outreach and communication with the preservation and archival information community (OGC 2010 g). A goal of the Working Group has been to create and dialog with the broad spectrum of geospatial community and archival community constituents that have a stake in addressing data preservation issues. To date the work of the group has been focused on identifying points of intersec-
138 S. P. Morris
tion between data preservation issues and OGC standards efforts, and to introduce temporal data management use cases into OGC discussions. In the future it is possible that the Working Group might conceive, design, coordinate, and implement demonstration, pilot, and production projects that demonstrate technical approaches to data preservation within the context of the OGC suite of technologies and interoperability initiatives. The group might also serve as a forum for the development of specification profiles and application schemas for archival purposes. Limitations on Scope of the Data Preservation Working Group Although some mechanisms such as the OGC Network and special topic focus days are in place to facilitate participation of the general public, for intellectual property right reasons direct participation in OGC working groups is limited to OGC membership. This restriction on participation limits the potential for the Data Preservation Working Group to function as a venue involving general participation of members of archival organizations, which are typically not OGC members. A second factor limiting the ability of the Working group to function as a venue for discussion of broader geospatial data issues is the fact that many of the real data preservation problems faced by the archival and data custodian community involve proprietary technologies and will, at least in the near term, involve proprietary solutions. This is particularly true in the case of spatial database management.
6.6 Points of Intersection Between Geospatial Standards and Data Preservation Efforts OGC specifications and initiatives have promoted interoperability of data and services across vendor and organizational boundaries. An increased focus on data preservation and data persistence would make it possible to address the challenge of supporting interoperability of data and services across time. The issue of data preservation intersects with a range of existing or potential OGC specification and technology development areas including:
Preservation of Geospatial Data 139
6.6.1
GML for Archiving
One area of discussion has included the possibility of creating an archival profile of GML, drawing from the experience of PDF/A (the archival profile of PDF) in terms of creating an archival profile for complex content while engaging software vendors in the process. While GML would appear to provide a promising alternative for data preservation, there are a number of complicating factors. GML is not so much a single format as it is an XML language for which there are a wide range of different community- or domain-specific implementations as embodied by specific GML profiles associated with specific GML versions, and for which different application schemas might be available. While GML provides a means to encode vector data in an open, non-proprietary manner, “permanent access” is at risk unless that GML data adheres to an application schema which continues to be widely supported by software tools at a later point in time. In 2006 the OGC released the Simple Features Profile, a constrained set of GML designed to lower the barrier to implementation (Vretanos 2005). While the Simple Features Profile, due to its reduced complexity, might provide the basis for creation of a supportable archival profile of GML, something roughly analogous to PDF/A, there may still be a question of quality and functionality tradeoffs, including the potential for data loss that might comprise the cost of transferring data into a sustainable GML-based archival format (McGarva 2009). 6.6.2
Content Packaging
One preservation-related opportunity relates to the potential development of a content packaging scheme to support routine exchange of content, automated enforcement of rights or policies vis-à-vis static data files, and automated formulation of data repository ingest packages. Geospatial data frequently consists of complex, multi-file, multi-format objects, including one or more data files as well as: geo-referencing files, metadata files, styling or legend files, attribute data, licensing information, and other ancillary documentation or supporting files. The absence of a standard scheme for content packaging can make transfer and management of these complex data objects difficult both for archives and for users of the data. In some information industries, complex (often XML-based) wrapper formats or content packaging standards have been developed. Examples include
140 S. P. Morris
METS (digital libraries), MPEG 21 DIDL (multimedia), XFDU (space data), and IMS-CP (learning technologies), yet no similar activity has occurred in the geospatial industry. In practice, in the geospatial community archive formats such as Zip commonly function as rudimentary content packages for multi-file datasets or groups of related datasets. Such archive files typically lack structured data intelligence about file relationships and functions within the data bundle. However, formalized approaches to the use of Zip files are beginning to appear. For example KMZ files are used to package KML files and their ancillary components. Another example is the Metadata Exchange Format (MEF), developed for use in the open source GeoNetwork catalogue environment. MEF uses Zip as the basis for a formalized packaging of geospatial metadata as well as associated data and ancillary components (OSGeo 2007 a). MEF is explicit in packaging of metadata, elements of which are explicitly encoded in a file manifest, but non-explicit in packaging of the actual data and ancillary components which are simply grouped into subordinate subdirectories within the package. MEF might provide a starting point for exploration of geospatial data packaging solutions. 6.6.3
Preserving Data State in a Services Context
Another preservation opportunity and challenge relates to the documentation, for later review, interactions with distributed services and applications in a decision support context (including recreating output from earlier interactions with services). This would involve saving data state and not just service state in service interactions. WMS client applications, for example, can save service state, and the Web Map Context specification was developed by the OGC to formalize how a specific grouping of one or more maps from one or more map servers can be described in a portable, platform-independent format for storage in a repository. Yet data state within a referenced service cannot be saved, making it difficult or impossible to document the basis for decisions made when utilizing web services that are built upon changing data. One opportunity to preserve elements of WMS services could conceivably lie in the capture of static WMS outputs such as those created in WMS tiling schemes. WMS tiling efforts came as a response to the experience of Google Maps and other commercial map services, which demonstrated the speed with which static tiled imagery could be presented in user
Preservation of Geospatial Data 141
applications. Later, the NASA WorldWind application demonstrated how pre-generated, tiled representations of WMS services could be used to speed response to requests for WMS content. In response, the open source community developed a draft Tile Map Service specification (OSGeo 2007 b). These efforts further inspired development of the OGC Web Map Tile Service specification as a counterpart to the WMS specification. Further development of tile map services might perhaps, in the future, create a basis for capturing WMS state in decision-support context if earlier versions of tile sets were to be archived.
6.6.4 Preserving Complex End-User Representations as Documents Geospatial PDF documents present an approach to capturing, in static form, representations that are a result of interactions with services as well as outputs from GIS desktop software environments. PDF is commonly used to provide end-user representations of data in which multiple datasets may be combined and other value-added elements may be added such as annotations, symbolization and classification of the data according to data attributes. While these finished data views, typically maps, can be captured in a simple image format, PDF provides some opportunity to add additional features such as attribute value lookup, added annotations, and toggling of individual data layers. GeoPDF, which specifies a method for geopositioning of map frames within a PDF document, originated as a proprietary technology developed by TerraGo Technologies. GeoPDF, which leverages complex features that exist within the core PDF standard, has proven to be an effective approach for presentation of complex geospatial content to diverse audiences that are not familiar with geospatial technologies. In September 2008 the GeoPDF encoding specification for geo-registration was introduced to the OGC standards process to make it an open standard and it is now published as a Best Practices document. In parallel, Adobe introduced its own method for geo-registration into the ISO standards process for PDF (Adobe 2008). The preservation challenges that accrue to complex PDF documents will accrue to geospatial PDF documents as well. While the PDF/A specification has been developed to define an archive-friendly version of PDF (LoC
142 S. P. Morris
2007), some of the more advanced functionality that is put to use in geospatial implementations may not supported by the PDF/A specification. The history of complex geospatial PDF documents is still rather short and risks associated with external dependencies (e.g., fonts) and reliance on specialized software will require close attention over time. Standardization related to authorship of geospatial PDF documents may be beneficial to efforts to preserve complex representations in static form (McGarva 2009). Additional points of intersection between geospatial standards efforts and digital preservation might include: • Persistent Identification Persistent identifier schemes support long-term access to data and enable stable linkages between datasets, schemas, services, and metadata. The OGC has already developed a URN (Uniform Resource Name) namespace that is utilized for naming persistent resources published by the OGC (Reed 2008), as well as Definition Identifiers to be used within that namespace (Whiteside 2009), although there has been some shift in sentiment towards favoring URIs (Uniform Resource Identifiers) that are directly addressable over URNs which require creation and maintenance of a name resolution service. • Geosemantics Understanding the meaning of older data will require not just adequate metadata but also adequate means to interpret attribute and classification information, the meaning of which is grounded in the context of a particular time period. The OGC Geosemantics Domain Working group has been working to establish an interoperable and actionable semantic framework for representing the geospatial knowledge domains of information communities as well as mediating between them (OGC 2010 h). • Metadata Metadata support for versioned data will be needed, allowing for metadata related to data temporal instances to be managed or be discovered in concert with the “series set” corresponding to the individual data snapshot. Geospatial metadata might also be extended to include added preservation-related information such as technical dependencies, detailed provenance and versioning history, and annotation (Shaon 2010). Metadata elements for such an extension might be found in the PREMIS data dictionary (PREMIS 2008). The OGC Metadata Domain Working Group coordinated metadata activities
Preservation of Geospatial Data 143
within the OGC, maintaining a close correspondence between ISO TC/211 metadata standard (ISO 2010) and the manner of addressing metadata within OGC specifications. • Rights Management Support for archival use cases (e.g., the right to create copies for archival purposes, the right to maintain data in a “dark archive” context, addressing derivative data use cases) will need to be part of any rights management scheme used to manage access to data. The OGC Geo Rights Management (GeoRM) Domain Working Group has been working to coordinate and mature the development and validation of work being done on digital rights management for the geospatial community (OGC 2010 i).
6.7 Conclusion Many challenges await efforts to preserve geospatial data, and while many of these challenges will continue to involve proprietary technologies and will continue to arise in situations that involve proprietary solutions, the continued development of open standards and resulting improvements in data transparency should aid the data preservation effort. Specific opportunity areas for standards development may lie in such areas such as archival profiles for data, content packaging for transfer and repository submission, addition of preservation components to metadata standards, and development of standards-based methods for creating persistent representations (cartographic or otherwise) of services. With regard to geospatial standards development in general, it will increasingly be important to inform those efforts with concerns that relate to long-term and persistent access to data and data derivatives. In general, OGC standards-based services will encounter data persistence challenges related to schema evolution, URI/URN persistence, and access to superseded resources. Due to the ephemeral nature of data in services-oriented environment, new challenges in maintaining data persistence will continue to arise.
144 S. P. Morris
6.8 References Adobe, 2008. Adobe Supplement to the ISO 32000: BaseVersion 1.7, ExtensionLevel 3. Available at: http://www.adobe.com/devnet/acrobat/pdfs/adobe_supplement_iso32000.pdf (Accessed 25 June 2010). AIEM 2010: Aeronautical Information Exchange Model. 2010. Available: http://www.aixm.aero/public/subsite_homepage/homepage.html (Accessed 25 June 2010). de La Beaujardiere, J., 2006. Open GIS Web Map Service (WMS) Implementation Specification. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=14416 (Accessed 25 June 2010). Groger, G. et al., 2008. OpenGIS City Geography Markup Language (CityGML) Encoding Standard. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=28802 (Accessed 25 June 2010). ISO 2010: International Organization for Standardization, 2010. ISO/TC 211: Geographic Information/Geomatics. Available at: http://www.isotc211.org/ (Accessed 25 June 2010). Kyle, M. et al., 2006. OpenGIS GML in JPEG 2000 for Geographic Imagery Encoding Specification. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=13252 (Accessed 25 June 2010). LoC 2007: Library of Congress, 2007. PDF/A-1, PDF for Long-term Preservation, Use of PDF 1.4. Available at: http://www.digitalpreservation.gov/formats/fdd/ fdd000125.shtml (Accessed 25 June 2010). Lupp, M. OpenGIS Styled Layer Descriptor Profile of the Web Map Service Implementation Specification. Open Geospatial Consortium. Available at: http:// portal.opengeospatial.org/files/?artifact_id=22364 (Accessed 25 June 2010). Masó, J.Pomakis, K.and Julià, N., 2010. OpenGIS Web Map Tile Service Implementation Standard. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=35326 (Accessed 25 June 2010). McGarva, G. Morris, S. and Janée, G., 2009. Technology Watch Report: Preserving Geospatial Data. Available at: http://www.ngda.org/docs/Pub_McGarva_DPC_09.pdf (Accessed 25 June 2010). McKee, L., 2001. OGC’s Role in the Spatial Standards World. Available at: http:// portal.opengeospatial.org/files/?artifact_id=6207&version=1&format=pdf (Accessed 25 June 2010). Morris, S. Nagy, Z. and Tuttle J., 2008. North Carolina Geospatial Data Archiving Project: Interim Report. Available at: http://www.lib.ncsu.edu/ncgdap/documents/NCGDAP_InterimReport_June2008.pdf (Accessed 25 June 2010).
Preservation of Geospatial Data 145 NARA 2004: National Archives and Records Administration, 2004. Expanding Acceptable Transfer Requirements: Transfer Instructions for Permanent Electronic Records. Available at: http://www.archives.gov/records-mgmt/initiatives/digital-geospatial-data-records.html (Accessed 25 June 2010). OGC 2010 a: Open Geospatial Consortium, 2010. OGC Abstract Specifications. Available at: http://www.opengeospatial.org/standards/as (Accessed 25 June 2010). OGC 2010 b: Open Geospatial Consortium, 2010. OpenGIS Standards. Available at: http://www.opengeospatial.org/standards/is (Accessed 25 June 2010). OGC 2010 c: Open Geospatial Consortium, 2010. OGC Best Practices. Available at: http://www.opengeospatial.org/standards/bp (Accessed 25 June 2010). OGC 2010 d: Open Geospatial Consortium, 2010. OGC Discussion Papers. Available at: http://www.opengeospatial.org/standards/dp (Accessed 25 June 2010). OGC 2010 e: Open Geospatial Consortium, 2010. OGC Interoperability Program. Available at: http://www.opengeospatial.org/ogc/programs/ip (Accessed 25 June 2010). OGC 2010 f: Open Geospatial Consortium, 2010. OGC Web Services, Phase 7. Available at: http://www.opengeospatial.org/projects/initiatives/ows-7 (Accessed 25 June 2010). OGC 2010 g: Open Geospatial Consortium, 2006. OGC Data Preservation Working Group. Available at: http://www.opengeospatial.org/projects/groups/preservwg (Accessed 25 June 2010). OGC 2010 h: Open Geospatial Consortium, 2010. OGC Geosemantics WG. Available at: http://www.opengeospatial.org/projects/groups/semantics (Accessed 25 June 2010). OGC 2010 i: Open Geospatial Consortium, 2010. OGC Geo Rights Management (GeoRM). Available: http://www.opengeospatial.org/projects/groups/geormwg (Accessed 25 June 2010). OSGeo 2007 a: Open Source Geospatial Foundation, 2007. GeoNetwork Open Source: The Complete Manual. Available at: http://www.fao.org/geonetwork/ docs/Manual.pdf (Accessed 25 June 2010). OSGeo 2007 b: Open Source Geospatial Foundation, 2007. Tile Map Service Specification. Available at: http://wiki.osgeo.org/wiki/Tile_Map_Service_Specification (Accessed 25 June 2010). Portele, C., 2007. OpenGIS Geography Markup Language (GML) Encoding Standard. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=20509 (Accessed 25 June 2010). PREMIS Editorial Committee, 2008. PREMIS Data Dictionary for Preservation Metadata: version 2.0. Available at: http://www.loc.gov/standards/premis/v2/ premis-2-0.pdf (Accessed 25 June 2010). Reed, C., 2008. A URN Namespace for the Open Geospatial Consortium (OGC). Open Geospatial Consortium. Available at:
146 S. P. Morris http://portal.opengeospatial.org/files/?artifact_id=27357 (Accessed 25 June 2010). Robertson, A. and Morris, S., 2005. Long-term Preservation of Digital Geospatial Data: Challenges for Ensuring Access and Encouraging Reuse. Available at: http://www.lib.ncsu.edu/ncgdap/presentations/Architecture WG OGC Bonn 9th Nov Robertson Morris public.ppt (Accessed 25 June 2010). Shaon, A. and Woolf, A., 2010. Long-Term Preservation for INSPIRE: A Metadata Framework and Geo-Portal Implementation. Available at: http://inspire.jrc.ec.europa.eu/events/conferences/inspire_2010/get_details.cfm? urlx=inspire.jrc.ec.europa.eu/events/conferences/inspire_2010/abstracts/55.ht ml&KeepThis=true&TB_iframe=true&height=550&width=620 (Accessed 25 June 2010). Sonnet, J., 2005. Open GIS Web Map Service (WMS) Implementation Specification. Available at: http://portal.opengeospatial.org/files/?artifact_id=8618 (Accessed 25 June 2010). Vretanos, P., 2005. OpenGIS Web Feature Service (WFS) Implementation Specification. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=8339 (Accessed 25 June 2010). Whiteside, A., 2008. Web Coverage Service (WCS) Implementation Standard. Available at: http://portal.opengeospatial.org/files/?artifact_id=27297 (Accessed 25 June 2010). Whiteside, A., 2009. Definition Identifier URNs in OGC Namespace. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=30575 (Accessed 25 June 2010). Wilson, T., 2008. OGC KML. Open Geospatial Consortium. Available at: http://portal.opengeospatial.org/files/?artifact_id=27810 (Accessed 25 June 2010).
7 Pitfalls in Preserving Geoinformation - Lessons from the Swiss National Park
Stephan Imfeld, Rudolf Haller University of Zürich and Swiss National Park, Switzerland, [email protected]
Abstract In the Swiss National Park (SNP) long-term research is one of the main objectives. Scientific data has been collected for almost 100 years, many of them in a geographical context within this alpine environment. The digital era started in 1992 when a geographical information system (GIS) was established. Over the years a considerable amount of analogue and digital data has been accumulated which the SNP is obliged to keep available for future research generations. As one of the few undisturbed reference areas for global change with such long-term data series, the preservation of the source data is of high importance as the value of data increases as the time series are and will be continuously updated. When preserving long-term scientific data in digital forms one inevitably faces a manifold of problems not found in traditional (non-digital) archiving. We identified a set of obstacles and pitfalls which can have catastrophic effects on both archive creation and maintenance in a small scaled organisation like the SNP. There are three main classes of threats to digital data and their archives: hardware, software, and brainware. The rapid developments in hardware and software are well known aspects making archiving of data and especially geodata a challenge by itself. One of the biggest threats to geodata is the so called brainware, i.e. users, data maintainers, system managers and on the long-term the heirs clearing the garret of their deceased ancestors, once brilliant researchers, unaware of their data collections’ value.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_7, © Springer-Verlag Berlin Heidelberg 2011
148 S. Imfeld, R. Haller
The SNP has encountered multiple pitfalls in its effort to preserve geodata. The development of the GIS software with its regular updates and enhancements resulted in several losses of data due to format incompatibilities despite of sticking with the largest GIS software vendor throughout the years. In comparison, traditional media selection in non-digital archives seems to be idyllic, as paper is the most stable and endurable storage known for centuries. The mere volume of data handled at our Geo-data center amounts to several terabytes, posing considerable problems in terms of backup and archiving and especially archive maintenance. Metadata is an accepted key requirement without which data must be considered useless. Especially in long term data collections, metadata by itself does not sufficient. Current concepts for finding metadata focus on search engines with either unstructured or highly structured search strategies, unfortunately disregarding the fact that data must be embedded in knowledge retrieval systems spanning from traditional knowledge repositories (publications, libraries) with current digital content management systems and metainformation on data lineage. Such a meta-meta database system was developed for two research hotspots in nature reserves in Switzerland, currently containing roughly 3000 items. Further research is urgently needed for digital geodata archiving to improve technical and administrative aspects as well as to build a strong awareness in scientific disciplines that not only archiving publications but source data as well is a key component when trying to better understand our changing world.
7.1 Introduction Disseminating information across time and large spatial scales is a unique property of mankind. Language and especially the recording of language have contributed enormously to the cultural evolution. Describing spatial characteristics is also possible, but usually results in lengthy wordings without the required accuracy. With the invention of maps, most information on complex and detailed spatial facts were depicted using cartographic representations on paper, resulting in precious but at the same time rare and costly and limited map productions. With the turn of the 19th century, surveying and remote sensing steadily became more efficient up to the point, where data collection became so efficient and widespread, that digital means of handling became indispensable (Allgöwer and Bitter, 1992).
Pitfalls in Preserving Geoinformation 149
Geographical information systems (GIS) were developed and became the primary form of data representation in the past decade, even in relatively traditional national mapping agencies. This transition from a strictly paper based information medium to a primarily digital data handling with only small portions finally being represented in paper form opens up new and challenging problems. How can we counteract the threat of loosing today’s spatial data for upcoming generations? The value of preservation and archiving GIS data has been recognised 20 years ago (Hunter, 1988), but did not receive much attention until recently. The change to primarily digital forms of data is not confined to spatial data, it has also become a major research and development topic in many traditional libraries and archives. In Switzerland, the federal archives are being challenged by new legislation requiring all business processes including publications and documents to be solely in digital form by 2012, hence the purpose of archiving is also required to be in digital form (‘Schweizerisches Bundesarchiv’, 2008). Many libraries are developing work flows and infrastructure to keep up with the digital era, as for example the PADI initiative (preserving access to digital information) by the National Library of Australia (National Library of Australia, 2009). This initiative is also noteworthy in that it is working towards an integration of digital geographical data in its processes. Other initiatives such as the North Carolina Geospatial Data Archiving Project (Morris et al., 2006) have focused primarily on geographical data, as it is inherently more complex than written texts or standard databases and much more prone to be irrecoverably lost. The Swiss National Park (SNP), founded in 1914 as the first national park in the Alps, anticipated its role in long-term research from the beginning (Schröter 1920, Braun-Blanquet 1940, Baer 1962, Allgöwer et al. 2006), enabling us today to observe environmental changes over almost 100 years as an exquisite reference area in our constantly changing world. The above mentioned transition towards a (almost) pure digital data handling for spatial information has forced the SNP to search for and develop means of preserving this primarily digital data as well as historic data from the past 100 years. As research data has always been collected in a collaborative way by national and international groups, raw data was often regarded as a private matter for many years until recently, when loss of such data became evident and the scientific council decided that all research data must be available and archived at the national parks head quarter.
150 S. Imfeld, R. Haller
Today, several thousands of datasets are being maintained at our institution of which most of them are spatially referenced. The data volume in total is reaching several terabytes of storage which not only requires an adequate infrastructure for maintenance, but also poses considerable problems in terms of archiving purposes. Archiving digital data should cover storage in terms of media but also in terms of physical storage facilities as well as the creation and maintenance of retrieval systems and catalogues for finding it in the future. One must be well aware that archiving digital data includes not only the “pure” archiving aspects of storing safely coded text for future generations in a save building and using specialized cataloguing systems for retrieval. Analogue archives have identified paper as the major long-term media for preservation and are used to put it in a safe place on a shelf. Digital archives must encompass many more aspects including the selection of appropriate media and hardware as well as making sure that the appropriate software for later access will remain available. Parts of these aspects must often also be considered in other information technology processes, but they are rarely of high importance as long as long term conservation is not a core requirement. Only a few projects were carried out investigating the preservation of primarily non-spatial research or governmental (relational, table oriented) data (‘Schweizerisches Bundesarchiv’, 2008, Grobe et al., 2006). GI technology for archiving must even go further because access to the highly complex data structures requires specialized software which must also be preserved. In recent years, mostly research and development projects have been performed for GIS data archiving (Hoebelheinrich & Banning, 2008), and a few productive systems have been implemented (e.g. Morris et al, 2006). We present an analysis of threats in long-term preservation of digital spatial data as well as the resulting concepts, ideas and procedures applied in the SNP for preserving data, noticeably an area under continuous development and almost neglected by software vendors and many important national data centres.
7.2 Archiving needs in a wilderness area Why do we need to think about archiving data in highly protected natural areas, where the main goal is to keep out human influences and letting nature act by its own rules? In an article in Nature in 1923 Schröter poin-
Pitfalls in Preserving Geoinformation 151
ted out the basic ideas behind the creation of the Swiss National Park saying that '...absolute protection is secured for scenery, plants and animals; Nature alone is dominant. [...] By the work of successive generations of investigators, it will be possible to follow the truly natural successions and changes occurring within the area. [...]. In this unique laboratory, the naturalists of Switzerland will find themselves united in a common work.' This trilateral idea of protection, research, and information remains true and is written down in the national laws to become an obligation of the park and the research (‘Bundesgesetz über den Schweizerischen Nationalpark im Kanton Graubünden’ 1980). It is clear that over 100 years research methods are being changed, new insights generate new hypotheses and new analytical methodologies as for example geostatistics or genetics open up completely new dimensions to the understanding of nature (Burger 1950, Schütz, Krüsi & Edwards 2000, Saether et al. 2002). Moreover, research in the Swiss National Park became part of international research activities and contributed to global ecological knowledge (Grabherr, Gottfried & Pauli 1994, Camenisch 2002, Risch et al. 2004). To be able to make use of these new technologies, we require access to the original data and material. For these reasons, the Swiss National Park, and in fact all natural protection areas and their research communities have the obligation to collect and preserve all relevant information to achieve the goals set up almost 100 years ago (Filli, 2006). The term digital archive as we use it, does not cover all digital data, but is confined to information that is either 'born digital' (created and disseminated primarily in electronic form) or for which the digital version is considered to be the primary medium (Hodge 2000). For these data, we can distinguish three main areas of threats: hardware, software, and brainware.
7.3 Hardware threats Hardware in its general sense encompasses computer hardware such as media, controllers, and systems, but also includes the non-computer aspects up to the maintenance and threats of its surrounding buildings. The latter can be devastating to data archives, but is luckily a very rare event, and we are happy we did not encounter data loss due to fire, water or earth quake or similar events up to date. Preventing such major events must be done by replicating data off-site to unconnected buildings. In the Swiss National Park replication is currently being performed to a second system
152 S. Imfeld, R. Haller
in a building 100m away, the main work data pool is also mirrored additionally to a second computing centre in Zurich. Computer systems can not be considered as very long standing assets. We have experienced 4 major hardware switches resulting in one change every four to five years. As long as hard disk migration is being looked after, online data should not be in danger. By contrast, offline media, media controllers and reading devices are much more in danger of not being available after a major hardware exchange. Media might not seem to be a major problem initially, but in our experience over the past 16 years, it is one of the biggest threats to data. We have encountered and used thirteen different types of media for storage, of which only five can still be accessed to retrieve data. Today, we regret the loss of two datasets due to the inability of reading the media and unavailability of the reading hardware respectively. These two cases occurred during the collection of distributed datasets from different research laboratories which had not implemented data preservation plans. Professional data recovery services offer their expertise to restore data from unreadable media, unfortunately in our research oriented environment direct monetary return is hardly ever the case for recovering lost data, and without it the costs are mostly considered too high compared to dropping the media into the bin and ignoring its scientific values. Some of our media used in the past have theoretical durability qualities (e.g. DVD) that might lead to the temptation of using them for archiving purposes. Nevertheless we do not recommend these media as the only means of archiving. Sunlight for a few weeks when storing it on the board until migrating it to an underground archiving silo did render DVDs unreadable in several cases.
Pitfalls in Preserving Geoinformation 153 Tab. 7.1. Types of media storage used in the Swiss National Park in last 16 years. Paper Punch card (1928!) Tapes Floppy disk CD QIC Exabyte DAT DLT ZIP, JAZ LTO2 LTO3 DVD
Used x x x x x x x x x x x x x
In Use x
Readable x (x)
x
(x) x
x x x
x x x
Media which do not have automatic mechanisms to immediately report media failure should not be used as only means of storing archival data. Currently the only media with proven long term capabilities for archival purposes is paper, even though certain handling and environmental properties must be assured. Today, online hard disks are probably the second best alternative as long as immediate response by the administrators is guaranteed and crash recovery is implemented. In large enterprises robots with automated media checking and implemented redundancy could in certain cases replace online disk systems, though for medium and smaller sized organisations it is often not affordable. Offline media without mechanisms for immediate failure detection should not be used as only means of archiving. Smaller datasets should be printed on paper and stored in appropriate traditional archives. At least one copy of an archival set should be stored on online media, i.e. currently on online surveyed hard disk systems.
7.4 Software considerations Software is one of the critical points in preserving data over many decades. Data as complex as geographical data is hardly usable without the appropriate software. Theoretically good documentation and description of data could be of use at a later time, but could be of limited use due to too large efforts needed to write specific reading software if the data models are complex.
154 S. Imfeld, R. Haller
With the above mentioned regular hardware upgrades and exchanges, operating systems are being exchanged as regularly as well, showing a strong interconnection between hard- and software. This means that operating systems required to run certain software releases also have dependencies on hardware which could prevent one from reading proprietary data formats by missing software, operating system and hardware, whichever is being unavailable first. As an example, ArcInfo Version 5 data is not compatible with later versions than version 6, but some of this data might be migrated physically to newer hardware due to platform upgrades of exchanges. On current hardware and operating systems, ArcInfo Version 5 does not run and hence data in this format is lost as the old hardware (and software) has been disposed. Data archiving must be performed as independent from specific software as possible. The expenditure maintaining software and in turn the therefore required hardware are if at all only affordable in exceptional cases. At the same time we must realize that every conversion into a different environment or data format bears the risk of data change in geometry and attributes, indicating that these processes need to be limited as much as possible. Format conversion software (e.g. FME, GDAL/OGR) can be of great help, but we have to make sure that the result of the conversion process is checked for completeness and accurate transformation, a highly difficult task for both geometry and topology in GIS data sets. Major software releases are often accompanied by relevant changes in the data formats. Geographical information systems for handling spatial data are highly specialized systems dealing with different problems for representing and handling 2-, 3-, or even 4-dimensional data. Solutions for the representation of for example topology or fields have been implemented in many different ways. The developments by commercial GIS vendors have been tremendous, resulting in yearly software releases and major software revisions occurring about every two to three years. In the past 17 years the GIS of the Swiss National Park had to deal with 5 major GIS software releases, accompanied by underlying data format changes: pre5 info, version 6 info, the invent of the shapefile format, personal geodatabases, pre 9.1 geodatabase, 9.2 geodatabase, and 9.2 geodatabase with terrains, and file based geodatabases. Unfortunately software vendors do not, and probably sometimes can not due to fundamental software changes and improvements, maintain data format compatibilities over more than only a few major releases.
Pitfalls in Preserving Geoinformation 155
As it has been clearly shown during the past decades, GIS data formats from different vendors exhibit incompatibilities on multiple levels. Incompatibilities from data models (e.g. object oriented versus relational) to real world representations down to the level of restrictions in the underlying databases in column naming or differences in precision can all lead to major obstacles in migrating large GIS databases from one system to another. The multitude of data conversions needed during a longer time period such as e.g. 100 years could render the data itself unusable. Reading historic versions of a competitors data formats is purely illusive, with the exceptional case where the data format became a de facto standard. Migrations and upgrading of data formats even within ones vendor systems can introduce unwanted geometrical changes of the data. Migrating from a geodatabase format of ESRI's spatial database engine SDE to the widely used shapefile format results in changes of the underlying geometry with shifting of data and changes such as approximations of arcs to multiple linear elements. Performing such migrations and upgrades as theoretically required whenever a major software release is accompanied by an underlying format change might therefore introduce unwanted changes to the data, defeating one of the primary goals in archiving of maintaining the current state of data over long time spans. In the Swiss National Park we have lost at least one dataset from the early days to incompatibilities of pre version 5 formats with ESRI’s current software suite. Hence we are facing the apparently insolvable contradiction between required upgrades with inherent data changes and the preservation of original data formats which will not be readable in the future. The role of documentation, exchange and interoperability standards (e.g. ISO TC211, OGC) might become more important as they get implemented into current software, even though they only become standards when a majority adopts them in their daily business. Data formats are critical in maintaining data. Data should be archived in non-proprietary formats or open and publically described formats. Data models should be kept as simple as reasonable increasing the chance to be able to migrate, upgrade, maintain and restore the original data after decades.
156 S. Imfeld, R. Haller
7.5 Data volume The size of today’s geoinformation databases start to exceed feasibility barriers of current technologies in small to medium organisations in terms of backup as well as long term storage. New concepts and technologies are urgently required which are on the one hand capable of storing massive volumes of data at minor costs (with preferably no energy consumption) and on the other hand guarantee readability of the data over decades. Today the SNP maintains over 4000 datasets requiring large disk systems with several dozens of terabytes storage capacity. With such amounts of data, traditional backup systems become more and more unaffordable at least for smaller organisations. New mechanisms of securing data must be implemented. Archiving data might help solving the aspect of maintaining data for many years, but the physical archive itself, when becoming large, inherits the same threats and difficulties as maintaining a normal system without any history. Data volumes become too large for affordable backup media, datasets too numerous to be maintained by hand. Hence a movement towards disk based backups, snapshot oriented file systems has started in recent years. Data mirroring has become a cornerstone for crash recovery and survival due to the large data volumes as terabytes of data can not be restored from tape systems within short times rendering an organisation unmanageable when 24h hour data access is mission critical. In the SNP such critical data in terms of access times does not exists, nevertheless restore times of several weeks could cause major problems and have to be prevented. Until new storage concepts become available, we assume that the current movement of all data even including backup data will more and more be migrated to disk based systems. Storing archiving versions on paper is sometimes not possible due to the massive volumes. For smaller data sets it might seem old fashioned at first, but it might well be one of the most durable ways of preserving the information, provided that all metadata, information on relationships and lookup tables is also attached.
7.6 Brainware The threats and problems mentioned above are often considered to be the major threats to maintaining data. In our experience, they certainly do play a role and must be carefully considered in every institution, but they are
Pitfalls in Preserving Geoinformation 157
concentrating on all the interesting technical details that keep hardware and software manufacturers in business, neglecting the biggest threat to data: humans. Unintended changes or even deletions of data are probably more often the cause for data loss than hardware and software problems. Humans also tend to forget details over the years and with this fading knowledge of and about data can also be lost over the years. At first this might seem a ridiculous argument, but the Swiss National Park is currently putting much effort in reconstructing the history of research and hereby its data, making the difficult experience that from the 95 years of research tradition, the largest amount of data has been lost due to the following bêtises. • undocumented data • loss of knowledge where the data is • loss of knowledge that data exists • loss of knowledge due to (natural) deaths of project responsibles • unwillingness to share data • missing awareness of the importance of certain datasets The situation of the SNP is quite a classical one at least in European national parks and protected areas, presumably also in most other regions of the world, in that a large part of the geodata originates from a large distributed network of researchers working at various institutions. It is not until recently that the responsibility for data maintenance was or should have been the individual researchers’ duty. Today it has been recognised that central data management is a key element for preserving long term data, especially digital data as it requires not only awareness of the topic's importance but also specialized knowledge, infrastructure, personnel, and hence financial resources.
7.7 Documentation A dataset itself without the appropriate metadata describing its origin and meaning must be considered of no value (Doucette & Paresi 2000). Knowledge about datasets must be transferred from the producer to prospective users. In the case of the SNP this often means that this transfer must span up to almost a century. It is thus clear that data descriptions must be an inherent part of an archiving process.
158 S. Imfeld, R. Haller
Several standards for meta information for spatial as well as non-spatial data exist ('Technical Committee ISO/TC 211' 2003). In our view, the choice of which standard to use is of minor importance as long as it integrates a complete description allowing for a thorough understanding for other users. In our experience, producing metadata in such a way is often too difficult for the majority of researchers which thus need professional assistance. Describing data in detail does not only include the documentation of spatial and temporal aspects, semantics and quality measurements, but also meticulous descriptions of database fields and coding schemes including the respective references (Veregin 1999). The time needed for the complete documentation of a medium sized project in the SNP can range from a few hours to several days. Compared to the initial time and resources invested in planning, acquiring and analysis of the data, a few days is only a marginal but inevitable effort to guarantee its usability at a later time. The amount of effort needed can often be reduced by integrating published and sometimes unpublished material describing motivation, purpose and methods used for producing the data.
7.8 Integration When organising thousands of digital datasets, its metainformation and all required auxiliary information an integrating framework for storing and retrieval is required. The SNP designed and implemented a meta-meta database for the purpose of data cataloguing, data search and retrieval of current and historic data and especially describing interrelationships between data, publications, institutions, projects and metadata. The system was developed in such a way that multiple starting points are available when searching for data. When a researcher is looking for data that has been used for a specific publication, he or she has the chance to be guided to the corresponding project and further on to the metadata and finally to the location, where the data in question is being found. Other starting points such as an organisational name or projects or even a full text search can be used. The system also integrates a large bibliographic database and includes where available digital versions of the corresponding publications, thus slowly but steadily starts to become one of the main retrieval systems for the academic community in the SNP. One of the special features of the system is its ability to depict dependencies in terms of lineage of information. It is publically available via http://www.parcs.ch/mmds.
Pitfalls in Preserving Geoinformation 159
7.9 Conclusions Organising and archiving the SNP's digital research and general data pool is, as outlined above, a challenging task. The peculiarities of digital geographical information require special mechanisms and procedures when archiving is required, being far more complicated than usual information archiving or long term storage of standard (relational) databases. Today no established and accepted standards exist for this purpose. Although we are facing an incredible increase in digital data volumes produced, we are confronted with the paradox that this data will be lost much faster than data originating from before the digital area.
7.10 References Allgöwer, B & Bitter, P (1992), ‚Konzeptstudie zum Aufbau eines geographischen Informationssystems für den Schweizerischen Nationalpark (GIS-SNP)’ (Jahresbericht GIS-SNP 1992), Wissenschaftliche Nationalparkkommission, Nationalparkdirektion. Allgöwer, B, Stähli, M, Bur, M, Koutsias, N, Koetz, B, Morsdorf, F, Finsinger, W, Tinner, W & Haller, R (2006), 'Long-term fire history and high-resolution remote sensing based fuel assessment: Key elements for fire and landscape management in nature conservation areas', Forest Ecology and Management, vol. 234, pp. 212-222. Baer, J (1962), 'Un demi-siècle d`activité scientifique dans le Parc national suisse', Actes Soc Helv Sc Nat, pp. 50-62. Braun-Blanquet, J (1940), 'Vingt années de botanique au Parc National Suisse.', Act Soc Helvet Sc Nat, pp. 82-87. Bundesgesetz über den Schweizerischen Nationalpark im Kanton Graubünden 1980, Bern, viewed at 09 February 2009, . Burger, H (1950), 'Forstliche Versuchsflächen im schweizerischen Nationalpark', Mitt Schweiz Anst forstl Verswes, vol. 26, pp. 583-634. Camenisch, M (2002), 'Veränderungen der Gipfelflora im Bereich des Schweizerischen Nationalparks ein Vergleich über die letzten 80 Jahre', Jber. Natf. Ges. Graubünden, vol. 111, pp. 27-37. Doucette, M & Paresi, C (2000), ’Quality management in GDI’, in Geospatial data infrastructure, eds. Groot, R & Mclaughlin, J, Oxford, Oxford University Press. Filli, F & Suter, W (eds) 2006, Huftierforschung im Schweizerischen Nationalpark, Nat.park-Forsch. Schweiz, vol. 83, p. 241. Grabherr, G, Gottfried, M & Pauli, H (1994), 'Climate effects on mountain plants', Nature 369, 448.
160 S. Imfeld, R. Haller Grobe, H, Diepenbroek, M, Dittert, N, Reinke, M & Sieger, R (2006), ‘Archiving and Distributing Earth-Science Data with the PANGAEA Information System’ in Antarctica, eds. Fütterer, DK, Damaske, D, Kleinschmidt, G, Miller, H & Tessensohn F, pp. 403-406. Hoebelheinrich, N & Banning, J (2008), 'An investigation into metadata for longlived geospatial data formats', Technical report, Stanford University Libraries. Hodge, GM (2000)‚ Best practices for digital archiving: An information life cycle approach, D-Lib magazine 6(1). Viewed at 09 February 2009, Hunter, GJ (1988), 'Non-current data and geographical information systems. A case for data retention', Int. J. Geographical Information Systems, vol. 2(3), pp. 281-286. Morris, SP, Tuttle, J, Farrell, R (2006), 'Preservation of State and Local Government Digital Geospatial Data: The North Carolina Geospatial Data Archiving Project', Proceeding of Archiving, pp. 45-48. National Library of Australia Preserving Access to Digital Information (PADI) initiative, viewed 09 February 2009, Risch AC, Schütz M, Krüsi B, Kienast F & Bugmann, H (2003), 'Long-term empirical data as a basis for the analysis of successional pathways in subalpine conifer forests.', Austrian Journal of Forest Science, vol. 120, pp. 59-64. Risch, A, Schütz, M, Krüsi, B, Kienast, F, Wildi, O & Bugmann, H (2004), 'Detecting successional changes in long-term empirical data from subalpine conifer forests', Plant Ecology, vol. 172(1), pp. 95-105. Saether, BE, Engen, S, Filli, F, Aanes, R, Schröder, W & Andersen, R (2002), ‘Stochastic population dynamics of an introduced Swiss population of the ibex’, Ecology, vol. 12, pp. 3457-3465. Schütz, M, Krüsi, BO & Edwards, JP (eds) 2000, Succession research in the Swiss National Park from Braun-Blanquet's permanent plots to models of long-term ecological change, Nat.park-Forsch. Schweiz, vol. 89, p. 259. Schröter, C (1920), 'Der Werdegang des schweizerischen Nationalparks als TotalReservation und die Organisation seiner wissenschaftlichen Forschung', Denkschriften der Schweizerischen Naturforschenden Gesellschaft(1). Schröter, C. (1923), 'The Swiss National Park', Nature, vol. 112(September 29), pp. 478-481. Schweizerisches Bundesarchiv 2008, SIARD Formatbeschreibung, viewed 09 February 2009, Schweizerisches Bundesarchiv 2008, Elektronische Geschäftsverwaltung im Bund, viewed 09 February 2009, Technical Committee ISO/TC 211 (2003) ISO 19115 (2003), Geographic information -- Metadata, Bruxelles, ISO. Veregin, H (1999), 'Data quality parameters', in Geographical Information Systems, eds. Longley, PA, Goodchild, MF, Maguire, DJ & Rhind, DW, New York, John Wiley & Sons Inc., pp. 177-189
8 Geospatialization and Socialization of Cartographic Heritage
Dalibor Radovan1, Renata Šolar2 1
Geodetic Institute of Slovenia, Ljubljana, Slovenia, National and University Library, Map and Pictorial Collection, Ljubljana, Slovenia, [email protected], [email protected]
2
Abstract The digital and geospatial technologies have changed the appearance, the media and the role of cartographic heritage. Once available only as real hardcopy maps, we transform it nowadays into diverse virtual forms and combine it with other data and information. The chapter shows how geodesy, cartography and librarianship were merged to bring the cartographic heritage closer to the users and education process. In the context of geolibrary, the paper presents several GIS, web and multimedia applications based on the cartographic heritage and geodetic data: • the dynamic presentation of settlement growth produced as a time series of interchanging 3D historical maps populated with 3D buildings in the respective era; • the 3D presentation of the WWI front line on the river Soča. Contemporary and old data on the Google Earth platform were used; • the database of all Slovenian place names which have been automatically replicated in all six grammatic cases used in Slovenian language, so each name could be geoparsed; • the pilot web-based application, which represents a virtual collection of geolocated contemporary and historical maps, postcards, portraits, ancient panoramic views and audio clips.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_8, © Springer-Verlag Berlin Heidelberg 2011
162 D. Radovan, R. Šolar
Finally, we discuss the applications and the potentials of cartographic heritage from the aspect of social media.
8.1 Introduction The essence of library has not changed over the centuries. It is still a collection of textual and graphic materials arranged for easy use, cared for an individual or individuals familiar with the arrangement, and accessible to at least limited number of persons (Tedd, Large 2005). The cartographic heritage is no exception in this case. We can even observe, that the collections of old maps are less visited by library users than e.g. book or newspaper sections. Recent developments in information and communication technologies, especially the internet and web, have brought significant changes in the ways we generate, archive, distribute, access and use cartographic information. The new digital era and the emerging geospatial technologies are changing also the paradigms of librarianship (Šolar, Radovan 2008). One of the most important contributions of web technology has been the creation of digital libraries, which allow users to access digital information resources from virtually anywhere in the world. Additionally, the introduction of geocoding into library collections has resulted in digital geolibraries, where the position, place and space became the cue for exploring the contents. The aim of this book chapter is to present the evolution of cartographic heritage from a static spatial information to a living lab for education, simulation, browsing and visualization of library collections on different media and communication channels. From four described applications, we learn also the social implications of libraries and archives in the very near future.
8.2 The Geolibrary Paradigm The origin of the idea to access library collections by spatial location was put forward by Goodchild (1998, 2004), who was one of the initiators of the Alexandria Digital Library project. Parallel to the world-wide initiative to digitize the library items, geoinformation scientists were thinking of using a mass of references to spatial locations hidden within the contents of
Geospatialization and Socialization of Cartographic Heritage 163
old maps, books and other hardcopy documents. Goodchild suggested that libraries should become geolibraries - a ‘spatially-oriented’ type of digital libraries filled with georeferenced information, with locations acting as the primary basis for representation and retrieval of the information. The changing paradigms have different impact on academic and national map libraries. 8.2.1
Academic map geolibraries
Academic map libraries are increasingly collaborating with the experts in geosciences, GIS and digital cartography, which introduced their policies and activities within the traditional frames of librarians. By supporting students, scholars and researchers academic map libraries have to recognize the changing nature of librarian field as well as information formats that they support (Goodchild 2000). In addition to scanned maps, which are a part of their collections (so called digital map libraries), they also provide different ways of GIS and geolocation facilities. The usage of GIS in map libraries began in 1992 with the American Research Libraries (ARL) GIS Literacy Project. The Project sought to provide a forum for libraries to experiment and engage in GIS activities by introducing, educating, and equipping librarians with the skills needed to provide access to digital spatial data (French 1999). The goals of the Project were designed to provide the tools and expertise necessary to ensure that digital government information can be used effectively and remain in the public domain. The Project increased knowledge of GIS among librarian and in the 1999 survey among 121 Project participants, 89% were already offering GIS services (French 1999). Since the beginning of incorporation of GIS services in academic map libraries in the nineties, changes and improvements have been achieved. Opinions on novel technology and services have been presented in several articles (Ferguson 2002, Kowal 2002, Martindale 2004, Jablonski 2004, Dixon 2006). They accentuate technology and user access to GIS data and GIS resources. These articles point out to the needs of academic users, and strive to meet those needs in a service sense. It became an academic map library policy to collect, manage and disseminate geospatial data.
164 D. Radovan, R. Šolar
Fig. 8.1. Four sample images from the video of settlement growth
8.2.2
National map geolibraries
In contrast to academic map libraries, the national library map collections are different. They collect, preserve and archive national cartographic heritage. They are highly involved in legal deposit problems, its changing forms (printed maps to digital), archiving (digital cartography), copyright law, etc. They also recognize changing nature of librarianship (Bäärnhielm 1998, Campbell 2000, Dupont 1998, Fleet 2000, 2004, Häkli 2002, Kildushevskaya and Kotelnikova 2000, 2002), but due to their history and rigid hierarchical structure, they are not so rapid in implementing the needed changes. Their essential interest lies in presentation and easy access to cultural heritage. This is the reason why scanned historical maps are the main contents of national library map collections web pages. In general, they provide users with the ability to easily manipulate the maps, magnify and zoom in on specific sections. High number of hits for the National Library
Geospatialization and Socialization of Cartographic Heritage 165
of Scotland and Library of Congress, as mentioned above, demonstrates a high interest of the users for the historical maps and cartographic heritage. While offering GIS support and services as a part of the national map libraries are not part of the standard practice nowadays, especially in the smaller libraries, things are starting to change (Kotelnikova and Kildushevskaya 2004). These changes are mostly dependent on the structure of the users. In addition, GIS applications are usually not included in the national map libraries web pages with the exception of the projects that are part of the National Library of Scotland (Fleet 2004) and British Library.
8.3 Cartographic Heritage and GIS Maps, charts and atlases were part of the library collections since they were established. Nowadays, due to the establishment of digital carto-
Fig. 8.2. Flight along the pair of front lines in the high mountains.
166 D. Radovan, R. Šolar
graphy, geospatial databases and multimedia, the role and activities of map libraries have been changed. In the National Library of Slovenia, maps and other cartographic material are part of the collection that encompasses postcards, drawings, portrait images, posters, and panoramic views. The majority of the pictorial items incorporate spatial locations or geographic footprints, which are hidden and unexploited in a traditional, non-digital library. New technologies such as web and GIS enable its usage. Spatial access to diverse library holdings has not been investigated and applied in national libraries portals. The reasons can be found in GIS characteristic, spatial illiterateness and/or collections policies and tradition. Fortunately, GIS is slowly being converted from its highly exclusive professional usage (in geography and geodesy) towards general public use. At the moment, there are no defined GIS standards for the public domain. On one hand, there are specialized GIS applications, and on the other, there
Fig. 8.3. The topographic source for the details of the battlefield from the war archives.
Geospatialization and Socialization of Cartographic Heritage 167
are functionally impoverished browsers for cartographic (not GIS) contents. In addition, GIS is not user friendly for those who are not familiar with geosciences and for all who do not “think graphical” (Fleet 2006).
8.4 Geospatial Applications with Cartographic Heritage Since 2004, the Map Collection of the Slovenian National Library in cooperation with the Geodetic Institute of Slovenia has tried to explore and use advantages of GIS in a way which has not been used in the national digital library presentations. We have tried to modify our role from geospatial access data point to active ‘creator of map-based’ access model portal, enabling users to access materials relative to spatial locations. In the forthcoming subchapters we show four applications which enable spatial browsing, learning and virtual travel through geolocated library holdings: • The dynamic presentation of settlement growth produced as a time series of interchanging 3D historical maps populated with 3D buildings in the respective era. • The 3D presentation of the WWI front line on the river Soča, where the biggest mountain battle in the history of mankind occurred. The contemporary and old data on the Google Earth platform are used. • The database of all Slovenian place names which have been automatically replicated in all 6 grammatic cases used in Slovenian language, so each name could be geoparsed eg. within a digital book, a newspaper or a gazetteer. • The pilot web-based application, which represents a virtual collection of diverse geocoded library holdings. The web portal enables the users to access materials relative to spatial locations where they can simultaneously overlay contemporary and historical maps of the same geographic area coupled with hypelinks to geolocated postcards, portraits, ancient panoramic views and audio clips within the same coordinate system. The applications were developed on the following objectives (Šolar, Radovan 2005, Šolar et al. 2007): • To integrate and analyze historical maps with modern geographical data in digital form in terms of accuracy, cartographic projection,
168 D. Radovan, R. Šolar
cartographic presentation techniques, development of settlements and changes of toponymy. • To explore the possibilities of GIS as a tool for creating a virtual collection compounded of diverse materials on a map-based access model, enabling users to access materials relative to spatial locations. • To provide users with an interactive, dynamic environment for exploring, manipulating and transforming a collection holding which was not possible with traditional print media. • To bring together the conservation and promotion of selected materials for education and research purposes by web users. 8.4.1
Dynamic visualization of settlement growth
For the presentation of a history book about the settlement Medno (400 inhabitants) near Ljubljana, the capital of Slovenia, a video clip was produced. It contains a 3D dynamically rotating model of landscape which is showing the settlement growth as a time series of interchanging 3D historical maps populated with 3D buildings in the respective era (Figure 1). Eight different geolocated documents were used, ranging from 18th century topographic and cadastral maps to contemporary digital ortophoto. All scanned maps and ortophotos were transformed approximately into the national coordinate system and draped onto the digital elevation model. Selected buildings from the respective era were taken from the contemporary 3D cadastre of buildings owned by the Surveying and Mapping Authority. As flying circularly around the settlement, the underlying maps are being changed with growing number of buildings on top of the digital elevation model. Since the aspect scale of the video clip did not allow representation of details, the buildings are shown as simple blocks with generalized roofs. Likewise, no research has been done at this stage on the actual form of the buildings in the history. In the same way, the temporal changes of topography and vegetation could be visualized and studied, or the old and new toponyms compared. From the cartographic point of view, the development of cartographic techniques could be presented to students or to common library users interested in maps.
Geospatialization and Socialization of Cartographic Heritage 169
Fig. 8.4. Entrance to the web-based application showing preview of old and new maps.
8.4.2 Visualization of WWI front line based on Google Earth platform The valley of river Soča in the west of Slovenia, near the border with Italy, was a scene of the biggest mountain battle in the history of mankind, known internationally under the name of The battle of Isonzo (italian name for the Soča river). In the years between 1915 and 1917 there were wrathful clashes of arms between two major alliances, the Entente and the Central Powers. The nearly 90 km long front, spanning from the centre of Julian Alps to the Adriatic coast was in fact a series of twelve battles. A 3D visualization of the part of the Julian Alps with the emphasis on the Soča valley with high mountains above it was produced (Figure 2). The 3D modelling was partly done by web GIS applications using the XMLbased KML language, which is the standard for managing the display of 3D geospatial data in Google Maps and Google Earth. The 3D visualization was finally done in a form of video files, presenting virtual flight over the terrain covered with different thematic data, contemporary and old
170 D. Radovan, R. Šolar
Fig. 8.5. Information about the panoramic view, marked on the old plan of Ljubljana.
(Figure 3). The research project was financed by the Slovenian Ministry of Defence under the name "For the freedom of homeland: The war in the Julian Alps 1915-1917" (Klanjšček, Radovan, Petrovič 2008). The first film presents the virtual flight over the entire area, where the high resolution digital elevation model (DEM, 5 m grid) was used as a base for draping the satellite and aerial imagery over the model. National topographic map at 1:50.000 was also used as a reference (Figure 2). The front lines have been taken from historical maps. They are shown as vectorized polygons in the national coordinate system. Since they were very close, details about the front course were carefully harmonized with the ridges and valleys of the digital terrain model. Three additional films represent the detailed view of three important parts of the front: the mountain of Krn and Batognica (both over 2000 m high, 20 sqkm), the mountain of Mrzli vrh (12 sqkm) and the ridge of Kolovrat (40 sqkm). All three videos show the historical positions of the trenches of both fighting sides. Data were gathered from different sketches
Geospatialization and Socialization of Cartographic Heritage 171
Fig. 8.6. Scanned and geolocated panoramic view.
and old maps from the archives. The national topographic map at 1:50.000 with hill shading draped on DEM was used on detailed views. Toponyms of close objects appear when flying over them. Textual descriptions and music accompany the videos. The video material is planned to be published on a DVD as an addition to the traditional history book. It will also become a part of permanent exhibition in the army history museum. 8.4.3
Toponymic gazetteer for geoparsing
The Slovenian Surveying and Mapping Authority is the owner of the Register of geographical names. It contains geolocated geographical names of the entire territory of Slovenia (about 200.000 names on 20.000 sqkm). They are distinguished by the object type, thus the names pertain to the classes of toponyms, hydronyms, oronyms, horonyms, etc. The names
172 D. Radovan, R. Šolar
which appear on the official geographic maps of Slovenia at the scale 1:250.000 and 1:1.000.000 are nationally standardized. From the register, all Slovenian place names (ie. toponyms) were extracted by the Geodetic Institute of Slovenia. According to the grammatic rules, all toponyms have been automatically replicated in all six grammatic cases used in Slovenian language. A minor part of names was corrected manually for exceptions. The resulting toponymic gazetteer is accompanied with geolocations (ie. coordinates) in the national coordinate system taken from the official register. It enables geoparsing of any toponym in any grammatic case eg. within a digital book, a digital newspaper or in any other digital document. So, searching a specific name across various textual documents can tell us: 1. which document contains the name, 2. where in the document is the name, 3. what is the frequency of the name, 4. what is the coordinate of the place that the document is talking about, 5. where is this on the map, 6. which places are encountered in a document, 7. and many other interesting facts related to geolocations described in the document. A pilot application has been developed for geoparsing in digital books. Several problems were addressed when toponyms were searched in the digital text, for example: • some generic terms in the text can be the same as toponyms, • different places can share the same toponym, • a specific place can have more than one toponym, • some places can have bilingual or archaic toponyms, • books can contain toponyms with grammatic errors, • longer toponyms can appear shortened in the digital text, • some toponyms do not exist in the database while found in the book. Obviously, various extensions of geoparsing application have broad potential also in digital media, archives, geography and linguistic research.
Geospatialization and Socialization of Cartographic Heritage 173
8.4.4 The cartographic heritage portal using GIS and multimedia The prototype of the web-based portal for the Map and Pictorial Collection of the Slovenian national library has been developed in two parts, in 2004 and 2007. In the first phase, diverse materials from the library collection holdings, such as maps, portraits, views, audio clips and manuscripts were selected from the same historical period (19th century) for the portal prototype. Contemporary items such as a section of modern digital topographic map of Slovenia at 1:100.000 and a digital city map of Ljubljana at 1:20.000 were also included to allow comparison with 19th century items. Coordinate system conversion was required to overlay old and modern maps. The rectification process has been performed on several ground control points. To non spatial items i.e. to panoramic views and portraits, coordinates were added to bibliographic metadata. They were marked as hot spots on interactive maps. Additional hypertext describing the view, the portrait and the author were added to picture images with informative and educational meaning. A prototype gazetteer for selected places (settlements) was also georeferenced which enables users to find and compare current toponyms to historic Slovenian and German names. In the second phase, the application has been amended and added to the European Library portal. The project European Library (TEL), which was initiated with European funding, is the online service that went live in March of 2005. This website allows search through the resources of 47 national libraries in 20 languages. New initiatives are currently supported with European co-funding, and they continue to build upon TEL and Europeana. Improvement of the usability of the portal by integrating new services, TEL is seeking ways to make its contents more usable, to enhance search results and to explore novel ways of presenting them; mainly to make the user experience more engaging and interesting. The cooperation of librarians with geoinformation scientists has lead to the introduction of a web-based portal prototype as an external service to which the European Library was linked (Angelaki et al. 2007). The web-based portal content can be explored from the collection of geolocated postcards which are a search startpoint for the user. By clicking on the collection icon, the user enters the application. He is then invited to
174 D. Radovan, R. Šolar
interact with the other elements of the application thereby exploring Slovenian history (Figure 4). The user can explore the map by directly clicking on the map signs or he can use the interface. By clicking on the map sign user can get information about the depicted view, postcard, portrait or place name (Figure 5). Thus, the user interface allows search based on various criteria; spatial coverage, title, author, date and publisher. The spatial location of the building, square or street with the angle from which the object is photographed, is displayed on the map. Search results (pictorial items) are displayed in a separate window simultaneously with their specific locations on the map. Metadata and locations are used for this purpose. For further details about the application see also (Šolar, Radovan 2005, Šolar et al. 2007). In 2008, over 150 additional old panoramic views of settlements have been added to the collection of geolocated imagery. Each was produced in three image resolutions: iconic, preview and facsimile quality scan. Some items were too large for the biggest available scanner format, so they were scanned piecewise. Colour tints and shades were equalized between tiles and glued digitally into complete image. The panoramic views were geolocated with geographical coordinates at the ancient central position of the settlement which in most cases still represents the nowaday centre (usually a church or a castle) (Figure 6). The expanding digital collection will have significant educational potential within the web-based application when the collocated views from different eras will show a timeline of settlement development.
8.5 Social Implications of Digital Cartographic Heritage The cartographic heritage is present in various historical, social, educational, linguistic and other researches, and could be coupled with various library contents. The carto-graphics could be mixed with photos, text and pictures. Cartographic heritage can be combined with contemporary contents and future predictions or simulations.. Currently, the use of GIS in libraries is rare and limited mostly to digitizing, data management and visualization, however spatial browsing and analyzing are in the peak interest of researchers. The applications described above have demonstrated that map and georelated library collections could be presented to multiple users on the Internet or even to mobile phone users. The items that have been accessed only
Geospatialization and Socialization of Cartographic Heritage 175
by individual professionals or enthusiasts in the past, are now open to different social groups easier than ever before. Geospatial information can be shared and discussed in a near real-time. Most user interactions with the digital geolibrary does not require extensive knowledge of GIS or programming, nor any specialized training. The change of library paradigms has brought the library closer to social media. A national library can now be accessed by global users without physical or explicite mediation of the institution itself. The access is cheap or even free of charge. In the near future we can expect that the users will generate and share own content on collaborative web sites (ie. wikis) containing maps, sketches, postcards, or photos related to locations. User generated content could add value to physical libraries. Sharing cartographic and geolocated content combined with other archive documents can rise interest in social data mining such as eg. genealogy, study of various lineages, property tracing, toponomastic and linguistic studies, etc. Social networking will result from communication between individuals having the same target interest in the library (and arhive) contents. Many digital library contents will become available on hardcopy on demand. Nowadays, unformal references (toponyms) prevail in the digital library contents. Geoparsing, optical character recognition from books and conversion of catographic heritage to vectors could provide massive "smart" access to geolocated heritage. The increased use of formal references to library contents (coordinates) will enable linking to measured GPS (satellite based) positions in real time. The Galileo Green Paper on Satellite Navigation Applications predicts that in the year 2020, three billion receivers for satellite navigation will exist all over the world. Smart phones, computers, vehicles, machines and other things and living beings will be positioned. The positions could be shared and exchanged in real time between individuals. So, the positions are becoming a part of social networks, revealing travel and navigation patterns, shopping habits, points of interest, etc. In reference to this, the cartographic heritage has potential to be introduced eg. into augmented reality systems, where past scenes, locations and imagery could be overlaid with a real time video flow of the surrounding environment on a digital camera screen. Cartographic heritage will be enriched with contemporary and historic points of interest (POIs).
176 D. Radovan, R. Šolar
8.6 Conclusions and Open Questions The described applications represent interdisciplinary presentation of the national library treasures with geodetic data. GIS creates new opportunities to display and link traditionally static prints. One of the open questions is, why GIS is not broadly used in national map libraries portals? Georeferencing and the use of GIS require a basic knowledge of programming, the use of databases as well as GIS technology. Thus, libraries will have to either provide adequate training in GIS for their staff, or hire GIS specialists, a solution that has become widely used in the map collections in the USA. Map librarians have used advantages of cross-institutional cooperation with various spatial research institutions. Dragland (2005) wrote about the future of georeferenced digital libraries: “There will be a golden age for georeferenced digital libraries if people learn to solve problem in the spatial paradigm, and if they are aided by georeferenced digital libraries and georeferenced collections that are publicly available and easy to comprehend and use.” Clearly, cartographic heritage librarians supported by geoscientist have to make their contributions. However, public availability does not only mean a free disposal of the library materials to the users, but also communication and active classification of contents between the users. So, sharing information and social tagging could provide new ways to find and retrieve information from media, archives, libraries and geo-oriented wikis (collaborative geo-content). The barriers between cartographic heritage and other information sources will be blurred. Many social, legal and contextual dilemmas still hinder the widespread use of geolibrary contents. One of the conceptual questions is also what is a library item or a cartographic heritage item in the digital geo-age at all? Are eg. aerial photos, e-mails with maps, 3D visualizations and old updates of topographic databases a cartographic heritage? Are the old periodic updates of geodetic data the future part of libraries? Will today's GIS applications be stored as virtual archive documents within a geolibrary? Will hardcopy libraries still exist in the present form, or will they coexist in a symbiosis with softcopy and virtual libraries? Maybe the answer is in the ways how information sharing will evolve through social interactions that individuals create.
Geospatialization and Socialization of Cartographic Heritage 177
8.7 References Alexandria Digital Library. In digital form, http://www.alexandria.ucsb.edu/ Angelaki, G., Šolar, R., Janssen, O., Verleyen, J. (2007). Old postcards of Ljubljana: a small feasibility study to examine a possible application and availability of Geographic Information Systems with a view to integrating this approach into The European Library: Project Report. In digital form, http://www.edlproject.eu/membersonly/wp1.php Bäärnhielm, G. (1998). Digital cartography in the Royal Library - National library of Sweden. In digital form, http://liber-maps.kb.nl/articles/baarn11.htm Campbell, T. (2000). Where are map libraries heading? Some route maps for the digital future. In digital form, http://liber-maps.kb.nl/articles/12campbell.html Dixon, J.B. (2006). Essential collaboration: GIS and the academic library. Journal of Map & Geography Libraries 2 (2): 5-20. Dragland, K.T. (2005). Adding a local node to a global georeferenced digital library. In digital form, http://www.diva-portal.org/diva/getDocument? urn_nbn_no_ntnu_diva-645-1__fulltext.pdf Dupont, H. (1998). Legal Deposit in Denmark - the new law and electronic products. In digital form, http://liber-maps.kb.nl/articles/dupont11.htm EDL project. In digital form, http://www.edlproject.eu/about.php Europeana project. In digital form, http://www.europeana.eu/portal/ The European Library. In digital form, http://www.theeuropeanlibrary.org/ portal/index.html Ferguson, A.W. (2002).'Back talk--GIS induced guilt. Against the Grain 14(5): 94. Fleet, C. (1998). Ordnance Survey digital data in UK legal deposit libraries. In digital form, http://liber-maps.kb.nl/articles/fleet11.htm Fleet, C. (2002). The legal deposit of digital spatial data in the United Kingdom. In digital form, http://liber-maps.kb.nl/articles/13fleet.html Fleet, C. (2004). Web-mapping applications for accessing library collections: case studies using ESRI's ArcIMS at the National Library of Scotland. In digital form, http://liber-maps.kb.nl/articles/14fleet.html Fleet C. (2006). Locating trees in the Caledonian forest': a critical assessment of methods for presenting series mapping over the web. e-Perimetron 1(2): 99112. In digital form, http://www.e-perimetron.org/Vol_1_2/Vol1_2.htm French, M. (1999). The ARL Literacy Library GIS Project: Support for Government Data Services in the Digital Library. In digital form, http://iassistdata.org/publications/iq/iq24/iqvol241french.pdf Galileo Green Paper on Satellite Navigation Applications. In digital form, http://ec.europa.eu/dgs/energy_transport/galileo/green-paper/index_en.htm GIS prototype. In digital form, http://www.theeuropeanlibrary.org/portal/? coll=collections:a0246&q=postcards Goodchild, M.F. (1998). The Geolibrary. Innovations in GIS 5: 59-68.
178 D. Radovan, R. Šolar Goodchild, M.F. (2000). Cartographic perspectives on a digital future. Cartographic Perspectives 36: 1-19. Goodchild, M.F. (2004). The Alexandria Digital Library Project. In digital form, http://www.dlib.org/dlib/may04/goodchild/05goodchild.html Häkli, E. (2002). Map collections as national treasures. In digital form, http://liber-maps.kb.nl/articles/13hakli.html Jablonski, J. (2004). Information literacy for GIS curricula: an instructional model for faculty. Journal of Map & Geography Libraries 1(1): 41-58. Kildushevskaya, L. Kotelnikova, N. (2002). Problems of preservation and accessibility of cartographic publications in the National Libraries of Russia. In digital form, http://liber-maps.kb.nl/articles/13rus.html Klanjšček M., Radovan D., Petrovič D. (2008). 3D-visualization of the mountain battlefield on the Soča front line. In: Perko D. et al. (eds.), Geographical information systems in Slovenia 2007-2008, Geographical Institute Anton Melik ZRC SAZU, Ljubljana, pp. 331-339. Kotelnikova, N. Kildushevskaya, L. (2000). Electronic maps and atlases in the Russian State Library and the Russian National Library. In digital form, http://liber-maps.kb.nl/articles/12kotelnikova.html Kotelnikova, N. Kildushevskaya, L. (2004). Development of geographic information systems and their use in national libraries of Russia. In digital form, http://liber-maps.kb.nl/articles/14kotelnikova.html Kowal, K.C. (2002). Tapping the Web for GIS and mapping technologies: for all levels of libraries and users, Information Technology and Libraries 21(3): 109-114. Martindale, J. (2004). Geographic Information Systems librarianship: suggestion for entryl evel academic professionals. The Journal of Academic Librarianship 30(1): 67-72. Šolar, R., Radovan, D. (2005). Use of GIS for presentation Map and Pictorial collection of the National and University Library of Slovenia. Information Technology and Libraries 24(4): 196-200 Šolar, R., Janežič, M., Mahnič, G., Radovan, D. (2007). Spatial querying of geocoded library resources on internet. In: XXIII International Cartographic Conference, 4-10 August, Moscow. Šolar, R., Radovan, D. (2008). The change of paradigms in digital map libraries. E-perimetron, vol. 3, no. 2, str. 53-62. In digital form, http://www.e-perimetron.org Tedd, L.A. Large, A. (2005). Digital Libraries : principles and practice in a global environment. Munchen: K.G. Saur.
Section III Keep It Online and Accessible
9 More than the Usual Searches: a GIS Based Digital Library of the Spanish Ancient Cartography...............................................181 Pilar Chías, Tomás Abad 10 Map Forum Saxony. An Innovative Access to Digitized Historical Maps.................................................................207 Manfred F. Buchroithner, Georg Zimmermann, Wolf Günther Koch, Jens Bove 11 A WYSIWYG Interface for User-Friendly Access To Geospatial Data Collections .......................................................221 Helen Jenny, Andreas Neumann, Bernhard Jenny, Lorenz Hurni
9 More than the Usual Searches: a GIS Based Digital Library of the Spanish Ancient Cartography
Pilar Chías1, Tomás Abad2 1
Professor, Technical School of Architecture and Geodesy, University of Alcalá, Spain, [email protected] 2 Searcher, Technical School of Architecture and Geodesy, University of Alcalá, Spain, [email protected]
Abstract Inside the frame established by the Council of the European Union about Digital Libraries, we are creating a complete and multilingual database on Spanish ancient cartography. It gathers the documents located in the different archives, that yet today remain disperse and even unknown. We are also implementing an open Geographic Information System, that surpasses the usual operability of the traditional multiformat databases. Allowing to establish more complex relationships among all this information, the GIS enlarges the different queries and searches, and spreads online a more personalised information. This new methodology has been created with the aim of being implemented all around the European Union. Based upon the study of old cartographic documents, the system will allow searches and analysis of the historical evolution of the territories and landscapes.
9.1 Introduction According to the strategies of the Council of the European Union, the European Digital Libraries (Commission 2005) have been considered as a
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_9, © Springer-Verlag Berlin Heidelberg 2011
182 P. Chías, T. Abad
common multilingual access point to Europe’s digital cultural heritage. Assuming that ancient maps and plans are important cultural materials, in the last decade several cartographic databases have been created and allow an efficient online accessibility. This libraries are defined as organised collections of digital contents made available to the public. They are composed of analogue materials that have been digitised, as well as of born digital materials. They must also follow several main strands as: • The online consultation, stressing the importance of exchanging information and publishing the results, in order to maximise the benefits that users can draw from the information (Council Conclusions on the Digitisation and Online Accessibility of Cultural Material and Digital Preservation 2006/C 297/01). • The preservation and storage of this digital collections, to ensure that future generations can access the digital material, and to prevent losses of contents. Maps represent an important part of the richness of Europe’s history and its cultural and linguistic diversity. They can be increasingly accessed through local libraries websites based on open-access models. And they are based on the principles of free, worldwide access to the information and voluntary sharing. The online presence of this cartographic material will make easier for citizens to appreciate their own cultural heritage, as well as the heritage of other European countries, and to use it for study, work or leisure. It will be a rich source of raw materials to be re-used in different sectors and for different purposes and technological developments.
9.2 Digital Approaches to the Cartographic Heritage The First International Workshop on Digital Approaches to Cartographic Heritage hold in Thessaloniki, May 18-19. 2006 (Livieratos 2006) has been the inaugural meeting of the near-eponymous ICA Working Group. Founded in La Coruña (Spain) in July 2005, its main target is to bring together both scientific and technical approaches to the history of cartography, with humanistic and historical methodologies. It also aims to gather together and synthesize disparate fields through a clearly articulated focus on digital technology, in the context of Cultural Heritage.
More than the Usual Searches: a GIS Based Digital Library 183
This Working Group complements the Working Group on the History of Colonial Cartography, as well as the five existing ICA Commissions named: • Maps and the Internet. • Theoretical Cartography. • Education and Training. • Map Projections. • Visualisation and Virtual Environments. In Spain, the interdisciplinary Working Group on Cartographic Heritage (GTI PC-IDE) in the SDI (Spatial Data Infrastructures), inside the Comisión Especializada en Infraestructuras de Datos Espaciales (CE IDE) of the Consejo Superior Geográfico, aims to promote the publication of historical geographic documents. It is composed by several members of the IBERCARTO Working Group of Spanish and Portuguese Map Libraries, whose first meeting was recently hold in the Institut Cartogràfic de Catalunya in Barcelona (Spain), June 25. 2008. Among its main targets are:
Fig. 9.1. Antonio Plo, Plano de la villa de Talavera, sus campos, bosques y valdíos segun la situación de sus principales partes y pueblos vecinos, en que se manifiestan los regadios que se pueden hacer, tomando las aguas de los rios Tajo y Alverche, para fertlizar sus tierras. Madrid, 15 de noviembre de 1767. Madrid, Biblioteca Nacional de España
184 P. Chías, T. Abad
• The publication of geographical and historical data, as well as documents, metadata and registers in Internet, through the SDI strategy, searching the interchange among the MARC format and the metadata format ISO 19115. • The optimisation of the Web services of localisation, visualisation, downloading, transformation and geoprocessing. • The data and documents generation (with the associated metadata). • The client applications that allow the use of this services. • And the provision of a general framework. • And among their activities are: • To detect and support the searches that are interesting to the Group and can be developed inside it. • To lead the activities for the diffusion and divulgation. • To send the recommendations and proposals to the Consejo Superior Geográfico through the IDEE Working Group. Also in Spain the AVANZA plan for the development of the Information Society has started a first digitisation and online publishing programme for the 2006-2010 period, according to the national standards in an OAI-PHM protocol compatible system. The main Spanish cultural offices are making a strong effort to digitise the public collections of historical documents, that at the beginning of 2006 included 109 collections of Spanish libraries. But the particular case of ancient maps carries up problems such as those posed by the different locations, techniques, sizes and preservation conditions, as well as the high costs that are delaying the prompt achievement of their diffusion. Another problem associated to the difficulties of finding those maps, is that they are frequently included into other documents or inside bundles of old papers, and remain yet undiscovered. Among those important initiatives we will emphasize the digital libraries created by the Institut Cartogràfic de Catalunya, the Instituto Geográfico Nacional and those of the Portal de Archivos Españoles (PARES). They are participating in the MICHAEL project, and not only display free low-resolution images of each map, but provide an accurate description of the document and the conditions of use.
More than the Usual Searches: a GIS Based Digital Library 185
9.3 The Cartographic Databases
9.3.1
Main targets
To define the contents of our cartographic database we have decided to apply the ICA’s Working Group broad definition of cartographic heritage as “anything of cultural value inherited from maps and accessible to a broad public community”, as well as the wide sense concept of a cartographic document of Harvey (1980, p. 7) and Harley and Woodward (1987, p. xvi), that includes all kinds of maps, plans and charts at different scales (architectural, urban and territorial scales), as well as pictures and bird’s-eye views (Kagan 1986, p. 18; De Seta 1996), with no restrictions due to techniques, functions or origins. Ancient cartography, as well as old pictures, drawings and photographs, has not been traditionally used as a reliable source of information about the history and the evolution of the land- and the townscape. Those graphic materials have been usually considered as ‘second order’ documents, mainly because of the difficulties that their interpretation can sometimes involve (Harley 1968). It is due to the different conventions that are applied in each case by the cartographer. But there are other problems related to the difficulties of localisation and visualisation of the original maps, that have to be considered to understand why cartographic materials are so seldom used in the historical searches. It is not easy to access to an original big size and small-scale map that is sometimes composed by several printed sheets; and it is also hard to read the symbols employed in the map and the texts, when searchers must handle a reduced hardcopy or a low resolution digital image. Although it is not essential to have an exhaustive knowledge of the context of each map, to get a meaningful interpretation of it (Skelton 1965, p. 28; Andrews 2005), it is necessary to achieve some basic specific concepts on the theory of the cartographic expression and design (about map projections, symbols or representation of relief, for instance). The lack of them can difficult the right interpretation of the document and twist the results of the investigations (Vázquez Maure & Martín López 1989, p. 1). Nowadays the “digital cartography and the history of cartography are not yet comfortable bedfellows” (Fleet 2007, p. 102).
186 P. Chías, T. Abad
Our project of a digital cartographic database accessed through GIS is related to, but distinctive from the history of cartography. It aims to integrate the digital technologies into the cartographic heritage, and to provide new approaches to, and new audiences for, the history of cartography. The methodology that we designed applies all those digital technologies to the history of cartography, and helps to establish new relationships between maps. It also provides an easy access to images, in order to make analysis and comparisons. It shows the map distribution on the different archives, and finally allows to reconstruct the historical landscapes and the history of the territory through old maps. This is one the main targets of our project, as well as to spread the old cartographic treasures that compose a relevant part of the Spanish Cultural Heritage, that yet remains unknown to the public, and even to a great number of specialists (Chías & Abad 2006; 2008) Briefly we can expose our main targets as follows: • To spread and relate the contents of the different Spanish archives, in order to give an overall view of old cartography. It can be directly applied to the study of the historical evolution of the territory and the landscape, at many different scales. • To enlarge the available information about the ancient cartographic documents, not only through the metadata of each image, but also with other contents that are located in particular, or non digitised collections. • To enlarge the possibilities of the traditional searches on the databases through the standard GIS queries, including such interesting issues as the metric and geometric accuracy. • To use the new technologies to study and spread the cartographic heritage through Internet. 9.3.2
The use of the new information technologies
The digital technologies were formerly used by other disciplines (archaeology and historical geography) and had practical non-academic applications (town planning, librarianship). Among the possibilities of the digital images, the new computerized methods and digital technologies have brought an explosion of the scope and potential of the digital cartography. They allow and encourage the interaction with early maps, with the aim of furthering our understanding of their various contents. But they also introduce new ways to connect early
More than the Usual Searches: a GIS Based Digital Library 187
Fig. 9.2. The ‘Cartography’ table.
188 P. Chías, T. Abad
maps with other kind of information, inviting us to use new forms of presentation and making easy a speedier transmission of images the world over. This is particularly interesting as it is inserted into the European Space of Information. On the other hand, the digital cartography is a useful tool in the traditional scholar research (Fleet 2007, p. 100). It still remains as a central theme of discussion the way digital technologies can be useful to historians of cartography, deepening on subjects as: • How digital technologies are already providing access to early maps (and related materials), through a range of methods that include: improved reproduction, electronic facsimiles, websites, new forms of presentation and integration, and new forms of digital preservation and archiving (for instance, by using photogrammetric techniques in seaming together images of large maps to create more authentic facsimiles); dynamically integrating maps with other information using the web; applying new ways of visualising and presenting early mapping; and associating new metadata as structured summary information about a cartographic source, to encode data on and about historical maps. • The digital technologies provide new ways of understanding the content of early maps. They allow the digital analysis of map geometry, and the use of digital transparency techniques that focuse on the cartometric analysis of early maps. They also check the projections used in 15th and 16th centuries nautical charts, that were based on the study of navigational practices, technologies and texts. But they also test the use of precise methods and mathematics behind the transformations of old maps in various geo-referencing projects. • Digital technologies and particularly Geographic Information Systems (GIS), supply new ways of integrating early maps with other information. They make 3-D visualisation more accessible, realistic and impressive. Or they combine historical maps and associated textual and numerical information, in order to get a spatial analysis of agricultural productivity and related to landowners, through the integration of cadastral and statistical information. The Gregoriano Cadastre (Orciani et al. 2007) and the old cadastral maps of Utrecht (Heere 2006), are examples that focuse on the reconstruction of the evolution of the land properties.
More than the Usual Searches: a GIS Based Digital Library 189
Fig. 9.3. Contents of the ‘References’ table.
9.3.3
Contents and structure of the databases
As we must restrict the temporal as well as the geographical contents of the cartographic databases, we firstly decided to include all the historic documents that have been drawn before 1900, because along the 20th century the cartographic production and techniques have very much increased in many senses, and its study should be carried separately. Secondly, the spatial restriction of the cartographic database should concern at a first stage the actual Spanish territories (Chías & Abad 2008a). The former stages of our search are focused on finding, studying and cataloguing all kind of cartographic documents, that are preserved in the main Spanish collections, archives and libraries. Among them are the Biblioteca Nacional de España (Madrid), Biblioteca del Palacio Real (Madrid), Biblioteca del Real Monasterio de San Lorenzo de El Escorial (Madrid), Biblioteca de la Universidad de Barcelona, Biblioteca de la Universidad de Salamanca, Real Academia de la Historia (Madrid), Real Academia de Bellas Artes de San Fernando (Madrid), Archivo Histórico Nacional (Madrid), Archivo General de Simancas (Valladolid), Archivo de la Real Chancillería (Valladolid), Archivo del Centro Geográfico del Ejército (Madrid), Instituto de Historia y Cultura Militar (Madrid), Instituto Geográfico Nacional (Madrid), Museo Naval (Madrid) and Archivo de Viso del Marqués (Ciudad Real). But we also make searches both in local and ecclesiastical archives.
190 P. Chías, T. Abad
Fig. 9.4. Elements of the ‘Libraries, Archives and Map Collections’ table.
The next step constructs the relational databases, using a commercial compatible platform. They are designed as multilingual (there is already an English and a Spanish version) and open ones. This facility will allow to include new registers in the future, and even to add new fields or tables, updating and adjusting the contents to the new needs. Moreover, the concept ‘relational’ implies to combine data mining from different tables, and to reduce their weights, looking for an easier data management and querying. According to this, our methodology includes three main tables, that are the following: • ‘Cartography’, that includes the registers concerning the cartographic documents, considering the ISBD Cataloguing Norms. • ‘Bibliography’, that includes the complete bibliographical references in the field Bibliography of the table ‘Cartography’. • ‘Libraries, Archives and Map Collections’, provides the complete references of the collections that have been visited; they appear just as an acronym in both Collection and Signature fields of the table ‘Cartography’. The three tables have been designed sharing at least one field, to allow crossing the data files and to economize data length in the databases. The design of the table ‘Cartography’ joins both the descriptive and the technical data of each document, by joining the perspectives of the historian and the cartographer. The items that have been included are the following: • Place or Subject (text field): Refers to the geographical place that is represented in the document, and to its province, since the Spanish administrtative reform of 1831. To define clearly the territorial limits, the old councils or boundaries are also included. And to determine the original uses of the map, it is also specified if it is a general, thematic
More than the Usual Searches: a GIS Based Digital Library 191
Fig. 9.5. A detail of the cartographic base of the GIS, showing the different areas covered by some ancient maps. In red: 17th century maps; in blue, 18th century maps. As a background is reproduced a composed map of the same areas dating from the second half of the 19th century. It is taken from the First Series Mapa Topográfico Nacional 1:50.000, from the sets in the Instituto Geográfico Nacional, Madrid. Reference grid, 1º latitude and 1º longitud
• • • •
(geological, military, statistic, cadastral, etc.) or a topographical map or plan, or a chart. Date (numerical field): as precise as the document can be dated. The estimated dates are written among square brackets. Kind of Cartographic Document (text field): it defines if it is a map or plan, a chart, a portolan or a view, or even a terrestrial globe. Size (text): width and height of the image in mm; it is also included the total size of the sheet(s) (or other supporting materials) and the number of pieces or sheets that compose the ensemble of the document. Collection and Signature (text): the collection that preserves the document and the signature; the first one is quoted through an acronym and the second one is abbreviated according to the norms (its total extension can be consulted in the table ‘Libraries, Archives and Museums’); when possible, it includes the link to other e-libraries or references.
192 P. Chías, T. Abad
• Original Title (text): quoted among inverted commas if it is literal, as it is written in the document; otherwise it is defined among square brackets through its main features. • Author(s) (text): names of the author(s) if the map is signed; in case of ascription, the name(s) appear among square brackets; they can be also ‘unknown’. • Scale (text): it is defined graphically or as a fraction, detailing the different units employed; when there is no definition scale, appears ‘without scale’. • Projection (text): details the projection employed with its different elements: grid, references, orientation; it is also referenced the use of different projections, for instance profile or section added, axonometric or perspective views, with their own distinctive elements, and even the case of large scale plans. • Technique (text): makes a distinction between manuscripts and printed maps, as well as the drawing surfaces and techniques, specifying the uses of colour. • Short History (of the map, text): place of edition, editor, or if it is a part of a big compilation or atlas. It is also mentioned where it comes from or the precedent owners, and the date of purchasing. • References (of the map, text): abbreviated and following the international system for scientific quotations ISO 690-1987. • Image (object/container field): it is included a low resolution raster image in highly compressed jpg format of the cartographic document. By clicking on the image, it is possible to display a high resolution one in a tiff format that allows to see the details and to read the texts. If the map is composed by several sheets, it is possible to see each one separately (and to compose it apart) (see below, The GIS implementation). • Other Remarks (text): in the case of a printed map, includes other collections that have a copy, or variations of the plate, as well as manuscript notes, etc. • Date (of the catalogue, aut.). • Operator (for future updates). The table ‘References’ defines completely the abbreviations and acronyms used in the other tables. The quotations follow the ISO 690-1987 Norm. The fields are in this case:
More than the Usual Searches: a GIS Based Digital Library 193
194 P. Chías, T. Abad
• • • • • • • • • •
Author(s) (text). Date (of edition, numerical). Title (of the book, text). Article’s Title (text). Periodical or Book (in case of articles or book chapters, text or link). Publisher (text). Place (text). Volume (numerical). Pages (text). Quotation (text): as it appears in the other tables.
Finally the table ‘Libraries, Archives and Map Collections’ makes possible to identify the acronyms used in the Collection and Signature field. It contains the following fields that complete the location of the documents: • Library (Abbreviation) (text). • Collection (Extended Name) (text). 9.3.4
The GIS implementation
The open GIS is supported by a commercial platform (GeoGraphics©) that includes a complete and easy to use computer-aided mapping module in a vector format (MicroStation©) and that is a standard with the maximal compatibility, although the connection with the database (ACCESS ©, FileMaker©) must be established through an ODBC protocol. We have tried some other possible GIS platforms that: • Integrate both the databases and the computer-aided mapping (ArcInfo©). • Or import the cartography file by using an exchange and export format such as the dxf (MapInfo©). The fist systems were the most complete, but databases were not easy to manage or to make changes in their structures. The second group of GIS platforms allows an easier and flexibler management of the databases, but problems appear whith the import of the vector cartographic base. This dues to the exchange formats, that always suppose a flaw of the information and the integrity of the graphic elements, leading to a hasty process of validation of the digital vectorial cartography (Chías 2004; 2004a).
More than the Usual Searches: a GIS Based Digital Library 195
196 P. Chías, T. Abad
The possibility of drawing our own vector cartography inside the GIS not only avoids the problems derived from the import of graphic files through the dxf format, but provides an easy definition of the vector base maps. It is then needed define clearly of the graphic features that must be digitised, and the strategies that must be followed to compose the base maps. We decided not to use the available digital cartography, because it would suppose a hard process to clean and to verify the topology. It would be harder to extract and add the features that were interesting to our GIS in the available digital cartographic bases, than to draw a new vectorial one specifically designed for our GIS. The new vector base has been structured in several data layers and sheets at a 1:200.000 scale. We have selected the graphic formats, the georeference and the symbolism according to the guidelines of the Instituto Geográfico Nacional of Spain, trying to ensure a proper understanding of both topographic and planimetric data, as well as an easy connection to other existing or future GIS. Due to the election we made of the GIS platform, the compatibility of both cartographic, graphic and attribute data storages is made through an ODBC protocol. Both ACCESS© and FileMaker© are easy to handle and don’t need a particular training. But the first one creates ‘heavier’ databases than the second one, what is a disadvantage when managing big amounts of data. FileMaker© brings also another important advantage, as fields are not limited to 256 characters, allowing to introduce more information if required. Each ancient map is referred to a point, a line or a centroid, depending on the territory they represent. Points are used generally in town plans or views; lines are used fundamentally to identify roads, railways and other lineal structures. Centroids are mainly used to identify the maps that represent wide territories. Obviously, a single centroid can be related to more than a map, and the different possibilities are simultaneously highlighted in a single screen; afterwards the different documents can be chosen separately. The contours of the different documents are drawn and referred to their own centroids. It is easy to see how maps and plans overlap, and to perceive rapidly the portions of land that are covered by different documents along the historical periods. As it has been mentioned, we have implemented three main datasets: the cartographic vector base, the tables that include the relevant features to
More than the Usual Searches: a GIS Based Digital Library 197
198 P. Chías, T. Abad
define each ancient map, and the sets of images, divided in high and lowresolution images of the map. The filling cards access directly to the low resolution image sets, to prevent problems on handling too ‘heavy’ registers and tables. Only in exceptional cases, and only when the copyright owner allows it, the high resolution images are available. The datasets of the ancient maps already include more than 8.000 files, that are continuously being updated and distributed in the servers by geographical units. This system and structure is the most useful and fits with the strategies that have been previously planned, in order to get the most varied and personalized information of the GIS. Our methodology has increased the possibilities of the usual queries that a GIS brings, just formulating them to the different databases separately (graphic, numerical or textual), or even crossing them;. And to ensure the proper display of the cartographical information, we have designed a filling card as one of the main printable output ways. This filling card includes also the adequate links to access to other e-libraries or references as it has been above mentioned. Obviously, traditional outputs as thematic mapping, statistics, lists or reports are always available (Spence 2007; Tufte 2005). The possibilities that the hypermedia concept brings on getting personalised information of the different data sets are an added value to the traditional queries system. 9.3.5 9.3.5.1
Digitisation, preservation and diffusion Digitisation
Sometimes digitisation has been primarily used to preserve the existing original and analogue materials (that can be degrading). Digitisation, preservation and diffusion are in our project strongly interrelated, and therefore have to be considered together. Specially in case of rare works (as several maps are), online consultation of the digital copy can replace the physical manipulation of the original, which will add to its longevity. The risk of losing digital material has been taken in account in our digitisation programme, because digitisation without a proper preservation strategy may become a wasted investment.
More than the Usual Searches: a GIS Based Digital Library 199
200 P. Chías, T. Abad
Some technical specifications about digitisation that have been used in the project are the following: • The name of the image files: FFFXX_NNNNNNNN_T.EXT, where: • FFF: Archive ID, max. length 5 characters; provided by each archive. • XX: Internal ID of the collection; length 2 characters. • NNNNNNNN: File ID that includes a geographical reference (province); variable length from 1 to 20 characters, from A to Z or from a to z, without accent, includes (-) but not the rest of characters including (_), that is used to separate the different ID groups of the image. • T: image format on screen: V = illustration, T = 1/3 screen, P = full screen, 2 = high definition (2.000 x 3.000 pixels), 4 = high definition 4.000 x 6.000 pixels, H = high definition (more than 4.000 x 6.000 pixels). • EXT: extension of the format: JPG, GIF, TIFF, PCD… • Example for an image of the Biblioteca Nacional de España, BNEM_002348CR_V.jpg • Image formats: open formats based upon norms and standards whose specifications are public. We have only considered both formats of preservation and diffusion when images are born digital material in the University of Alcalá, and can be diffused in a high definition version because there are no copyright problems. We always indicate the file format, including the version (for instance, TIFF version 6). Jpg and other compressed formats are preferred as a way to lighten the current databases. • Image file metadata: is a dataset that informs about other data in order to support their search, management and preservation (Dublin Core Norm). They can be descriptive (Subject, Description, Author, etc.), technical (Format, Digitisation options, etc.) and administrative (Copyright, etc.). There are two possibilities of relating the metadata and the image file, that are the external storage or the inclusion of the metadata in the file. We decided to combine them by maintaining the essential metadata (as are the image ID, Title, Archive and Date of digitisation) inside the digital file, while the rest of the metadata are stored in other external databases. • Storage supports and infrastructure: we have chosen the optical devices because of their capacity, durability, reliability, accessibility, volume,
More than the Usual Searches: a GIS Based Digital Library 201
stability, and cost, among other qualities. Those external devices grant the accessibility to the databases, even in case of web collapse. 9.3.5.2
Preserving digital content
A digital copy of a document does not necessarily guarantee its long-term survival: all digital material has to be maintained in order to keep it available for use. The main causes for the loss of digital content are the succession of generations of hardware that can render files unreadable (although the solution is the development of systems capable of accessing the disks using emulation techniques), the rapid succession and obsolescence of computer programmes, and the limited lifetime of digital storage devices. Unless data are migrated to current programs or care is taken to preserve the original source code, retrieval of information may become very costly, if not impossible. This is specially hard for the ‘closed’ data formats (those whose source code is not publicly known), that we have avoided. That is the reason to use simultaneously two platforms (PC and Mac) and compatible software for both. And to design an open system, always according to the PREMIS (Preservation Metadata for Digital Materials) Working Group Report (version 2.0). 9.3.5.3
Online accessibility and diffusion
Nowadays the web-based digital resources are quite frequent as a way to preserve and diffuse the cartographic heritage, as well as to access to the modern cartography (Zentai 2006; Livieratos 2008). Previous experiences as the one implemented on the Greek region of Macedonia (Jessop 2006), or the GIS-Dufour (Egil & Flury 2007) have shown the potential of GIS and its accessibility through the web. Under current EU-law and international agreements, material resulting from digitisation can only be made available online if it is in the public domain (in a narrow sense, refers to information resources which can be freely accessed and used by all, for example because copyrights have expired) or with the explicit consent of the right holders. The transparency and clarification of the copyright status of works is very relevant to us. As a matter of fact, our digital library is in principle focused on public domain material, and as digital preservation implies copying and migration, it has always been considered in the light of IPR legislation (Commission 2005). The digitised funds of other libraries are precisely quoted, and
202 P. Chías, T. Abad
respect the conditions that have been established for consulting the documents by the right holders; and we neither set other supplementary caution that restrict the access to the different data sets, nor establish different access levels. We also provide the links of the documents included in other cartographic databases, that can also be accessed through Internet.
9.4 Conclusion According to the initiative of the Council of the European Union about the European Digital Libraries as a common multilingual access point to Europe’s digital cultural heritage, and considering the ancient maps and plans as important cultural materials, we have developed an innovative GIS based methodology in ancient cartographic documents, whose essential values are: • To create new cartographic relational, multiformat and multilingual databases, that organise and unify the information that different archives and libraries have elaborated about their different funds. They also incorporate the dispersed and unknown documents that belong to nondigitised collections. This new information follows the ISBD Norm, and joins and completes the different approaches of the librarian, the historian (Edney 2007) and the more technical of the cartographer. • The new databases join both digitised materials and new informations that we have produced in a digital format. Our project provides some mechanisms that facilitate to digitise maps, to identify problems and to monitor bottle-necks (as those that appear handling big size maps). • They preserve the original materials, that are usually fragile. • The open GIS surpasses the usual operability of the traditional multiformat databases, as it enlarges through the queries the way to access to the different kind of data. We have also designed a new and personalised way to access to high resolution digital images by applying the hypermedia concept. • Our methodology provides an easy and successful electronic integration of metadata and text, of graphic and numerical information about the ancient maps. • On-line accessibility and diffusion through the Internet, as a response to a real demand among citizens and within the research community,
More than the Usual Searches: a GIS Based Digital Library 203
always paying attention to the full respect to the international legislation in the field of intellectual property. • This new methodology has been created aiming to be an open system, that can be implemented in all countries of the European Union. The essential challenges that we have undertaken during the development of the project were: A/ Those that have impacted the pace and efficiency of digitisation: • The financial challenges: because digitisation is labour-intensive and costly, and as it is impossible to digitise all relevant material, choices have to be made on what is to be digitised and when. • Organisational challenges: a ‘digitise once, distribute widely’ strategy benefits all the organisations involved in digitisation projects; in this sense, ours not only shows the duplicate maps in the different collections, but allows to find the various stages of the plates, following the history of the different prints and editions of a map or the collections, and their origins. • Technical challenges: we have tried to improve digitisation techniques in order to make digitisation cost-efficient and affordable. To digitise the non-digitised funds we apply both the contact scanning and the noncontact photographic methods (Tsioukas, Daniil and Livieratos, 2006), trying to minimize the distortion problems by digitising each sheet separately, although this process can not eliminate other problems in the final assemblage of the mosaic image. • Legal challenges: digitisation presupposes making a copy, which can be problematic in view of intellectual property rights (IPR). We have considered the Directive 2001/29/EC on the harmonisation of certain aspects of copyright and related rights in the information society (European Parliament and Council directive of 22 May 2001, OJ L 167, 22.6.2001, p. 10), that foresees an exception for specific acts of reproduction by publicly accessible libraries, educational establishments, museums or archives. The maps that are yet unpublished or remain unknown (for instance, those of the private collectors) if there is any legal obstacle that allows them to be shown, we offer a link to a high resolution image 1:1 that makes possible to see every detail and to read every name to analyse it properly. B/ The basic challenges that we have found for digital preservation are: • The financial challenges: the real costs of long-term digital preservation are not clear yet and depend on storage costs, the number of migrations
204 P. Chías, T. Abad
needed over time, including the efforts necessary to check the integrity of the digital object after migration. Once again choices have to be made as to which material should be preserved (according to the archival or historical value, use, etc.). • Organisational challenges: an added value can be found in ensuring complementarities and an exchange of good practices. C/ Technical challenges: • How to preserve the content so that it can be accessed, trusted and reused in the future, and how to preserve high volumes of rapidly changing distributed information, and how to develop tools, methods and technologies to preserve dynamic content (that changes as a result of user interactions or adding new data), tools for automatic analysis and indexing, and optimisation of GIS tools. • To improve its cost-efficiency and affordability. D/ Legal challenges: the traditional model of library services is not easily transferred to the digital environment, and as digital preservation depends on copying and migration we have introduced a set of new issues as: • The introduction of technological protection measures to prevent copying. • We are already studying the possibility of setting a digital rights management system restricting the access to digital material, with the aim to ensure that IPR mechanisms maintain a balance between enabling access and use while respecting the rights of the creators. As a final conclusion, we can state that the pilot experience about the Spanish ancient cartography accessible through an open GIS and diffused through Internet that we present is a successful example of the full application of the methodology in all its stages.
9.5 Acknowledgements This paper is a result of two main searches: • The Project EH-2007-001-00 “Las vías de comunicación en la cartografía histórica de la Cuenca del Duero: Construcción del territorio y paisaje” that has been financed by the Centro de Estudios Históricos de Obras Públicas y Urbanismo (CEHOPU-CEDEX) of the Ministerio de Fomento (Spain).
More than the Usual Searches: a GIS Based Digital Library 205
• The Project PAI08-0216-9574 “La cartografía histórica de la Comunidad de Castilla-La Mancha en los principales archivos españoles”, financed by the Consejería de Educación y Ciencia de la Junta de Comunidades de Castilla-La Mancha. Both are inscribed into our searchers’ guidelines on the investigation of the cultural heritage through the application of the most innovative technologies, as GIS and multiformat databases, that set up an essential basis for the knowledge of the history of the territory, the landscape and the town. Since a decade our team is engaged on setting up different useful methodologies that are being implemented in the Technical School of Architecture and Geodesy of the University of Alcalá.
9.6 References Andrews, JH 2005, ‘Meaning, knowledge and power in the philosophy of maps’, in JB Harley, The new nature of maps. Essays in the history of cartography, The Johns Hopkins University Press, Baltimore, Maryland, pp. 21-58. Chías, P 2004, Bases de datos y gestores de bases de datos para los sistemas de información geográfica, Publicaciones de la Escuela Técnica Superior de Arquitectura, Universidad Politécnica, Madrid. Chías, P 2004a, La imagen de los fenómenos geográficos en un sistema de información geográfica, Publicaciones de la Escuela Técnica Superior de Arquitectura, Universidad Politécnica, Madrid. Chías, P & Abad, T 2006, ‘A GIS in Cultural Heritage based upon multiformat databases and hypermedia personalized queries’, ISPRS Archives, no. XXXVI-5, pp. 222-226. Chías, P & Abad, T 2008, ‘Las vías de comunicación en la cartografía histórica de la cuenca del Duero: construcción del territorio y paisaje’, Ingeniería Civil, no. 149, pp. 79-91. Chías, P & Abad, T 2008a, ‘Visualising Ancient Maps as Cultural Heritage: A Relational Database of the Spanish Ancient Cartography’, in Information Visualisation, IEE, London, pp. 453-457. Commission of the European Communities 2005, i2010: Digital Libraries, Brussels. De Seta, C (ed) 1996, Cittá d’Europa. Iconografia e vedutismo dal XV al XIX secolo, Electa, Napoli. Edney, MH 2007, ‘Maps and “other awkward materials”: critical reflections on nature and purpose of cartobibliography’, in Paper and poster abstracts of the 22nd International Conference on the History of Cartography, ed M Oehrli, Verlag Cartographica Helvetica, Murten, p. 119. Egil, HR & Flury, P 2007, ‘GIS-Dufour: historical maps as base in a geographical information system’, in Paper and poster abstracts of the 22nd International
206 P. Chías, T. Abad Conference on the History of Cartography, ed M Oehrli, Verlag Cartographica Helvetica, Murten, p. 196. Fleet, C 2007, ‘Digital Approaches to Cartographic Heritage: The Thessaloniki Workshop’, Imago Mundi, vol. 59, no. I, pp. 100-104. Harley, JB 1968, ‘The evaluation of early maps: Towards a methodology’, Imago Mundi, no. 22, pp. 68-70. Harley, JB & Woodward, D 1987, ‘Preface’, in The History of Cartography: Cartography in Prehistoric, Ancient, and Medieval Europe and the Mediterranean, eds JB Harley & D Woodward, The University of Chicago Press, Chicago, Illinois, vol. I, pp. xv-xxi. Harvey, PDA 1980, Topographical maps. Symbols, pictures and surveys, Thames and Hudson, London. Heere, E., 2006, ‘The use of GIS with property maps’, e-Perimetron, vol. 4, no. 1, pp. 297-307, viewed 2 July 2008, < http://www.eperimetron.org/Vol_1_4/Vol1_4.htm>. Jessop, M 2006, ‘Promoting cartographic heritage via digital resources on the Web’, e-Perimetron, vol. 3, no. 1, pp. 246-252, viewed 2 July 2008, < http://www.e-perimetron.org/Vol_1_3/Vol1_3.htm>. Kagan, RL 1998, Imágenes urbanas del mundo hispánico, 1493-1780, Ediciones El Viso, Madrid. Livieratos, E (ed) 2006, Digital Approaches to Cartographic Heritage, National Centre for Maps and Cartographic Heritage, Thessaloniki. Livieratos, E 2008, ‘The challenges of Cartographic Heritage in the Digital World’, in Third International Workshop ‘Digital Approaches to Cartographic Heritage, Barcelona, viewed 10 July 2008, . Orciani, M, Frazzica, V, Colosi, L & Galletti, F 2007, ‘Gregoriano cadastre: transformation of old maps into Geographical Information System and their contribution in terms of acquisition, processing and communication of historical data’, e-Perimetron, vol. 2, no. 2, pp. 92-104, viewed 2 July 2008, < http://www.e-perimetron.org/Vol_2_2/Vol2_2.htm>. Skelton, RA 1965, Looking at an early map, University of Kansas Library, Lawrence, Kansas. Spence, R 2007, Information Visualization. Design for Interaction, Pearson, London. Tsioukas, V, Daniil, M & Livieratos, E 2006, ‘Possibilities and problems in close range non-contact 1:1 digitization of antique maps’, e-Perimetron, vol. 3, no. 1, pp. 230-238, viewed 2 July 2008, < http://www.e-perimetron.org/Vol_1_3/ Vol1_3.htm>. Tufte, ER 2005, Visual Explanations. Images and Quantities, Evidence and Narrative, Graphics Press, Cheshire, Connecticut. Vázquez Maure, F & Martín López, J 1989, Lectura de mapas, Instituto Geográfico Nacional, Madrid. Zentai, L 2006, ‘Preservation of modern cartographic products’, e-Perimetron, vol. 4, no. 1, pp. 308-313, viewed 2 July 2008, < http://www.e-perimetron.org/Vol_1_4/Vol1_4.htm>.
10 Map Forum Saxony. An Innovative Access to Digitized Historical Maps
Manfred F. Buchroithner1, Georg Zimmermann2, Wolf Günther Koch1, Jens Bove3 Technische Universität Dresden, Institute for Cartography, Germany [email protected], [email protected] 2 Saxon State and University Library, SLUB, Dresden, Germany [email protected] 3 German Photothek, Dresden, Germany, [email protected] 1
Abstract In 2006, on the occasion of the 800-year anniversary of the Saxonian capital Dresden a web presentation of selected maps and historical vedute stored in the Saxonian State- and University Library – Sächsische Landesbibliothek, Staats- und Universitätsbibliothek (SLUB) in Dresden, Germany, was initiated. Since then this digital map collection named “Kartenforum Sachsen” (“Map Forum Saxony”) has been significantly extended. This map forum represents an information portal of both the Saxon and foreign libraries, museums and archives managed by the Deutsche Fotothek (German Photothek) of SLUB. Currently it offers roughly 5,300 of the most important historical cartographic media (maps and vedute) of all involved collections. The digitising needs highest resolution requirements. The article describes the analog/digital conversion of the originals, data processing (by means of Zoomify), its zooming and scrolling possibilities. The further extension of the “map forum” is currently on the way until 2011 within the scope a project funded by the German Research Council (DFG). E. a. it is planned to realise the map-based searching for larger maps and map series by means of an open standard interface (Google Maps API).
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_10, © Springer-Verlag Berlin Heidelberg 2011
208 M. F. Buchroithner e t al.
10.1 Introduction Historical1 and topographic maps as well as vedute belong to the most precious stocks of great libraries, whose collections, within Europe, mostly reach back to the 16th century. Also the Saxonian State- and University Library (SLUB) in Dresden, Germany, holds one of the greatest and most significant collections of this kind. Because of their great historical and aesthetic value old maps and vedute are also subjects of interest for private collectors worldwide. Historical maps comprehend hand drawings (manuscript maps), original prints, as well as facsimile prints. Often, they captivate with their beauty because they are documents of ancient arts. They document first and foremost the contemporary topographic and thematic structure of landscapes. Particularly, they are analysed and systematically evaluated by the discipline of Historical Geography in order to reconstruct former states of a nation's culture and to compare them with recent information. For this purpose, nowadays digital techniques based in geoinformation systems (GIS) are used (cf. i. a. Hackner, 2002, Witschas, 2002, and Walz & Berger, 2003). In the broadest sense they enable a spatial-determinated insight into municipal, regional and global history (cf. Große & Zinndorf, without date, Hackner, 2002; Leppin et al., 2000 and Witschas, 2002). Furthermore, they represent the main source for historical map research. Access to and distribution of such a comprehensive sets of spatial information was always one of the major objectives of the Saxonian Stateand University Library. The utilisation of new media such as analogue-todigital and digital-to-analogue conversion, highly efficient storage media, and especially the Internet provide further potential for users. Thus, in the future time-consuming and expensive searches will be replaced by "virtual access".
1
If a map is called "historical" depends on the type of map and various other factors. According to Stams (2001) maps published before 1945 (and, thus, produced using "traditional" analogue methods) are historical maps. The German data base for "old maps" IKAR which has been set up since 1985 only considers "historically precious map stocks until 1850". The SLUB in Dresden even delimits their cartographic "old stock" with the year 1800. The authors of this paper basically consider all maps "historical" if they were either published under a bygone political or social regime or if there exists a more recent edition of a map than the one in question.
Map Forum Saxony. An Innovative Access to Digitized Historical Maps 209
Fig. 10.1. Portal of the Map Forum Saxony as it presents itself since 2008: digitised maps and vedute of Dresden.
Cataloguing and digitizing of historical material will provide a basis for a convenient, quick, and precise access to the stock of selected historical maps and vedute. The website “Kartenforum Sachsen” ("Map Forum Saxony"), established on the occasion of the 800-year anniversary of the Saxonian capital Dresden in 2006 and maintained by the German Photothek (www.deutschefotothek.de), contributes to the preservation of the historical stocks and increases their worldwide utilisation (Figure 1).
10.2 Access to Spatial and Graphic (HistoricalCartographic) Information and Concept of the Contents According to the statutes of the SLUB, the public access to the cartographic library stocks has to be granted. In spite of their historical value maps should not be insulated from the public like other historical collection objects (e.g. books) stored in non-accessible archives of a library. "Stocks
210 M. F. Buchroithner e t al.
Fig. 10.2. Seutter, Matthäus: City Map of Dresden, Scale approx. 1:6 000, coloured copperplate print, about 1755.
should not only be preserved for ensuing ages but also be kept for current research, education, and for all scientific and cultural interests" (Haupt 1980). Loaning of originals outside libraries is not possible because of their value but also because of the bulky size of many maps. Until few years ago, access to maps was only provided via reading rooms for maps and time-consuming preparation of reproductions (facsimiles). The transition to the digital age within the librarianship since the middle of the 1980s yielded new possibilities in archiving and accessing historical maps (e.g. establishment of digital catalogues). Since the beginning of the 1990s, analogue-digital conversion was also realisable in new German (eastern) federal states. Therefore, the digitising of selected (particularly precious, frequently requested, especially typical) maps and pictures was tackled. Due to economic reasons digitising of the present total stocks of
Map Forum Saxony. An Innovative Access to Digitized Historical Maps 211
the SLUB of about 170,000 maps and vedute might not be accomplishable in foreseeable time and even not necessary. After accordant conceptual preparatory activities in the year 2005 the digitising of selected maps and vedute began. The selection of maps and vedute was based on the aspect to show the most important and most valuable maps and vedute of Dresden and its surrounding areas for the octacentenary of Dresden. Old city maps from Dresden, as well as hand-drawings, copper engravings, lithographies, and prints are with their cartographic artwork equally attractive and informative (Figure 3). The digitised city maps and vedute from five centuries deliver not only information about the development of landscape and city but also allow a comparison with present times. Users interested in history may find answers about questions regarding the dimension of the moats and battlements of the old residency of Dresden and their economic and cultural influence. The perception of the city through the time derived from thematic maps and vedutes are unpayable for urban historians, monument conservators, and Dresden enthusiasts but also for today’s architects and city planners. The knowledge of former settlement structures, watercourses, and landuse
Fig. 10.3. Pictograms of the Map Forum Saxony zoom window.
212 M. F. Buchroithner e t al.
Fig. 10.4. The Cruse large-format flat-bed scanner of the German Photothek Dresden.
are the basis for every landscape architect. Time series of Dresden map sheets (equidistance maps and ordinance survey maps) at a scale of 1:25,000 offer a descriptive presentation of historical relations and urban development from 1882 to 1941. They allow to trace the extension of the city, the changes in the street network, and the relocation of the watercourses. Thus, the landscape changes within the 19th and 20th century are well illustrated. Historical maps are unique sources to gain historical background information. Based on the archive inventory of Dresden, the intention was to provide access to the maps and vedute of Saxonia via the Internet. First, in 2006 the “Sächsische Meilenblätter” ("Saxonian Mile Sheets"), hand-drawn maps from the First Saxonian Triangulation (1780-1806) at the scale of 1:12,000, and several hand-drawings of Saxonian cities around 1800 at the scale of 1:3,000 were digitised. The international use of the digitised sheets is increasing as well as the positive response in scientific publication organs and at international conferences. The digitised maps have i.a. already been used by architecture students of the University of Delft for a project dealing with historical building research.
Map Forum Saxony. An Innovative Access to Digitized Historical Maps 213
10.3 Structure of and Access to the Digital Collection
10.3.1 Search and Selection of Maps, Interactive Navigation Until November 2009 about 5,300 historical maps and vedute from Dresden and its surrounding areas were digitised and published for Internet use. Access to the “Kartenforum Sachsen” is either provided by the “Deutsche Fotothek” ("German Photothek"; www.deutschefotothek.de) using the tab “Kartenforum” or directly searching for “Kartenforum Sachsen” with a search-machine. The menu allows an interactive navigation (e.g. zoom, pan) through the maps on any web-browser (presently Flash plug-in required; Figure 3). For selected important maps and vedute additional text offers information regarding the historical cartographic context and the history of Dresden and Saxony. Besides the “Kartenforum Sachsen”, the digitised maps can be queried by the Online Public Access Catalogue (OPAC) of the SLUB and the Südwestdeutscher Bibliotheksverbund (SWB) using the advanced search capabilities (check box “Online Resources"). Other libraries and map collections have their own concepts of search and use which are customised for local conditions and needs.2 10.3.2 Scanning Large historical maps were digitised using the large-format scanner “CS 220 SL 450” of Cruse GmbH, Digital Imaging Equipment, Rheinbach, Germany. This device has a geometric resolution of 1,100 dpi (dots per inch) with a colour depth of 24 bit (3x8 bit RGB). The map will be placed 2
Two Examples: The State and University Library Bremen, Germany, has until 2005 digitized their whole stock of "old maps" (however, only 3,800 sheets) and furnished with geographic (regional) and thematic keywords for Internet use. The search can also take place using graphical navigation in general maps. Scrolling and zooming (not continuous!) are possible. Also map extracts can be downloaded. The digital collection of historical maps of the Library of Congress in Washington D.C., U.S.A., is certainly one of the most comprehensive in the world. However, it only allows stepwise enlargements up to 16 x, the respective map detail being indicated in a red box on a thumbnail representation of the whole map. Dynamic panning is not possible.
214 M. F. Buchroithner e t al.
on a horizontal suction plate, the originals being up to 125 by 185 cm in size. A light-bar is stepwise moved over the digitising plate. Simultaneously the scanner digitises the image information. This way of lighting guarantees a minimum stress for the documents: sensitive originals are exposed to up to ten times less light than with conventional technologies. Two geometric resolutions are available: 7000 x 10 500 pixels and 10 000 x 15 000 pixels. Colour alignment, especially white balance, can be performed using various media. To digitise ultra-large format master copies, image partitioning and image mosaicing has to be applied. Few very precious maps and vedute were reproduced using colour slides of either 13 cm x 18 cm or 9 cm x 12 cm in size that were finally digitised using a flat-bed scanner. For this purpose the Purup Eskofot “EskoScan F14” was used. It achieves a geometric resolution of 5,400 dpi with a colour depth of 42 bit (3 x 14 bit RGB) and up to 125 cm x 185 cm in size. For further information regarding non-contact scanning of historical maps the reader is kindly referred to Leppin, Rausch and Zinndorf (2000).
Fig. 10.5. Located main villages and towns of Saxonian Mile Sheets in the Map Forum.
Map Forum Saxony. An Innovative Access to Digitized Historical Maps 215
10.3.3 Data Processing To present the historical maps and vedutes in the Internet the software suit “Zoomify” is used. Zoomify is a product of the company of the same name, Zoomify, Inc., Santa Cruz, California, U.S.A. This program creates tiles of the image, adds zoom effects, navigation buttons, and finally coverts the image into the Flash format SWF. The software is based on the master-archive-version which allows access to the full resolution of the previously digitized image. Moreover, a very high image quality along with short loading-times can be achieved.
10.4 Requirements for Internet Presentations of Historical Maps. Potential of Zoomify In order to access spatial graphic information based on digitised maps and vedute high requirements on their visibility and usability have to be fulfilled. Those demands may be considered irrespective of the Internet, because maps can be distributed offline using storage media like CD-ROMs and DVDs. The information system of the SLUB is Internet-based3. Although an Internet presentation provides “ubiquitous” utilisation, it is hard to determine the user needs. In the following expert knowledge derived within four decades of academic research at the Dresden University of Technology, in particular from the lecture “History of Cartography”, will be given (Koch, 2002). The program “Zoomify” fulfills all these needs, the main aspects being the following three: 1. Presentation of the whole map to gain a synoptic impression about layout, map style and possible peculiarities: Because of the limited screen size many maps must be displayed with some limitations, since down-scaling is unavoidable. However, by scrolling through the map (automatic reload of the screen parts becoming blank) a total overview can be reached. 2. Visualisation, analysis and reproduction of details, legend, cartridges etc.: 3
Today several historical maps and map series are commercially available on CD-Rom and/or DVD, e. g. from swisstopo the complete Siegfried Map 1:25 000/1:50 000 and the complete Dufour Map 1:100 000, the latter one provided with superpositioning functions with the recent topographic map of Switzerland 1:100 000.
216 M. F. Buchroithner e t al.
The desired part of the map can be chosen in the overview window and displayed in original scale in the map window. By means of functional buttons (arrow symbols) and the symbol “+” or “-“ it is possible to zoom in or out in 7 steps4. An optional function is to zoom infinitely using a scroll bar. The different parts of the map are downloadable with suitable software (also polychrome). Thus they can be used and prepared for external applications (e.g. for presentations, publications in journals etc.). Another possibility is to order the desired maps on CDROM. 3. Visualisation, analysis and reproduction of individual map symbols, figures etc.: The zooming possibilities of Zoomify permit very high magnifications with highest graphical resolution. Compared to hardcopies, the softcopies are naturally limited (older maps are mostly copper engravings or lithographies).5 The extraction of single signs – indispensable for detailed semiotic analyses – is, thus, possible.
10.5 Recent Developments In order to amplify the possibilities of the conveyance of spatial graphic information and to make them still more efficient, in the beginning of 2007 the source material offered by the Map Forum Saxony has been integrated into the newly structured image database of the German Photothek. Thus, the whole information offer of the Map Forum can be investigated separately or in the context of architectural drawings, historical photographs and airphotos in the sense of combined spatial-historical knowledge. Besides an open search in texts there exits the possibility for targeted queries concerning title keywords, countries/places, buildings, artists/architects, dates, and scales. In addition, all sheets are retrievable by means 4
5
The maximum magnification reachable by the Zoomify functions provided by the German Photothek is, regarding their graphic resolution, equal to other Internet presentations of map collections in the German speaking domain and beyond (cf. Collection Ryhiner, Bern, Switzerland). Due to the (basically not manipulable) lower resolution of the digital screens (in comparison to analogue maps) the representation of certain map symbols and letters can lead to graphic blurring and semantic fuzziness. Still, precently topographic and thematic digital screen displays requests particular map graphics (cf. Neudeck, 2001 and 2005).
Map Forum Saxony. An Innovative Access to Digitized Historical Maps 217
of search machines like Google. Furthermore, as from October 2008 an OAI interface is available which allows access to the maps of Europeana and later also of the Deutsche Digitale Bibliothek (German Digital Library). Apart from easy-to-handle search possibilities the image/map database is primarily characterised by a high degree of transparency regarding the contents. In order to optimise the access to more complex map series, SLUB, in cooperation with the Department of Surveying and Cartography of the Dresden University of Applied Sciences (Hochschule für Technik und Wirtschaft Dresden [FH]), realised visualisation models for the Saxonian Mile Sheets (see above), that permit the targeted retrieval of individual sheets via map indices or general maps as well as based on districts or according to sheet-line systems. Other map series of Central Europe are available through similar entrances. In addition to these query interfaces offered to the users, the German Photothek looked for an automatable way to visualise the complex geographic relations. This has been realised by means of a database-based access to the open programming interface API of Google Maps. It allows to localise the major places indicated in the historical maps, which were digitally recorded during their cataloguing, in recent maps and satellite imagery. This is a most simple and (spatially) well-structured manner to, e.g., convey the information which places are covered by city maps.
10.6 Prospects Beginning in 2009 the successful approach described above is going to be extended into a basal cartographic database for all scientific disciplines covering entire Germany. This development is funded by the German Research Council DFG within the project “Innovative Access to Spatial Graphic Information”. The ambitious objective of this project is to enlarge the source database, which up till recently has only comprehended originals out of the stock of the Saxonian State Library SLUB and their partner libraries dealing with Dresden and Saxony. This concerns both width (geographic coverage) and depth (timespan) of the maps. Supra-regional relevance is going to be accomplished by digitising the 674 sheets of the Karte des Deutschen Reiches (Map of the German Empire; 1880 -1919) at a scale of 1:100,000 as well as the 6,000 sheets of the Ordnance Survey Map 1:25,000 for the whole German Empire (1860 – 1935). Furthermore, SLUB tries to cover the whole spectrum of spatial-historical aspects by di-
218 M. F. Buchroithner e t al.
gitising (almost) all relevant maps, vedute and other graphic landscape depictions of all Euroregions neighbouring Saxony from subrecent to 500 years back. During the period of the aforementioned project some 12,000 historical maps are envisaged to be processed. It can be expected that this undertaking in the field of digitising cartographic information will gain model character. The approach pursued be the Map Forum offers a lot more than all other collections and portals: full online-exploitation, access to search-machines, clear structure and query possibilities for specific disciplines, also for libraries other than SLUB. The chosen method for the georeferencing of historical maps and views, which is an actual georeferencing of individual places and landmarks, follows a rather pragmatic approach. It permits to provide a comprehensive basic supply to the user in a well-structured way. For the first time a map-based inquiry of big map collections has been materialized, making use of an open standard interface (Google Maps API). It can be intuitively used without installation of proprietary software. Further planned is a timeline for the interactive visualisation of temporal, i.e. historical, relations, e.g. the depiction of all street maps of the 19th century or all maps and views of a certain place between 1785 and 1815.
10.7 Concluding Remarks In the age of spatial data mining and information engineering it seems to be indispensable to process geospatial information using digital graphic formats in order to provide convenient access for the public via the Internet. In this context the “Map Forum Saxony” was enhanced by new geographical and thematic search functions, in order to improve the use of the German Digital Library (Deutsche Digitale Bibliothek) (cf. Bove 2005). The principle of “open access” has been applied to permit the fast acquisition of information about major cartographic sources regarding history and regional studies of Saxony as well as map collections from various cooperating libraries. One objective of the map and vedute collection is the coordination and applied scientific support of scientific facilities (libraries, museums, archives) and the preparation of their own stocks in order to increase the range of digital regional sources. Presently (November 2009) the efforts of the SLUB to process spatial historical information, like historical maps and vedute in digital form have
Map Forum Saxony. An Innovative Access to Digitized Historical Maps 219
been assessed very positively. The graphic quality and user-friendliness of the digital map provision are on a high level. The Map Forum Saxony will, within the next years, become a working tool which makes use of the most advanced technologies and allows the visualisation of the spatio-temporal dimension of the original sources of urban development research in an unprecedented way. Through the crossmedia provision of originals and, in particular, through the intensified inclusion of hand-drawn, unique map material the Map Forum Saxony will reach a new quality in cartographic – and hence geographic – information which compares very favourably to the offers of other map collections world-wide.
10.8 References Bove, J 2005, Kartenforum Sachsen, SLUB-Kurier, Dresden, no. 2005/3, pp. 1213. Bove, J & Zimmermann, G 2008, Das Kartenforum Sachsen. Innovativer Zugriff auf raumbezogene grafische Informationen, BIS: Das Magazin der Bibliotheken in Sachsen Dresden, no. 3, pp. 148-150. Buchroithner, MF, Zimmermann, G & Koch, W-G 2007, The Dresden Digital Archive of Historical Maps. Proceedings of the 23rd International Cartographic Conference, Moscow 2007, 12 pages, CD-ROM. Große, B & Zinndorf, St (without date), Möglichkeiten und Grenzen der Nutzung von Altlasten, mobiler Scan-Technik und GIS-Anwendungen in der Landschaftsforschung. Unpublished Report, 5 p., Univ. Rostock. Haupt, W 1980, Führer durch die Kartensammlung der Landesbibliothek zu Dresden. Saxonian State Library, Dresden. Hackner, N 2002, Einsatz historischer Kartenwerke zur Erfassung historischer Landnutzung. 9. Kartographiehistorisches Colloquium Rostock 1998. Vorträge, Berichte, Posterbeiträge. Kirschbaum-Verlag, Bonn, pp. 145-148. Klemp, E 1996, Die Erfassung von Altkarten in der IKAR-Datenbank – gegenwärtiger Stand und künftige Entwicklungsmöglichkeiten. 7. Kartographiehistorisches Colloquium Duisburg, 1994. Vorträge und Berichte (= Duisburger Forschungen, vol. 42), Duisburg, pp. 225-232 (cf. http://ikar.sbb.spk-berlin.de). Koch, W G 2002: Das Lehrfach „Geschichte der Kartographie“ an der TU Dresden. 9. Kartographiehistorisches Colloquium Rostock 1998. Vorträge, Berichte, Posterbeiträge. Kirschbaum-Verlag, Bonn, pp. 111-116.
220 M. F. Buchroithner e t al. Leppin, D, Rausch, R & Zinndorf, S 2000, Berührungsloses Scannen zur Nutzung historischer Karten für den Eigentumsnachweis an Liegenschaften. Kartogr. Nachrichten, vol. 50, no. 4, pp. 175-180. Neudeck, S 2001, Zur Gestaltung topographischer Karten für die Bildschirmvisualisie-rung. Dissertation Universität der Bundeswehr, München-Neubiberg (= Schriftenreihe d. Studiengangs Geodäsie u. Geoinformation d. Univ. d. Bundesw. München, Vol. 74). Neudeck, S 2005, Vorschläge zur Gestaltung thematischer Karten für die Bildschirm-visualisierung. Kartogr. Nachrichten, vol. 55, no.1, pp. 31-35. Seutter, M about 1755, Stadtplan von Dresden (City Map of Dresden), appr. 1:6,000, SLUB Dresden, Kartensammlung. Stams, W 2001, Altkarte (Stichwort). In: Lexikon der Kartographie und Geomatik, Bd. 1, Spektrum Akademischer Verlag, Heidelberg/Berlin, p. 17. Walz, U & Berger, A 2003, Georeferenzierung und Mosaikherstellung historischer Kartenwerke – Grundlage für digitale Zeitreihen zur Landschaftsanalyse. Photogrammetrie – Fernerkundung - Geoinformatik, vol. 2003, no. 3, pp. 213-219. Witschas, S 2002, Erinnerung an die Zukunft – sächsische historische Kartenwerke zeigen den Landschaftswandel. Kartogr. Nachrichten, vol. 5, no. 3, pp. 111.117. Zölitz-Möller, R, Hartleib, J, Röber, B & Sattler, H 2002, Das EU-Projekt „Digital Historical Maps“: Altkarten im Internet. Kartogr. Nachrichten, vol. 52, no. 1, pp. 13-19. Cruse GmbH: http://www.crusescanner.com, (last access 15 January 2010) Deutsche Forschungsgemeinschaft (DFG): http://www.dfg.de, (last access 15 January 2010) Deutsche Fotothek: http://www.deutschefotothek.de, (last access 15 January 2010) Google Maps: http://maps.google.de, (last access 15 January 2010) Zoomify, Inc.: http://www.zoomify.com, (last access 15 January 2010)
11 A WYSIWYG Interface for User-Friendly Access to Geospatial Data Collections
Helen Jenny1, Andreas Neumann2, Bernhard Jenny1, Lorenz Hurni1 1 2
Institute of Cartography, ETH Zurich, Switzerland Office of Survey, City of Uster, Switzerland
Abstract Map collections and dataset providers increasingly offer online access to their geospatial data repositories. Users can visually browse and sometimes select and download datasets to their desktop. The experience of those using such applications may vary greatly from novice to GIS expert. Two types of user interfaces for geospatial data collections are currently prevalent, the wizard-based and the search-based interface. Both types do not serve experts and novice users equally well: the former imposing a rigid chronological order of user actions, the latter requiring the user to handle complex search mechanisms. The inadequacies of these types of interfaces led to the development of a new interface for geospatial data collections, the WYSIWYG interface. The WYSIWYG interface encourages the user to explore available datasets and system capabilities. The sequence of actions for map selection is flexible and can be influenced by the user. Instead of showing reference maps, scale-dependent representations of the downloadable datasets are displayed giving an immediate impression of dataset characteristics. User-friendly layout and function design contribute to making the WYSIWYG interface a suitable front-end for geospatial data collections satisfying the needs of expert users und novices alike.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_11, © Springer-Verlag Berlin Heidelberg 2011
222 H. Jenny e t al.
11.1 Introduction: Common Types of Geospatial Data Collection Interfaces When comparing online access to geospatial libraries, two major types of graphical user interfaces (GUIs) are often encountered: the wizard-based and the search-based interface. This section takes a closer look at the characteristics of theses two types and, for each type, presents interface examples of established geospatial map collections. Observations based on the authors’ experience are then made on the suitability of the wizardbased and the search-based interface, which will clarify the need for a new type of GUI for geospatial data collections. The focus lies on web GUIs allowing the user to select from a number of geo-referenced geospatial datasets, including raster and vector data, which can be downloaded directly to the user’s desktop. 11.1.1 The Wizard-based User Interface The wizard-based GUI is task-oriented. The user follows a step-by-step process. For selecting and downloading data from a geospatial library, a wizard-based user interface could include the following steps: first select a map product and format, then perform a location search, choose the correct location from a hit list, display the dataset or its outline on a reference map, select tiles or layers for download, choose a file format and access the provided link to the downloadable file. The decisions the user needs to take can be imagined as a hierarchical tree. Beginning at the root, at each node the user selects from a number of options that let him follow a branch of the tree. At the bottom of the tree, each leaf represents a selection for download. The tree metaphor shows very well the advantages and disadvantages of a geospatial library with a wizard-based GUI. The wizard provides a novice user with the necessary guidance and chronologically leads to an output with correct dataset combinations. Empty search results or cartographically unfortunate dataset combinations are avoided. A disadvantage of this approach is that the user traverses only one path of the tree. The user ignores the existence of the other branches and therefore is not aware of the systems capabilities and content. For a geospatial map collection with a wizard-based interface, this means that the user may see a text-based list of the available datasets for download but does not
A WYSIWYG Interface for User-Friendly Access 223
have a real understanding of the geospatial data’s characteristics, possible appearance or application. Also the user may not be aware of certain functionalities that are only activated upon the selection of a specific dataset. According to Lauesen and Harning (2001, p. 68), main problem of the wizard-based user interface design “is that the user never gets an overview of
Fig. 11.1. Screenshots of Edina Ordnance Survey Collection Data Download Service: dataset selection (top left), location search (top right), tile selection (bottom left) and download parameter selection (bottom right) (EDINA n.d.).
224 H. Jenny e t al.
the data available. (…) Furthermore, real-life tasks are usually much more varied and complex than the analyst assumes. Consequently, the system supports only the most straightforward tasks and not the variants.” While guiding the novice user, the wizard-like GUI does not promote understanding of the system and can be very limiting to the experienced user. An established example of a geospatial library with a wizard-based GUI is the British EDINA Digimap Collection (EDINA 2009). Its Ordnance Survey Collection Service provides “the ability to create and customize maps online, down to individual feature selection, specifying scale and area, and generating plots up to A0 in size” (Sutton, Medyckyj-Scott and Urwin 2007, p. 269). Also part of Digimap is the Ordnance Survey Collection Download Service that offers a task-oriented step-by-step GUI to help novice users with downloading datasets to their desktop (Fig. 1). The progress bar at the top of the interface, with the names of the different process steps to be taken in predefined order, is a GUI element that is often present in wizard-based user interfaces. 11.1.2 The Search-based User Interface Another type of geospatial data collection interface is the search-based interface. It usually starts with a location name search or a thematic keyword search. Visual selection of a geographic area of interest on a reference map is also often available. The search results are typically presented to the user as a text-based hit list. The user then selects one or more datasets from the hit list to be displayed in the map window. This display is possible in the form of a dataset extent frame on a reference map, the dataset itself (e.g. a satellite image) or a symbolization of point, line or polygon data. This interface type usually offers a multitude of text-based search options. The chronology of predefined steps is not as rigid as in the wizard-based GUI. A combination of different datasets can be displayed in the map window allowing expert users to more exactly check and compare data availability for a location. One reason to choose a search-based geospatial data collection interface lies in the nature of the available datasets. Locally very dispersed datasets are difficult to adequately display on a single map. A search mechanism helps to narrow down the number of eligible datasets and the search results can in turn be individually displayed. While this type of interface offers more flexibility to the expert user, it can be confusing to the novice user. As for the wizard-based interface, it takes several steps before the user sees downloadable data in the display
A WYSIWYG Interface for User-Friendly Access 225
Fig. 11.2. Screenshots of MIT GeoWeb (MIT GIS Services n.d.): search results for “New York” (top left), map window zoomed to first result on list (top right), list of available datasets for the selected area (bottom left), symbolization of two downloadable datasets (bottom right).
window. A wrong keyword may lead to an empty hit list, and reference maps may be mistaken for downloadable datasets. A novice user may not be familiar with the (abbreviated) dataset names in the hit list or may not have an accurate understanding of the dataset types. The user may also get the impression that it is not possible to gain access to the required datasets because he/she cannot think up the correct keyword for the text-based search. Often the symbolization of several selected datasets on the same reference map is not cartographically optimized, e.g. the same color is used for two different datasets or thematic layers. Established examples of geospatial libraries with search-based GUI are the Harvard Geospatial Library (Harvard University Library n.d.) and the
226 H. Jenny e t al.
GeoWeb application of the MIT GIS Services (MIT GIS Services n.d.). MIT GeoWeb (Fig. 2) focuses on a variety of search options and allows the user to create arbitrary data layer combinations and retrieve additional attribute information. Based on a list of text based search results, the user selects what data should be displayed on the background reference map. The downloadable output can be a raster or vector dataset in a number of formats or a geodatabase link for use with a GIS.
A WYSIWYG Interface for User-Friendly Access 227
11.2 The WYSIWYG User Interface The wizard-based and the search-based interfaces presented in the previous sections both do not serve the experienced and the novice user equally well. In both interfaces the user needs to take several pre-selection steps before seeing downloadable data. The novice user may have difficulties selecting a dataset without receiving a visual impression of its appearance and characteristics. The user may even confuse reference maps and downloadable datasets. In both interfaces, it is difficult to gain an overview of the available datasets and functionalities. The WYSIWYG interface is a new type of interface for geospatial libraries that meets the requirements of users of varying experience levels with geospatial data. This is achieved by the following principles: • Reference maps and text-based searches are used as little as possible. Instead, the user is always exposed to downloadable data in the map window. Hence comes the name What-You-See-Is-What-You-Get (WYSIWYG) interface. The symbolization of datasets is cartographically correct and adapted to different display scales. • The user is encouraged to explore all available datasets. The interface is flexible and does not impose a strict chronology of actions: it is up to the user to either first navigate to a different dataset type or a different location. A variety of paths can be taken to arrive at the data of interest. • GUI windows and sub-windows are clearly arranged according to specific design guidelines and user-friendly measures are taken to optimize application use. An example of a geospatial library with WYSIWYG interface is the ETH Zurich Base Map Collection, which has been developed by the Institute of Cartography. In the following sub-sections, the above named principles will be further explained and a possible implementation of these principles will be demonstrated by showing details from the ETH Zurich Base Map Collection. 11.2.1 The WYSIWYG Map Window: Presenting Downloadable Data at Different Display Scales After the user logs into the geospatial library application, a default downloadable dataset is immediately displayed in the map window of the WYSIWYG interface. The user can either continue with this default data-
228 H. Jenny e t al.
Fig. 11.4. Display of vector data in the WYSIWYG map window of the ETH Base Map Collection (left) and the corresponding sub-window to select layers for download (right). The hedges and trees layer, road network and administrative boundaries are selected for download. Layers marked with a crossed-out eye are not visible at the current display scale.
set or switch to other downloadable datasets. What the user sees in the map window should resemble as closely as possible the content of the file that, as a final step, will be downloaded to the user’s desktop. In the map window of the search-based and the wizard-based interface type, a reference map (e.g. satellite image, street map, topographic map) is usually shown. Downloadable datasets are displayed in a much later step, directly as symbolized vector data or image clip or indirectly as dataset boundary box. The WYSIWYG map window, which immediately displays
Fig. 11.5. The 1:25 000 swisstopo raster map at different display scales in the WYSIWYG map window of the ETH Base Map Collection. For small scale display (left) a down-sampled image of the original (right) is used.
A WYSIWYG Interface for User-Friendly Access 229
downloadable datasets and avoids using reference maps, bears a number of advantages: • Some of the characteristics of the datasets are conveyed visually to the user without needing further textual explanations, e.g. the user automatically differentiates between satellite image or topographic map, point or line information, river or highway. • The user does not need to deal with the content of the reference map, nor does he/she risk confusing it with downloadable data. Unfortunate combinations of reference map and downloadable dataset (e.g. a downloadable relief map combined with a satellite image reference map) are also avoided. • The user sees exactly where downloadable data is available because it is always directly symbolized in the map window. • Navigating to a location in the WYSIWYG map window, especially by zooming in several steps to a location, is at the same time an act of exploring the downloadable dataset. When the user zooms in and out on the WYSIWYG map window, a screen display of the downloadable datasets corresponding as closely as possible to the content of the download file needs to be created. A major challenge is therefore, to find an appropriate representation of the downloadable data at different display scales. The following sub-sections show how this challenge was met when designing the WYSIWYG GUI for the ETH Zurich Base Map Collection. Different solutions were chosen for displaying raster and vector datasets. 11.2.1.1 Displaying Vector Data at Varying Scales
The ETH Zurich Base Map Collection contains currently three topographic vector datasets (an overview map and two digital landscape models) covering the area of Switzerland, which are provided by the Swiss national mapping agency (swisstopo 2008). The datasets were generalized for the scales of 1:1 Million, 1:200 000 and 1:25 000. To mark the entire 1:25 000 vector dataset of Switzerland for download in the WYSIWYG map window, the user would need to view the dataset at full extent. The challenge was that the 1:25 000 dataset, being very detailed, could not be used for an overview map showing the whole country. At the same time, as stated in the previous sub-section, the screen display at full extent needed to look as similar as possible to the downloadable 1:25 000 vector dataset.
230 H. Jenny e t al.
For the topographic vector datasets in ETH Zurich Base Map Collection, the problem was solved in the following way: three main display scale intervals were defined. At small scales the 1:1 Million dataset was used, for medium scales the 1:200 000 was applied and at large scales the 1:25 000 dataset itself was shown in the map window. To give the user the impression of adaptively zooming within the same dataset, care was taken to symbolize themes in the same manner in all three datasets. E.g. the same line symbols were used for rivers, administrative boundaries and highways. Within each display scale interval, geometry density needed again to be reduced towards the small-scale end of the interval. Subscale intervals were defined based on thematic attributes of the vector dataset. E.g. at a 1:400 000 display scale, only highways, two lane roads and major railways of the 1:200 000 transportation layer are shown. Narrower roads and shorter railroads from the 1:200 000 dataset are gradually added when the user zooms further in on the WYSIWYG map window until the display switches to the 1:25 000 dataset at the appropriate zoom level. Fig. 3 shows the content of the WYSIWYG map window at different scales when the user selects the 1:25 000 topographic vector dataset. A negative side-effect of defining attribute-based intervals for display can be that fragmented or isolated geometry elements appear in the WYSIWYG map window; e.g. when two highways are connected by a narrow road and the latter is not displayed at the current display scale. But since this problem appears rarely and only concerns the display and not the downloadable datasets themselves, it was judged to be acceptable. By using three different datasets for screen representation with harmonized symbolization, the user receives the impression of viewing the same dataset at different generalization levels. At all zoom scales the displayed map is similar in appearance to the downloadable dataset. To optimize the dataset symbolization for screen display, general web map design guidelines (Jenny, Jenny and Räber 2008) were followed wherever possible; e.g. anti-aliasing was applied to vector data and the size of line and point symbols was adapted for screen display. It needs to be noted that vector datasets can often only be downloaded without a symbolization file. If one interprets the term WYSIWYG narrowly, for vector data the user does not get what is displayed. In a broader interpretation, which is favored in this article, by looking at symbolized vector data in the WYSIWYG map window, the user still learns about many of the characteristics of the dataset that will be included in the down-
A WYSIWYG Interface for User-Friendly Access 231
load file. Due to this broad interpretation, in the context of this article the WYSIWYG principle is considered to also apply to vector datasets. Another challenge concerning the display of vector datasets at different scale is encountered when the user needs to include a layer in the download even if it is not shown in the WYSIWYG map window. This is for example the case, when the user wants to download a layer containing hedges and trees for the area of entire Switzerland. Hedges and trees are only shown starting at a display scale of 1:10 000 and larger and certainly cannot be shown on a country scale representation. For this purpose special GUI elements were created: a sub-window shows the names of the layers of the downloadable dataset with a check box (Fig. 4). If the check box is turned on, the layer will be included in the download even if it cannot be seen at the current display scale. A crossed out eye next to the check box indicates that the user needs to zoom in further to see the layer. When the layer can be shown, a legend icon replaces the crossed out eye. Layers not selected for download are also not displayed on the map. 11.2.1.2 Displaying Raster Data at Varying Scales
For the display of raster datasets in the ETH Base Map Collection, a different approach was chosen. Swisstopo produces topographic raster maps at different scales (between 1:1 Million and 1:25 000), but the maps do not always share the same symbolization. It was therefore not possible, in analogy to the vector data representation, to use a small-scale map for a full country view of a large-scale dataset selected for download. Instead, for each dataset a pyramid of down-sampled images (with the dataset in original resolution at the base) was created. Depending on display scale, a lower or higher resolved image is shown. While this method is not equival-
Fig. 11.6. Drop down menus for dataset selection: for the selected dataset a description is provided (left); selection of dataset category (middle); dataset selection (right).
232 H. Jenny e t al.
ent to a cartographic generalization, it does not falsify the appearance of the dataset (Fig. 5). It was also assumed that users would have fewer problems handling raster datasets since layer selection is not necessary. The majority of users are probably familiar with the appearance of digital topographic raster maps, given that they look identical to their paper equivalent. 11.2.2 Interface Flexibility: Allowing the user to Choose the Order of Actions The WYSIWYG interface for geospatial libraries does not only consist in the WYSIWYG map window. It also offers a number of features that support flexible usage. A pre-defined sequence of actions to arrive at a display of downloadable data is not imposed, as it is often the case in wizard-based and search-based interfaces. A flexible WYSIWYG GUI allows the user to follow a variety of paths to arrive at the same result. As a consequence the user gains a better overview of the available datasets and their characteristics and becomes acquainted with the system’s capabilities. In the WYSIWYG interface of the ETH Base Map Collection such flexibility was implemented in the following ways: • Geospatial dataset and location selection can be executed in arbitrary order. This facilitates browsing available datasets and learning about geospatial data. • A number of complementary location search tools were implemented supporting users of varying experience levels and interests. • Because download parameter tuning is a demanding task, it can be approached in two ways: novice users are guided by default values, whereas experienced users may customize download parameters to better fit their needs. After login into the ETH Base Map Collection, the user has the choice to display a different dataset in the map window or to navigate to a location. Selecting a new dataset is executed by manipulating two drop down menus (Fig. 6). The first drop down menu shows categories of dataset types e.g. vector data, airborne images, and digital elevation models. The second drop down menu shows available datasets for the selected category. The user can quickly browse through all available datasets. Selecting a name will dis-
A WYSIWYG Interface for User-Friendly Access 233
play the data immediately in the WYSIWYG map window. Browsing through available geospatial datasets with the WYSIWYG user interface is a learning experience. By looking at the display of different downloadable datasets of the same category, the novice user will discover common characteristics. If not familiar with dataset names and abbreviations, the user will soon be able to form an approximate mental classification of geospatial datasets. Since not all dataset characteristics can be intuitively deduced from their visual representation, a short description of the dataset and its typical use cases are provided (Fig. 6 left). Users who are already acquainted with the ETH Base Map Collection may prefer to navigate directly to a location. Several ways to display an area of interest are offered, serving different experience levels and needs of the users (Fig. 7). Users can directly navigate in the map window or on a small overview map using standard map navigation tools. The zoom function can be operated gradually by moving a slider, in predefined steps by clicking a plus or a minus button, or directly on the map by drawing a rectangle. Users who are not familiar with interactive tools or are looking for a particular location may prefer the text-based gazetteer search. It is a live search that
234 H. Jenny e t al.
shows the corresponding matches for every character that the user enters. According to Lauesen (2005, p. 281), this makes the search more intuitive because the user receives immediate feedback. Clicking on a place name displays the selected location in the WYSIWYG map window. It is also possible to provide exact geographical coordinates to speed up the selection process. For users who want to download a certain map sheet, tile grids can be overlaid on the map display. Clicking the map sheet number will select the map sheet’s coordinates. By pressing the shift key several map sheets can be selected. When the user switches to a different geospatial dataset, the display stays at the selected location. This provides the user with the flexibility to determine the order of the selection steps optimally supporting his/her interests. Sometimes it is necessary to provide different options for experts and novice users, especially where specific background knowledge is required. In the ETH Map Collection interface this is the case for download parameter tuning. In the Download Folder tab (Fig. 9), where a user’s dataset selections are listed, default values for download format, projection and vector data attribute selection are provided. The default values were defined to meet typical user needs. For example, if the user accepts the default values, vector datasets for download will be delivered in ESRI shape format, in Swiss projection without additional attributes. Experienced users have the option to select a different projection, file format or add attributes to each layer. 11.2.3 User-friendly Interface Implementation The functionality of the WYSIWYG user interface was designed as userfriendly as possible. User-friendliness also needs to be taken into account when designing the interface’s layout. When designing the WYSIWYG interface for the ETH Base Map Collection, selected steps for systematic interface design suggested by Lauesen (2003) and Lauesen and Harning (2001) using virtual windows were followed. Among other guidelines, Lauesen and Harning (2001) suggest: • to use as few windows as possible, • to reuse windows, • to display the same data in one window only. The interface of the ETH Base Map Collection is a tabbed document interface with only three main tabs: a tab for dataset browsing and selection, a
A WYSIWYG Interface for User-Friendly Access 235
download folder to save selections and a help page. The tab for dataset browsing is clearly structured into three sections (Fig. 8): (1) the WYSIWYG map window; (2) a vertical section to the right of the map window letting the user choose the datasets and layers; (3) a horizontal section below the map window where the user can navigate to or search for a location. Window reuse was realized in the bottom tabs (navigation and name search) sharing one sub-window as shown in the first two images of Fig. 7. To select part of a dataset for download the user places a red-rectangle on the WYSIWYG map window that is also used for dataset browsing. The interface designer can help to avoid disappointment on the user’s side by respecting the following general guidelines for user interface implementation: • to give warning messages and block unsupported actions early, • to allow state or result saving, • to call on the user’s patience and time conscientiously. When working with a geospatial map collection, warning messages are typically needed when the user has insufficient download permissions or exceeds the allowed file size for download. In the ETH Base Map Collec-
Fig. 11.8. Dataset selection interface in The ETH Base Map Collection: map window (middle), content selection (right) and location selection (bottom) sections. The rectangle on the map window defines the extent of the download area.
236 H. Jenny e t al.
tion interface care was taken to warn the user as early as possible. For example, to avoid overly large files, which would block the server process and could not be reasonably handled by the user, for every dataset a maximum area for download was defined. The size of the selection rectangle is displayed in red if the selection exceeds the allowed size. To profit from the maximally allowed area for download, the user can easily adjust the rectangle until its size is displayed in black. Attempts to add oversized selections to the download folder are blocked, and an alert message is shown. Many geospatial dataset collection interfaces also do not offer the option to interrupt the data selection process and to save the current map display including current settings. Often it is also necessary to keep the application open in the web browser and to wait until the file extraction process has finished. With the ETH Base Map Collection interface, the user can persistently save up to ten map definitions containing information on selected datasets, layers and area. File extraction can be executed simultaneously to dataset browsing and does not stop when the application is closed. A status field informs the user on file extraction progress (Fig. 9).
Fig. 11.9. Download Folder of the ETH Base Map Collection showing a user’s maps for download. Download parameters can be selected by the user before starting file extraction. Default values are provided, but can be replaced by customized options. The status column informs on file extraction progress.
A WYSIWYG Interface for User-Friendly Access 237
11.2.4 Technical Implementation The section gives a brief overview of the three-tire architecture used for this client-server system. The interface of the ETH Base Map Collection was implemented in Scalable Vector Graphics (SVG), XML and JavaScript. Reusable widgets (e.g. tab groups, selection lists) were defined that are resized when the user changes the browser dimensions. When the user logs into the client application, information on available layers and attributes for each dataset and a list of projections and formats is requested from the server. This information dynamically initializes the GUI according to available data. On the server side, the ETH Base map collection relies on a geodatabase (relational geo-spatial database and ESRI ArcSDE) and a map server (ESRI ArcIMS), the Apache web server and Apache Tomcat. Requests from the client are processed by a Java Servlet, which makes a decoupling of client and server possible. Depending on the parameters sent by the client, the servlet forwards the client’s request to ArcIMS, ArcSDE or to the database system. To communicate with ArcIMS the servlet translates requests into ArcXML and uses the Java Database Connectivity (JDBC) to send SQL-statements to the database. The servlet returns a text string or http link to the client referring to the dynamically created result and if necessary creates a zip archive containing the selected data.
11.3 Conclusion The WYSIWYG interface is a new type of interface to geospatial dataset collections. Compared to other interfaces it offers the advantage to cater equally well to experts and novice users alike. The most important characteristic of this new type of interface is the WYSIWYG map window. It displays scale-adapted representations similar in appearance to the downloadable datasets allowing for visual browsing. A strict order of actions is not imposed by the interface. Instead, the user can choose between many paths to arrive at a dataset selection for download giving precedence to the order or tool that best suits his/her interests and experience level. A user-friendly window design and implementation add to making the WYSIWYG interface a well-suited solution for a user group with varying experience levels. The ETH Zurich Map Collection, which successfully uses a WYSIWYG interface implementation, will in the future be extended to also include his-
238 H. Jenny e t al.
torical maps of Switzerland. These maps are not always available for all parts of Switzerland and also include datasets from varying years. It will be intriguing to integrate access functions for such datasets into the WYSIWYG interface.
11.4 References EDINA 2009, Login to Digimap Collections, viewed 7 January 2009, . EDINA n.d., Digimap Collection Service Demonstration Slide Shows: Digimap – OS Collection, viewed 7 January 2009, . Harvard University Library n.d., The Harvard Geospatial Library, viewed 7 January 2009, < http://peters.hul.harvard.edu:8080/HGL/jsp/HGL.jsp>. Jenny, B, Jenny, H & Räber, S 2008, ‘Map design for the internet’, in International Perspectives on Maps and the Internet, eds Peterson, MP, Springer Berlin Heidelberg New York, pp. 31-48. Lauesen, S & Harning, MB 2001, ‘Virtual windows: linking user tasks, data models and interface design’, IEEE Software, July/August, pp. 67-75. Lauesen, S 2003,’Task descriptions as functional requirements’, IEEE Software, March/April, pp. 58-65. Lauesen, S 2005, User interface design: a software engineering perspective, Pearson Education Limited, Harlow, England. MIT GIS Services n.d., GeoWeb (public version), viewed 7 January 2009, . Sutton, E, Medyckyj-Scott, D & Urwin, T 2007, ‘The EDINA Digimap Service – 10 years on…’, The Cartographic Journal, vol. 44, no. 3, Ordnance Survey Special Issue, pp. 268-275.
Section IV Pragmatic Considerations
12 Considerations on the Quality of Cartographic Reproduction for Long-Term Preservation.............................................................241 Markus Jobst 13 Issues of Digitization in Archiving Processes..................................257 Zdeněk Stachoň 14 Digitized Maps of the Habsburg Military Surveys – Overview of the Project of ARCANUM Ltd. (Hungary)................................273 Gábor Timár, Sándor Biszak, Balázs Székely, Gábor Molnár 15 The Archiving of Digital Geographical Information......................285 Peter Sandner 16 An Intellectual Property Rights Approach in the Development of Distributed Digital Map Libraries for Historical Research.....................................................................295 A. Fernández-Wyttenbach , E. Díaz-Díaz, M. Bernabé-Poveda
12 Considerations on the Quality of Cartographic Reproduction for Long-Term Preservation
Markus Jobst Research Group Cartography, Vienna University of Technology [email protected]
Abstract When thinking of sustainability of cartographic heritage two main important questions have to be seen: The first one asks for the best and high quality processes to save as well as make accessible historical maps/documents also in a far future. These aspects generally will try to reproduce and copy the original document in order to disseminate some material for ongoing working processes. The second important question focuses on solutions to save and keep accessibility to cartographic products and states of today, including multimedia and Internet applications. This contribution focuses on the first question of preservation in cartographic heritage, its sustainability and should present some considerations on actual and alternate reproduction methods of historical maps. By using the essence of the small project HYREP (hybrid reproduction) – where actual available reproduction methods/techniques were identified, their applicability for damageable documents/maps considered, alternate possibilities of hybrid reproduction developed and a rough comparison of reproduced qualities for the fine line arts in maps was given – an exclusive digital approach for saving the cartographic heritage should be scrutinized and opened for discussion.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_12, © Springer-Verlag Berlin Heidelberg 2011
242 M. Jobst
12.1 Introduction Digital reproduction methods for cartographic products offer new possibilities in economical working processes and dissemination techniques. Due to digital processes the costs of reproduction are mostly reduced to working time and manpower. Material costs almost disappeared. On the other hand dissemination becomes supported by using the Internet and digital storage media. Never before it has been so easy to distribute copies of cultural objects and to supply this copy with additional information or interpretations. These are the main arguments for digital technologies in cultural heritage. But what´s about the storage costs instead of material costs, consequence of technology improvements or loss of digital data? One main scrupulosity against digital technologies seems to be its missing durability because of accumulation of various parameters like readability of data-format, accessibility of storage media and needed applications or change of operating systems. This contribution focuses on “hybrid” (analogue and digital) reproduction of detailed maps in context with its quality and sustainability. With the understanding that a reproduction of an original map may deal as “working copy” and template for dissemination, the quality of this reproduction should be high as well as its long-term persistence should be ensured. Under these circumstances the original can be preserved and the copy will deal as basis for further work.
12.2 Importance of Map Content as Cartographic Heritage One main use of cartographic products seems to be decision support. Information retrieved should give some input in order to make the right decision e.g. in navigation. Generally the use of maps may be seen much wider, when thinking of maps as documentation tool for a specific spatial based condition by political or physical manner. Then historical maps may help to access past states and political situations, which help to expand the individual and social knowledge. Physical situations name topographic conditions like wooded areas, courses of rivers or traffic situations (to mention some of a long list). Wooded areas in historical maps may help to identify virgin wood and original vegetation. Traditional/natural courses of rivers may also be explored with old maps. Nowadays both topographic themes gain importance due to
Considerations on the Quality of Cartographic Reproduction 243
the results of human intervention on natural systems, which sometimes becomes expressed in floods and similar environmental hazards. Additionally the influence of increasing traffic situations on the environment can be studied by the development of topographic situations in historic maps. A political situation often influences map design and content. Depending on politics, map signature changes because of ideally beliefs, military secrets or political influenced expression. For example, some parts in Germany possess various maps of the same region with multiple signature catalogues. The reason for the change of signature bases on the political system. Within communism some map-elements, like industrial zones, were considered to be top-secret. This secret status changed in history. Thus the comparison of maps of different time states will picture the appearance and disappearance of these map elements, although the real world object always existed. An additional example by means of military secret depicts deliberate distortions of map sheets in order to hide real world topography or to express the political dominance of specific regions (some examples may be found in Monmonier 2002). The results of politically motivated influence on map-making may be very misleading, if the influencing element (e.g. the political system) is not identified or simply unknown. But if the influencing element is known, then the expression of a map and the used signature as well as available metaphors may help to identify/interpret further mechanisms of this political system and by this means document historic values. To the same extend map-use may have impact on the signature of maps. For example, the replica of the Roman map, tabula peutingeriana, shows settlements according to hospitality and was designed as roll of paper in order to be used at traveling (ONB 2010). From a historical point of view, maps can tell a lot of passed and unknown circumstances, either by means of topography or the influence of policy, culture and pragmatics on the content. Thus the exploration of the past can be supported by accessing this documented information in maps for some extend. This is the main reason and importance for saving cartographic heritage in high quality. Furthermore these facts give arguments to expand cartographic heritage to the digital domain for dissemination purposes.
244 M. Jobst
12.3 Overall Considerations on Sustainability of Digital Technologies Latest technical developments make the reproduction and digitalization of map sheets cheap and easy. This economic factor and an easy dissemination seem to strengthen numerous initiatives with the digitalization of rare books and maps (e.g. Octavo 2010). In most cases resulting products become available in form of CD-Rom, DVD or via the Internet (Google 2010). Basically the sustainability of these digital technologies has to be considered. Is digitalization the right way to save printed heritage? There are several severe problems with long-term archiving of digital data concerning data format, operating system, used applications, media lifetime or hardware suppositions. All of these factors take effect on each dataset or media (Borghoff 2003). The data format codes information and makes it readable for applications. This coding can be in binary and text-based (ASCII) form. Binary formats will need some interpreter, a specific software, that decodes and extracts information in any case (viewing, editing or using the information). The application deals as interpreter for data formats and enables access to information with the programmed functionality. Depending on the functionality of this application and its related expense for development, various business models may allow a free available source code and its individual adaptation. These Open Source initiatives are a possible way to make today´s applications accessible in future (OS 2010). The operating system forms the basis for executing a file/program that interprets the coded information. New developments in operating systems adapt to new hardware components. This may cause incompatibilities with existing interpreters. In many cases old interpreters are not usable on new operating systems anymore. Media lifetime spans various aspects of computer development. On one hand media lifetime concerns durability of the media itself and on the other it may be related with the development of hardware devices (Borhoff 2003). The ongoing development of hardware devices results in specific hardware suppositions in order to play specific media. While in the late twentieth century a floppy disk was the most mobile storage device and magnetic tape devices were used to store data masses, a few years later ZIP disks
Considerations on the Quality of Cartographic Reproduction 245
and magneto-optic media embossed data storage. Nowadays flash disks with USB connector as most mobile and sizable storage media and DVD as the most commonly used media for large data amounts shape data storage activities. This development of storage devices within the past ten years make the rapid change of devices clear (Bergeron 2002). Adding to this point of view redundant storage management for sustainable data access should be considered because of drive failures and occurring damages. These five almost independent factors give a coarse overview on the influencing factors on sustainability of digital data. Of course traditional archival considerations of shelf life, temperature and humidity are still valid because these “traditional” factors influence media/storage-layer lifetime as well. Resulting from the technical dependencies and the mentioned state of the art methodologies to overcome accessibility problems, it has to be stated that digital data seem not to be a place for safe custody of cartographic heritage (by traditional means: storing data and accessing these after some decades or centuries) (Dörr 1999). Digitalization would form no appropriate solution, if the task of a sustainable heritage management would be defined by solely archiving. Too many influencing factors affect the life-time and accessibility of data. Additionally digitalization methods are developing very fast. The resolution and quality growth is almost as high as the growth of storage capacity. But when we think on transmission of knowledge, to recollect historic information and to build up consciousness, most of digital characteristics can support these action items. Especially the lossless reproduction of digital data, its easy dissemination and extension with multimedia components are the main important advantages of a cultural knowledge transmission. Concluding these overall considerations on sustainability of digital technologies leads to two main tasks of digitalization/digital data: a lossless reproduction of copies and support of effective dissemination. The digital reproduction seems not to be an archivable object in terms of long-term preservation at the moment. Thus the focus of archiving and preservation should be on the original object. An essential need for digital reproduction may be seen in the creation of digital working copies, which depends on digital or hybrid reproduction methods.
246 M. Jobst
12.4 Digital Reproduction Techniques Similar to analogue reproduction techniques, sensors, light sources and the source-object are needed for digital reproduction. Whereas analogue reproduction uses sensitized film material and various filtering mechanisms in order to gain colors of the artwork, digital techniques depend on the sensibility of the used sensor. The technical formation of scanners mostly depends on the characteristic of sensors. Generally CCD- or CMOS sensors are sensitive to a wide bandwidth, reaching from 400 nm to 2400 nm, which has to be restricted to about 700 nm with specific elimination filters. Even though the sensors are more sensible to the red contingent. This high sensibility in red and near infrared causes loss of the blue contingent when monochromatic reproductions are made. The use of blue light sources is one simple solution to counter this effect. Colored reproductions require a much more detailed filtering, which results in various arrangement of sensors. In contrast to film material, which uses three filter layers for the absorption of the color spectrum, most of sensors have to be filtered three times, which results in a three-pass system, or one pixel is calculated by merging 3 sensor elements, which is called one-pass system. It can be clearly seen that a one-pass system needs a three time higher density of sensors in order to come to the same resolution as the three-pass system. Various dispositions of the sensor elements result in different effects of the resulting picture (e.g. diagonal orientation like the Super CCD of Fuji (Fuji 2010)). The creation of the X3 CMOS sensor by the company Foveon (Foveon 2010) in 2002 presented the possibility of imitating film by combining the filtering process on one over the other. At the moment the CMOS sizes are still too small to be used for high quality map reproduction. Large format scanners (flatbed scanners, area- or plot scanners) use three rows (trilinear scan sensor) or only one row of sensors. This row either uses a turn round prism or LED light sources (LIDE technique) for filtering. The one row system generally is called Contact Image Sensor (CIS). It uses no lens to focus sampled data on the field of sensors. Thus it is almost free of maintenance (Krautzer 1999). Latest developments show effective resolutions up to 2400 dpi. Camera scanners can be described as digital “film packs”, which use the main components of flatbed scanners (row sensor, spindle, stepping motor, A/D modifier and cache). In most cases the sensors are built in cases that are used similar to a film holder. The principle arrangement of sensors can
Considerations on the Quality of Cartographic Reproduction 247
be as row (for high quality reproductions of art and still life) or furthermore as array (for photographic tasks where fast one-shot scans are used). The use of camera scanners was thought to be as non contact scanners, which can use the whole sensor capacity for recording. A general size of the sensor row in camera scanners is about 7 centimeters with 8500 sensor elements. Latest available developments (like at company Cruse (Cruse 2010) or METIS (Metis 2010)) present systems including artwork holder, light system and camera scanner with a fixed resolution of 14000 x 22000 pixel on a maximum object size of 40'' x 60'' (approx. 120 x 150 cm). After a simple calculation of resolution large format scanners would be appropriate for digitizing large format objects with fine details, like copperplate prints have. But when working with cartographic heritage, ancient handwritings or vintage books, any contact with the object is critical and may damage it. The kind of contact reaches from skin (bacterial and acid level influence) to mechanical influences (human and machine influence). A classification of contact and contact-free reproduction techniques/scanners seems to be useful. Large format scanners use some kind of sheaves to transport the artwork over the row sensor. Its mechanical influence on the surface of the artwork makes this system useless for unique historic objects. In the same way drum scanners can be seen: the artwork, which has to be pliably, is fixed on a cylinder from where the sensor reads data by rotation. Camera scanners follow the concept of contact-free reproduction as it was used with analogue techniques. In this case the projection on the film plane or sensor plane underlies possible distortions of the camera lens, especially geometric and chromatic aberration. This influence has to be considered in further analysis of reproduction quality, especially when no specific reproduction lens is used.
12.5 Hybrid Reproduction Issues Equipment for reproduction has been subject of intense investigation in the last century. Results were high quality lenses almost without aberrations, large camera bodies with various adjustment possibilities for filtering and geometric equalization. Considering available resolutions within contact-free scanners may bring up the need for better alternatives. One of the highest developed scanners peaks at 14000 x 22000 pixel for the sensor and about
248 M. Jobst
160 x 240 cm for the object holder. Using this maximum extend for a simple resolution equation results in approximately 240 dpi (maximum pixel divided with maximum extend of object). This value is at any rate too little for high quality reproduction in scale 1:1. Reducing the object size to an average of 84 x 119 cm (A0) leads to a resolution of approx. 450 dpi. This value is sufficient for digital reproduction of pictures, like paintings and photographs, which consist of smooth transitions, but hardly show hard contoured lines, like in copperplate prints. A further limitation to 42 x 60 cm (A2) ends up in approx. 900 dpi. This value nearly reaches the requirement for a digital 1:1 reproduction for line-art, in which visible fractals (pixel structures) will disappear. The question for alternatives to actual existing digital reproduction systems leads to the design of hybrid methods, especially for line-art based masters, like copper-plate prints. These desirable methods should combine traditional/analogue techniques with digital ones. The aim should be higher resolutions, sustainable accessibility of the working copy and still performing high quality reproductions (sufficient for scales in 1:1 or even larger). 12.5.1 Quality issues when reproducing line art Useful quality descriptions focus on the planned aim of reproduction process. The end size of reproduction is the measure for needed resolution. For pictures a minimum of 300 dpi at the resulting size is needed to get rid of raster fractals (visible pixels of the digital image) and to have enough information for the rastering process in print, that uses several pixels for one printing point (Prieß 1995). For line art the minimum value is much higher: 1200 dpi at the resulting printing size are needed for bitmaps (black and white images) or lines in order to make raster “steps” invisible (Limburg 1997). At this resolution the size of the pixel is so small that two pixels cannot be differentiated from one another. On the other side resolutions of print and printing films have to be considered. Within print, lines should not be smaller than 0,3 pt to assure constant line size. Smaller values will result in broken or uneven lines depending on the used paper. The resolution of printing film, which is the starting material for the production of printing plates (assuming that the traditional reproduction process is in focus and not something like “print-to-plate”, where digital images are directly sent to the printing plate), is about 1000 lines/mm at a hard gradation (contrast 1000:1). This value may vary with
Considerations on the Quality of Cartographic Reproduction 249
gradation, used chemicals and film emulsion. Compared to the postulated value of digital line art, which is approx. 50 lines/mm (1200 dpi), printing film offers enormous potential of resolution for further enlargements. These deliberations presuppose a chosen end size of reproduction. But what if this size cannot be determined because a prospectively use cannot be specified? This would be the case, if digitalization is done for sustainable archiving. Then either an assumed size for further use has to be specified or a method providing potential for a wide variety of usage has to be chosen. An appointed end size often is calculated with available storage media and its capacity. Thus available storage media seem to dimension a possible resolution beside hardware configuration (scanner) or operating system limits. Recalling capacity development where storage media almost double their capacity every two years should result in reviewing any decision made, especially when digitalization projects with a working time of several years or even decades are concerned. 12.5.2 The intermediate product – a long-term working copy A digitalization method with no superficial constraint of further usage may have to use a hybrid approach. This approach then follows traditional reproduction in order to obtain an intermediate product in form of printing film. The intermediate product forms the starting point for digitalization, which may be adapted to actual scanner developments as well as to prospective use of the digital format (adaptation of quality). Additionally this intermediate product presents the working copy for content based analysis (media based analysis, like examinations of material, still will require the original artwork) and therefore redundant archiving (the original artwork and the intermediate copy) should be suggested. This procedure of redundant archiving is a frequently practiced method by libraries to save the content of their inventory. The favored film material then is “large format” microfilm (microfiche), which´s lifetime can be estimated with about 500 years under controlled circumstances (Kodak 2010, Ilford 2010). The microfilm technology in case of black-and-white (B/W) has been used for decades in libraries. The characteristic of B/W microfilm is its hard gradation that does not reproduce greyscales. In opposite to B/W the color microfilm will reproduce colored masters according to film-specific color curves (matching schemes). The characteristic requirements for mi-
250 M. Jobst
crofilm also exist for color material: high resolution, color and archival stability. A specification of “high resolution” for microfilm will differ from commercial film material. High resolution for microfilm means a high resolving capacity on one hand and fine grain on the other. The fine grain will be responsible for the sharpness of edges. At the same time the speed of the film is as low as 1 ASA. Generally this microfilm delivers 300 line-pairs per mm, which is equivalent to a digital resolution of 8000dpi or 44000 x 31500 px at 10.5 x 14.8 cm (Kenning 2004). The color and archival stability aims at existing for 500 years. This stability can be achieved with the bleaching technology called “Cibachrome”, where all color layers do exist on the film and are bleached after the exposure. Even after a four year range in extreme test environments (temperature of 70°C and humidity of 50%) distortions of color where not measurable (Grubler 2010). In addition to the simple reproduction of visible masters on microfilm, this material is also used to create a permanent visual archive. Then various kind of digital data are visibly coded as 2D barcode and exposed to film (Peviar 2010). Actual film materials as well as scanning and exposure technologies support the concept of a long-term copy. The main cost factors are driven by the master preparation and metadata creation, and not by the scanning and exposure procedure itself. Furthermore any expenses of the film material should amortize rapidly, if all needed activities of an digital archive are taken into account. Based on these arguments and preceding considerations a test case was established in order to compare hybrid reproduction with an exclusive digital one.
12.6 The Test Case and its Comparison This section presents the results and enlightens advantages of the compared methods. The inquisitional material was formed by copperplate prints of the 18th century displaying some parts of Austria. These prints on the one hand contain old riverbeds of Upper Austria, thus some important content for an interested planning consortium is available, and on the other hand are formed by very fine line artwork within the rivers, woods and object symbols, which enables the direct comparison of chosen techniques.
Considerations on the Quality of Cartographic Reproduction 251
Fig. 12.1. Direct comparison of camera scanner (left side) and hybrid method (right side). This picture is reduced in size for publication purposes.
Two set-ups were prepared for the reproduction test: Set-up one used a state-of-the-art high-end scanning device. The initial idea was a direct comparison with a scan in one go and not the production and post-processing of a digital mosaic. Of course a mosaic would deliver higher resolutions, but also requires enormous postprocessing resources.
252 M. Jobst
The second set-up was aiming to achieve an intermediate product in form of printing film, which then should be used for digitalization. The scanning process should use a low-cost flatbed scanner in order to assure easy interchangeability with further scanner developments. Misalignments and possible geometric distortions were no investigation of this attempt, but should be considered by a critical discussion of the results. 12.6.1 Procedure of set-up one The same copperplate prints, which were used for the hybrid process, were the source for an actual digital reproduction process. For this process a camera scanner with an 10500x15000 pixel resulting file size was used. The framework for digitalization was to record one map sheet in one go. The creation of mosaics and digital mounting in a postprocessing step certainly would have produced higher resolution, but would be not adaptable to masses of data from an economic and archival point of view. With this framework the maximum acquirable resolution was 400 dpi. One example of this process is shown in figure 1 at the left side. 12.6.2 Peculiarities of set-up two Set-up two has some peculiarities to consider due to film processing. The production of the film material, the intermediate copy, is characterized by exposure time, filtering, film material and used chemicals. The exposure time influences the thickness of lines on the film. The correct exposure time will produce very fine details also of the finest lines. At the same time there should be no “growing” of thick lines, which is some kind of blooming effect. Additional filtering enables the removal of colored parts so as to expose only the line based content on the film. The film is one factor that defines gradation and thus the number of gradings for coming from black to white. In addition this gradation is, for some extent, casting the resolution of the film. High contrast results in higher resolutions, which bases on the emulsion crystal structure (Hübner 1986). The chemicals for film development and fixation also have some influence on the gradation. Depending on the reaction of the developer chemicals with the exposed emulsion on the film, contrast and resolving power may be higher or lower.
Considerations on the Quality of Cartographic Reproduction 253
12.6.3 Procedure of set-up two In order to have manageable analogue copies for the “low-end” scanning afterwards, the size/extend was reduced to 50% of the original map sheet. A negative consequence on resolution quality was not expected, because all considerations resulted in having enough potential with hard grading film. The scanning process of the intermediate copy generally offers comprehensive influence on gradation again. This process aims at employing all usable storage units (bits and bytes) and having almost no loss of information between the original (intermediate copy) and digital copy. Thus the gradation of the scan has to be adapted to the film gradation in a way that the whole range of color values is used. Resulting from these processes it has to be stated that only existent information on the original can be digitized, which means that failures of the analogue film exposure cannot be solved within the digital process. For the realization of the digitalization only 2400 dpi were used. Thus the readjustment to the original size of “reproduction camera reduction” then resulted in 1200 dpi of the scanned image. An example of this method can be found in figure 1 on the right side. 12.6.4 Comparison of the test case The direct comparison of the digital state-of-the-art process with the hybrid method of scanning the film can show up visual quality differences (figure 1), especially when details become enlarged. Thus this picture shows a section of 10 x 10 cm, 5 x 5 cm and 2 x 2 cm. This result can verify the simple resolution calculation made for the assumption that digital reproduction cannot provide enough quality for fine line art (made in previous section) and large map-sheets at the moment, especially in terms of digital long-term preservation. The genesis of figure 1 should be shortly explained at this point, because digital procedures may give the potential to distort objective results and to produce outcomes that support specific argumentation. Both results, left side as well as right side, are “as is” pictures. This means that the achieved result was not further processed by filtering or changing brightness and contrast. It may be that the camera scan (left side of figure 1) would be more brilliant by changing the contrast, but this would influence the direct comparison. The digitalization of the intermediate copy (right
254 M. Jobst
side of figure 1) seems to have a higher contrast. This is the result of the precise adaptation of the scanning process and not of a postprocessing step. A postprocessing step generally lessens information that is stored in pixels and thus should be avoided. The starting point of figure 1 was the highest quality of the camera scanner (which is still the quality of this digital image). From this, sections were extracted, enlarged and put to the same extent in order to visualize quality advantages. The same steps were made for the hybrid result, for which the extractions of 10 cm and 5 cm had to be reduced in size (pixel extend) in order to use the same area on the sheet.
12.7 Conclusion and Further Aspects As result of this investigation it can be stated that analogue methods (filmbased methods) still have an important role for preserving cartographic heritage in a hybrid manner. Even more, new perspectives can be established for the preservation of digital cartography with initiatives on permanent visible archives that use microfilm for “barcode storage”. It becomes important to differ between use oriented processes and archiving oriented processes. Use-oriented digitalization processes aim at a wide dissemination via various communication media. For this purpose the resolution within digitalization of maps will be appropriate chosen according to requirements of transmitting media. For some cases the resolution will be higher, if further uses, like photographic print-outs, are considered. Any result of archiving oriented processes should be accessible and adequate usable also in a far future. “Adequate usable” means that the requested resolution of a future application cannot be determined nowadays. Therefore this contribution arguments that a master should be reproduced in 1:1 quality, which means that there is no quality loss due to reproduction. The non-economical way of producing intermediate products on film, which again uses material, time and manpower, can only be argued with the “safe way” of preservation and low costs of maintenance. The intermediate product serves as long-term working copy and enables a strict archiving of the original source. The digital way of reproduction seems to be very economic. Often the complexity of effort for archival purposes cannot be seen (media storage, device availability etc.). A productive and pragmatic applicability for
Considerations on the Quality of Cartographic Reproduction 255
archival (long-term) purposes has to be worked at in order to open rapid digitalization, sustainable preservation and long-term dissemination for the digital cartographic heritage. Independent from reproduction processes all participants in cartographic heritage have to be conscious about the importance of maps and uniqueness of original material. Throughout centuries reproductions and facsimiles were made. Often these copies are the only source for extracting historic topographic or political contents these days. Their status transformed to a master. But still some questions, e.g. concerning the material, production techniques and semiotics, could only be answered with the source material.
12.8 Acknowledgements The author thanks all helping participants for their support in this project. Special thanks go to the Federal Office of Metrology and Surveying, which made the direct comparison with a state-of-the-art scanner possible, the Institute of Cartography at the University of Dresden, which provided film material, helping hands and access to the camera, and especially to the City of Vienna, whose funding of the project HYREP, H 1071 / 2004, made this analysis possible at all.
12.9 References Bergeron B. P. (2002) Dark Ages 2: When the Digital Data Die; Prentice Hall PTR, New Jersey Borghoff U.M., Rödig P., Scheffcyzk J., Schmitz L. (2003) Langzeitarchivierung – Methoden zur Erhaltung digitaler Dokumente; dpunkt Verlag; Heidelberg Bormann W. et al. (1961) Kartenvervielfältigungsverfahren; Arbeitskurs Niederdollendorf 1960, herausgegeben von der Deutschen Gesellschaft für Kartographie, Bonn Cruse (2010) visited April 2010, source: Dörr M. (1999) Langzeitarchivierung digitaler Medien, Arbeitsbericht für den Zeitraum 1.1.1999 – 30.6.1999. Techn. Bericht 554922(3)/98 BIB 44 Musb 01-03. Bayerische Staatsbibliothek Foveon (2010) visited April 2010, source: Fuji (2010) Fujifilm, visited April 2010, source:
256 M. Jobst Google (2010) Google digital book collection, visited April 2010, source: Grubler (2010) Grubler Imaging and Mikrosave, visited April 2010, source: Helbig T., Bosse R. (1993) Druckqualität – Grundlagen der Qualitätsbewertung im Offsetdruck; Polygraph Verlag, Frankfurt a. Main Henning A. P. (2000) Taschenbuch Multimedia, Fachbuchverlag Leipzig, München – Wien Hübner G., Junge K.W. (1986) Fotografische Chemie – aus Theorie und Praxis; VEB Fotokinoverlag, Leipzig Ilford (2010) visited April 2010, source: Jüptner B. (1987) Die rechnergestützte kartographische Entzerrung mit der Reproduktionskamera KLIMSCH PRAKTIKA ULTRA KT80; Diplomarbeit am Institut für Kartographie und Reproduktionstechnik; TU Wien, Wien Kenning Arlitsch and John Herbert, "Microfilm, Paper, and OCR: Issues in Newspaper Digitization," Microform & Imaging Review 33, no. 2 (Spring 2004): 59–67. Kipphan H. (2000) Handbuch der Printmedien – Technologien und Produktionsverfahren; Verlag Springer, Berlin Heidelberg Kodak (2010) visited April 2010, source: Krautzer W. (1999) Digitale Fotopraxis, Leitfaden für Einsteiger und Profis; Public Voice, Wien Leibbrand W. (1990) Moderne Techniken der Kartenherstellung; Ergebnisse des 18.Arbeitskurses Niederdollendorf 1990 des Arbeitskreises Praktische Kartographie, Kirschbaum GmbH, Bonn Limburg M. (1997) Der digitale Gutenberg – Alles was Sie über digitales Drucken wissen sollten; Verlag Springer, Berlin Heidelberg Loos H. (1989) Farbmessung – Grundlagen der Farbmetrik und ihre Anwendungsbereiche in der Druckindustrie; Verlag Beruf und Schule, Itzehoe Metis (2010) visited April 2010, source: <www.metis-digital.com> Monmonier M. (2002) How to lie with maps, second edition, The University of Chicago Press, Chicago and London Müller G. M. (2003) Grundlagen der visuellen Kommunikation – Theorieansätze und Analysemethoden; UVK Verlagsgesellschaft mbH, Konstanz Octavo (2010) Octavo digital Book editions, visited April 2010, source: ONB (2010) The Austrian National Library, visited April 2010, source: OS (2010) The Open Source Initiative, visited April 2010, source: Peviar (2010) Project on the permanent visible archive, visited April 2010, source: Prieß P.W. (1995) Digitale Reprografie; Reprografie Verlags- und Beratungsgesellschaft mbH, Frankfurt a. Main
13 Issues of Digitization in Archiving Processes
Zdeněk Stachoň Masaryk University [email protected]
Abstract Cartographic cultural heritage constitutes an important part of world cultural heritage; therefore, it deserves appropriate attention of cartographers, computer specialists and archiving specialists simultaneously. In the past, archiving met problems of appropriate storage of original items and making of precise copies. Recent development of information technologies allows storage of digital representations. These are suitable especially for dissemination of valuable materials to wide public and also a possible way of archiving. On the other hand, usage of digital technologies for long term archiving is not sufficiently solved. Therefore, there is a need for long term archiving techniques in our rapidly changing digital society. This contribution describes possibilities for digitization of old maps. Most used digitization technologies are mentioned and evaluated. There is proposed a quality evaluation of different graphic formats, resolution and color schemes for digitization of old maps. The presented work has profound implications for future digitization of map archive of Institute of Geography, Masaryk University.
13.1 Introduction There is no evidence when the first map was created; odds are that the maps are constructed as long as written documents. Mapping activities
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_13, © Springer-Verlag Berlin Heidelberg 2011
258 Z. Stachoň
were forced by necessity of memorizing important places of economical, military, etc. significance (Novák, Murdych, 1988). Maps accompany most of human civilizations from infancies (see figure 1) to developed societies and there were (are) only few mapless societies in the world (Harley, Wooodward, 1987). Importance of maps and its assessing for study of human society is very difficult. This fact is caused by nature of maps as simple iconic representations of reality and by providing complex information. This mixture means that maps cannot be completely translatable (Harley, Wooodward, 1987). Reasons mentioned above emphasize the importance of preservation of world cartographic heritage as an evidence of development and craftsmanship of human society. This contribution is aimed on description of particular steps of old maps digitization process. Individual steps are evaluated and potential critical parts of digitization process are mentioned. Each part includes an example of application on the map archive of the Institute of Geography, Masaryk University.
Fig. 13.1. Sculpture of map abstraction on ivory found near Pavlov (the Czech Republic, 1962). Age approximately 24 000 years (Nakladatelství České geografické společnosti, 1994).
Issues of Digitization in Archiving Processes 259
Fig. 13.2. Map of Moravia from 1743.
13.2 Preservation strategies Ngulube, 2003 refers to three basic preservation strategies: photocopying, digitization and microfilming. All preservation strategies have different advantages and disadvantages. In case of digitization strategy, the main mentioned disadvantage is that digitization can be endangered especially by technical obsolescence. Next part of the contribution will be concerned only with digitization strategy. 13.2.1 Digitization Wisconsin State Cartographer's Office glossary says that digitizing is „the process of converting information shown on "paper" maps into digital form for computer processing either by scanning or by manually capturing point and line features using specialized computer hardware“. Preservation of cartographic heritage involves especially scanning. Smith, 1999 notes, that digitization is not preservation, due to fast development of information technologies is suitable only for dissemination. Ten years later digitizing
260 Z. Stachoň
seems to be one of the most progressive ways how to preserve old cartographic products. Advantages are obvious: is a relatively low-cost process; at the same time, it is simply manageable. Digitization allows wide accessibility of digitized entities. Nevertheless, there are some problematic steps in this process. Digitization can preserve some valuable material which can be irreversibly lost, for example due to fire, destroyed by time, stolen etc. Seriously threatened by self destruction are for example printings on acid paper. Stuart D. Lee (1999) describes digitization chain by few following steps: • Assessment and Selection of Source Material • Digitization Assessment • Benchmarking • Full Digitization • Quality Assessment • Post-Editing • Application of Metadata • Delivery Assessment and Selection of Source Material is an important step on the beginning of digitization process. It provides effectiveness of this process. Digitization assessment as the next step has to be based on nature of digitized material, users requirements, economic background, time consumption etc. The chosen method of digitization has to be tested before starting the digitization process. Outputs of the process need to be further evaluated and eventually discovered errors repaired. A next step includes metadata implementation to each digitized item. At last but not least the digitization process ends in distribution of digitized material to the intended group of users. Previously mentioned steps are useful for digitization of cartographic heritage. Nevertheless, they have been created for digitization in general. 13.2.2 Digitization process As mentioned above, digitization process consists of few consequential steps. All mentioned phases consist of different stages depending on type of digitized material, digitization technology, storage media, etc. It is necessary to determine which material will be digitized, because it is not al-
Issues of Digitization in Archiving Processes 261
ways possible to digitize all available material, what type of technology will be used, what file format will be used for storing data, what resolution will be used for digitizing, what color scheme will be used and what metadata will be collected etc. before the digitization process starts.
13.3 Chosen Issues of Digitization
13.3.1 Digitized material There are different materials which can be digitized. We can distinguish 3 main representatives of cartographic products in most cartographic archives. Most of the historical products are represented by single map sheets, atlases and globes. Each type of potentially digitized cartographic product needs a specific approach. Therefore the following will focus especially on map sheets as the most common historical cartographic product in archives. Maps can be divided by different criteria. In case of digitization, the most important are historical value, age and material condition. Historical value is a subjective criterion. This criterion should be evaluated by cartographers and historians together. Material condition can vary from excellent to very bad. This has to be evaluated mainly by preservation specialists. Application: Examples made in this contribution were made on the Map of Moravia chosen from map archive of Institute of Geography, Masaryk University. The example map was produced in 1743. The size of the map sheet is 23 x 19 cm (see figure 2). For the purposes of digitization, four levels of importance evaluation were established (old maps with local importance, old maps with regional importance, old maps with national importance and old maps with international importance) and 5 levels of map condition evaluation (very good, good, abraded, bad, very bad). The chosen map was determined to be in good condition and classified to regionally important.
262 Z. Stachoň
13.3.2 Digitization technology There are two main ways of digitization that are described in “Good practices handbook, 2004”: The digitization using a scanner and digitization using a camera. 13.3.2.1 Digitization using scanner
There are many types of scanners, e.g. flatbed scanners, sheet-fed scanners, drum scanners, slide scanners, microfilm scanners etc. For digitizing old maps especially flatbed scanners, sheet-fed scanners and drum scanners are useable. Most common equipment represents A4 and A3 flatbed scanners. In use for archiving these flatbed scanners have disadvantages due to their small extent. Larger flatbed scanners without those disadvantages are expensive and available only on few specialized workplaces. One advantage of flatbed scanners is minimization of possible damage of scanned material. A more cost effective alternative are large format sheetfed (continuous) scanners with no extension limitation in one dimension. These scanners come along with a significant threat of damage. A possible damage can be minimized by using protective folio. Another disadvantage is that these scanners are suitable only for map sheets. There is no possibility for atlas or globe digitization. For map sheets, drum scanners are most precise, but high in purchase price. As well a possibility of damage of scanned material exists, because masters have to be fixed on the drum. These are significant limitations for the scanning of old maps. 13.3.2.2 Digitization using digital camera
Digitization using camera is primarily useful for bound books, folded or wrinkled manuscripts and 3D objects. An advantage of this technology is its ability of easy digitization of non flat objects. These involve especially atlases and globes in the field of cartography. Nevertheless, for digitization of over sized materials, like old mapsheets, scanners with cradle are usually preferred (Good Practice Handbook, 2004). Application: The Institute of Geography owns various A4 and A3 flatbed HP scanners and a sheet-fed CONTEX Chameleon G600 scanner with its possibility to scan materials up to 914 mm width and “unlimited” length. On that account, the sheet-fed scanner was chosen for the digitization because it provides a higher resolution and does not restrict proportions of map sheets.
Issues of Digitization in Archiving Processes 263 13.3.2.3 File format
Results from digitization need to be saved in appropriate graphical formats. There are various graphical formats, which generally were developed for different reasons. Therefore existing formats may not be suitable for map archiving. Examples of widely used graphic formats with their characteristics for archiving should be explained in this chapter. Examples in figure 3 - 7 have been made on Map of Moravia from 1743 scanned in 300 dpi resolution. Joint Photographics Expert Group - JPEG The JPEG format was especially created for saving of photographs. There is neither support for opacity and animation nor is it useful for vector graphics due to its compression method. JPEG uses lossy compression. The advantage of JPEG format is its small file size compared to other formats. User can influence the compression level of a saved image by determination of the quality coefficient.
Fig. 13.3. Map saved in JPEG format with maximum quality coefficient – 100% (400% of original size).
Bit Mapped Picture - BMP The bitmap format was developed by Microsoft Company in 1986. It is usually used for background colors or textures in the Internet. BMP uses the simple and fast lossless Run length encoding (RLE compression), therefore it results in larger file sizes. Useful application fields for RLE compressions are grayscale images.
Fig. 13.4. Map saved in BMP format (400% of original size).
264 Z. Stachoň
Tag Image File Format - TIFF Tag Image File Format has been developed for a long time and meanwhile exists of various versions. It is widely used for storing photographs. TIFF uses lossless data compression, which seems to be very convenient for archiving, especially of old maps. One disadvantage is its larger file size of scanned images, e.g. compared to JPG.
Fig. 13.5. Map saved in TIFF format (400% of original size).
PCX PCX format was created by Zsoft company for Paintbrush. PCX uses lossless RLE compression which means higher demands on file size. PCX is not suitable for images with high color depth. (Antoš, 2006)
Fig. 13.6. Map saved in PCX format (400% of original size).
JPEG 2000 JPEG format was modified and extended as it was inconvenient for many purposes. This resulted in the version JPEG 2000, which can use lossy or lossless data compression applied by wavelets. This method allows for simultaneous storage of more resolutions which can be useful for internet publications (Antoš, 2006). The main asset of this format are low demands on storage capacity.
Fig. 13.7. Map saved in JPEG 2000 format (400% of original size).
Application: The issue of graphical format was sufficiently solved and recommendations of using TIFF for archiving and JPEG2000 for web publication were given by Antoš, 2006, Good Practice Handbook, 2004, etc.
Issues of Digitization in Archiving Processes 265
Fig. 13.8. Map scanned on 300, 400, 600, 1200 (up to down) dpi resolution (450% of original size).
Table 1 is a comparison of file sizes of map of Moravia scanned in 300 dpi and saved in different file formats. The scanner used for this task was described previously. For the saving in format JPEG2000, a maximaum rate of digital information preservation was used (lowest compression). For the purposes of digitization within the map archive of the Institute of Geography, TIFF format was chosen for storing, although it brings higher demands on storing capacity. Tab. 13.1. Comparison of file size of different file formats Format JPEG BMP TIFF PCX JPEG 2000
File Size (MB) 5 86,5 86,5 143,5 8,6
266 Z. Stachoň
Fig. 13.9. Dependence of dpi and file size.
Fig. 13.10. Dependence of dpi and time consumption. 13.3.2.4 Resolution
The resolution of raster images is measured in dpi (dots per inch). More dots per inch mean higher density. 100 dpi equals 4 dots per 1 mm. This resolution depends on the chosen technology and its dissolving possibilities. Nowadays scanners dissolve more than 9600 dpi. A resolution of 9600 dpi means more than 384 dots per 1 mm, thus 1 dot represents less than 0,003 mm. Longley et al., 2001 notified that most GIS scanning is in the range of 400 – 1000 dpi. Usually scanned documents in Czech libraries feature about 100 – 600 dpi (Zmapování situace digitalizace v ČR, 2008). Brůna (2002) recommended 400 dpi for map archiving purposes because
Issues of Digitization in Archiving Processes 267
Fig. 13.11. Example of displayed area of scanned map with different resolution on 17 inch monitor in original 100% size. (Map of Moravia 23 x 19 cm from 1743)
this enables printing of the map sheet images in a visual quality almost comparable to the original. There is a comparison of results obntained from scanning the map of Moravia in 300, 400, 600 and 1200 dpi resolution on figure 8. There is a significant influence of dpi on file size and time of scanning. It can be stated that a doubling of resolution leads to a quadruple of the file size (see figure 6). A description of time-resolution dependency in scanning is more complicated. It is linear in case of lower resolutions, but quadratic in case of higher resolutions (see figure 10). The quality of resolution can be demonstrated by an image-area that is displayed on a 17 inch monitor with 100% scale. See figure 11. Further information on digital file formats can be found on the website “Sustainability of Digital Formats Planning for Library of Congress Collections”. Application: The determination of resolution is a very difficult task that is influenced by available technology, storage space, purpose of digitization etc. A standard dpi resolution for Czech libraries was determined as
268 Z. Stachoň
300dpi in Zmapování situace digitalizace v ČR, 2008. It seems to be progressive to choose a resolution closer to the superior limit of scanners by reason of the ongoing development of information technologies as well as future uses of scanned material. At the Institute of Geography a scanning resolution of 1200 dpi was selected. 13.3.2.5 Color modes
Scanning software provides various color modes. The most common is the 24-bit color mode, called true color. It means that colors can be represented by 16 millions color tones. This is more than human eye can distinguish. Other commonly available color modes are 8-bit indexed color and 8-bit classified color which are suitable mainly for internet representations. These color modes are supplemented with 8-bit grayscale (256 grayshade levels), which are mainly used for the scanning of black and white materials. Application: For the archiving of cartographic heritage at the Institute of Geography, it is most suitable to use 24-bit true color mode which provides authentic color preservation and reproduction possibility for almost master-equivalent copies. Therefore the 24-bit true color mode was chosen for the digitization within the map archive at the Institute of Geography. 13.3.2.6 Metadata
Metadata (MD) provide additional information to each stored item. There is no general MD standard for the archiving of cartographic products. Therefore the necessity arises to create metadata profiles by adaptation of existing ones. It is obvious that an adapted definition of metadata profile should be derived from existing standards, like ISO, Dublin Core, METS etc. (e.g. Larsgaard, 2005). Advantages and disadvantages of common metadata standards are described by Řezník (2008). The Dublin Core metadata profile seems to be most suitable for map archiving. Mandatory metadata core-attributes consists of 15 basic elements (see table 2) which are mostly relevant for olds maps. Mandatory elements contain Title (A name given to the resource), Creator (An entity primarily responsible for making the resource), Subject (The topic of the resource), Description (An account of the resource), Publisher (An entity responsible for making the resource available), Contributor (An entity responsible for making contributions to the resource), Date
Issues of Digitization in Archiving Processes 269
Fig. 13.12. Example of 24 bit true color, 8 bit index color and 8 bit gray tone (up to down).
(A point or period of time associated with an event in the lifecycle of the resource), Type (The nature or genre of the resource), Format (The file format, physical medium, or dimensions of the resource), Identifier (An unambiguous reference to the resource within a given context), Source (A related resource from which the described resource is derived), Language (A language of the resource), Relation (A related resource), Coverage (The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant) and Rights (Information about rights held in and over the resource) (Dublin Core
270 Z. Stachoň
Metadata Initiative, 2008). A detailed description of mandatory elements can be visited at the web page of the Dublin Core Metadata Initiative. Application: A suggestion for an adapted metadata profile was created at the Institute of Geography for digitization purposes. It is based on standards made by Dublin Core Metadata Initiative. It was extended for specific requirements in scanning within the map archive. Optional elements replenished to Dublin Core are shown in table 2. This specific metadata profile will be further developed to fit for map archiving. Tab. 13.2. Optional elements proposed for map archiving. Dimension Scale Metadata Resolution Note …
Dimension of map sheet (cm x cm), Dimension of map field (cm x cm) Scale should contain scale, approximate scale, parallel scale, meridian scale. Metadata should contain Date of metadata creation, date of metadata last updating, metadata creator, language of metadata. Resolution should contain dpi of scanned item. Remark concerning to the item. ...
13.4 Conclusions As previously mentioned, there are different issues to be solved in the process of digitization for archiving purposes. At first the status of original material that should be digitized is important. Chosen materials have to be in proper condition and should have significant historical value. Digitized materials have to be saved to some graphical format. TIFF format can be recommended in spite of higher storage capacity demands. Standards for digitization using 300 dpi resolution and formats using lossy data compression seem to be insufficient for demands of a long-term map archiving. Further development of (geo)information technologies such as increase of available disk space, memory, and processor frequency can be expected. Therefore it is necessary to recommend lossless data compression and the use of higher scanning resolution for archiving usage. This means 1200 dpi and higher in spite of higher storage demands and time consumptions. Color mode for the archiving of cartographic materials has to be differently considered. A use of 24-bit true color mode seems appropriate, although
Issues of Digitization in Archiving Processes 271
other color modes can be useful in different ways. Once the object was digitized, it is necessary to add metadata to every item. There are proposed metadata profiles for old maps based on the Dublin Core Metadata Initiative. Additionally maintenance of MD is necessary to avoid failures. However, this constitutes further issues. Recommendations mentioned above were used for map digitization and archiving at the Institute of Geography, Masaryk University.
13.5 References Antoš F., 2006. Problematika skenování historických map a jejich následné prezentace na internetu. Diploma thesis, České Vysoké učení technické, Prague (In Czech). Nakladatelství České geografické společnosti, 1994. Sculpture of map abstraction on ivory in: Geografický kalendář 1995, NČGS, Praha 1994 (In Czech). Brůna V., 2002. Identifikace historické sítě prvků ekologické stability krajiny na mapách vojenských mapování. Acta Universitatis Purkynae. Studia Geoinformatica II. Ústí nad Labem. (In Czech) Dublin Core Metadata Initiative, 2008. Dublin Core Metadata Element Set, Version 1.1 Available at: http://dublincore.org/documents/dces/ [Accessed 5 January 2009] Harley J.B., Woodward D., 1987. The history of cartography, volume one, Cartography in Prehistoric, Ancienit, and Medieval Europe and the Mediterranean, the university of Chicago PRESS, Chicago and London, Publisher. Larsgaard M. L., 2005. Metaloging of Digital Geospatial Data, The Cartographic Journal Vol. 42 No. 3 pp. 231–237 December 2005, The British Cartographic Society 2005. Lee S.,D., 1999. Scoping the Future of the University of Oxford’s Digital Library Collections, Final Report, Available at: http://www.bodley.ox.ac.uk/scoping/ report.html [Accessed 11 January 2009]. Longley P.A., Goochild M.F., Maguire D.J., Rhind D.W, 2001. Geographic Information Systems and Science, John Wiley and Sons, LTD, Chirchester, England. Minerva Working Group 6, 2004. Good Practices Handbook, version 1.3, edited by the Minerva Working Group 6, Identification of good practices and competence centres, [pdf]. Available at: http://www.minervaeurope.org/listgoodpract.htm [Accessed 12 January 2009] Ngulube P., 2003. Preservation And Access To Public Records And Archives In South Africa, PhD thesis, University Of Natal. Novák V., Murdych Z., 1988. Kartografie a topografie, Prague, Státní pedagogické nakladatelství Praha. (In Czech)
272 Z. Stachoň National Library of the Czech Republic, 2006. Digitalizace a digitální zpřístupnění dokumentů, National Library of the Czech Republic, Prague. (In Czech) Available at: http://www.nkp.cz/pages/page.php3?page=weba_digitalizace.htm [Accessed 5 January 2009]. Sdružení knihoven ČR, 2008. Zmapování situace digitalizace v ČR, [pdf]. (In Czech) Available at: http://www.sdruk.cz/it/Zmapovani_situace_digitalizace_ v_CR.pdf [Accessed 8 January 2009]. Smith A., 1999. Why Digitize, Council on Library and Information Resources, Washington, D.C. The Library of Congres, Sustainability of Digital Formats Planning for Library of Congress Collections Available at: http://www.digitalpreservation.gov/ formats/intro/intro.shtml [Accessed 15 September 2009] Řezník, T., 2008. Metadata flow in crisis management, PhD thesis, Masaryk university, Brno. (In Czech) Wisconsin State Cartographer's Office, Wisconsin State Cartographer's Office glossary Available at: http://www.sco.wisc.edu/references/glossary.html [Accessed 15 January 2009]
14 Digitized Maps of the Habsburg Military Surveys – Overview of the Project of ARCANUM Ltd. (Hungary)
Gábor Timár1, Sándor Biszak1,2, Balázs Székely1,3, Gábor Molnár1,3 1
Dept. of Geophysics and Space Science, Eötvös University, Budapest, Hungary ([email protected]) 2 Arcanum Database Ltd., Budapest, Hungary 3 Institute of Photogrammetry and Remote Sensing, Vienna University of Technology, Austria
Abstract This paper summarizes and overviews the scientific, technical and legal background of the rectifying project of the Habsburg Military Survey sheets at the Hungarian firm Arcanum. Rectified versions of the whole First, Second and Third Surveys are completed, however, till this moment, only the Hungarian part of the First and Second Surveys (in 1:28800 scale) were published together with the full Third Survey (in 1:75000 scale) because of legal issues. The rectification errors are quite high in case of the First Survey; this accuracy fits only for settlement finding applications. Accuracy of the Second Survey is surprisingly good in most parts of the Empire, the maximum error is cca. 200 meters, the same value that characterizes the Third Survey, too. This new, electronic cartographic version of the old map systems offers excellent possibilities to follow the changes of the natural and built environment of Central Europe in the last two and a half centuries.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_14, © Springer-Verlag Berlin Heidelberg 2011
274 G. Timár e t al.
14.1 Introduction The Habsburg military surveys represent an unique information source of the Central European geography in the late 18th and in the 19th centuries. Their scale, details, and quality put them among the best European cartographic works in this period. Rectifying them to the modern map projection systems offers a splendid tool to monitor the landscape changes, both natural and artificial, from the time of Maria Theresia to the reign of Francis Joseph. For a long time, the map sheets treasured in the archives were only available for the closed group of professionals of military cartography. The existence and, more importantly, the advantageous characteristics of the maps became known for the specialists of various branches like archeology, hydrology, forestry and nature protection from the beginning of the 1990s in Hungary. A number of reproductions of sheets portraying the most important territories, mainly as black and white copies, started to be distributed. There has been an increasing demand of specialists of river regulation and nature protection in terms of numbers and frequency. The background of the map making of these giant works is discussed by far better authors (eg. Hotstätter, 1989; Kretschmer et al., 2004). In this short paper we summarize the project of the Hungarian firm ARCANUM from 2006 to complete a dataset, which contains all of the 1:28800 scale survey sheets of the First and Second Surveys of the Habsburg Empire, as well as the 1:75000 general sheets of the Third Survey of the Austro-Hungarian Monarchy. We describe the story of the project, with special attention to the scientific and technical background as well as the legal issues and barriers of the work and their publication.
14.2 Overview of the ARCANUM project
14.2.1 Early cartographic products Based on the Map Room Archives of the Ministry of Defense (MoD), Institute and Museum of Military History, Budapest, Hungary, some DVDs were issued, containing the scanned sheets of the First and the Second Military Surveys of the historical Hungary (Jankó et al., 2005).
Digitized Maps of the Habsburg Military Surveys 275
The scanned sheets were not mosaicked or georeferenced, however, a settlement seeker utility was built in, based on the sheet numbers and image coordinates of the settlements centers. As the georeferencing methods for the Second Military Survey became available (Timár and Molnár, 2003; Timár, 2004) and known for the company, a new level of products were designed and later introduced. 14.2.2 Georeferenced products based on the Budapest archives The new, mosaicked and georeferenced version of the Second Military Survey of the historical Hungary was published in the first half of 2006 (Timár et al., 2006), which made the above mentioned earlier product obsoleted. The mosaicked appearance of the sheets is supported by the newly developed user software interface (GEOVIEW, see later). The earlier version of the settlement database (gazzetteer) was transformed; the geodetic
Fig. 14.1. Wien in the First Military Survey shown by the Geoview software of Arcanum (25% magnicifation, in Bundesmeldenetz-34).
276 G. Timár e t al.
coordinates of the settlement centroids were applied instead of their image coordinates. The map sheets of the DVD cover not only the present territory of Hungary but also Slovakia, the Zakarpatska territory of Ukraine, Burgenland, the Voivodina in Serbia, the birder zone of Romania to Hungary and small parts of Poland, Croatia and Slovenia. The mosaicked content can be exported to the GIS software of the user in various projection systems used in these coutries. Later in 2006, the mosaicked version of the First Military Survey of the same area but without the historical Banat, the region around the modern town of Timisoara, Romania, was finished by ARCANUM. As the horizontal control and accuracy of these maps were considerably lower, the standard product does not offer a georeferenced output for the users. Later, the users required this function even for the inaccurate output and a patch was published to upgrade the software. Using the two georeferenced sets of Transylvania (now in Romania), ARCANUM provided a synchronized DVD with the First and Second Surveys of the region (Timár et al., 2007). Synchronization means here that the user can roam one cartographic product in a software window and the other product follows it ’geo-linked’ in another window. This DVD is fully supported by a gazzetteer (Biszak and Timár, 2008). The map sheets of the Third Survey were published also in 2007 (Biszak et al., 2007a; 2007b), in two scales. 1:25,000 scale survey sheets were published for the historical Hungary (the whole above mentioned territory and Croatia), based on the Budapest archives. 1:75,000 scale general sheets of the whole Monarchy were published as it was originally a printed product in the 1880s and there was no legal problem to use them for a DVD product. From 2008, Arcanum sells all of the above mentioned datasets on hard disks and Blue-Ray disks with full synchronization. 14.2.3 Georeferenced products based on the material in Kriegsarchiv Vienna The full set of the First and Second Military Surveys are stored in this Viennese institute, a department of the Austrian State Archive (Staatsarchiv). This institute replied positively to the approach of ARCANUM to make a scanning project in Vienna. The work was carried out in the spring of 2006; several thousand sheets were scanned in the Kriegsarchiv building by Arcanum. Processing of these sheets took another three months by the methods described below. All datasets were scanned but the ones whose
Digitized Maps of the Habsburg Military Surveys 277
copies were stored also in Budapest. The maps covers the full territory of the Habsburg Empire with Belgium in the First, and Northern Italy in the Second Survey. These products are ready from the summer of 2006 but not yet published because of legal issues, see that chapter below.
14.3 The Rectification Methods
14.3.1 The First Military Survey The First Survey was carried out between 1763 and 1787. It covers not only the core part of the Empire but without the western provinces of Austria but also the Austrian-ruled part of the Low Countries, approximately the territory of the present Belgium. As there was no available projection definition for this cartographic products, we made separate mosaics for the different regions. Rectification of these mosaics
Fig. 14.2. Venezia in the Second Military Survey shown by the Geoview software of Arcanum (25% magnification, in the native coordinate system).
278 G. Timár e t al.
were made by reference points (in terms of the used GIS software: ground control points; GCPs). 30-50 GCPs were used for each region. The GCPs were defined by their image (mosaic) pixel coordinates and with their UTM coordinates in the respective zone. Quadratic polynomials were derived and used for all pixels of the mosaics to put them to the UTM plane. The accuracy of this type of rectification is far from precise, it is just adequate for overview purposes and finding the settlements in the mosaic by their modern coordinates. The average accuracy is, in case of smaller provinces, around half a kilometer, while the maximum error is cca. 2 kilometers. The corresponding values for larger provinces are 1-4 kilometers. The errors perhaps can be decreased by defining quasiprojections for the different parts of the First Survey but this method is still not checked in the practice. 14.3.2 The Second Military Survey The Second Military Survey was carried out between 1806 and 1869. It covers a large, contiguous area in Central Europe from the Po Plains in nothern Italy to the Galizia in western Ukraine. The survey had a real geodetic basis and a mapping protocol that can be more or less approximated by the Cassini projection (Veverka & Čechurová, 2003; Timár, 2004; Timár et al., 2006). Eight fundamental points were used throughout the Empire as projection centers (Timár, 2009a). The map sheets have no coordinate descriptions, the section numbering and the sheet sizes and sheets systems bear the georeference, so the four corner points of each sheets can be used as GCPs (Timár et al., 2006). This system cannot be applied for Tyrol and Salzburg; their rectification had to be done using the method of the First Survey rectification (Timár, 2009b). The accuracy of the survey is far better than the first one; it is better than 200 meters in most cases, for the most populated and important parts of the Empire it is between 50 and 100 meters. In case of Tyrol and Salzburg, however, the maximum errors are 220 and 500 meters, respectively (Timár, 2009b).
Digitized Maps of the Habsburg Military Surveys 279
14.3.3 The Third Military Survey The Third Survey was carried out in the 1880s. It covers the territory of the late stage of the Monarchy; without Lombardy and Veneto but with Bosnia-Herzegovina. The compilation of this survey was simultaneous with the geodetic adjustment process of its own base point system (Molnár & Timár, 2009). Each sheet has its own Stereographic projection, their rectification can be made by the method of Čechurová & Veverka (2009) or Molnár & Timár (2009). The remaining (non-systematic) errors are in the range of 200 meters. An idea to decrease these errors is to apply modern geoid models. Deflection of vertical values, derived from global or regional geoid models (Jekeli, 1999), may be useful to provide better fit for these maps, which have a geodetic basis without proper adjustment process. This method, however, is still in experimental phase.
Fig. 14.3. Innsbruck in the Third Military Survey (50% magnification, in Bundesmeldenetz-28). Note that the small rotation is the consequence of using of a unified system for the whole survey instead of the sheets’ own Stereographic projections.
280 G. Timár e t al.
14.4 The User Interface (GEOVIEW) For the publication of the map sheets in georeferenced mosaic form, ARCANUM developed a user interface software to explore the product and provide data export link to the GIS environment of the users. As it is shown in the Figures, the window of this software consists of three parts, the overview map of the region (lower right), the intermediate zoom part (upper right), both with fixed zoom level, and the zoomable and roamable main map area (left and center). In case of synchronized products, this window is multiplicated. The actual WGS84 coordinates of the cursor are shown at the bottom of the map area; right of it, projected coordinates are indicated (projection selected by the user). Users can roam and zoom the map mosaic and, if required, the actual content of the map area can be printed or exported. Export has several options. JPG2000, GeoTIFF and ECW (Enhanced Compression Wavelet) formats are supported, and a worldfile is also presented at user’s selection. The user can set the resolution from the maximum zoom of 2.45 meters/pixel, and also the map projection from a list containing the historical and modern map grids valid in the area.
14.5 Legal Issues Apart from the printed version of the 1:75,000 scale sheets of the Third Survey, all of the mentioned maps were individually drawn in one or two copies. After the World War I, the peace treaty with Austria forces to cede the map sheets to the successor states, according to their new territory. However, theses countries agreed to store this material in Vienna, while they insisted to have the cadastral sheets. Austria and Hungary signed the Treaty of Baden in 1926. According to this agreement, Hungary accepted the right of Austria to store the formerly common archive material and was provided a full access to the documents. Hungarian delegation is working even nowadays in the State Archive of Vienna and, separatedly, also a military delegation in the Kriegsarchiv. The ’copyrights’ of the old materials concerns the Austrian and Hungarian sides, according to the pre-WWI territorial separation. This is the legal basis of the publication of the maps of the historical Hungary; it is controlled by the 2005 agreement between the Institute and Museum of the Military History and the ARCANUM.
Digitized Maps of the Habsburg Military Surveys 281
The Austrian part of the ARCANUM product, however, is not covered by any contract with the Austrian State Archive. The negotiations have been paused for more than two years; the Arcanum completed the scanning project, gave copies of the scan images and one disk with the full products but still there is no contract controlling the publication rights. Without this contract or agreement, the products containing the maps of the Austrian part of the former empire are still not published (however, some samples are shown in Figs. 1 & 2). According to the policy of ARCANUM and to the existing agreement in Budapest, the customers bought the full publication rights with the DVD without any restriction.
14.6 Conclusions The rectification of the map sheet sets of the surveys enable us to making full mosaics of the provinces that were mapped in unified projection systems. Now, it is possible to provide synchronized version of these datasets. Settlement identification for the users are based on common georeference of the surveys (Biszak & Timár, 2008). The datasets led to interesting research results in various fields of the geosciences (eg. Pišút, 2006; Boltižiar et al., 2008; Timár et al., 2008; Zámolyi et al., 2009). The legal issues of the project, however, shows an example to how an uncertain background can block a publication that is waited for many users in Europe. Nevertheless, the project can be considered as a significant action in the field of achiving and preservation. Scanning and publishing the described historical cartographic products provide backup copies, however in electronic format, showing the present physical status of the old paper maps. Moreover, adding georeference to the map sheets gives a ’new life’ to them as they can be integrated to present and future cartographic projects and analyses.
14.7 Acknowledgements The raw cartographic material of the project was provided the Österreichsche Staatsarchiv, Kriegsarchiv, Vienna, Austria, for the Austrian part of the Empire. The sheets of the former Hungarian parts were offered by the
282 G. Timár e t al.
Map Archive of the Institute and Museum of Military History, Ministry of Defense, Budapest, Hungary. Dr. Cristoph Tepperberg, Dr. Robert Rill and Dr. Annamária Jankó are especially thanked for their help. The authors are grateful to Dr. Róbert Hermann and Dr. Ferenc Lenkefi, Hungarian military delegations to the Kriegsarchiv, Vienna, for the flawless cooperation.
14.8 References Biszak, S., Timár, G. (2008): Georeferenced gazetteers based on historical Central European topographic maps. Geophysical Research Abstracts 10: 01498. Biszak, S., Timár, G., Molnár, G., Jankó, A. (2007a): Digitized maps of the Habsburg Empire – The Third Military Survey, Österreichisch-Ungarische Monarchie, 1867-1887, 1:75000. DVD issue, Arcanum Database Ltd., Budapest. ISBN 978-963-73-7451-7 Biszak, S., Timár, G., Molnár, G., Jankó, A. (2007b): Digitized maps of the Habsburg Empire – The Third Military Survey, Ungarn, Siebenbürgen, Kroatien-Slawonien, 1867-1887, 1:25000. DVD issue, Arcanum Database Ltd., Budapest. ISBN 978-963-73-7454-8 Boltižiar, M., Brůna, V., Křováková, K. (2008): Potential of antique maps and aerial photographs for landscape changes assessment - An example of the High Tatra Mts. Ekologia [Bratislava] 27(1): 65-81. Čechurová, M., Veverka, B. (2009): Cartometric analysis of the 1:75 000 sheets of the Third Military Survey of the territory of Czechoslovakia (1918-1956). Acta Geodaetica et Geophysica Hungarica 44(1): 121-130. Jankó, A., Oross, A., Timár, G. (eds., 2005): A Magyar Királyság és a Temesi Bánság nagyfelbontású, színes térképei. DVD-kiadvány, Arcanum Database Ltd., Budapest, ISBN 963-7374-21-3 Hofstätter, E. (1989): Beiträge zur Geschichte der österreichischen Landesaufnahmen, I. Teil, Bundesamt für Eich- und Vermessungwesen, Wien, 196 p. Jekeli, C. (1999): An analysis of vertical deflections derived from high-degree spherical harmonic models. Journal of Geodesy 73: 10-22. Kretschmer, I., Dörflinger, J., Wawrik, F. (2004): Österreichische Kartographie. Wiener Schiften zur Geographie und Kartographie – Band 15. Institut für Geographie und Regionalforschung der Universität Wien, Wien, 318 p. Molnár, G., Timár, G. (2009): Mosaicking of the 1:75 000 sheets of the Third Military Survey of the Habsburg Empire. Acta Geodaetica et Geophysica Hungarica 44(1): 115-120. Pišút, P. (2006): Evolution of meandering Lower Morava River (West Slovakia) during the first half of 20th century. Geomorphologica Slovaca 6(1): 55-68.
Digitized Maps of the Habsburg Military Surveys 283 Timár, G. (2004): GIS integration of the second military survey sections – a solution valid on the territory of Slovakia and Hungary. Kartografické listy 12: 119-126. Timár, G. (2009a): The fundamental points of the Second Military Survey of the Habsburg Empire. Geophysical Research Abstracts 11: 02652. Timár, G. (2009b): System of the 1:28 800 scale sheets of the Second Military Survey in Tyrol and Salzburg. Geodaetica et Geophysica Hungarica 44(1): 95-104. Timár, G., Molnár, G. (2003): A második katonai felmérés térképeinek közelítő vetületi és alapfelületi leírása a térinformatikai alkalmazások számára. Geodézia és Kartográfia 55(5): 27-31. Timár, G., Molnár, G., Székely, B., Biszak, S., Varga, J., Jankó, A. (2006): Digitized maps of the Habsburg Empire – The map sheets of the Second Military Survey and their georeferenced version. Arcanum, Budapest, 59 p. ISBN 963 7374 33 7 Timár, G., Biszak, S:, Molnár, G., Székely, B., Imecs, Z., Jankó, A. (2007): Digitized maps of the Habsburg Empire – First and Second Military Survey, Grossfürstenthum Siebenbürgen. DVD issue, Arcanum Database Ltd., Budapest. ISBN 978-963-73746-0-9 Timár, G., Székely, B., Molnár, G., Ferencz, Cs., Kern, A., Galambos, Cs., Gercsák, G., Zentai, L. (2008): Combination of historical maps and satellite images of the Banat region – re-appearance of an old wetland area. Global and Planetary Change 62(1-2): 29-38. Veverka, B., Čechurová, M. (2003): Georeferencování map II. a III. vojenského mapování. Kartografické listy 11: 103-113. Zámolyi, A., Székely, B., Draganits, E., Timár, G. (2009): Historic maps and landscape evolution: a case study in the Little Hungarian Plain. Geophysical Research Abstracts 11: 11929.
15 The Archiving of Digital Geographical Information
Peter Sandner Hessisches Hauptstaatsarchiv (Central State Archives of Hessen), in Wiesbaden, Germany, [email protected]
Abstract National and state laws in Germany charge public archives with cataloguing, preserving, and making available to the public documents of historical or legal importance that are no longer needed by the public agencies that created them. This clause also applies to data generated by geographical information systems (GIS). However, preserving this digital topographic information in perpetuity poses a major challenge for these public archives, which, in the absence of a single, clear method for achieving this goal, have adopted a number of different approaches. Each of these approaches has its distinct advantages and disadvantages. But there are, nevertheless, certain criteria that can be used for evaluating these alternatives. The data should be stored in a format that is simple and widely available. Archives must take into account the cost and effort entailed for the initial archiving of this geo data, yet they must also insure that interval between future data migrations will be as long as possible. Archives must maintain their independence from other institutions or companies whose own continuity cannot be presumed. And the data must be archived in a way that will be easily accessible to the user. The exchange of ideas and experience between archivists and GIS experts is necessary for the work of both groups and should continue.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_15, © Springer-Verlag Berlin Heidelberg 2011
286 P. Sandner
15.1 The Statutory Framework for Archiving
15.1.1 Archival Laws Federal and state laws in Germany charge public archives with archiving historically or legally important documents. There are 17 different archival laws in Germany: one for the Federal Republic and one for each of the 16 federal states (for example, the Hessian Archive Law [Hessisches Archivgesetz 1989]). These laws generally define 'archiving' to include taking possession of these materials, preserving them for the indefinite future, safeguarding them, making them accessible to and usable by the interested public. Thus, the meaning of the German word 'Archivierung' is somewhat broader than the meaning of 'archiving', 'storage' or 'preservation' as these terms are used in the IT field. 15.1.2 The Statutory Framework for Archiving of Geo data The archival laws require that maps are treated in the same manner as traditional written records. For example, the Hessian Archive Law stipulates that every state-level administrative office or agency has to turn over to the Hessian State Archive all documents that are no longer needed. These requirements also extend to digital documents produced by public administration, including the contents of databases (which are continuously changing) and data generated by geographical information systems. The rapidly expanding use of geographical information systems by the government agencies responsible for topographical surveying and registering the ownership of land (Vermessungs- und Katasterverwaltung) is posing a huge problem for the public archives.
The archiving of digital geographical Information 287
15.2 Strategies in the Public Archives of the German States
15.2.1 Overview Archiving digital geographical information is much more difficult than archiving other digital documents, e. g. electronic records. Five aims are essential: • Digital contents of the geographical information system should be delivered completely (thematic data, geographical position) in order to insure that no information is lost. • Digital data should be saved in simply-structured formats in order to minimize the number of future data migrations, though such migrations are inevitable because it is highly unlikely that the software in use today will still be in use 50 or 100 years from now, much less further into the future. • Digital geographical information should be saved in common data formats that can be used without proprietary software in order to ensure access independently of the life-span of a particular company or product. • The functionalities and features of the geographical information system should be retained in order to allow for diverse queries and retrievals. • Historical geo data should be stored in a way that will be easily accessible by future users. Given the current state of our knowledge, it seems almost impossible to achieve all of these aims at the same time. Are we able to transfer the entire contents of a geographical information system into the archives? What are the costs? And last but not least: Will our strategy still work in 10 years? 100 years? 1,000 years? Different public archives in Germany have adopted different strategies for balancing among these seemingly irreconcilable goals. The examples below are based on discussions held at two 2008 workshops by the working group 'Electronic systems in law and administration' (AG ESys) and by the 'IT committee' (IT-Ausschuss) in 2008, both of which were established by the 'Conference of archival officials of the federation and the federal
288 P. Sandner
states of Germany' (Konferenz der Archivreferentinnen und Archivreferenten des Bundes und der Länder – ARK). The examples are drawn from the archives of the federal states of Baden-Württemberg, Bavaria, Brandenburg, Hesse, Lower Saxony, and Rhineland-Palatinate. The examples involve three different kinds of data: The first data type are topographical geo data from ATKIS, the Official Topographic and Cartographic Information System (Amtliches Topographisch-Kartographisches Informationssystem). This system contains not only information about objects that are shown on topographic maps, such as streets, rivers and forests, but also additional descriptive information about these objects, such as the name of the street, the width of the river, the different species of trees found in the forest. The second data type are land register geo data from ALKIS, the Authoritative Land Survey Register Information System (Amtliches Liegenschaftskataster-Informationssystem), the Automated Real Estate Map (Automatisierte Liegenschaftskarte), and the Bavarian Digital Cadastral District Map (Digitale Flurkarte). These systems all contain more detailed information about the specific parcels of real estate shown on the different maps. The third data type comes from UIS, the Environmental Information System (Umweltinformationssystem), which, as its name indicates, displays environmental data for the geographical region. The official English terms for the German systems are quoted by IMAGI (2003). 15.2.2 Strategy 1: Archiving a Map as a Simple Digital Picture The first strategy involves reducing the complexity of geo data by freezing the information at a specific point in time in the form of a static digital picture. The resulting digital map then becomes the object to be archived. The most frequently-used format for such objects, at least in Germany, is the Tagged Image File Format (TIFF). Both the Brandenburg Central State Archive (Brandenburgisches Landeshauptarchiv) in Potsdam, which archived the Automated Real Estate Map described above, and the Lower Saxon State Archiv (Niedersächsisches Landesarchiv) in Hanover employ this approach. The digital maps produced in this manner contain plot boundaries, plot numbers, buildings, house numbers, types of use, street names, and topographical features.
The archiving of digital geographical Information 289
This strategy has several advantages: • TIFF is a common, simply-structured format that can be used without proprietary software. • The simple structure of TIFF files will probably minimize the number of future data migrations. • Such files are user-friendly because they do not require data manipulation by the user. But this approach also has several disadvantages: • The functionalities and features of the geographical information system used to produce the original files cannot be retained, which means that it is not possible to query or manipulate the underlying data. • Any information that is not incorporated into the original data file is effectively lost. 15.2.3 Strategy 2: Archiving Maps as a Georeferenced Digital Picture The second strategy also seeks to reduce the complexity of the underlying geo data. In this instance, however, information about the projection, the coordinate system, and other georeferencing data are incorporated into the digital picture. The archival object in this instance is a digital map, which uses the GeoTIFF format to embed georeference information in the image. The Digital Cadastral District Map archived by the Bavarian State Archives in Munich employs this approach. The GeoTIFF format also has both advantages and disadvantages. Advantages: • GeoTIFF can be read by most image processing applications. • GeoTIFF is a public domain meta data standard for embedding information in standard TIFF files. • GeoTIFF files should be easy for future users to use. Even if a specific software program cannot access the embedded geodata, it should be able to open the image as a basic TIFF file. Disadvantages: • As with a basic TIFF file, the functionalities and features of the geographical information system used to produce GeoTIFF files cannot be retained, which means that it is not possible to query or manipulate the underlying data.
290 P. Sandner
• Special software is necessary to read the embedded georeferencing information. • The additional embedded data means that the effort and expense of any future data migrations will greater than would be the case for simple TIFF files. 15.2.4 Strategy 3: Separating Information – Archiving both a Digital Picture and associated Text-based Information This strategy involves the separate storage of both the digital image and the associated text-based information. The preferred format of the digital map is the Tagged Image File Format (TIFF). Since, as noted above, the non-graphic geo information (for example, data concerning the coordinate system, meta data, or thematic attribute data) can not be integrated into the image without losing future functionality, the goal here is to store this information in a format that will permit future users to manipulate--and then integrate--the two kinds of data. A text-based format (coded information), such as Extensible Markup Language (XML) must be used in such cases. The Hessian Central Archive is planning to use this approach to archive the Official Topographic and Cartographic Information System (ATKIS). ATKIS consists of the following components: • DLM – 'Digitales Landschaftsmodell' (Digital Landscape Model) • DGM – 'Digitales Gelaendemodell' (Digital Relief Model) • DOP – 'Digitale Orthophotos' (Digital Orthophotos) • DTK – 'Digitale Topographische Karte' (Digital Topographic Map) Advantages: • The data formats used by the archive are common, simply structured, and can be used without proprietary software. • This simplicity will also minimize the number of future data migrations. • The use of TIFF should be easy for future users. • Text-based data can be read by commonly-used word processors and web browsers. • The functionalities and features of a geographical information system can be restored with a future GIS, thus enabling diverse queries and retrievals. Disadvantages: • The cost and effort involved in the initial data migration are larger than with other strategies.
The archiving of digital geographical Information 291
• It will be difficult to use historical geo data with future geographical information systems because the user will be responsible for translating the data into a format that can be read by the system. 15.2.5 Strategy 4: Archiving the Vector Data of the Geographical Information System (GIS) The fourth strategy archives the geo data as vector data using a proprietary GIS format: 'Shapefile' (developed by the company ESRI). This approach was used by the State Archives of Baden-Württemberg to archive the Environmental Information System. Advantages: • Since this approach retains the functionalities and features of the geographical information system, it is still possible to query, retrieve, and manipulate the information contained in the database. • Although it is a proprietary format, Shapefile documents can still be read by several open source programs. The details of the file format are well documented, and the documentation is freely accessible. • Future data migration should be possible. Disadvantages: • Shapefile is a proprietary data format; the source code of ESRI software is not freely available. • The data format cannot be interpreted without the use of specific GIS software programs. • The effort and energy involved in any future data migrations will probably be larger than with more common data formats. 15.2.6 Strategy 5: Archiving Information using the original Data Format of the Administration The fifth strategy involves archiving geo data in the diverse original formats of the government agencies that created the information. The State Archival Administration of the Rhineland-Palatinate used this approach in 2004 to archive the Automated Real Estate Map, and the Bavarian State Archives are considering using it to archive the Digital Cadastral District Map.
292 P. Sandner
Advantages: • No additional cost or effort for data migration are necessary for the original storage of the data. • The diverse functionalities and features of the geographical information system can be retained, and the information can be queried and retrieved as long as the original software program continues to function. Disadvantages: • The data formats are proprietary, and the source code is not generally available. • The data can not be read without the use of a number of dedicated software programs. • The software needed to read the data must be archived as well. • Future data migrations are probably impossible.
15.3 Conclusion There is no perfect strategy for archiving the geographical information generated by geographical information systems, and all of these five strategies involve trade-offs. However, the basic criteria to be used in evaluating every concrete alternative are: 1) the simplicity of the data format used, 2) the broad availability of the software needed to read this data, and 3) the length of the anticipated intervals between data migrations. Archives must also take into account the cost and effort involved in the initial archiving of historical geo data, and, whatever format is chosen, it must be easily accessible by potential users. And, in addition, archives must preserve their independence from other institutions or companies because no one can predict how long these will exist. The overriding imperative to minimize the loss of information militates strongly against the archiving of geo data as a single map. The standardsbased exchange interface (NAS, Normbasierte Austauschschnittstelle) that will soon be employed by German government agencies responsible for the surveying and registry of land gives hope for better archiving options in the future because it will be able to store both vector data and meta data in a text-based XML format. But the NAS is not the end, but simply another step in an ongoing process, and the exchange of ideas and experience between archivists and GIS experts is necessary for the work of both groups and should continue into the future.
The archiving of digital geographical Information 293
15.4 References Archivschule Marburg 2009, Archivgesetze und weitere Gesetze, Marburg, viewed 16 February 2009, . 'Hessisches Archivgesetz' 1989 (amendments 2002 and 2007), Gesetz- und Verordnungsblatt fuer das Land Hessen, vol. 1989 part I, pp. 270-273, viewed 22 October 2009, Ueber uns > Rechtsgrundlagen > Downloads > Hessisches Archivgesetz, Stand 20. Juli 2007 (PDF, 51 KB). Interministerial Committee for Geo Information (IMAGI) 2003, Geo Information in the modern state, The Federal Agency for Cartography and Geodesy (BKG), Frankfurt am Main, viewed 16 February 2009,
16 An Intellectual Property Rights Approach in the Development of Distributed Digital Map Libraries for Historical Research
A. Fernández-Wyttenbach1, E. Díaz-Díaz2, M. Bernabé-Poveda1 1
Technical University of Madrid, Spain Law firm ‘Mas y Calvet’, Spain 1 [email protected] 2
Abstract Initiatives in the field of Digital Map Libraries have greatly increased, uncovering old maps stored in the cartographic collections of libraries and archives, keeping the geographic component as a common link. A new type of work comes up based on the use of new analysis tools, involving processes carried out through distributed systems, via the Virtual Map Rooms in the Spatial Data Infrastructure framework. Therefore, some specific Intellectual Property Rights challenges in specific deployment scenarios might be considered. Concerning the technology incorporated in these services, free access components or open source technology are seemingly not of any specific IPR relevance. A library has an almost automatic right to publish online the contents which are its property. But when it concerns the reuse of metadata or images from an external entity, that library must consider the related IPR issues, and eventually depending on the specific services that will be used. This paper aims to introduce some prospective legal and licensing aspects, described as an added value to succeed in web publishing of cartographic heritage and to assure its sustainability, following the last advances developed in the geo-spatial field, but emphasizing specific IPR aspects of our historical digitized images and metadata.
M. Jobst (ed.), Preservation in Digital Cartography, Lecture Notes in Geoinformation and Cartography, DOI 10.1007/978-3-642-12733-5_16, © Springer-Verlag Berlin Heidelberg 2011
296 A. Fernández-Wyttenbach e t al.
16.1 Introduction Digital management of geographic information is taking its first steps and further advancement is expected in the development of a new society based on geographic knowledge. The Internet is a powerful tool providing different online communication options for geographic approaches. Technical and legal experts, historians and scholars need to remotely access the entire existing information which should be compiled in a single place to facilitate access, comparison and the assumption of their legal implications. They have established a number of well-defined international standards (Library of Congress 2009), and they have assembled catalogues using similar harvesting techniques (Lagoze & Van de Sompel 2001). But now, in addition to several technical and legal organizations, some cultural institutions are making certain deals and taking decisions in order to access information in a new way. The aim of Digital Map Libraries (DML) is to provide access to the cartographic heritage through a single geoportal with distributed access to thousands of maps stored in the cartographic collections of different libraries and archives, keeping the geographic component as a common link (Fernández-Wyttenbach, Ballari & Manso-Callejo 2006). So far, old maps were reserved to high cultural level people with specific interests, but now geographic technologies are helping the whole society to easily re-discover old cartography. This paper is about establishing shared notions, conventions and practices that express a first approach to the Intellectual Property landscape of DML. With defined boundaries, it is possible to share exchange or even trade rights to cartographic heritage resources in a clearly defined and managed way, with the geographical component as a link.
16.2 From Digital Map Libraries to Virtual Map Rooms Scientific advances are made through international, informative forums and meetings in which these new technologies, as applied to historical cartography, are presented and discussed. The Commission on Digital Technologies in Cartographic Heritage (ICA 2006) is certainly the most relevant, followed by other initiatives carried out at a national level (Montaner 2009, p.54).
An Intellectual Property Rights Approach 297
There are also international organizations gathering together map library professionals who could support the setting up of discussions and exchange of knowledge and policies to follow in the acquisition, conservation, cataloguing and dissemination of public cartographic collections. Moreover, it is necessary to deeply study the legal criteria regarding web publishing. There have been several cooperative projects aiming at the widespread distribution of historical cartographic collections in the Web through geographic localization tools, engaging various national and international institutions. In this sense, some specific experiences have been carried out in the Spatial Data Infrastructure (SDI) field, trying to include some geospatial services in a historical environment. Taking advantage of the INSPIRE case to promote European cooperative policies (European Parliament and the Council 2007); other European projects are already under way. The DIGMAP Project has recently finished and stands for “Discovering our Past World with Digitized Maps”. The information sources consist of catalogues and digital libraries of the partners, public OAI-PMH data providers, gazetteers, and any other relevant source
298 A. Fernández-Wyttenbach e t al.
of information available in the Internet. The developed tools for browsing (Figure 1), cataloguing, geo-referencing and extracting iconographic information are expected to help both the promotion of cartographic resources and the global collaborative policies for reference, description, cataloguing, indexing and classification of ancient maps (Pedrosa et al. 2008). Concerning the technology and techniques incorporated in the services, DIGMAP made use of explicit free access services or open source technology, with which we developed new techniques and scenarios, but which are not of any specific Intellectual Property Right (IPR) relevance. But new prospective ideas that were out of the scope of the project suddenly appeared, related to the sustainability of its legal framework (DIGMAP Project 2008, p.9). SDI’s need not only metadata, data sets, spatial services and interoperable technologies, but also agreements to share information and to coordinate and monitor the processes. There is a number of technological and policy considerations to be taken into account apart from the characteristics common to all SDI in any thematic field. Thus, cartographic heritage contained in the DML stands out as an exceptional case within the generic frame of an SDI. (Fernández-Wyttenbach et al. 2007) The term “virtual reality” suggests considering a screen as a window upon a virtual world, a challenge put forward for research workers in that field. At present the hardware and software technology is sufficiently developed for implementation of either type with high quality features in virtual map rooms. In most cases the representation of reality and the interaction with it would be enough. The Spanish CartoVIRTUAL initiative (Bernabé-Poveda 2009) has just kicked off with the aim of creating a specialized virtual map room, with OpenSource searching services and access to historical content. This project intends to design a methodology and set up a prototype of distributed historical map library having the advantages of real map libraries, with advanced online measurement and geo-referencing tools within a virtual environment. As was the case for the DML, distributed access to the cartographic funds through different search tools is offered, but here a further step is taken by offering very intuitive tools according to each user profile. These tools of analysis, study and interpretation of that information provide for the ability to measure, cut off, take notes and compare maps of varied sources in a multilayer layout. Thus we can state that the researcher’s work usually carried out in a Map Room becomes virtualized. To that
An Intellectual Property Rights Approach 299
end the project will take advantage of the Open Geospatial Consortium and SDI services, the direct crosswalks developed between geographic and bibliographic metadata profiles, and the dissemination and advancement of forums of interest. Once viability ascertained, the project will move forward towards the creation of an institutional prototype of national virtual map room, a geoportal providing access to the Spanish historical map libraries, museums and archives. The fact is that these newly developed functionalities will involve the assumption of new responsibilities, i.e. the analysis to be carried out with the cartographic material, the constraints to be imposed to save the results and the images at a certain resolution, as well as the legal provisions for use of the actual location, visualization, download, transformation and access services. All this already entails consideration of the Digital Rights Management (DRM) in the setting of the DML and the prospective virtual map rooms. Therefore, this portal will deal with different possibilities, licenses and measurement tools in keeping with every user profile – scholars, research workers, professionals or just amateurs; but keeping the legal aspects as a must.
16.3 A New Legal Landscape in Geographical Information The highest level international standards for geographic information are applied through lower level binding resolutions. A significant example within the European reach is the directive of the European Parliament and of the Council (2007, p.8), establishing an Infrastructure for Spatial Information in the European Community (INSPIRE). While establishing that this directive should not affect the existence or ownership of public authorities intellectual property rights, its article 13 stipulates that ‘member states may limit public access to spatial datasets and services through the services … or to the e-commerce services … where such access would adversely affect … intellectual property rights’. In short, several legal regulations contemplate the IPR. However their scope is not always specifically defined. The deficiencies of the general rules, which have not been sufficiently developed or specifically harmonized, are showed in geospatial information, especially within the scope of DML. In addition, the different development or specification implementing rules of INSPIRE seem to ascribe the IPR regulation to the national le-
300 A. Fernández-Wyttenbach e t al.
gislation; the resulting lack of harmonization might be generated by the actual member states, which, being sovereign, may adopt partly compatible regulations. Thus, in addition to the problems arising for the legal and technical operators, it is difficult in this way to reach the desired interoperability, a cornerstone of the SDI. The range of these subjects is wide and rich in subtleties. IPR and user guarantees are a subject of great interest. The creation of added value through the safeguard of royalties before the general public and the users of DML, as well as the development of suitable technical tools for not exclusively technologic uses and applications, lead to a novel way of understanding and employing geoinformation. Why not speak of SDI “external interoperability”? Beyond SDI internal interoperability at the global, regional or local level, a new concept of interoperability arises between the involved agents as users of data services and the wide framework of the network services regarding geoinformation. This “external interoperability” is compatible with the current technical system interoperability; it is also the logical and natural continuation of the efforts made up to now. IPR international and national regulations about geoinformation, particularly about DML, reveal the importance and the interest of the balance achieved through legislation. However, despite the existing regulation, the IPR, are not regulated in sufficient detail, in spite of the good intentions. Consequently, the lack of explicitness in the practical application raises doubts about those responsible. Furthermore, the current regulations are ineffective deterrents and erratic proposals, thus generating remarkable uncertainty as a result (Garnett 2006). The horizon of the DRM allows for the consideration of the need for harmonization within the legal and technical domains. It is necessary for the technical and legal operators and agents to reach a mutual and firm understanding enabling the comprehensive development of the very important subject of geoinformation. As is the case with other areas of knowledge, the legal regulations and the technological developments would together give a greater practical breadth and a new dimension to the technological advances, to the service of citizens and in the interest of users (European Commission 2002). In the particular case of the Spanish law (Law 10/2007, p. 4), the DRM scope has been recently increased to ebooks or any kind of books with prints, audios or videos enclosed in the publication as attached information. This could be germane to most of the contents provided by DML.
An Intellectual Property Rights Approach 301
The application of new functionalities would entail the introduction of new responsibilities or different obligations. The result would be the creation of a secure and stable work setting, the feasibility of certain projects that, without a legal substrate would otherwise be unfeasible, and the trust generated between the interoperable systems and between the technical and legal agents involved (Ianella 2001). There are a number of similarities between SDI and DML (FernándezWyttenbach et al. 2007, p.6); as far as the legal aspects are concerned, this may be used to find a new meeting point. However, there are other cultural considerations which go beyond the geographic domain. (Crews 2008, p.25)
Fig. 16.2. ODRL Foundation Model (Iannella 2002, p.4)
302 A. Fernández-Wyttenbach e t al.
16.4 Digital Rights Management Digital Rights Management (DRM) involves the description, layering, analysis, valuation, trading and monitoring of the rights over an enterprise’s tangible and intangible assets (Iannella 2001). It covers the digital management of rights –considering them in a physical medium (e.g. a map), or in a digital medium (e.g. a DML service). A DRM consists broadly in the identification of intellectual property (i.e. the attribution of an identifier, such as an ISBN number, and the marking of a sign, such as a watermark); and the enforcement of usage restrictions that works via encryption (i.e. ensuring that the digital content is only used for purposes agreed to by the rights holder) (European Commission 2002, p.3). The DRM system enhances the altruistic trust by providing ex ante facto protections (i.e. from before the fact). Protection can be anything that restricts access to resources, such as high-resolution cartographic images or detailed metadata information provided by libraries and archives. The user, through trusted software, knows that he can legally do whatever he is allowed to do and the owner of the resource knows that abuse of the contract is at least difficult. The degree of difficulty should be proportional to the
Fig. 16.3. GeoDRM Reference Model Context (Vowles 2006, p.16)
An Intellectual Property Rights Approach 303
risk to the resource, where valuable resources are generally protected more than those of lesser value (Vowles 2006, p.35). DML is emerging as a formidable new challenge for content communities in this digital age. It is essential for DRM systems to provide interoperable services, changing the nature of distribution of digital media from a passive one way flow to a much more interactive cycle (Iannella 2001). In the end, trusted services are needed to manage rights for both owners (e.g. the libraries) and users. In this regard, it is important to mention the huge interest shown by both technical and legal users. In addition to the Land Register or the Cadastre, the DML are presently key tools in both legal and extralegal proceedings and in decision making in general. The cartographic heritage, when appropriately regulated, becomes an added value service for citizens. It is not a merely ornamental reality of cultural interest, but above all it is a valuable tool of the Knowledge Society. So far, several models and languages for digital rights have been developed that are capable of expressing a range of licenses of the kind that DRM may be expected to support (Gunter, Weeks & Wright 2001). In addition, the definition of DRM architectures is fundamental in DML due to the need for interoperability and openness. Two critical architectures are considered by Iannella (2001) in designing and implementing DRM systems: the functional architecture - i.e. the modules or components of the DRM system - and the information architecture - the modelling of the entities within a DRM system as well as their relationships. However, all these models have been accomplished through a semantic approach to DRM developed by García-González (2005), based on web ontologies, whose basic pieces are a creation model, the copyrights and the actions that can be carried out on the content. This ontology facilitates rights management system development. From a political point of view, the European Commission published a first paper in 2002 as a contribution to the debate about DRM, providing a factual presentation of the current state of the subject and identifying some of the major policy issues surrounding the acceptability of DRM (European Commission 2002). The paper sought to provide policy guidance on the use of technology at this critical juncture, in addition to contributing the legal framework within which DRM would be administered by the EC Directive 2001/29 on copyright and related rights in the Information Society.
304 A. Fernández-Wyttenbach e t al.
In this sense, the Commission has also been co-funding research projects with new business models in mind in keeping with emerging technologies, such us the digital map libraries.
16.5 Rights Expression Languages Rights expression languages (REL) are breaking through in the information community supporting different aspects of the digital access environment: licensing, payments, web material, use control, access, etc. They are part of the technology of DRM. The expression of rights can be generally described in terms of the statement of legal copyright, the expression of contractual language, and the implementation of controls. Each type of REL has been developed to solve a particular set of problems. Thus, CreativeCommons functions work specifically in the open access environment of the Internet. METSRights is intended to accompany digital materials being provided by academic institutions and libraries. MPEG-21/5 is a general language that is formally described and fully actionable within a trusted systems environment, and it is designed primarily to support applications whose resources require the greatest amount of protection from unauthorized use (Coyle 2004, p.4). An interesting example is the Open Digital Rights Language (ODRL). It is a general-purpose language that allows some actionable control over resource use. It was designed as an open standard for industry and community use. ODRL lists the many potential terms for permissions, constraints, and obligations as well as the rights holder agreements (Figure 2). As such terms may vary across sectors, rights languages should be modelled to allow the terms to be managed via a Data Dictionary and expressed via the language. XML encoding enables the exchange of information supporting the same language semantics, and will set the stage for complete and automatic interoperability (Ianella 2002, p.5).
16.6 Geospatial Digital Rights Management Reference Model As shown above, different standards already exist for the licensing of digital content. However, they describe the licensing of digital media content and cannot be used for licensing of geographic information unless they are
An Intellectual Property Rights Approach 305
extended. The ultimate goal of geographic standards is to make geographic information and services available and readily usable for the entire information services community (Vowles 2006, p.14). Thus, the GeoDRM is a reference model for digital rights management functionality for geospatial resources developed by the Open Geospatial Consortium as an abstract specification. This abstract specification builds on and complements the existing OGC specifications, and defines at an abstract level a Rights Model to enable the DRM of standards-based geospatial resources (Figure 3). A resource in this context is a data file, a service for geographic information or a process. Our cartographic heritage is the object of cultural and geographical intellectual property rights management. Licensing in the GeoDRM domain supports the licensing of digital content based on different infrastructures; this is of special interest for DML. Even more important is the fact that licensing can also take place for geographic information since it can be dynamically created by using OpenGIS Web Services to be used in DML. Improvements in this abstract specification by the OGC are desirable in order to establish REL extensions for geospatial resources and to create a description metadata language to extend ISO 19115 and provide control information for GeoDRM enabled systems (Vowles 2006, p.11). In the near future, the efforts of the ISO TC211 will very soon result in the final approval of two specifications: ISO 19149: Geographic Information – Rights expression language for geographic information and ISO 19153: GeoDRM RM – The software architecture, both being presently at the stage of Committee Draft and Working Draft respectively (ISO 19147). In this sense, the OGC has recently chartered a committee of the Board to specifically address the spatial law and policy issues which will influence development requirements of the Consortium's technology process (OGC Press, 2009). In the end, GeoDRM Abstract Rights Model is creating a simplified model of geospatial Intellectual Property so that it may be practically licensed. Moreover, the rights to that Intellectual Property may be managed and protected.
306 A. Fernández-Wyttenbach e t al.
16.7 Future Considerations It is critical to start taking into account in the DRM geospatial models three important issues for the preservation of the cartographic heritage information. First of all, the temporal issue which is already contemplated in article 8 of the INSPIRE Directive (European Parliament and Council, 2007). This issue requires distinction between two aspects concerning the length of IPR: (i) the temporality of the information (i.e. dates of creation, update, last modification…) and (ii) the temporality of the IPR (i.e. length and validity of the IPR as stipulated by the national laws of every country). Moreover, the information medium is very relevant. It used to be a ‘printed’ presentation (connoting any non-automated instrument) but we are going to find with increasing frequency ‘digital’ historical documents (connoting any technical, electronic, computer and telematic media). It should be noted that the media used, technical in this case, do not counteract the legal consequences; the traditional medium is considered equal to the known innovating media, or those to be devised in the future. In other words, the IPR that the laws grant to authors or holders of cartography are identical for both non-automated and automated information. Thus, the information medium does not in principle affect the ownership of rights, although it may modify the way in which they are exercised. Finally, it is important to take into account the age of the information. The legal consequences are varied but regulatory gaps on geospatial information are common herein. Therefore, by analogy it should be applied those legal criteria regarding printed and digital documentation together with the specific laws of the cultural heritage.
16.8 Conclusions Considering the study of the initiatives carried out so far in Digital Map Libraries (DML), it may be said that it is necessary to fuse together many of the ideas developed separately (distributed access, tools for query and analysis, geographic navigation, virtual spaces…) and even more so if it is possible to improve them. The point is that these functionalities will introduce new responsibilities in keeping with the Digital Rights Management (DRM).
An Intellectual Property Rights Approach 307
The legal aspects of the cartographic heritage are required in many aspects of everyday activity; they represent a suitable means for sustainable interoperability and a fitting assurance about the persistent development of DRM, and DML. Far from the apparent inflexibility of laws, a satisfactory regulation allows a better management and protection of the cartographic heritage and the generation of added value in the safe application of new technologies, their correct use without risks and the establishment of the appropriate responsibilities of authors/managers. At this point, it is important to stress that our cartographic heritage is not only the object of cultural intellectual property rights (IPR), but it is even the object of IPR issues in the geospatial domain when geospatial web services enter the picture. From the moment they were created, they become geospatial resources themselves because they may be reused for geographical purposes.
16.9 Acknowledgements We would like to express our gratitude to the CartoVIRTUAL initiative (Ref. CSO2008-03248/GEOG) from the R&D Programme 2008-2011 of the Ministry of Science and Innovation of Spain.
16.10 References Bernabé-Poveda, M 2009, CartoVIRTUAL Project, Ref. CSO2008-03248/GEOG, R&D Programme 2008-2011, Ministry of Science and Innovation, Madrid, viewed February 2009, Coyle, K 2004 Rights Expression Languages: A Report for the Library of Congress, viewed February 2009 Crews, K 2008, Copyright Limitations and Exceptions for Libraries and Archives, Standing Committee on Copyright and related Rights, WIPO - World Intellectual Property Organization, Geneva, viewed February 2009 DIGMAP Project 2008, ‘Final Report’, Programme eContentplus, European Commission. DIGMAP Portal 2008, Programme eContentplus, European Commission, Brussels, viewed February 2009
308 A. Fernández-Wyttenbach e t al. European Commission 2002, Digital Rights - Commission Staff Working Paper, Brussels, viewed February 2009 European Parliament and the Council 2007, Directive 2007/2/EC of the of 14 March 2007, establishing an Infrastructure for Spatial Information in the European Community (INSPIRE), viewed February 2009 Fernández-Wyttenbach, A, Álvarez, M, Bernabé-Poveda, M & Borbinha, J 2007, ‘Digital Map Libraries Services in the Spatial Data Infrastructure (SDI) Framework: The Digmap Project’, in Proceedings of the 23rd. International Conference in Cartography, International Cartographic Association, viewed February 2009 Fernández-Wyttenbach, A, Ballari, D & Manso-Callejo, M 2006, ‘Digital Map Library of the Canary Islands’, e-Perimetron: International Web Journal on Sciences and Technologies affined to History of Cartography, vol.1, no. 4, pp. 262-273, viewed February 2009 García-González, R 2005 ‘A Semantic Web approach to Digital Rights Management’ Department of Information and Communication Technologies, PhD thesis, Pompeu Fabra University, viewed February 2009 Garnett, N 2006, Automated Rights Management Systems and Copyright Limitations and Exceptions, Standing Committee on Copyright and related Rights, WIPO - World Intellectual Property Organization, Geneva, viewed February 2009 Gunter, C, Weeks, S & Wright, A 2001, ‘Models and languages for digital rights’ Proceedings of the 34th. Hawaii International Conference on System Sciences, viewed February 2009 Head of State 2007, Law 10/2007, of June 22 of the lecture, the book and the libraries. Official State Gazette, BOE 150/2007, Madrid, viewed February 2009 Iannella R 2001, ‘Digital Rights Management Architectures’ D-Lib Magazine, vol. 7, no. 6, viewed February 2009 Iannella R 2001, ‘Open Digital Rights Management’, in Proceedings of the Workshop on Digital Rights Management for the Web. World Wide Web Consortium Workshop, viewed February 2009 Iannella R 2002, Open Digital Rights Language (ODRL), W3C Note, World Wide Web Consortium, viewed February 2009
An Intellectual Property Rights Approach 309 ICA International Cartographic Association 2006, Commission on Digital Technologies in Cartographic Heritage, Thessaloniki, viewed February 2009 ISO International Organization for Standardization 1947, International harmonized stage codes, Geneva, viewed February 2009 Lagoze, C, Van de Sompel, H 2001 ‘The Open Archives Initiative: Building a low barrier interoperability framework’, in Proceedings of the 1st ACM/IEEE Joint Conference on Digital Libraries, viewed February 2009 Library of Congress 2009, MARC Standards, Washington, viewed April 2009 Montaner, C 2009, ‘Disseminating digital cartographic heritage: Standards and infrastructures’, e-Perimetron: International Web Journal on Sciences and Technologies affined to History of Cartography and Maps, vol.4, no. 1, pp. 53-54, viewed April 2009 OGC Press 2009, The OGC forms a Spatial Law and Policy Committee, Open Geospatial Consortium Press Room, Massachusetts, viewed April 2009 Pedrosa, G, Luzio, J, Manguinhas, H, Martins, B & Borbinha, J 2008, ‘DIGMAP: A Digital Library Reusing Metadata of Old Maps and Enriching It with Geographic Information’ in Research and Advanced Technology for Digital Libraries. 12th European Conference, ECDL 2008, Lecture Notes in Computer Science 5173, ed. Springer, pp. 434-435. Vowles, G 2006, Geospatial Digital Rights Management Reference Model (GeoDRM RM), Open Geospatial Consortium Inc., viewed February 2009