Fourth IFIP International Conference on Theoretical Computer Science - TCS 2006: IFIP 19th World Computer Congress, TC-1, Foundations of Computer Science, ... in Information and Communication Technology)

FOURTH IFIP INTERNATIONAL CONFERENCE ON THEORETICAL COMPUTER SCIENCE- TCS 2006 IFIP - The International Federation fo...

Author: Gonzalo Navarro | Leopoldo Bertossi | Yoshiharu Kohayakawa

14 downloads 857 Views 16MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

FOURTH IFIP INTERNATIONAL CONFERENCE ON THEORETICAL COMPUTER SCIENCE- TCS 2006

IFIP - The International Federation for Information Processing IFIP was founded in 1960 under the auspices of UNESCO, following the First World Computer Congress held in Paris the previous year. An umbrella organization for societies working in information processing, IFIP's aim is two-fold: to support information processing within its member countries and to encourage technology transfer to developing nations. As its mission statement clearly states, IFJP's mission is to be the leading, truly international, apolitical organization which encourages and assists in the development, exploitation and application of information technology for the benefit of all people. fPJP is a non-profitmaking organization, run almost solely by 2500 volunteers. It operates through a number of technical committees, which organize events and publications. IFIP's events range from an international congress to local seminars, but the most important are: • The IFIP World Computer Congress, held every second year; • Open conferences; • Working conferences. The flagship event is the IFIP World Computer Congress, at which both invited and contributed papers are presented. Contributed papers are rigorously refereed and the rejection rate is high. As with the Congress, participation in the open conferences is open to all and papers may be invited or submitted. Again, submitted papers are stringently refereed. The working conferences are structured differently. They are usually run by a working group and attendance is small and by invitation only. Their purpose is to create an atmosphere conducive to innovation and development. Refereeing is less rigorous and papers are subjected to extensive group discussion. Publications arising from IFIP events vary. The papers presented at the IFIP World Computer Congress and at open conferences are published as conference proceedings, while the results of the working conferences are often published as collections of selected and edited papers. Any national society whose primary activity is in information may apply to become a full member of IFIP, although full membership is restricted to one society per country. Full members are entitled to vote at the annual General Assembly, National societies preferring a less committed involvement may apply for associate or corresponding membership. Associate members enjoy the same benefits as full members, but without voting rights. Corresponding members are not represented in IFIP bodies. Affiliated membership is open to non-national societies, and individual and honorary membership schemes are also offered.

FOURTH IFIP INTERNATIONAL CONFERENCE ON THEORETICAL COMPUTER SCIENCE- TCS 2006 IFIP 19th World Computer Congress^ TC-1, Foundations of Computer Science^ August 23-24, 2006, Santiago, Chile

Edited by Gonzalo Navarro Universidad de Chile, Chile

Leopoldo Bertossi Carleton University, Canada

Yoshiharu Kohayakawa Universidade de Sao Paulo, Brazil

^

Sprin er

Library of Congress Control Number: 2006927819 Fourth IFIP International

Conference on Theoretical Computer Science- TCS 2006

Edited by G. Navarro, L. Bertossi, and Y. Kohayakawa

p. cm. (IFIP International Federation for Information Processing, a Springer Series in Computer Science)

ISSN: 1571-5736/1861-2288 (Internet) ISBN: 10: 0-387-34633-3 ISBN: 13: 9780-387-34633-5 elSBN: 10:0-387-34735-6 Printed on acid-free paper

Copyright © 2006 by International Federation for Information Processing. All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief exceipts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed in the United States of America. 9 8 7 6 5 4 3 2 1 springer.com

Preface

The papers contained in this volume were presented at the fourth edition of the IFIP International Conference on Theoretical Computer Science (IFIP TCS), held August 23-24, 2006 in Santiago, Chile. They were selected from 44 papers submitted from 17 countries in response to the call for papers. A total of 16 submissions were accepted as full papers, yielding an acceptance rate of about 36%. Papers sohcited for IFIP TCS 2006 were meant to constitute original contributions in two general areas: Algorithms, Complexity and Models of Computation; and Logic, Semantics, Specification and Verification. The conference also included six invited presentations: Marcelo Arenas (Pontificia Universidad Catolica de Chile, Chile), Jozef Gruska (Masaryk University, Czech Republic), Claudio Gutierrez (Universidad de Chile, Chile), Marcos Kiwi (Universidad de Chile, Chile), Nicola Santoro (Carleton University, Canada), and Mihalis Yannakakis (Columbia University, USA). The abstracts of those presentations are included in this volume. In addition, Jozef Gruska and Nicola Santoro accepted our invitation to write full papers related to their talks. Those two surveys are included in the present volume as well. TCS is a biannual conference. The first edition was held in Sendai (Japan, 2000), followed by Montreal (Canada, 2002) and Toulouse (France, 2004). TCS is organized by IFIP TCI (Technical Committee 1: Foundations of Computer Science). TCS 2006 was part of the 19th IFIP World Computer Congress (WCC 2006), constituting the TCl Track of WCC 2006, and it was sponsored by TCI and the Center for Web Research (CWR), at the Department of Computer Science of the University of Chile. We thank the local WCC organizers and TCI for their support in the organization of IFIP TCS. We also thank the members of the Program Committee and the additional reviewers for providing timely and detailed reviews. Finally, we thank TCI for inviting us to chair this edition of TCS.

Santiago, Chile

Gonzalo Navarro, TCI Track Chair & PC Cochair Leopoldo Bertossi, PC Cochair Yoshiharu Kohayakawa, PC Cochair

TCS 2006 Organization

Technical Committee 1 (TCI) Chair Mike Hinchey

NASA, USA

WCC 2006 TCI Track Chair Gonzalo Navarro

Center for Web Research Department of Computer Science Universidad de Chile, Chile

Program Committee Chairs Gonzalo Navarro

Center for Web Research Department of Computer Science Universidad de Chile, Chile

Leopoldo Bertossi

School of Computer Science Carleton University, Canada

Yoshiharu Kohayakawa

Department of Computer Science Institute of Mathematics and Statistics Universidade de Sao Paulo, Brazil

VIII

Preface

Program Committee Members Amihood Amir Marcelo Arenas Diego Calvanese Marsha Chechik Jan Chomicki Josep Diaz Volker Diekert Thomas Eiter David Fernandez-Baca Esteban Feuerstein Gianluigi Greco Jozef Gruska Claudio Gutierrez Joos Heintz Douglas Howe Klaus Jansen Deepak Kapur Michael Krivelevich Ravi Kumar Leonid Libkin Satyanarayana V. Lokam Ernst Mayr Daniel Panario Rene Peralta Jean-Eric Pin Bruce Reed Marie-Prance Sagot Nicola Santoro Phihp Scott Torsten Schaub Angelika Steger Jayme Szwarcfiter Wolfgang Thomas Jorge Urrutia Alfredo Viola

Bar-Ilan University (Israel) Pontificia Universidad Catolica de Chile (Chile) Free University of Bolzano/Bozen (Italy) University of Toronto (Canada) University at Buffalo (USA) Universitat Politecnica de Catalunya (Spain) Universitat Stuttgart (Germany) Technische Universitat Wien (Austria) Iowa State University (USA) Universidad de Buenos Aires (Argentina) Universita della Calabria (Italy) Masaryk University in Brno (Czech Republic) Universidad de Chile (Chile) Universidad de Buenos Aires (Argentina) Carleton University (Canada) Universitat Kiel (Germany) University of New Mexico (USA) Tel Aviv University (Israel) Yahoo! Research (USA) University of Toronto (Canada) Microsoft Research (USA) Technische Universitat Miinchen (Germany) Carleton University (Canada) NIST (USA) LIAFA (CNRS, Universite Paris 7, France) McGill University (Canada) INRIA (Prance) Carleton University (Canada) Ottawa University (Canada) Universitat Potsdam (Germany) ETH Zurich (Switzerland) Universidade Federal do Rio de Janeiro (Brazil) RWTH Aachen (Germany) Universidad Nacional Autonoma de Mexico (Mexico) Universidad de la Republica (Uruguay)

Preface

External Reviewers Eugene Asarin Marie-Pierre Beal Liming Cai Ivana Cerna Florian Diedrich Olga Gerber Stefan Goller Serge Grigorieff David Ilcinkas Elham Kashefi Markus Lohrey Anil Maheshwari Marc Moreno Maza Michele Mosca Pedro Ortega Jose Miguel Piquer Philipp Rohde Alan Schmitt Peter Selinger Imrich Vrto

Inge Battenfeld Flavia Bonomo Roberto Caldelli Luc Devroye Mitre Dourado Mihaela Gheorghiu Cristina Gomes Fernandes Arie Gurfinkel Philippe Jorrand Werner Kuich Sylvain Lombardy Arnaldo Mandel Robert W. McGrail Thomas Noll Holger Petersen Ivan Rapaport Mauro San Martin Stefan Schwoon Ralf Thole Steven (Qiang) Wang

W C C 2006 Local Organization Mauricio Solar

Universidad de Santiago, Chile

IX

Contents

Part I Invited Talks Locality of Queries and Transformations Marcelo Arenas

3

Prom Informatics to Quantum Informatics Jozef Gruska

5

RDF as a Data Model Claudio Gutierrez

7

Adversarial Queueing Theory Revisited Marcos Kiwi

9

Distributed Algorithms for Autonomous Mobile Robots Nicola Santoro

11

Recursion and Probability Mihalis Yannakakis

13

Part II Invited Papers Prom Informatics to Quantum Informatics Jozef Gruska

17

Distributed Algorithms for Autonomous Mobile Robots Giuseppe Prencipe, Nicola Santoro

47

Part III Contributed Papers The Unsplittable Stable Marriage Problem Brian C. Dean, Michel X. Goemans, Nicole Immorlica

65

Variations on an Ordering Theme with Constraints Walter Guttmann, Markus Maucher

77

XII

Contents

BuST-Bundled Suffix Trees Luca Bortolussi, Francesco Fabris, Alberto Policriti

91

An 0(1) Solution to the Prefix Sum Problem on a Specialized Memory Architecture Andrej Brodnik, Johan Karlsson, J. Ian Munro, Andreas Nilsson

103

An Algorithm to Reduce the Communication Traffic for Multi-Word Searches in a Distributed Hash Table Yuichi Sei, Kazutaka Matsuzaki, Shinichi Honiden

115

Exploring an Unknown Graph to Locate a Black Hole Using Tokens Stefan Dobrev, Paola Flocchini, Rastislav Krdlovic, Nicola Santoro

131

Fast Cellular Automata with Restricted Inter-Cell Communication Martin Kutrib, Andreas Malcher

151

Asynchonous Distributed Components: Concurrency and Determinacy . . . 165 Denis Caromel, Ludovic Henrio Decidable Properties for Regular Cellular Automata Pietro Di Lena

185

Symbolic Determinisation of Extended Automata Thierry Jeron, Herve Marchand, Vlad Rusu

197

Regular Hedge Model Checking Julien d'Orso, Tayssir Touili

213

Completing Categorical Algebras Stephen L. Bloom, Zoltdn Esik

231

Reusing Optimal TSP Solutions for Locally Modified Input Instances . . . . 251 Hans-Joachim Bockenhauer, Luca Forlizzi, Juraj Hromkovic, Joachim Kneis, Joachim Kupke, Guido Proietti, Peter Widmayer Spectral Partitioning of Random Graphs with Given Expected Degrees . 271 Amin Coja-Oghlan, Andreas Goerdt, Andre Lanka A Connectivity Rating for Vertices in Networks Marco Abraham, Rolf Kotter, Antje Krumnack, Egon Wanke

283

On PTAS for Planar Graph Problems Xiuzhen Huang, Jianer Chen

299

Index

315

Part I

Invited Talks

Locality of Queries and Transformations (Invited

Talk)

Marcelo Arenas * Center for Web Research & Computer Science Department, Pontificia Universidad Catolica de Chile, Escuela de Ingenien'a - DCC143, Casilla 306, Santiago 22, Chile. marenasQing.puc.cl

Abstract Locality notions in logic say that the truth value of a formula can be determined locally, by looking at the isomorphism type of a small neighborhood of its free variables. Such notions have proved to be useful in many applications especially in computer science. They all, however, refer to isomorphism of neighborhoods, which most local logics cannot test. A more relaxed notion of locahty says that the truth value of a formula is determined by what the logic itself can say about that small neighborhood. Or, since most logics are characterized by games, the truth value of a formula is determined by the type, with respect to a game, of that small neighborhood. Such game-based notions of locality can often be applied when traditional isomorphism-based locality cannot. In the first part of this talk, we show some recent results on game-based notions of locality. We look at two, progressively more complicated locality notions, and we show that the overall picture is much more complicated than in the case of isomorphism-based notions of locality. In the second part of this talk, we concentrate on the locality of transformations, rather than queries definable by formulas. In particular, we show how the game-based notions of locality can be used in data exchange settings to prove inexpressibility results.

Partially supported by FONDECYT grant 1050701 and the Millennium Nucleus Center for Web Research, Grant P04-067-F, Mideplan, Chile. Please use the following format when citing this chapter: Ai'enas, M., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), p. 3.

From Informatics to Quantum Informatics (Invited

Talk)

Jozef Gruska* Faculty of Informatics, Masaryk University, Brno, Czech Republic. gruskaOf i . mimi. cz

Abstract During the recent years, exploration of the quantum information processing and communication science and technology got a significant momentum, and it has turned out quite clearly that paradigms, concepts, models, tools, methods and outcomes of informatics play by that a very important role. They not only help to solve problems quantum information processing and communication encounters, but they bring into these investigations a new quality to such an extend that one can now acknowledge an emergence of a quantum informatics as of an important area of fundamental science with contributions not only to quantum physics, but also to (classical) informatics. The main goal of the talk will be to demonstrate the emergence of quantum informatics, as of a very fundamental, deep and broad science, its outcomes and especially its main new fascinating challenges, from informatics and physics point of view. Especially challenges in the search for new primitives, computation modes, new quality concerning efficiency and feasibility of computation and communication, new quality concerning quantum cryptographic protocols in a broad sense and also in a very new and promising area of quantum formal systems for programming, semantics, reasoning and verification. The talk is targeted to informaticians that are pedestrians in quantum world, but would like to see what are new driving forces in informatics, where they drive us and how.

* Support of the grants GACR 201/04/1153 and MSM0021622419 is acknowledged. Please use the following format when citing this chapter: Gruska, J., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), p. 5.

R D F as a Data Model (Invited

Talk)

Claudio Gutierrez * Center for Web Research, Computer Science Department, Universidad de Chile, Blanco Encalada 2120, 3er piso, Santiago, Chile. cgutierrOdcc.uchile.cl

Abstract The Resource Description Framework (RDF) is the W3C recommendation language for representing metadata about Web resources. It is the basic data layer of the Semantic Web. The original design was influenced by the Web, library, XML and Knowledge representation communities. The driving idea was a language to represent information in a minimally constraining and flexible way. It turns out that the impact of the proposal goes far beyond the initial goal, particularly as a model for representing information with a graph-like structure. In the first half of the talk we will review RDF as a database model, that is, from a data management perspective. We will compare it with two data models developed by the database community which have strong similarities with RDF, namely, the semistructured and the graph data models. We will focus the comparison on data structures and query languages. In the second half of the talk, we will discuss some of the challenges posed by RDF to the Computer Science Theory Community: 1. 2. 3. 4. 5. 6.

RDF as data model: Database or knowledge base? Abstract model for RDF: What is a good foundation? Concrete -real life- RDF data: What are the interesting fragments? Theoretical novelties of the RDF data model: Are there any? RDF Query Language: Can the database experience be of any help? Infrastructure for large-scale evaluation of data management methodologies and tools for RDF: Waiting for something? 7. Storing, Indexing, Integrity Constraints, Visualization et al.: Theory is required.

The speaker acknowledges the support of Millennium Nucleus Center for Web Research, Grant P04-067-F, Mideplan, Chile. Please use the following format when citing this chapter: Gutierrez, C, 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), p. 7.

Adversarial Queueing Theory Revisited (Invited

Talk)

Marcos Kiwi* Depto. Ing. Matematica & Ctr. Modelamiento Matematico UMI 2807, Universidad de Chile. Blanco Encalada 2120, piso 5, Santiago, Chile. www. dim. uchile. cl/~mkiwi.

Abstract We survey over a decade of work on a classical Queueing Theory problem; the long-term equilibrium of routing networks. However, we do so from the perspective of Adversarial Queueing Theory where no probabilistic assumptions about traffic patterns are made. Instead, one considers a scenario where an adversary controls service requests and tries to congest the network. Under mild restrictions on the adversary, one can often still guarantee the network's stability. We illustrate other applications of an adversarial perspective to standard algorithmic problems. We conclude with a discussion of new potential domains of applicability of such an adversarial view of common computational tasks.

Background In 1996 Borodin et al. [9] proposed a robust model of queueing theory in network traffic. The gist of their proposal is to replace stochastic assumptions about the packet traffic by restrictions on the packet arrival rate, which otherwise can be under the control of an adversary. Thus, they gave rise to what is currently termed Adversarial Queueing Theory (AQT). In it, the time-evolution of the routing network is viewed as a game between an adversary and a packet scheduling protocol. The AQT framework originally focussed on the issue of stability of queueing policies and network topologies. Characterizations and efficient algorithms were developed for deciding stability of a collection of networks for specific families of scheduling policies. Generalizations of the AQT framework were proposed. Endto-end packet delay issues were addressed. Time-dependent network topology variants were considered, etc. We survey a decade of results in AQT. We point to other work where a similar adversarial approach has been successfully developed. We conclude with a discussions of other computational domains where a similar adversarial approach might be fruitfully applied. Gratefully acknowledges the support of CONICYT via FONDAP in Applied Mathematics and Anillo en Redes. Please use the following format when citing this chapter: Kiwi, M., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 9-10.

10

References 1. W. Aiello, E. Kushilevitz, R. Ostrovsky, and A. Rosen. Adaptive packet routing for bursty adversarial traffic. In Proc. of the ACM Symposium on Theory of Computing, 359-368, 1998. 2. C. Alvarez, M. Blesa, J. Diaz, A. Fernandez, and M. Serna. Adversarial models for priority based networks. In Proc. of the International Symposium on Mathematical Foundations of Computer Science, 142-151, Springer-Verlag, 2003. 3. C. Alvarez, M. Blesa, and M. Serna. A characterization of universal stability in the adversarial queueing model. SIAM J. Comput., 34(l):41-66, 2004. 4. M. Andrews, B. Awerbuch, A. Fernandez, J. Kleinberg, T. Leighton, and Z. Liu. Universal stability results and performance bounds for greedy contention resolution protocols. J. of the ACM, 48(l):39-69, 2001. 5. M. Andrews, A. Fernandez, A. Goel, and L. Zhang. Source route and scheduling in packet networks. In Proc. of the IEEE Symposium on Foundations of Computer Science, 2001. 6. E. Anshelevich, D. Kempe, and J. Kleinberg. Stability of load balancing algorithms in dynamic adversarial systems. In Proc. of the ACM Symposium on Theory of Computing, 399-406, 2002. 7. B. Awerbuch, P. Berenbrink, A. Brinkmann, and C. Scheideler. Simple routing strategies for adversarial systems. In Proc. of the IEEE Symposium on Foundations of Computer Science, 158-167, 2001. 8. R. Bhattacharjee, A. Goel, and Z. Lotker. Instability of FIFO at arbitrarily low rates in the adversarial queueing model. SIAM J. Comput, 34(2):318-332, 2005. 9. A. Borodin, J. Kleinberg, P. Raghavan, M. Sudan, and D. Williamson. Adversarial queueing theory. J. of the ACM, 48(l):13-38, 2001. 10. A. Borodin, R. Ostrovsky, and Y. Rabani. Stability preserving transformations: Packet routing nertworks with edge capacity and speed. In Proc. of the ACMSIAM Symposium on Discrete Algorithms, 601-610, 2000. 11. A. Charny and J.-Y. Le Boudec. Delay bounds in a network with aggregate scheduling. In Proc. of the International Workshop on Quality of Future Internet Services, 1-13. Springer-Verlag, 2000. 12. I. Chlamtac, A. Farago, H. Zhang, and A. Fumagalli. A deterministic approach to the end-to-end analysis of packet flows in connection-oriented networks. lEEEACM T. Network., 6(4):422-431, 1998. 13. D. Gamarnik. Stability of adversarial queues via fluid model. In Proc. of the IEEE Symposium on Foundations of Computer Science, 60-70, 1998. 14. M. Kiwi and A. Russell. The Chilean highway problem. Theor. Comput. Set., 326(1-3) :329-342, 2004. 15. M. Kiwi, M. Soto, and C. Thraves. Adversarial queueing theory with setups. Technical report. Center for Mathematical Modelling, U. Chile, 2006. 16. P.R. Kumar and T.I. Seidman. Dynamic instabilities and stabilization methods in distributed real-time scheduling of manufacturing systems. IEEE Trans, on Automat. Contr., 35(3):289-298, 1990. 17. J.-Y Le Boudec and G. Hebuterne. Comments on "A deterministic approach to the end-to-end analysis of packet flows in connection oriented network". lEEEACM T. Network., 8(1):121-124, 2000. 18. Z. Lotker, B. Patt-Shamir, and A. Rosen. New stability results for adversarial queuing. In Proc. of the ACM Symposium on Parallel Algorithms and Architectures, 192-199, 2002.

Distributed Algorithms for Autonomous Mobile Robots (Invited

Talk)

Nicola Santoro School of Computer Science, Carleton University, santoro9scs.ccirleton.ca

Abstract The distributed coordination and control of a team of autonomous mobile robots is a problem widely studied in a variety of fields, such as engineering, artificial intelligence, artificial life, robotics. Generally, in these areas, the problem is studied mostly from an empirical point of view. Recently, a significant research effort has been and continues to be spent on understanding the fundamental algorithmic limitations on what a set of autonomous mobile robots can achieve. In particular, the focus is to identify the minimal robot capabilities (sensorial, motorial, computational) that allow a problem to be solvable and a task to be performed. In this talk we describe the current investigations on the interplay between robots capabilities, computability, and algorithmic solutions of coordination problems by autonomous mobile robots.

Please use the following format when citing this chapter: Santoro, N., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), p. 11.

Recursion and Probability (Invited

Talk)

Mihalis Yannakakis * Department of Computer Science, Columbia University, 455 Computer Science Building, 1214 Amsterdam Avenue, Mail Code 0401 New York, NY 10027. [email protected]

Abstract We discuss recent work on the algorithmic analysis of systems involving recursion and probability. Recursive Markov chains extend ordinary finite state Markov chains with the ability to invoke other Markov chains in a potentially recursive manner. They offer a natural abstract model of probabilistic programs with procedures, and generalize other classical well-studied stochastic models, eg. Multi-type Branching Processes and Stochastic Context-free Grammars. Recursive Markov Decision Processes and Recursive Stochastic Games similarly extend ordinary finite Markov decision processes and stochastic games, and they are natural models for recursive systems involving both probabilistic and nonprobabilistic actions. In a series of recent papers with Kousha Etessami (U. of Edinburgh), we have introduced these models and studied central algorithmic problems regarding questions of termination, reachability, and analysis of the properties of their executions. In this talk we will present some of the basic theory and algorithms.

Research partially supported by NSF Grant CCF-4-30946. Please use the following format when citing this chapter: Yannakakis, M., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), p. 13.

Part II

Invited Papers

From Informatics to Quantum Informatics Jozef Gruska* Faculty of Informatics, Masaryk University, Brno, Czech Republic gruskaQf i.muni.cz

Abstract. Quantum phenomena exhibit a variety of weird, counterintuitive, puzzling, mysterious and even entertaining effects. Quantum information processing tries to make an effective use of these phenomena to design new quantum information processing and communication technology and also to get a better understanding of quantum and information processing worlds. During the recent years, exploration of the quantum information processing and communication science and technology got a significant momentum, and it has turned out quite clearly that paradigms, concepts, models, tools, methods and outcomes of informatics play by that a very important role. They not only help to solve problems quantum information processing and communication encounter, but they bring into these investigations a new quality, and to such an extend, that one can now acknowledge an emergence of a quantum informatics as of an important new area of fundamental science with contributions not only to quantum physics, but also to (classical) informatics itself. The main goal of this paper is to demonstrate the emergence of quantum informatics, as of a very fundamental, deep and broad science, its outcomes and especially its main new fascinating challenges, from informatics and physics point of view. Especially challenges in the search for new primitives, computation modes, new quality concerning efficiency and feasibility of computation and communication, new quality concerning quantum cryptographic protocols in a broad sense, and also in a very new and promising area of quantum formal systems for programming, semantics, reasoning and verification. The paper is targeted towards informaticians that are pedestrians in the mysterious quantum world, but would like to see what are new driving forces in informatics, where they drive us, why and how. In the paper, oriented towards broad audience, main mysteries, puzzles and specific features of quantum world are dealt with as well as basic models, laws, limitations, results and the state-of-the-art of quantum information processing and communication.

1 Introduction In q u a n t u m computing we witness a merge of two arguably the most important areas of science of 20th century: q u a n t u m physics and informatics. It would * Support of the grants GACR 201/04/1153 and MSM0021622419 is acknowledged. Please use the following format when citing this chapter: Gruska, J., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 17-46.

18

J. Gruska

therefore be astonishing if such a merge would not shed new light on b o t h of t h e m and would not bring new great discoveries. This merge is surely bringing new aims, challenges and potentials for informatics and also new approaches to explore q u a n t u m world. In spite of the fact t h a t it is hard to predict particular impacts of q u a n t u m computing on computing in general, it is quite safe to expect t h a t the merge will lead t o important outcomes. Since the very beginning of q u a n t u m mechanics, various its mysterious and counterintuitive phenomena have been discovered, b u t science community did not pay too large attention to t h e m because they looked as innocent features t h a t largely exist due to our, still not perfect, mathematical m o d e l / u n d e r s t a n d i n g of the q u a n t u m world, or as phenomena investigation of which can be postponed. Randomness of q u a n t u m measurement and resulting collapse of the q u a n t u m state being measured, q u a n t u m entanglement and non-locality in correlations exhibited due to it^, are perhaps the most puzzling ones. Q u a n t u m counterfactual effects with its peculiar consequences^ are even more weird phenomena. In between, situation has radically changed. Q u a n t u m entanglement has been shown to be useful to perform actions, as q u a n t u m teleportation (Bennett et al, 1993), t h a t is not possible in the classical world, to achieve in computation the efficiency t h a t seems to be impossible in the classical world, as Shor's polynomial time algorithms for factorization and discrete logarithms (Shor, 1994)), t o achieve level of security not possible in the classical world (for example for classical keys generation (Ekert, 1991)), to increase exponentially efficiency of communicating protocols (Raz, 1999), to introduce new important capacities and to increase old capacities of q u a n t u m channels (see Gruska (1999-2005) and Nielsen and Chuang (2000) for an overview, and so on. All t h a t is still only a small list of the success story of q u a n t u m entanglement t h a t has been experimentally demonstrated for distance of u p to 50km using fiber (Marcikic et al., 2004) and up to 13km over noisy ground atmosphere (see Peng at al., 2004). It is, for example, believed, and expected by some, t h a t q u a n t u m entanAs formally defined later, entanglement of quantum states is defined using Hilbert space formalism for quantum phenomena. However, the existence of non-local correlations is an experimentally observed phenomenon and therefore independent of the choice of formalism. At the moment, the only observed non-local correlations are those exhibited by entangled states. This, however, does not exclude that some other non-local correlations will be discovered. The term counterfactual is usually used for things that might have happened, although they did not really happened. An important point is that while classical counterfactuals do not have physical consequences, quantum counterfactuals can have surprisingly big consequences because the mere possibility that some quantum event might have happened can change the probabilities of obtaining various experimental outcomes. For example, it can be shown that a quantum computer can provide the result of a computation without performing the computation provided it would provide the same result of computation by really performing the computation (Mitchinson and Jozsa, 1999).

Prom Informatics to Quantum Informatics

19

glement will have also large practical impacts. For example, to increase quality of measurements (see Childs et al. 1999). To summarize, quantum entanglement is now considered as a new very important resource for quantum information processing and communication, a resource that has, in addition, the following potentials (see also Gruska 19992005, 2003): - To provide a new gold mine for science and technology; - To give an edge to quantum versus classical information processing and communication. - To help to understand better various important physical phenomena. Surely, the most puzzling and powerful consequence of the existence of entangled quantum states is non-locality their measurements exhibit. Namely, if a set of particles is in an entangled state and one of the particles is measured, then this measurement immediately influences/determines results of subsequent measurements of other particles. There are therefore non-local correlations between results of the measurements of particles in an entangled state.

X = y implies a = b

Fig. 1. EPR-box Quantum nonlocahty, exhibited by the measurement of so-called EPR-state 7^(100) + 111)), can be modelled by so-called EPR-box shown in Figure 1. There are two parties involved, A and B, much separated by space, that do not communicate with each other, and an imaginary box with two input-output ports, each for one of the parties. If the party A puts in its input port a, it gets out, immediately, an output x, and if the party B puts in an input b it gets out, as the output, immediately, a y. The key property of the EPR-box is that ii a = b, then x = y, no matter in which order the parties put their inputs in and how much time is between their entries. No-scommunication (nosignaling) condition meanes that output of Alice (Bob) does not depend on the input of Bob (Alice). Nonlocality exhibited in the EPR-box can be manifested by the measurement of entangled states, namely of the EPR-state. However, non-locahty exhibited in so called PR-box, shown in Figure 2, where inputs and outputs are always in the relation x • y = (a © 6), seems to be beyond

20

J. Gruska

the possibilities of the physical world. Indeed, would there be a physical system t h a t would allow to implement the PR-box, then any multiparty communication could be done by transmitting only a single bit (van Dam, 2005) what can be indeed seen as impossible. Interesting enough, none of these non-localities allows instantaneous communication and therefore they actually do not contradict the no-signaling condition of special relativity"*.^ T h e task to understand nonlocality is one of the most important in current science. In this connection, the recent experiment (Scarani et al., 2000) is of importance, from which it follows t h a t there are reasons t o believe t h a t either space-time is an illusion or free will is an illusion or, as their experiment confirms, there is a special "quantum information" t h a t travel faster then light (but cannot be used directly to communicate classical information). No-signaling condition actually says that local choice of measurements may not lead to observable differences on the other ends. PR-box may seem as an artificial construction, but it is not so and it comes out very naturally when non-classical correlations and their limits are considered. Indeed, the basic scheme is that two parties separated in space, say A and B, that cannot communicate have an access to a physical state and can use it to generate correlations. This can be seen as that both parties to perform one of two randomly chosen measurements and then the outcomes of these measurements are given by random variables and one asks the question how much can these outcomes be correlated. Both classical physics and quantum mechanics put certain limits on strength of such correlations. The limits that any classical theory (i.e. local hidden variable theory) provides are known as Bell inequalities (Bell, 1964). There are many of them and among them special position has so-called CHSH inequality y~^

Prob(a;a ® 2/6 = a •fo)< 3,

a,66{0,l}

where a and b denote choices of the measurements of A and B, and Xa,yb are outcomes of measurements. Quantum mechanics allows violation of this inequality, but only up to so-called Cire 'Ison's bound 2 -|- A/2. The PR-box captures maximal possible, mathematically, violation of this bound. In spite of the fact that van Dam's result strongly indicates/proves physical impossibility of PR-boxes, they keep been intensively studied. For example, it has also been shown (Short et al., 2005), that availability of PR-boxes would allow unconditional secure oblivious transfer protocol, an important cryptographical primitive. Cerf et al. (2005) have also shown that a single PR-box could be used to simulate the EPR box, and therefore a maximally entangled state (its measurements), though not any two-qubit entangled state and that the PR-box would be a strictly weaker resource than a bit of communication. The PR-box can also be used to show that no-cloning theorem holds. PR-boxes have a variety of other surprising and also counterintuitive properties. They are surveyed nicely and referenced well by Scarani (2006). For example, two parties may need 2" PR-boxes for some tasks that can be performed using n EPR states. In addition, for all natural measures of nonlocality non-maximally entangled states exhibit more non-locality than maximally entangled states.

Prom Informatics to Quantum Informatics

21

X . y = (a + b) mod 2

Fig. 2. PR-box

Q u a n t u m superposition, t h a t stands for the fact t h a t any q u a n t u m state is a weighted superposition (with complex numbers as weights - probabiUty amp h t u d e s specifying probabihties of t h e transfer from a given s t a t e t o particular state of the basis) of the states of a basis, is another very special q u a n t u m phenomenon. One of the implications of t h a t is quantum parallelism t h a t allows, for example, on a single state of n q u a n t u m bits to perform, in a single step, an action t h a t corresponds, in some sense, to 2 " computation steps in the classical world. For example, one can get, in one step, into amplitudes of a q u a n t u m n-qubit state, all values of a function / : { 0 , . . . , 2 " - 1} - ^ { 0 , . . . , 2 " - 1}.^ There is a certain catch in this result/fact, because there is no way to get faithfully out all these values from the resulting q u a n t u m state. However, in some important cases, as it is in Shor's algorithm for factorization of integers n, this does not really m a t t e r , because what one needs to compute is only a single value, a period of a properly chosen function / ( x ) = a^ mod n, and in such a case such a massive q u a n t u m parallelism is indeed useful. A mysterious fact is why we do not observe superposition and entanglement between objects of the classical world if our world is actually fully quantum.'' ® With more technical details, it works as follows: If / : { 0 , 1 , . . . , 2" — 1} <-> { 0 , 1 , . . . , 2" — 1}, then the mapping / ' : {x, 0) = » {x, f{x)) is one-to-one and therefore there is a unitary transformation Uf such that for any a; G { 0 , 1 , . . . , 2" — 1}.

Ufi\xm) =^ \x)\f{x)} The state \ip) = -^Yli=o l*)|0) can be obtained in a single step, using Hadamard transform, from the basis state |0'"^} and with a single application of the mapping Uf, on the state \'>p) we get

UfW

E H)i/w)

Hence, in a single computation step, 2" values of / are computed! We have therefore a really massive parallelism. "^ Of interest in this context are two well known citations: There is no quantum world. There is only an abstract quantum physical description. It is wrong to think that the

22

J. Gruska

This strange situation was already long time ago well demonstrated by famous Schrodinger's cat Gedanken experiment with a cat t h a t is in a superposition of states \alive) and \dead) - though none has ever seen a cat t h a t would be b o t h alive and dead. An important agenda of the current experimental research is therefore to find some border lines, if they exist at all, between the world in which superposition exists and the one where no superposition can be detected.^ There have been surprising results in such investigations recently. For example, entanglement has been demonstrated at a group of 10^^ atoms (see Julsgaard et al., 2000) and q u a n t u m interference for large molecules (see Brezger et al. 2002). However, there is still a range of several orders of magnitudes to explore where border between classical and q u a n t u m world is. Concerning q u a n t u m measurement, there are also several mysterious and counterintuitive things. T h e first one is the fact t h a t results of q u a n t u m measurement are random. Einstein's position was expressed by his famous words God does not roll dice, b u t equally famous is Bohr's reply The true God does not allow anybody to prescribe what he has to do.^ and the puzzling fact about q u a n t u m measurement is t h a t theory does not say anything about how much a particular measurement really costs in terms of some physical resources. Because of t h a t it is usually considered, in efficiency calculations, t h a t a measurement step requires a unit time. However, this does not seem to be realistic because sometimes we can see at a q u a n t u m measurement as t h a t N a t u r e performs, in a "unit time", quite complicated computation, what is again against our common sense. Q u a n t u m measurement can therefore be seen as a special resource t h a t , if properly used, can do miracles, from q u a n t u m information processing point of view. task of physics is to find out how Nature is. Physics concerns what we can say about Nature, by N. Bohr and There is no classical world - there is only quantum world by D. Greenberger (see Arndt et al., 2005), who actually said: I believe there is no classical world. There is only quantum world. Classical physics is a collection of unrelated insights: Newton's laws. Hamilton's principle, etc. Only quantum theory brings out their connection. An analogy is the Hawaiian Islands, which look like a bunch of island in the ocean. But if you could lower the water, you would see, that they are the peaks of a chain of mountains. That is what quantum physics does to classical physics. In this context another views are of interest from Arndt et al. (2005): The border between classical and quantum, phenomena is just a question of money, by A. Zeilinger, The classical-quantum boundary is simply a matter of information control, by M. Aspelmeyer, and There is no border between classical and quantum phenomena ~ you just have to look closer, by R. Bertlman. Experiments performed recently actually imply not only that God does play dice, but actually that God plays with non-local dice, beause measuement of an entangled state can produce shared randomness, see Gisin (2005).

From Informatics to Quantum Informatics

23

2 Basics of q u a n t u m information processing and communication Quantum physics deals with fundamentals entities of physics — particles, like (a) protons, electrons and neutrons (from which matter is built); (b) photons (which carry electromagnetic radiation); (c) various "elementary particles" which mediate other interactions of physics. We call all of them particles in spite of the fact that some of their properties are totally unlike the properties of what we call particles in our ordinary world. (Actually, it is not clear in which sense these "particles" can be said to have properties at all.) It is also clear that quantum physics is an elegant and conceptually simple theory that describes with surprising precision a large spectrum of the phenomena of Nature. Predictions made on the base of quantum physics have been experimentally verified to 14 orders of precision. No conflict between predictions of the theory and experiments is known. Without quantum physics we cannot explain properties of superfluids, functioning of laser, color of stars, Quantum physics is of special interest for informatics for several reasons. One of them is similarity, in a sense, and close relation between these two areas of science. Indeed, the goal of physics can be seen as to study elements, processes, laws and limitations of the physical world. Goal of informatics can then be seen as to study elements, processes, laws and limitations of the information world. Of large importance is therefore to explore which of these two worlds, physical and information, is more basic, if any, and what are the main relations between the basic concepts, principles, laws and limitations of these two worlds. Quantum physics can be also seen as an excellent theory to predict probabilities of quantum events. Such predictions are to a large extend based on three simple principles: PI To each transfer, from a quantum state <> / to a state tjj, a complex number {ip\4>) is associated, which is called the probability amplitude of the transfer, and |(^|(?!>)p is then the probability of such a transfer. P2 If a transfer from a quantum state (f> to & quantum state ip can be decomposed into two subsequent transfers ip *— 4>' <— (p, then the resulting amplitude of the transfer form ^ to i/' is the product of the amplitudes of subsequent subtransfers: {ip\) = {'ip\^'){(f>'\(p) P3 If the transfer from a state (j) to a. state ip has two independent alternatives, then the resulting amplitude is the sum of the amplitudes of two subtransfers, which can be zero if a = —/3. (This has surprising consequences. It may happen that there are two ways, each with positive probability k = | a p , how to get from a state {(p) to a state \ip), but if both options are possible, then such a transfer has zero probability.) To the physical concept of quantum system, the mathematical concept of the Hilbert space is usually associated, and to the physical concept of a (pure) state of a closed (that is not interacting with environment) quantum system, the mathematical concept of a vector/state of a Hilbert space corresponds.

24

J. Gruska

a

l¥> B Hilbert space iJ„ is an n-dimensional complex vector space on which the scalar product

{ijj\4>) = 2 . 't'i4'i of Einy two vectors

01 02 0n

Ipl

,w =

tp2

i'n

is defined as well as the norm of a vector \\cj)\\ = ^/\{(p\4>)\ and the metric dist(0, V') = 110 — V'll- This allows to introduce on if a topology and such concepts as continuity. Two quantum states are called orthogonal if their scalar product is zero. This is a very important concept because physically are perfectly distinguishable only orthogonal states. Dirac introduced the following handy notation, so called bra-ket notation, to deal with amplitudes, quantum states and linear functionals f : H —^ C. If -0,0 G H, then {ip\(p) is the scalar product of ijj and (f> (and an amplitude of going from cp to V'); |0) is called a ket-vector- a column vector, an equivalent to 4>; {ip\ is a bra-vector - a row vector, a linear functional on H such that

mm = (v;|0). Evolution in a quantum system is described by the Schrodinger linear equation ih-dipit) Hit)m,

dt

where H is the Plank constant, ip{t) is the state of the system in time t and H{t) is a quantum analogue of a Hamiltonian of the classical system. In case H is constant, the Schrodinger equation has as solution ip{t) = e~R^*'0(O) and from that it follows that a discretized evolution (computation) of any quantum system is performed by a unitary operator and a step of such an evolution we can see as a multipHcation of a unitary matrix^^ A with a vector |i/)), i.e. as

Am^° A matrix A is unitary ii A • A^ = A^ • A = I, where A^ is the matrix obtained from A by transposition and then by replacement of each element by its complex conjugate.

From Informatics to Quantum Informatics

25

A quantum bit, called usually qubit, is then a quantum state in H2, \) = a\0) + P\l), where a,/3 € C are such that jap + |/3|2 = 1 ( {|0), |1)} is the standard basis of H2). Important operations on one qubit are Hadamard transform, represented by the matrix

Now we can say that the essence of the difference between the classical computers and quantum computers is in the way information is stored and processed. In classical computers, information is represented on macroscopic level, by bits, which can take on one of two values, 0 or 1. In quantum computers, information is represented on microscopic level, using qubits, which can take on any from uncountable many values a|0) +/3| 1), where ex, /? are arbitrary complex numbers such that ]ap + |/?p = 1. Very important is also difference between the ways compound classical and compound quantum systems are created. In the classical case, any state of a composed system is composed of the states of subsystems. This is not so in the quantum case. If a Hilbert space H {Ti') corresponds to a quantum system S (S'), and {ai}i ({/3j}j) is a basis of W (W), then the tensor product of H and H', notation Ti ®H', corresponds to the quantum system composed of S and S' and this Hilbert space has a (standard) basis consisting of all tensor products of states \ai) and \f3j). For example, Hilbert space 7^4 can be seen as the tensor product of two one-qubit Hilbert spaces, H'z ®'H2, and therefore one of its (standard) basis consists of the states |0) 0 |0), |0) ® |1), |1) ® |0), |1) ® |1) These states are usually denoted shortly as: |00>, |01), |10), 111). Another important orthogonal basis in 7T!4 consists of the following four so-called Bell states: |^+) = - ^ ( | 0 0 ) + 111),

1^-) = -^(|00> - 111)),

10^+) = -^(101) + |10),

\^-) = i = ( | 0 1 ) - |10)).

Similarly, the (standard) basis states of an n-qubit Hilbert space W2" are the states \i\i2---in) = |Ji)<8)...<8i|Jn), where i^ G {0,1} for all k. ^^ Hadamard operation transforms the standard basis {|0), |1)} into the dual basis, consisting of the vectors {|0') = ^ ( | 0 ) + jl)), |1') = ^ ( | 0 ) - |1))}

26

J. Gruska A general state, \(j>), of an n-qubit register has therefore the form: l) =

X]

^x\x),

where

^

\ax\'^ = 1.

Operators on n-qubits registers are unitary matrices of degree 2". If a state |0) of an n-qubit register is measured with respect to the standard basis {|a^)}xe{o,i}'') then in the quantum world the state \(p) collapses, with probability |axP, into the state |a;), and into the classical world information about that, as x, emerges. The key concept of so called open quantum systems, that is quantum systems interacting with environmnet, is the concept of a mixed state, what is a probability distribution {{pi, \4'i))}i=i on pure states {\(pi}}i. To each such a mixed state the density operator p = '}2i=i Pk\4>i){(t^i] is associated, and its matrix representation is called density matrix. A very important fact is that it may happen that the same density matrix corresponds to two mixed states and that two mixed states are physically undistinguishable if their density operators (matrices) are the same. In modern quantum information processing literature, the concept of the state is often associated with that of the density operator. Now we are in the position to define formally a so important concept of entangled states. A pure state \(j)) of a tensor product of Hilbert spaces H\ <8) . . . (8) Hn is called entangled if it cannot be decomposed in the form | (p) = \4>i) ® • • • ^ \4>n), where \4>i) is a pure state oi Hi. A mixed state p of n qubits is called (fully) separable if it can be decomposed as

E PiPl ' <E)...'S>Pl ', where Y^^Pi = 1 and p^' is a density matrix of j - t h qubit, for any j . Otherwise, p is called inseparable or entangled. We can now formulate one important limitation of quantum information processing and to summarize differences between the classical and quantum information. The limitation is that there is no universal way how to copy/clone unknown quantum states - what so called no-cloning theorem says. On the level of qubits, no-cloning theorem says that there is no unitary transformation U such that for any one-qubit state \(f)) it holds U{\(j))\Q)) = \4>)\(j)).^'^ ^^ Proof. Let us assume that a unitary U with such a property exists and that for two different states, \a) and |/3), U{\a)\0)) = \a)\a) ul\P)\0)) = |/3)|/3). Let |7) = ^ ( | a ) + |/3)), then C/(|7>|0)) = ^ ( l « ) l « ) + mm

^ |7>|7) = l{\a)\a) + |/3)|/?) + |a)|/3) + |/3>|a)).

From Informatics to Quantum Informatics

27

We can now also say that important properties of the classical information are: (a) transmission of information in time and space is very easy (b) making unlimited number of copies of information is very easy. On the other side, important properties of the quantum information are: (a) transmission of the quantum information in time and space is very difficult; (b) there is no way to make faithful copies of unknown quantum information, (c) attempts to measure the quantum information destroy it, in general.

3 Outcomes and challenges of q u a n t u m computation Quantum polynomial time algorithms of Shor, in 1994, that could be used to break important classical cryptosystems, were so far main apt killers for quantum information processing. A natural quantum version of the Fourier transform has been the main tool^^ and the quantum Fourier transform has been also used later to design various other quantum algorithms that are more efficient than the most efficient classical algorithms for the same algorithmic problems. Main generalized result is that there are quantum polynomial time algorithms for so called Hidden Subgroup Problem for Abelian groups.-^^ Perhaps the most important open problem in the design of quantum algorithms is to determine whether the Hidden Subgroup Problem is always solvable in polynomial time also for non-Abelian groups. Would this be true, it would imply, for example, that there is a quantum polynomial time algorithm also for the graph isomorphism problem. Even of large impact on the design of efficient quantum algorithms have had the discovery of Grover (1996). who has shown that one can find in an unordered database of A'' elements a unique element satisfying a given condition P in \/N quantum steps. His idea was generalized and applied in numerous ways and resulted also into so-called probability amplification technique. Recently, quantum random walks got a momentum as a way to design quantum algorithms (see Aharonov et al., 2001). Of interest are also non-traditional modes of quantum computation as adiabatic (see Farhi et al., 2000). Several ingenious techniques have also been developed to prove lower bounds: for example, the polynomial method (Beals et al., 1998), the quantum adversary method (Ambainis, 2000) and its various variants. They have been used to show a variety of impressive lower bound results (see Gruska, 1999-2005, for an overview). Also other quantum generalizations of transforms known from signal processing and applied mathematics have turned out to be useful for the design of quantum algorithms. The Hidden Subgroup Problem is the following one: Given is an (efficiently computable) function f : G —* R, where G is a finite group and R a finite set and a promise that there exists a subgroup Go < G such that / is constant on any left cosset and distinct on different cossets of Go. The task is to find a generating set for Go (in polynomial time (in Ig |G|) in the number of calls to the oracle for / and in the overall polynomial time).

28

J. Gruska

There are several, and some quite surprising, models of quantum universal computation. The most basic one is that of quantum unitary-operations based circuits, that is defined in a similar way as in the classical case, only gates have to be quantum, representing quantum unitary operations. Given an algorithmic problem P , in order to solve it using a quantum circuit one has to find at first a unitary operations Up that solves P and then to create a quantum circuits Cup, with quantum gates from some universal set of quantum gates, that implements Up. A variety of special problems concerning quantum computation comes from the fact that quantum unitary operations have to be reversible, that is such that one can uniquely determine inputs from their outputs. This seems to be a very special and strong restriction because from the most basic logical operations only NOT is reversible and none of the basic arithmetical operations. An important contribution to the understanding of the computational power of quantum phenomena was a surprising result of Bennett (1973) that says that if a function f is computable by a one-tape Turing machine in time t{n), then there is a 3-tape reversible Turing machine computing, with constant time overhead, the mapping a —» {a,g{a), f{a)), where g(a) is so called gaih&ge that can be removed using a special technique. For classical reversible computations of Boolean functions universal is so called Toffoli operation, or control-control-not operation, CCNOT(a;,y,z) = {x,y, (xDy)® z). Nature offers many ways - let us call them technologies - various quantum information processing primitives can be exhibited, realized and utilized. Since it appears to be very difficult to exploit potential of the Nature for QIP, it is of large importance to explore which quantum primitives form universal sets of primitives. Also from the point of view of the understanding of the laws and limitations of QIP, and also of quantum mechanics itself, the problems of finding rudimentary and universal QIP primitives are of large importance. Concerning universal sets of computation primitives, the very basic result says that a single two-qubit operation control-not, CNOT{\x).\y)) = \x)\x®y), and all one-qubit gates form a universal set of gates that can be used to design, for any unitary operation and any given precision £ > 0, a quantum circuit to approximate this operation with precession e. (The catch is that it is very difficult to create the CNOT-gate because such a gate has to be able to transform two separable states into an entangled state.-^^ Universal is also the set of the ^^ There are many ways how to create entangled states. For example, using various special physical processes. Of importance for understanding problems with the design of quantum processes is the fact that if CNOT is applied to two simple and separated one-qubit states, then CNOT may produce an entangled state: Indeed, CNOT(|0), TTjdO) + |I>) = 7f (|00) + 111))- Another surprising way how to create an entangled state of two separated particles is so-called entanglement swapping: If particles Pi and P2 are in the EPR-state and so are particles P3 and P4, then Bell measurement of particles P2 and P3, makes particles Pi and P4, that have never interacted before, to get into the maximally entangled EPR-state: In other words,

From Informatics to Quantum Informatics

EPR-state PI

29

EPR-state PA

yP3

P4

BELL MEASUREMENT

EPR-state Fig. 3. Entanglement swapping

following three operations: C N O T , H a d a m a r d and aj = I „ i j 1. For computational purposes with classical input and output, universal is also the set of only two simple gates: the Toffoli gate and the H a d a m a r d gate (Shi, 2002). This actually means t h a t in order to get universality for q u a n t u m computation one has to add the H a d a m a r d gate to the Toffoli gate t h a t is universal for classical reversible computation. (Hadamard gate can actually create a perfectly random bit.) It is also known t h a t any n-qubit unitary operation can be implemented by a circuit consisting of 0 ( 4 " ) gates C N O T and one-qubit gates (see Vartianen et al., 2003). One of the recent surprising results in Q I P C is t h a t universal, from the computational point of view, are also circuits with gates performing only measurements and t h a t what is needed for t h a t are measurement-gates from only a very small set of gates. Measurement gates can be specified by Hermitian operators and measurements then correspond to the orthogonal basis created by the orthogonal set of eigenvectors of these Hermitian matrices. Actually, universal is a set of only four different Hermitian operators (measurements, see Perdrix, 2004). Measurement-based computations are probabilistic, up to a Pauli matrix, but this is only a small handicap. Another surprising model of universal computation are so-called one-way computers at which computation starts with a special entangled, so-called cluster state, but then only one qubit measurements are performed (Raussendorf and Briegel, 2000). All these results indicate t h a t search for primitives in q u a n t u m computation is likely still to be full of surprises and options, what is actually not so strange because N a t u r e offers so many way q u a n t u m information processing processes can be exhibited. CNOT gate has to be able to make entangled two particles that have never before interacted, see Figure 3.

30

J. Gruska

Two types of circuits are of special importance. Universal circuits, for certain number k of qubits, that can perform any unitary operation on k qubits if some classical parameters are fixed appropriately. Such universal circuits, with 3 CNOT gates and 15 elementary rotation gates for the case of two qubits and with 40 CNOT gates and 98 elementary rotation gates in the case of three qubits were derived by Vatan and WiUiams (2003, 2004), see also Gruska (2005). Programmable circuits (sometimes called programmable processors) are another type of circuits that are universal in some restricted sense and that are of theoretical and also of large application interest. The basic idea is similar to that in case of classical universal circuits: certain inputs form so-called operation register and are used to specify, through a quantum state, an operation U that is to be performed on the state 10) given on the remaining inputs on data register. There are several reasons why are such circuits are of importance. They may be universal for a set of operations and the operation to be performed can be result of some previous computation. The idea of programmable circuits has a limited use in case it is required that the outcome U{(f>) is determined uniquely and perfectly, because in such a case in order for a programmable circuit to be able to perform n unitary operations the dimensionality of the program space has to be n, in order for the circuit to be able uniquely distinguish the program given. More interesting and practical seem to be the cases that the outcomes should be correct only with some (sufficiently large) probability, or should only approximate the correct result, again with a given precision. Approximate programmable circuits also better reflect reality because circuits with perfect outcomes are an idealization only. For an overview of the subject and latest results on approximate programmable circuits that can approximate a set of unitary operations see Hillery et al. (2005). There are many interesting/important problems associated with such programmable circuits. For example, how to determine input that makes the circuit/processor to make best approximation of a given unitary. Of interest and importance are also investigations what kind of circuits can be simulated in polynomial time on classical computers. Almost "classical" result of Gottesman and Knill (see Nielsen and Chuang, 2000), says that circuits composed of the CNOT-gate, Hadamard-gate and the standard basis measurement, so called Clifford circuits, can be simulated on classical computers in polynomial time. Recently, Markov and Shi (2005) have shown that a quantum circuit with n gates, whose underlying graph has tree-width d can be simulated classically in n'-'^^^e'-'^'') time, which is polynomial in n ii d = O(lgn). This result has a variety of implications: for example in classical polynomial time one can simulate any log-depth circuit whose gates apply to nearby qubits only. Another approach to the problem of simulation on classical computers was taken by Somma et al. (2006). They consider a special Lie-algebraic models of computation and showed that these models can be efficiently simulated on classical computers in time polynomial in the dimension of algebra. Their results generalize those on fermionic linear optics computations.

Prom Informatics to Quantum Informatics

31

Another very basic model of quantum computation are quantum finite automata. Actually, there are several versions of them. Three very basic problems for models of quantum automata to explore are: (a) What is the class of languages accepted by a given model? (b) Which accepting probabilities can be achieved with a given model of automata? (c) How does the size of automata of the model (the number of states) compares to the size of equivalent minimal deterministic finite automata? Comparing with classical finite automata, quantum finite automata have special strength, due to the power of quantum superposition (parallelism), but also a special weakness, due to the requirement that they have to be reversible. (It is important to notice that negative impacts of reversibility can be, to a large extent, compensated by a suitable distribution of suitable measurements.) For some models, quantum finite automata accept a smaller class of languages as regular languages and for some other models they accept exactly the class of regular languages. Of large importance is what kind of measurements are performed and which measurement policy is used. For example, a measurement is performed after each computation step or only at the end of computation two extreme options. It has also be shown that in some cases quantum finite automata can be exponentially more succinct than classical deterministic finite automata. However, in some cases the opposite situation occurs. The very basic models of quantum finite automata, so called one-way (or real time) quantum automata, are defined similarly as probabilistic automata, only instead of probabilities, probability amplitudes are used and there is one additional requirement, namely that the overall evolution has to be unitary. More peculiar are quantum two-way automata. In the most basic model, they are a natural generalization of the classical two-way probabilistic finite automata. Quantum two-way automata can accept, with high probability, even some non-regular or non-context-free languages. In another model, quantum two way automata work almost as classical ones, they only have an additional quantum memory and at each step they either perform a usual classical move and a unitary operation on the state of their quantum memory, or a measurement on quantum memory is performed that then specifies, in a random way, the next move. Such automata have been shown to be much more powerful than classical probabilistic two-way finite automata (Ambainis and Watrous, 1999), even in the case quantum memory is restricted to one qubit (for an overview of concepts and results concerning quantum finite automata, see Gruska (2000). The very basic model of quantum Turing machines, originally due to Deutsch (1985), is again a modification of that of a probabilistic Turing machine - probabilities are only replaced by probability amplitudes. However, a non-trivial additional requirement is that the overall evolution of a quantum Turing machine has to be unitary. A state of such a quantum Turing machines can be seen as a weighted superposition of many configurations of a classical Turing machines. This model has been used to define basic quantum complexity classes and to develop quantum structural complexity. Such a model has classical inputs and outputs, only its evolution is quantum. Two new quite different models

32

J. Gruska

of Turing machines are of interest and importance. Both of them have quantum inputs and outputs (as sequences of qubits). One model (Jorrand and Perdrix, 2004), works with one additional qubit as memory and only measurements as operations. Another model is that of quantum Turing machines with classical control and quantum operations (Jorrand and Petrix, 2004a). The basic philosophy behind many of such models is that measurement is the basic tool to make quantum world to perform computations we need in the classical world. An important challenge concerning quantum computation is to develop a really good model of quantum cellular automata. There have been numerous attempts to do that, with variety of interesting results, but one can say that theory of quantum cellular automata is still not in a good shape. At the same time, quantum cellular automata are of large importance for quantum physics because interactions with neighbours is the very basic way Nature works. Those versions of quantum cellular automata that are O.K., are modifications of the partitioned or block-type of the classical cellular automata, see Schumacher and Werner (2004), for recent results. Quantum (structural) complexity theory is also being developed and it is an important part of quantum information processing science. One of the goals of quantum complexity theory is to challenge our basic intuition how physical world behaves. One can also say that quantum complexity theory is of great interest because one of its goals is to understand two of great mysteries of 20th century: what is nature of quantum mechanics and what are the limits of computation. It would be astonishing if a merge of such important areas would not shed light on both of them and would not bring new great discoveries. Taking complexity theory perspective can lead us to ask better questions about quantum nature - nontrivial, but answerable questions, which put old quantum mysteries in a new light even if they fall short of answering them (Aaronson, 2005). Quantum complexity theory has as the basic complexity class QP (as a quantum variant of the class P) and the class B Q P (as a quantum variant of the class B P P ) . There are also two quantum versions of the class N P , namely the classes N Q P and QMA. There are also many variants of the classes of relativistic quantum computing. Unfortunately, an introduction of all these classes did not help to make order in the ZOO of more than 470 classical complexity classes. Just opposite happened, the mess got larger. For an overview of recent results see Gruska (1999-2005)). Prom the recent surprising results in this area we mentioned that of Raz (2005) showing enormous power of quantum advices. ^^ In connection with theoretical investigations concerning quantum information processing and communication, of large importance is to find out whether we can really build powerful quantum computers and what is required for success. In this connection, one of the main goals of quantum informatics in general. ^^ Raz has shown that a quantum interactive proof system at which the verifier gets quantum advices can solve any problem whatsoever.

Prom Informatics to Quantum Informatics

33

and quantum complexity theory in particular, is to help to resolve this puzzle. In behind is actually question whether our world is polynomial or exponential, as pointed out by Aaronson (2005). The fact that such a basic question is unresolved makes also of large importance the task to study more elementary models as are that of quantum circuits, quantum programmable circuits and quantum finite automata. Main new challenges of quantum complexity theory can be seen as follows (see also Gruska (2005): (a) To help to determine whether we can build (and how) powerful quantum computers, (b) To help to determine whether we can effectively factorize large integers using a quantum computer, (c) To use complexity theory paradigms to classify quantum states (d) To use complexity theory (computational and communication) to study quantum entanglement and nonlocality. (e) To use complexity theory to determine power of decoherence and to find ways to fight decoherence. (f) To use complexity theory to formulate laws and limitations of physics, (g) To study feasibility in physics on a more abstract level, (h) To study various quantum theory interpretations from a new and more abstract (complexity) point of view, (e) To develop a more firm basis for quantum mechanics, (f) To develop new tests of quantum mechanics.

4 Outcomes and challenges of q u a n t u m communication

w EPR-state<^

Bell meas.

\f\i

'X

\M

CT.

V

Fig. 4. Quantum teleportation Quantum teleportation was the first and is still the most amazing new feature of quantum communication. The basic idea is very simple: if two parties, say Alice and Bob, share two particles, say A and B, in the EPR-state and Ahce gets a new particle P in an unknown state \ip) = Q:|0) -|-/3|1), then by performing the Bell measurement (that is the measurement using the Bell states), on her two particles, Bob's particle gets with the same probability into one of the four states |V'), o'x|V')i '^z\''P and CFX(^Z\'>IJ)-, and Ahce gets information (say, in the form of two bits) which of these four cases took place. If Alice sends this information to Bob, through a classical public channel, for example by email. Bob can then make his particle B to get into (still unknown for him) state \4i) by performing on his particle one of the operationsCTX,CT^ or a^cTz^ because al = (7? / . This way Alice can teleport, not knowing what, to not knowing where.

34

J. Gruska

Quantum teleportation allows therefore to send one qubit by sending two classical bits (if shared entanglement is available). In some sense, an inverse process is so called dense coding that allows in one qubit to send two bits (if shared entanglement is again available). This is also surprising because so-called Holevo theorem says that in one qubit we can store faithfully only one bit. Quantum teleportation allows perfectly secure transmission of quantum information (encoded via qubits) provided communicating parties share enough of EPR-states. Shared entanglement can be also used to exhibit so called pseudo-telepathy, see Brassard et al., 2003. For example, in various games that look as having participants to use telepathy to make agreements, but actually correlations between their actions are achieved by proper measurements of proper shared entangled states. It has been shown that shared entanglement can be used to improve exponentially protocols for a variety of communication tasks. For example, see Buhrman et al. (1998), Raz (1999). However, for some other communication tasks, as for computation of the inner product, it cannot. Results of communication complexity have also been used to show that some phenomena are likely impossible in physical world. For example, they were used to show, see van Dam (2005), Brassard et al. (2005), why are the correlations achievable by quantum processes not maximal among those that would preserve non-signaling condition of special relativity. They were also used to explore the question how well can processes of quantum mechanics approximate PRboxes, see page 19, that would exhibit strongest correlations preserving the nonsignaling condition. They have shown that, on one side, that availability of prior shared entanglement allows to approximate PR-boxes with a success probability cos^ I w 0.854 and that would it be possible to do that with probability greater than 0.908, then any Boolean function could be computed using only one bit of communication, what is considered as impossible. An interesting challenge is to close the gap between 0.854 and 0.908, in the above context. Large progress in understanding various aspects of quantum communication has been made during the recent years. We mention here only some results concerning quantum entanglement, and capacities of quantum channels. 4.1 Outcomes and challenges of quantum entanglement In this area very large progress has been made in recent years. In spite of that in almost all its areas there are big challenges. Basic problem is how to generate entangled states and how far entangled particles can be. A large variety of physical processes have been explored that result in entangled states. Importance of entangling unitary operators, those that can transform a product state to an entangled state has also been demonstrated. For example, any such two qubit entangling operation and all one qubit operations form a universal set of unitary operations. Entanglement swapping

Prom Informatics to Quantum Informatics

35

is perhaps the most counterintuitive way to generate entangled states. To decide whether a given mixed state is entangled is another important problem and many methods to do that were developed. Problem how many pure and maximally entangled states one can get from a given set of mixed states is also pretty good understand and many methods to do that were explored. The same is true for entanglement concentration problem: to get some maximally entangled pure states from a set of less entangled pure states. Discovery of bound entangled states - those mixed entangled states from which one cannot get pure entanglement - has been a big surprise and so were discovery of various properties of such states and of various ways how bound entangled states can be useful. Study of entanglement monotones, invariants and measures^^ is another important area of research with many interesting and important results. The fact that entanglement can be used as a catalyst that can help, without being destroyed, to transform one quantum state to another using local quantum operations and classical communication (LOCC) has been another surprising discovery, laws and limitations of entanglement sharing and also quantitative and qualitative classification of multipartite states is another big challenge. On a more applied level, a big challenge is still to understand how important is entanglement for quantum computation. Another big challenge is to get a proper understanding how frequent is entanglement and how robust such a concept can be (for example that in some vicinity of some entangled states all states are entangled). For a review of results in all these areas see Gruska (2003). Concerning quantum channels perhaps the main issue is to study various types of channels and various capacities. Entanglement plays by that a very important role. An important task was to find nice formulas to express different capacities and to find relations between different capacities, see Nielsen and Chuang (2000) and Gruska (1999-2005).

5 Outcomes and challenges of q u a n t u m cryptography So called BB84 quantum protocol, due to Bennett and Brassard (1984), for generation of classical shared and perfectly secret keys, and numerous proofs, using a variety of techniques, under more and more realistic conditions (concerning perfection of the photon sources, quality of channels and perfection of the receivers), that BB84 protocol is unconditionally secure, have been the first highlights of quantum cryptography. The first experiment, due to Bennett and Brassard (1989), demonstrated feasibility of such a protocol for the distance of 32 cm. This has been increased, step by step to 120-150 km what used to be seen as limit set up by photon loses and detectors loses. Zhang et al., (2005) claim to increase maximal distance to 260km exploiting entanglement swapping ^'^ An important measure is so called entanglement of formation Ef (how many maximally entangled states are needed to create a given state) is one of such measures and the additivity problem for this measure - that is if always Ef{p\ ® p2) = -'^/(Pi) + EfiP's)) - is a very important open problem.

36

J. Gruska

and quantum relays. A big challenge for classical key generation is still to make quantum generation of the classical keys more robust, more reliable and with much better performance. The DARPA network, that was launched in 2003 in Boston connecting Boston and Harvard universities on one side, and BBN Technology on the other side, is one of the most complex attempts to create a network for quantum key distribution. Such networks for metropolitan areas are currently seen as feasible. So-called unconditional security of the classical keys generating quantum protocols actually says that undetectable eavesdropping is impossible, in a very reasonable probabilistic sense. Behind this results is impossibility of quantum cloning and destructive impacts of quantum measurement. Another highhght of quantum cryptography has been the proof that unconditionally secure bit commitment is impossible, due to the fact that the existence of quantum entanglement is impossible to detect locally, and therefore quantum entanglement can always be used for cheating. There are again many proofs of this result and many consequences for such protocols as oblivious transfer, coin tossing and multipartite computation. There are many other task of broadly understood cryptography, where quantum protocols have been developed and/or are under development: quantum authentication, digital signatures, public key cryptography, secret sharing, data hiding, anonymity, voting and so on. An open problem, recently resolved, by Watrous (2005), was to find a proper approach to quantum zero-knowledge proofs. One of the most particular aspects of security in quantum cryptography is that in the quantum case a variety of possible quantum attacks is much larger and they can be more complex than in the classical case. All that makes security consideration in quantum case much more complex. Of surprising elegance, simplicity and power, is quantum version of the classical ONE-TIME PAD cryptosystem. In the classical case, to encode an n-bit plaintext p, using a shared n-bit random key /c, one performs bit-wise ©-operation to get the cryptotext c = p®k. Decryption is done then using the same procedure: p = c® k. Another way to see the classical ONE-TIME PAD cryptosystem is that n bits (of the shared key) are sufficient (and necessary) to hide perfectly n bits (of the plaintext) so one can get them all back (by decryption). Quantum ONE-TIME PAD uses two n-bit keys k and k', to encode a plaintext of n qubits \pi),..., \pn)- An encryption of the ith qubit is done by multipli/

cation with Pauli matrices \ci) = a^'-ax^ \pi) and its decryption can be obtained analogically as cr^V^'Ici). This way a qubit \pi) is encrypted and sent through a mixed state {i\,\Pi))^ih(^x\Pi)),i\,(Tz\Pi)),{j,crx(^z,\Pi))} that is undistinguishable from a random bit and therefore this quantum ONE-TIME PAD is perfectly secure. Amazing by that is that inspite of the fact that one qubit can hide infinitely many bits, in its amplitudes, to hide such a qubit as a whole, so

Prom Informatics to Quantum Informatics

37

one can get the qubit back perfectly, only two classical bits are sufficient (and necessary) - see Mosca et al. (2000).

6 Outcomes and challenges of q u a n t u m formal systems In the classical informatics, the development of high level formal systems, based on the concepts and tools of logic and formal semantics, to precisely specify and reason about computation, cooperation and communication processes in general, and about algorithms, protocols and concurrent systems in particular, has turned out of large importance for design and analysis of provably correct software for computation and communication systems. At the same time, this line of research in the classical informatics has brought theoretically surprisingly deep and practically very important and useful insights and outcomes concerning the laws and limitations of very large information processing, cooperation and communication systems. Classical complexity theory research community, with emphasis on lower bounds, clearly underestimated, for a long time, an importance of this area of research. However, step by step, this, logic and formal semantics and abstraction based, area of research started actually to dominate in broadly understood theoretical computer science and there are good reasons to believe that it can be so, and even more, also in the area of the classical/quantum computing. Moreover, there is also a good chance that also this area of research can bring new view points and tools to deal with quantum mechanics in general, and with quantum information processing and communication in particular, and to put new lights on these areas. There are two main reasons why quantum (quantum/classical) programming theory is much needed and has a chance to be insightful and useful. At first, any formal description of algorithms, protocols and processes, that make use of quantum phenomena has to to take into account both quantum and classical computation, cooperation and communication components and assemble them in such a way that they coexist, communicate and cooperate. (For example, preparation of quantum states is an (always inevitable) example of classical/quantum interaction and quantum measurement (and control actions that depend on its random outcome) is an (always inevitable) example of a classical/quantum/classical interaction. One can also say that classical/quantum interaction and cooperation is inherent in the classical/quantum information processing and communication. Fortunately, concepts and tools developed in the classical programming theory have been so abstract and powerful that they are now quite easy to adjust to cover classical-quantum case in a homogeneous way. Secondly, concepts and tools developed in the classical programming theory are so abstract and powerful that they allow to generalize naturally current (von Neumann) quantum mechanical framework that was created to deal just with "minimal view of quantum mechanics". This more general framework, that

38

J. Gruska

allows to consider current view of quantum mechanics as a possible model, has its advantages. Some of the main challenges in this area can be seen as follows: - To develop quantum and classical/quantum versions of formal systems, for description, analysis and verification of algorithms, protocols and computation/communication systems, that have turned out to be so important for the classical information processing (see, for example, Lalire and Jorrand (2004)18). - To develop abstract (for example category theory based) approaches to quantum/classical information processing and communication and also to quantum mechanics itself (see, for example, the approach of Abramsky and Coecke (2004)19). - To develop new understanding of fundamental quantum phenomena using ideas and concepts coming from logic- and semantics-based formal systems, see, for example, Coecke (2005)^''.

7 Outcomes and challenges in beating decoherence Decoherence - a destructive impact of the environment on any information processing quantum process - used to be seen, and it is still seen by many, as the main, and even unbeatable, obstacle for our goal to have reliable and powerful quantum information processing. One of the reason for that was a conviction that, from the physics point of view, sufficiently powerful quantum error correction is impossible for a variety of reasons. Some of them were beliefs that in the quantum case the number of the potential quantum errors^^ is infinite, that any attempt to detect errors by measurement would destroy, in an irreversible way, the erroneous state, and, finally, that quantum error correcting codes would need to fight successfully, and in polynomial time, exponentially fast growing decoherence, what looked again as impossible. However, it has, fortunately, turned out that, under very reasonable assumptions, it is sufficient to consider two types of errors - a bit error that is actually performed by the PauH ax operator {ax{ci\0) + f3\l)) = a\l) -f-/3|0)) and a sign error, performed by the Pauli az operator {az{a\0) + P\l)) = a|0) — /3|0)). It was then shown, especially by Shor (1995,1996), that not only sufficiently powerful error correcting codes and processes do exist, but that also quantum information processing They developed a process algebra approach to concurrent and distributed quantum computation. They recasted standard axiomatic presentation of quantum mechanics at a more abstract level in terms of category theory and this new and more abstract approach creates new possibilities to reason about quantum mechanics He developed so called picture calculus for quantum mechanics ( as a natural extension of Dirac's notation). Nature actually does not make errors. It can only behave differently than we wish or expect.

Prom Informatics to Quantum Informatics

39

can be realized in a fault-tolerant way. All that has been achieved by a clever generalizations of the ideas known from the classical linear codes. Second major breakthrough came with the discovery of threshold theorems that say that if elementary gates and channels have certain reliability,'then, using so-called concatenated codes, arbitrarily long, in time and space, reliable quantum information processing and communication is possible. Each such threshold theorem establishes some bounds and any improvement of upper and lower bounds on such thresholds is currently an important task and challenge that could help to see realistically what needs to be achieved and where we are currently concerning the development of elements for QIPC. Concerning fighting the decoherence, the main current challenges are: (a) to develop error models for specific QIPC technologies and for them also quantum error correcting codes; (b) to develop error detecting and preventing codes; (c) to generalize the concept of errors (see, for example, the concept of nice error bases, see, for example, Klappenecker and Rotteller (2000); (d) to explore various ideas of so called error-free subspaces.

8 Outcomes and challenges in beating quantum limitations and barriers We will discuss here only three limitations: the one established by no-cloning theorem and its variations (no-deletion theorem and so on), and so called Turing barrier and BQP-barrier. Buzek and Hillery (1996) were first to show that one can determine a reachable upper bound on the best way how to do cloning on qubits in an approximate way. Their results have been generalized in various ways to cover Hilbert space of larger dimension and other mathematically well defined operation that cannot be realized perfectly physically. Finally, let us discuss Turing barrier or better Church-Turing barrier. Turing thesis, or Church-Turing thesis, can be formulated as follows: Every function that can be computed by what we would naturally regard as an algorithm is a computable function, and vice versa. So called Turing principle, formulated by Deutsch, reads as follows: Every finitely realizable physical system can be perfectly simulated by a universal computing machine operating by finite means. It is important to realize that Church-Turing thesis can be seen as one of the guiding principles for mathematic, physics and informatics and that since its very beginning Church-Turing thesis is under permanent attack from both mathematical and physical sciences. In mathematics and computing, all these attack used to be based on uncritical use of infinity, continua and density. It is also important to realize that recognition of physical aspects of the Church-Turing thesis has had important impacts also for physics. Turing barrier puts important restriction when searching for new physical theories It is interesting and important to ask and answer the question what is the sense of trying to beat such a barrier that seems to be unbeatable. To that one can say the following: (a) It is interesting and intellectually usually very

40

J. Gruska

rewarding to overcome limitations that seem to be unquestionable; One has to realize that limits of mathematics ought to be determined not solely by mathematics itself but also by physical principles; Attempts to show that there is a way to overcome Turing barriers are an important way to improve our understanding of physical world and nature in general and to find in it new important resources and/or theories.^•^ Two other questions are of interest for us now. Is there a chance to overcome this barrier and can we use quantum phenomena to do that? An extended version of Church-Turing thesis, that captures an important new phenomenon in computing - the existence of global computing network that continuously interact with environment, keep changing/evolving, works practically without an end and have inputs that can be seen as non-uniform, van Leeuwen and Wiedermann (2001) have shown that any (non-uniform interactive) network computation can be described in terms of interactive Turing machines with advices^^ that are equivalent to so called site machines and also equivalent to internet machines (GRID-networks) (that is a model inspired by computer networks and distributed computing). All these models accept all recursively enumerable sets and their complements. The Extended Church-Turing Thesis (or VW-thesis of van Leeuven and Wiedermann) does not aim to attack the Church-Turing thesis; VW-thesis merely tries to identify a new proper extension of Church-Turing thesis (to cover computations that share the following features: non-uniformity of programs, interaction of machines and infinity of operations). VW-thesis tries to see the concept of computation in a broader sense, based on different assumptions and suited to answer different questions. Since it is possible, in a sense, to get beyond, in the classical world, it is natural to see as a challenge to do so even more in quantum world. The attempts, as those of Kieu (2001), who has tried to show a quantum way to solve Hilbert's 10th problem, can hardly be seen as successful, as analysed by Hodges (2006). On the other hand, there seem to be more successful attempts to do so using some other physical principles. For example, Etesi and Nemeti (2002) showed that certain relativistic space-time theories license the idea of observing the infinity of certain discrete processes in finite time. That led to the observation that certain relativistic computers could carry certain undecidable queries in finite time. On this basis Wiedermann and van Leeuwen (2005) designed a In this context one can see as especially valid the following thought When you try to reach for stars you may not quite get one, but you won't come with a handful of mud either, by Leo Burnett. The idea of advices has the following motivation: Many systems in Nature prefer to sit in highly entangled multipartite states. Is it possible to make use of that to get an extra computational power (see Nielsen and Chuang, 2000)? Technically, we get to the following problem: Are quantum advices more powerful than classical? In other words, is (BQP/qpoly = BQP/poly?) Concerning the power of advices, the following result of (Raz, 2005) is of interest. A quantum interactive proof system at which the verifier gets quantum advices can solve any problem whatsoever.

Prom Informatics to Quantum Informatics

41

relativistic Turing machine that models the above relativistic computer and that recognizes exactly A2 set of Arithmetical Hierarchy. BQP-barrier says that effectively computable are problems that are in B Q P . The question whether we can beat this barrier seems to be more intriguing and it does not have (yet) such a statue of unbeatability as other barriers. Actually, previous versions of this barrier, that included complexity classes P and B P P , seem to be beaten, though we are not sure, yet.^^ There are still many mysteries concerning the class B Q P . Not only we do not know whether N P C B Q P , but we even do not know whether N P C B Q P would imply P = N P . Related to that is the NP-barrier that says that not all NP-complete problems can be solved in polynomial time using the resources of the physical world. There have been many attempts to beat NP-barrier and they are to large extend well summarized and analyzed by Aaronson (2005a). He discuss such ideas as quantum adiabatic computing, variations on quantum mechanics (nonlinearity, hidden variable theories), analog computing, but also more esoteric ones as relativity computing^^, time travel computing, quantum field, string and gravity theories, and even anthropic computing^^ Main conclusions are: (a) searches for overcoming N P barriers are important, they can bring a better understanding of the physical worlds; (b) none of the well specified attempts is successful - they usually forget to count all resources needed and/or all physics known. In connection with NP-barrier, of interest and importance is the question, see Aaronson (2005) whether we should not take "NP-hardness assumption" saying that l>i'P-complete problems are intractable in the physical world as a new principle of physics (as, for example. Second Law of Thermodynamic is). This principle starts to be used. Perhaps main problem with it is that why N P , why not B Q P or # P or P S P A C E . On a more philosophical level, all above considerations lead to two basic questions: Is universe computable? Is it efficiently computable? It is nowadays clear that the assumption of the founders of the Hilbert space quantum mechanics that any state and observable are in principle implementable is wrong. That would allow to compute uncomputable functions. Less clear is what to consider as feasible. In this connection it is perhaps worth to observe that, on one side, likely nobody believes that classes P and B P P are identical, and, on the other side, Impagliazo and Widgerson (1997) gave quite convincing evidence that they are. The idea behind relativity computing can be informally described as that one makes a computer to deal with an intractable problem, then boards a spaceship and accelerates it to nearly speed of light. After returning to Earth, answer will wait for him (though all his friends would be long dead). They are models of computing in which the probability of one's own existence might depend on a computer's output.

42

J. Gruska

9 Impact of q u a n t u m informatics Let us try to summarize briefly three impacts of quantum informatics: on quantum physics, quantum information processing and communication and on (classical) informatics itself. Impacts on (quantum) physics: Quantum informatics brings to quantum physics a new way of thinking, new value systems, new ways, more general and more precise, of formulation of quantum physics laws and limitations, new ways to get around, in a reasonable way, of otherwise its strict laws and also a variety of new technical concepts, methods, tools and results. It brings new paradigms, concepts, models, measures and so on. It helps to increase quality of reasonings and findings in quantum physics. Quantum complexity theory helps to establish principles, see Aaronson (2003, 2004, 2005, 2005a), that allow to see impossibility of some physical phenomena and to restrict search space for new physical theories in general and for variations of quantum mechanics in particular. Impacts on quantum information processing and communication technology: Quantum informatics helps to discover and analyse power of quantum information processing primitives and their optimal use (see, for example, Gruska (2005)); to see merits, potentials and limitations of the potential technologies also without doing experiments; and to discover ways to manage and fool quantum decoherence. Impacts on informatics itself: In a similar way as the development of probability theory brought a variety of powerful method to solve problems of "classical" (that is non-probabilistic in this context) mathematics and brought powerful tools practically for all areas of science and technology in general, the development of quantum informatics can be expected to bring (and already brings) a variety of paradigms, methods and tools that can be used to deal with problems of classical informatics for and also many areas of science and technology, especially for those dealing with microworld. Some of the first examples how one can use quantum tools to solve non-quantum problems have been demonstrated by de Wolf (2005). Moreover, taking into consideration that computation, communication, security and feasibility are also physical concepts, in a way, quantum informatics allows also informatics to meet its main goals in a more proper way.

10 Conclusion The development of quantum information processing science and technology has come to the point that in order to make further significant progress in this area a new view is needed and pursued concerning the overall aims, scope, methods, primitives of the underlying sciences and technologies that need to be developed. Pursuing much more paradigms, viewpoints, methods, and tools of quantum informatics is one of the ways to go.

From Informatics to Quantum Informatics

43

References 1. S. Aaronson. Multilinear formulas and skepticism of quantum computing, quantph/0311039, 2003. 2. S. Aaronson. Is quantum mechanics an island in Theoryspace? quant-ph/0401062, 2004. 3. S. Aaronson. Are quantum states exponentially long vectors? quant-ph/0507242, 2005. 4. S. Aaronson. NP-complete problems and physical reality, quant-ph/0502072, 2005a. 5. S. Abramsky and B. Coecke. A categorical semantics of quantum protocols, quantph/0402130, 2004. 6. D. Aharonov, A. Ambainis, J. Kempe, and U. Vazirani. Quantum walks on graphs. Proc. of 33th STOC, 50-59, 2000. 7. A. Ambainis. Quantum lower bounds by quantum arguments, quant-ph/0002066, 2000. 8. A. Ambainis. Quantum walks and and their algorithmic applications, quantph/0403120, 2004. 9. A. Ambainis and J. Watrous. Two-way finite automata with quantum and classical states, quant-ph/9911009, 1999. 10. Ch. H. Bennett Logical reversibility of computation. IBM Journal of Research and Development, 17:525-532, 1973. 11. Ch. H. Bennett and G. Brassard. The dawn of a new era for quantum cryptography. The experimental prototype is working! SIGACT News, 20(4):78-82, 1989. 12. Ch. H. Bennett and G. Brassard. Quantum cryptography: public key distribution and coin tossing. In Proceedings of IEEE Conference on Computers, Systems and Signal processing, Bangalore (India), pages 175-179, 1984. 13. Ch. H. Bennett, G. Brassard, C. Crepeau, R. Jozsa, A. Peres, and W. K. Wootters Teleporting an unknown quantum state via dual classical and Einstein-PodolskyRosen channels. Physical Review Letters, 70:1895-1899, 1993. 14. G. Brassard, A. Broadlent, and A. Tapp. Quantum telepathy, quant-ph/0306042, 2003. 15. B. Brezger, L. Hackermiiller, S. Uttenthaler, J. Petschinka, M. Arndt, and A. Zeilinger. Matter-wave interferometer for large molecules, quant-ph/0202158, 2002. 16. H. Buhrman, R. Cleve, and A. Wigderson. Quantum versus classical communication complexity. In Proceedings of 30th STOC, pages 63-68, 1998. 17. V. Buzek and M. Hillery. Quantum copying: beyond the no-cloning theorem. Physical Review A, 54:1844-1852, 1996. 18. N. Cerf, N. Gisin, S. Masar, and S. Popescu. Quantum entanglement can be simulated without communication. Physical Review Letter, 94:220403, 2005. 19. A. M. Childs, J. Preskill, and J. Renes. Quantum information and precision measurement, quant-ph/9904021, 1999. 20. B. S. Cirel'son. Quantum generalization's of Bell's inequality. Letters in Mathematical Physics, 4(2):93-100, 1980. 21. B. Coecke. Kindergarten quantum mechanics, quant-ph/0510032, 2005. 22. R. de Wolf. Lower bounds on metric rigidity via a quantum measurement, quantph/0505188, 2005.

44

J. Gruska

23. D. Deutsch. Quantum theory, the Church-Turing principle and the universal quantum computer. Proceedings of Royal Society of London A, 400:97-117, 1985. 24. A. K. Ekert. Quantum cryptography based on Bell's theorem. Physical Review Letters, 67(6):661-663, 1991. 25. C. Elliott, A. Colvin, D. Pearson, O. Pikalo, J. Schlafer, and H. Yeh. Current status of the DARPA quantum network, quant-ph/0503058, 2003. 26. A-N. Zhang et al. Quantum-relay-assisted key distribution over high photon loss channels, quant-ph/0508062, 2005. 27. Ch-Z. Peng et al. Experimental free-space distribution of entangled photon pairs over a noisy ground atmosphere of 13 km. quant-ph/0412218, 2004. 28. M. Arndt et al. Quantum physics from A to Z. quant-ph/0505187, 2005. 29. E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser. Quantum computation by adiabatic evolution, quant-ph/0001106, 2000. 30. G.Brassard, H. Buhrman, N. Linden, A. A. Methot, A. Tapp, and F. Unger. A limit on nonlocality in any world in which communication complexity is not trivial. quant-ph/0508042, 2005. 31. N. Gisin. Can relativity be considered complete? from Newton nonlocality to quantum nonlocality and beyond, quant-ph/0512168, 2005. 32. L. K. Grover Quantum mechanics helps in searching for a needle in a haystack. Physical Review Letters, 78:325-328, 1997a. 33. J. Gruska. Quantum computing. McGraw-Hill, 1999-2005. See also additions and updatings of the book on http://www.mcgraw-hill.co.uk/gruska. 34. J. Gruska. Descriptional complexity issues in quantum computing. Automata, Languages and Combinatorics, 5(3): 198-218, 2000. 35. J. Gruska. Quantum entanglement as a new quantum information processing resource. New Generation Computing, 21:279-295, 2003. 36. J. Gruska. General Theory of information transfer and combinatorics, chapter Universal sets of quantum information processing primitives and optimal use of such primitives, pages 356-377. Springer-Verlag, 2005. 37. J. Gruska. Quantum complexity theory goals and challenges. International Journal of Quantum Information, 3(l):31-39, 2005. 38. M. Hillery, V. Buzek, and M. Ziman. Approximate programmable quantum processors. quant-ph/0510161, 2005. 39. A. Hodges. Can quantum computing solve classically unsolvable problems? quantph/0512248, 2005. 40. R. Impagliazo and A. Widgerson. P = B P P unless e has subexponential circuits: derandomization. In Proceedings of 29th STOC, pages 220-229, 1997. 41. B. Julsgaard, A. Kozhekin, and E. S. Polzik. Experimental long-lived entanglement of two macroscopic objects, quant-ph/0106057, 2001. 42. T. D. Kiew. Quantum algorithm for Hilbert's tenth problem, quant-ph/0110136, 2001. 43. A. Klapenecker and M. Rotteler. Beyond stabilizer codes I: nice error bases. quant-ph/0010082, 2000. 44. M. Lalire and Ph. Jorrand. A process algebraic approach to concurrent and distributed quantum computation: operational semantics, quant-ph/0407005, 2004. 45. D. A. Lidar, I. L. Chuang, and K. B. Whaley. Decoherence-free subspaces for quantum computing. Physical Review Letters, 81:2594-2598, 1999. 46. I. Marcikic, H. de Riedmatten, W. Tittel, H. Zbinden, M. Legre, and N. Gisin. Distribution of time-bin entangled qubits over 50 km of optical fiber, quantph/0404124, 2004.

Prom Informatics to Quantum Informatics

45

47. G. Mitchison and R. Jozsa. Counterfactual computations, quant-ph/9907007, 1999. 48. M. Mosca, A. Tapp, and R. de Wolf. Private quantum channels and the cost of randomizing quantum information, quant-ph/0003101, 2000. 49. M. A. Nielsen and I. I. Chuang. Quantum information processing. Cambridge University Press, 2000. 50. M. A. Nielsen and I. L. Chuang. Programmable quantum gate arrays, quantph/9703032, 1997. 51. S. Perdrix. State transfer instead of teleportation in measurement-based quantum computation, quant-ph/0402204, 2004. 52. S. Perdrix and P. Jorrand. Classically controlled quantum computing, quantph/0407008, 20004a. 53. S. Perdrix and Ph. Jorrand. Measurement-based quantum Turing machines and questions of universalities, quant-ph/0402156, 2004. 54. S. Popescu and D. Rohrlich. Causality and non-locality as axioms for quantum mechanics, quant-ph/9709026, 1997. 55. R. Raussendorf and H. J. Briegel. Quantum computing by measurements only. Phys. Rev. Lett, 86, 2004. 56. R. Raz. Exponential separation of quantum and classical communication complexity. In Proceedings of 31st ACM STOC, pages 358-367, 1999. 57. V. Scarani. Feats, features and failures of the PR-box. quant-ph/0603017, 2006. 58. V. Scarani, W. Tittel, H. Zbinden, and N. Gisin. The speed of quantum information and the preference frame: analysis of experimental data, quant-ph/0007008, 2000. 59. B. Schumacher and R. F. Werner. Reversible quantum cellular automata, quantph/0405184, 2004. 60. Y. Shi. Both Toffoli and controlled-NOT need little help to do universal computation. quant-ph/0205115, 2002. 61. P. W. Shor. Algorithms for quantum computation: discrete log and factoring. In Proceedings of 36th IEEE FOCS, pages 124-134, 1994. 62. P. W. Shor Scheme for reducing decoherence in quantum computer memory. Physical Review A, 52:2493-2496, 1995. 63. P. W. Shor Fault-tolerant quantum computation. In Proceedings of 37th IEEE FOCS, pages 56-65, 1996. 64. T. Short, N. Gisin, and S. Popescu. The physics of no-bit commitment generalized quantum non-locality versus oblivious transfer, quant-ph/0504134, 2005. 65. R. Somma, H. Barnum, and G. Ortiz. Efficient solvability of hamiltonians and limits on the power of some quantum computational models, quant-ph/0601030, 2006. 66. W. van Dam. Implausible consequences of superstrong nonlocality. quantph/0501159, 2005. 67. J. van Leeuwen and J. Wiedermann. Mathematics unlimited, 2001 and beyond, chapter The Turing machine paradigm in contemporary computing, pages 11391156. Springer Verlag, 2001. 68. J. J. Vartiainen, M. Mottonen, and M. M. Salomaa. Efficient decomposition of quantum gates, quant-ph/0312218, 2003. 69. F. Vatan and C. Williams. Optimal realization of a general two-qubit quantum gate, quant-ph/0308006, 2003. 70. F. Vatan and C. Williams. Realization of a general three-qubit quantum gate. quant-ph/0401178, 2004.

46

J. Gruska

71. J. Watrous. Quantum zero-knowledge proofs, quant-ph/0511020, 2005. 72. J. Wiedermann and J. van Leeuwen. Relativistic computers and non-uniform complexity theory. In Proceedings of CMU'02, LNCS 2509, pages 287-299, 2002.

Distributed Algorithms for Autonomous Mobile Robots Giuseppe Prencipe^ and Nicola Santoro^ 2

^ Dipartimento di Informatica, Universita di Pisa, [email protected] School of Computer Science, Carleton University, santoroOscs.carleton.ca Abstract. The distributed coordination and control of a team of autonomous mobile robots is a problem widely studied in a variety of fields, such as engineering, artificial intelligence, artificial life, robotics. Generally, in these areas, the problem is studied mostly from an empirical point of view. Recently, a significant research effort has been and continues to be spent on understanding the fundamental algorithmic limitations on what a set of autonomous mobile robots can achieve. In particular, the focus is to identify the minimal robot capabilities (sensorial, motorial, computational) that allow a problem to be solvable and a task to be performed. In this paper we describe the current investigations on the interplay between robots capabilities, computability, and algorithmic solutions of coordination problems by autonomous mobile robots.

1 Introduction In this paper we describe the current investigations on the algorithmic limitations of what autonomous mobile robots can do with respect to basic coordination problems. The current trend in robotic research, both from engineering and behavioral viewpoints, has been to move away from the design and deployment of few, rather complex, usually expensive, application-specific robots. In fact, the interest has shifted towards the design and use of a large number of "generic" robots which are very simple, with very limited capabilities and, thus, relatively inexpensive, but capable, together, of performing rather complex tasks. The advantages of such an approach are clear and many, including: reduced costs (due to simpler engineering and construction costs, faster computation, development and deployment time, etc); ease of system expandability (just add a few more robots) which in turns allows for incremental and on-demand deployment (use only as few robots as you need and when you need them); simple and affordable fault-tolerance capabilities (replace just the faulty robots); re-usability of the robots in different applications (reprogram the system to perform a different task). Moreover, tasks that could not be performed at all by a single agent become manageable when many simple units are used instead [19, 34]. One of the first studies conducted in this direction in the AI community is that of Mataric [30]. The main idea in Mataric's work is that "interactions Please use the following format when citing this chapter: Prencipe, G., Santoro, N., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 47-62.

48

G. Prencipe and N. Santoro

between individual agents need not to be complex to produce complex global consequences". Other investigations in the AI community include the study of [4] on stigmergy communication and on the use a set of simple robots that operate completely autonomously and independently to collect pucks spread over a square arena in a single cluster; the ALLIANCE architecture and the studies on selfish behavior of cooperative robots in animal societies by Parker [34]; the formation and navigation problems in multi-robot teams in the context of primitive animal behavior in pattern formation by Balch and Arkin [3]; and the experiments in cooperative cleaning behavior of Jung et al [28]. Alternative approaches to the problem of studying multi-robot systems, can be found in the CEBOT system of Fucuda, Kawaguchi et al. [25, 29], in the planner-based architecture of Noreils [32], in the information requirements theory of Donald et al. [19] (see [6] for a survey), in the Swarm Intelhgence of Beni and Hackwood [5], in the Self-Assembly Machine (" fructum") of Murata et al. [31], etc. The common feature of all these approaches is that they do not deal with formal correctness and they are only analyzed empirically. In all these investigations, algorithmic aspects were somehow implicitly an issue, but clearly not a major concern, let alone the focus, of the study. An investigation with an algorithmic flavor has been undertaken within the AI community by Durfee [20], who argues in favor of limiting the knowledge that an intelligent robot must possess in order to be able to coordinate its behavior with others. Recently, the problem has been tackled from a different perspective: from a computational point of view. In other words, the focus is to understand the relationship between the capabilities of the robots and the solvability of the tasks they are given. In these studies, the impact of the knowledge of the environment is analyzed: can the robots form an arbitrary geometric pattern if they have a compass? Can they gather in a point? Which information each robot must have about its fellows in order for them to collectively achieve their goal? The goal is to look for the minimum power to give to the robots so that they can solve a given task; hence, to formally analyze the strengths and weaknesses of the distributed coordination and control. In this paper we describe the current investigations on the interplay between robots capabilities, computability, and algorithmic solutions of coordination problems by autonomous mobile robots. In Section 2 we describe the model used in these investigations. In Section 3 we review some results related to the analysis of the problem of pattern formation by autonomous mobile robots. Finally, in Section 4 we draw some conclusions and present suggestions for further study.

Distributed Algorithms for Autonomous Mobile Robots

49

2 Modeling Autonomous Mobile R o b o t s In the general model, the computational universe is a 2-dimensional plane populated by a set of n autonomous mobile robots, denoted by r i , . . . , r „ , that are modeled as devices with computational capabilities which are able to freely move on a two-dimensional plane. 2.1

The robots and their behavior

A robot is a computational unit capable of sensing the positions of other robots in its surrounding, performing local computations on the sensed data, and moving towards the computed destination. The local computation is done according to a deterministic algorithm that takes in input the sensed data (i.e., the robots' positions), and returns a destination point towards which the executing robot moves. All the robots execute the same algorithm. The local view of each robot includes a unit of length, an origin, and a Cartesian coordinate system defined by the directions of two coordinate axes, identified as the x and y axis, together with their orientations, identified as the positive and negative sides of the axes. Each robot repeatedly cycles through four states: (i) initially it is inactive - Wait, (ii) it observes the positions of the other robots in its area of visibility - Look, (iii) it computes its next destination point by executing the algorithm (the same for all robots) - Compute, and (iv) it moves towards the point it just computed ~ Move. After the Move it goes back to the Wait state. The sequence: Wait - Look - Compute - Move form a computation cycle (or briefly cycle) of a robot. The operations performed by each robot r in each state will be now described in more details. 1. Wait. The robot is idle. A robot cannot stay indefinitely idle. Initially, all robots are in Wait. 2. Look. The robot observes the world by activating its sensors which will return a snapshot of the positions of all other robots within the visibility range with respect to its local coordinate system. Each robot is viewed as a point, hence its position in the plane is given by its coordinates, and the result of the snapshot (hence, of the observation) is just a set of coordinates in its local coordinate system: this set forms the view of the world of r. 3. C o m p u t e . The robot performs a local computation according to a deterministic algorithm A (we also say that the robot executes A). The algorithm is the same for all robots, and the result of the Compute state is a destination point. 4. Move. If the destination point is the current location of r, r performs a null movement (i.e., it does not move); otherwise it moves towards the computed destination but it can stop anytime during its movement^. e.g. because of limits to the robot's motorial capabilities.

50

G. Prencipe and N. Santoro

The robots are completely autonomous: no central control is needed. Furthermore they are anonymous, meaning that they are a priori indistinguishable by their appearance, and they do not (need to) have any kind of identifiers that can be used during the computation. Moreover, there are no explicit direct means of communication: any communication occurs in a totally implicit manner. Specifically, it happens by means of observing the robots' positions in the plane, and taking a deterministic decision accordingly. In other words, the only mean for a robot to send information to some other robot is to move and let the others observe (reminiscent of bees in a bee dance). 2.2 Levels of Synchronization The model, in its general setting, makes no assumptions about the level of synchronization of the robots. Indeed, the assumptions on the level of synchronization have a deep impact on computability; in fact, there are problems that are unsolvable in the general setting but can be solved in a synchronous setting (e.g., see [36]). The situation is analogous to the one occurring in the distributed computing field, and the settings will be described in this section. General Setting: Asynchronous Robots In the general setting, no assumptions on the cycle time of each robot, and on the time each robot takes to execute each state of a given cycle are made. It is only assumed that each cycle is completed in finite time, and that the distance traveled in a cycle is finite. Moreover, the robots do not need to have a common notion of time, and each robot can execute its actions at unpredictable time instants. More precisely, there are only two limiting assumptions. The first one refers to space; namely, the distance traveled by a robot during a computational cycle. Assumption A l (Finite Distance). The distance traveled by a robot r in a move is not infinite. Furthermore, there exists an arbitrarily small constant 5r > 0, such that if the destination point is closer than 5r, r will reach it; otherwise, r will move towards it of at least Sr • As no other assumptions on space exist, the distance traveled by a robot in a cycle is unpredictable. The second limiting assumption is on the length of a cycle. Assumption A2 (Finite Cycle). The amount of time required by a robot r to complete a computational cycle is not infinite. Furthermore, there exists a constant Cr > 0 such that the cycle will require at least Sr time. As no other assumption on time exists, the resulting system is fully asynchronous and the duration of each activity (or inactivity) is unpredictable. There are two important consequences:

Distributed Algorithms for Autonomous Mobile Robots

51

1. Since the time that passes after a robot starts observing the positions of all others and before it starts moving is arbitrary, but finite, the actual move of a robot may be based on a situation that was observed arbitrarily far in the past, and therefore it may be totally different from the current situation. 2. Since movements can take a finite but unpredictable amount of time, and different robots might be in different states of their cycles at a given time instant, it is possible that a robot can be seen while it is moving by other robots that are observing^. These consequences render difficult the design of an algorithm to control and coordinate the robots. For example, when a robot starts a Move, it is possible that the movement it performs is not "coherent" with the current configuration (i.e., the configuration it observed at the time of the Look and the configuration at the time of the Move can differ), since, during the Compute, other robots can have moved. Restricted Setting: Semi-synchronous Robots A computational setting that has been extensively investigated is one in which the cycles of all the robots are synchronized and their actions are atomic. In particular, there is a global clock tick reaching all robots simultaneously, and a robot's cycle is an instantaneous event that starts at a clock tick and ends by the next. The only unpredictability (hence the name semi-synchronous) is given by the fact that at each clock tick, every robot is either active or inactive, and only active robots perform their cycle. The unpredictability is restricted by the fact that at least one robot is active at every time instant, and every robot becomes active at infinitely many unpredictable time instants. A very special case is when every robot is active at every clock tick; in this case the robots are fully synchronized. In this setting, at any given time, all active robots are executing the same cycle state; thus no robot will look while another is moving. In other words, a robot observes other robots only when they are stationary. This implies that the computation is always performed based on accurate information about the current configuration. Fiirthermore, since no robot can be seen while it is moving, the movement can be considered instantaneous. An additional consequence of atomicity and synchronization is that, for them to hold, the maximum distance that a robot can move in one cycle is bounded. 2.3 Capabilities Different settings arise from different assumptions that are made on the robots' capabilities, and on the amount of information that they share and use during the accomplishment of the assigned task. In particular, ^ Note that this does not mean that the observing robot can distinguish a moving robot from a non moving one.

52

G. Prencipe and N. Santoro

- Visibility. The robots may be able to sense the complete plane or just a portion of it. We will refer to the first case as the Unlimited Visibility case. In contrast, if each robot can sense only up to a distance V > 0 from it, we are in the Limited Visibility case. In the following, we will say also that the robots have unlimited/limited visibility In addition, a robot cannot in general detect whether there is more than one fellow robot on any of the observed points, included the position where the observing robot is. We say it cannot detect multiplicity. - Agreement on Coordinate System. The robots do not necessarily share the same x—y coordinate system, and do not necessarily agree on the location of the origin (that we can assume, without loss of generality, to be placed in the current position of the robot), or on the unit distance. In general, there is no agreement among the robots on the chirality of the local coordinate systems (i.e., in general they do not share the same concept of where North, East, South, and West are). We will refer to this scenario as no agreement on the local coordinate systems. In the most favorable scenario, the robots agree on the direction and orientation of both axes. In this case, we will talk of total agreement on the local coordinate systems. Note that knowledge of the directions and orientations of both axes does not imply knowledge of the origin or the unit of length. An intermediate scenario is when the robots agree only on the direction and orientation of one axis; we will talk of partial agreement. - Memory. The robots can access local memory to store different amount of information regarding the positions in the plane of their fellows. In particular, if the robots can only store the robots' positions retrieved in the current observation, we have oblivious robots. In contrast, if the robots can store all the positions retrieved since the beginning of the computation, we have unbounded memory robots. We will also refer to the algorithm the robots execute as oblivious or non oblivious, depending on the assumption made. Note that, the conditions under which the robots operate are by definition common knowledge among the robots. Let us stress that the only means for the robots to coordinate is the observation of the others' positions and their change through time. For oblivious robots, even this form of communication is impossible, since there is no memory of previous positions.

3 Problems and Limitations In the following, we survey the computational results obtained so far. They are mostly about geometric problems, like forming a certain pattern, following a trail, or deploy the robots in order to guarantee optimal coverage of a certain terrain. Observe that several classical problems in distributed computing (e.g.,

Distributed Algorithms for Autonomous Mobile Robots

53

leader election) can be reformulated as geometric problems in our model (e.g., forming an asymmetric pattern). 3.1 Pattern formation The PATTERN FORMATION problem is one of the most important coordination problem and has been extensively investigated in the literature (e.g., see [8, 38, 39, 41]). The problem is practically important, because, if the robots can form a given pattern, they can agree on their respective roles in a subsequent, coordinated action. The geometric pattern to be formed is a set of points (given by their Cartesian coordinates) in the plane, and it is initially known by all the robots in the system. The robots are said to form the pattern if, at the end of the computation, the positions of the robots coincide, in everybody's local view, with the points of the pattern. The formed pattern may be translated, rotated, scaled, and flipped into its mirror position with respect to the initial pattern. Initially the robots are in arbitrary positions, with the only requirement that no two robots are in the same position, and that, of course, the number of points prescribed in the pattern and the number of robots are the same. The basic research questions are which patterns can be formed, and how they can be formed. Many proposed procedures do not terminate and never form the desired pattern: the robots just converge towards it; such procedures are said to converge. Arbitrary Pattern In this section, we review our results on the formation of an arbitrary pattern. The problem has been investigated by Flocchini et al [21, 23] and Oasa et al. [33] in the general setting, and by Suzuki and Yamashita [39] in the semi-synchronous setting; both investigations consider robots with unlimited visibility. In the general setting with unlimited visibility: - With total agreement oblivious robots can form any arbitrary given pattern [21]. - With partial agreement, oblivious robots can form any arbitrary given pattern if n is odd. If n is even, oblivious robots can form only symmetric patterns that have at least one axis of symmetry not passing through any vertex of the pattern [23]. - With no agreement at all, oblivious robots cannot form an arbitrary given pattern [21]. In the semi-synchronous setting with unlimited visibility, let m be the size of the largest subset of robots having an equivalent initial view. - Robots with unbounded memory can form [39] 1. any pattern if m = 1; 2. only patterns whose vertices can be partitioned into n/m regular m-gons all having the same center, if m > 2.

54

G. Prencipe and N. Santoro

Circle Formation In the CIRCLE FORMATION problem, the robots want to place themselves on the plane to form a non degenerated circle (i.e., with finite radius greater than zero). First observe that, if the diameter of the circle is not fixed a priori, the problem can be solved in a rather straightforward way by oblivious robots even in the general setting: each robot computes the smallest circle enclosing all the robots' positions and moves on the circumference of such a circle. The problem becomes more difficult when the diameter is prescribed. This problem was first studied by Sugihara and Suzuki [38]. They presented an heuristic distributed protocol that allowed the robots to form an approximation of a circle having a given diameter. The distributed protocol they proposed (executed independently by all the robots) to let the robots form an approximation of a circle of given diameter D. Experiments have shown that sometimes the robots bring themselves in a configuration similar to a Reuleaux triangle rather than a circle (see Figure 1). Successively, the protocol has been improved by Tanaka [40], that proposed a new solution that produces a better approximation of the circle.

Fig. 1. Reuleaux's triangle. It is obtained by drawings arcs arc{a,b), arc{b,c), and arc{c,a), with radii equal to D, from the vertices c, a, and 6, respectively, of an equilater triangle A{a,b, c) with sides equal to D.

A variant of this problem is the UNIFORM CIRCLE FORMATION problem: the n robots on the plane must be arranged at regular intervals on the boundary of a circle. Notice that this is the same as the problem of forming an n-gon. This problem has been studied in the semi-synchronous setting by Defago and Konagaya [16]; simulation results of these studies have been presented in [37]. The solution in [16] is, however, computationally expensive: in fact, it involves the use of Voronoi diagrams, necessary to avoid the very specific possibility in which at least two robots share at some time the same position and also have total agreement. Based on this observation, in [7] it is presented a new algorithm that avoids these expensive calculations.

Distributed Algorithms for Autonomous Mobile Robots

55

- In the semi-synchronous setting with unlimited visibility: oblivious robots can converge towards an n-gon [16, 37, 7]. Line Formation Let us now consider another simple pattern for the robots: a hne. That is, the robots are required to place themselves on a line, whose position is not prescribed in advance; we just defined the LINE FORMATION problem. Note that, if n = 2, a line is always formed. Despite the simplicity of its formulation, this problem has some subtleties that render its solution not so easy. In fact, the solvability of this problem heavily depends on the amount of agreement the robots have on their local coordinate systems. Clearly, if the robots can rely on total agreement, then the problem is easily solved: after lexicographically ordering the robots' positions (e.g., left-right, top-down), the first and the last robot in the ordering define the line to be formed. Then, all robots move sequentially (in order to avoid collisions) to this line (see Figure 2.a). If the robots have partial agreement, for instance on the direction and orientation of y, the robots can not rely on an unique total ordering of the robots' positions. In this case the robots can place themselves on the axis that is median between the two vertical axes tangent to the observed configuration (see Figure 2.b). The robots on the tangent axes are the last to move.

3« 2,

a.

b.

Fig. 2. Line formation with (a) total and (b) partial agreement.

In a recent study [15], the LINE FORMATION problem has been tackled by studying an apparently totally different problem: the spreading. In this problem, the robots, that at the beginning are arbitrarily placed on the plane, are required to evenly spread within the perimeter of a given region. In their work, the authors focus on the one-dimensional case: in this case, the robots have to form a line, and place themselves uniformly on it. A very interesting aspect of the study, is that [15] addresses the issue of local algorithms: each robots decides where to move based on the positions of its close neighbors. In particular, in the case of the line, the protocol, called Spread, is quite simple: each robot r observes its left and right neighbor. If r does not see any robot, it simply does not move; otherwise, it moves to the median point between its two neighbors. The authors prove its convergence in the semi-synchronous setting.

56

G. Prencipe and N. Santoro Semi-synchronous

Asynchronous

Multiplicity Detection [39] Infinite Time [2, 13, 14] Multiplicity Detection [10] Compass [22] Unbounded Memory [9] Infinite Time [12]

Table 1. Summary of additional assumptions made by the existing solutions for the GATHERING problem.

- In the semy-synchronous setting, the robots executing Spread converge to a line configuration with equal distances. Furthermore, if each robot knows the exact number of robots at each of its sides, it is possible to achieve the spreading in one dimension in a finite number of cycles. ~ In the fully-synchronous model, n robots can spread in one dimension in n —2 cycles. 3.2 Gathering In the GATHERING problem, the robots, initially placed in arbitrary positions, are required to gather in a single point. This problem is also called point formation, homing, or rendezvous. In spite of its apparent simplicity, it has recently been tackled by several studies: in fact, several factors render this problem difficult to solve, as shown by the following - In both the asynchronous and the semi-synchronous setting, there exists no deterministic obhvious algorithm that solves the GATHERING problem in a finite number of cycles, hence in finite time, for a set of n > 2 robots [35]. Some additional capabilities are thus needed to solve this problem (in Table 1 we report the existing results related to the GATHERING problem). Let us first consider the case of unlimited visibility. - In the semi-synchronous setting, n > 3 oblivious robots with multiplicity detection can gather in finite time [39]. This result has been recently improved; in fact, the same result can be achieved even in the general setting, extending the previous work of [11]: - In the general setting, n > 3 oblivious robots with multiplicity detection can gather in finite time [10].

Distributed Algorithms for Autonomous Mobile Robots

57

The multiplicity detection assumption is crucial to prove the correctness of these algorithms. In fact, the main idea is first to create a unique point p on the plane with two robots on it, and then to move all other robots on this point, taking care in not having other points with multiplicity greater than one while the robots move towards p. In contrast, the multiplicity detection is not used in the solution described in [9]; however, it is assumed that the robots can rely on an unlimited amount of memory: the robots are said to be non-oblivious. In other words, the robots have the capability to store the results of all computations since the beginning, and freely access to these data and use them for future computations. - In the general setting, n > 3 robots with unbounded memory can gather in finite time [9]. Another study [13] has been devoted to study the behavior of a particular simple solution to the problem: the robots use the center of gravity as gathering destination. The authors prove that this simple algorithm represent a convergence solution to the problem in the semi-synchronous setting. In [12] the same algorithm has been proven to be a convergence solution to the problem in the asynchronous setting. Let us then consider the case of limited visibility. With limited visibility, an obvious necessary condition to solve the problem, is that at the beginning of the computation the visibility graph (having the robots as nodes and an edge (rj, r^) if rj and rj are within viewing distance) is connected [2, 22]. In [2] the proposed protocol works in the semi-synchronous setting; however, it is a convergence solution to the problem: the robots do not gather in finite time. In fact, the authors design a protocol that guarantees only that the robots converge towards the gathering point. In contrast, in [22], the authors present an algorithm that let the robots to gather in a finite number of cycles. However, the robots can rely on the presence of a common coordinate system: that is, they share a compass. - In the semi-synchronous setting there exists an oblivious procedure that lets robots converge towards (but not necessarily reach) a point for any n [2]. - In the general setting oblivious robots with agreement on the coordinate system (e.g., with a compass) can gather in finite time [22, 24]. The GATHERING problem has been also investigated in the context of robots failures. In this context, the goal is for the non-faulty robots to gather regardless of the action taken by the faulty ones. Two types of robot faults were investigated by Peleg et al. [1]: crash failure, in which the robot stops any activity and will no longer execute any computational cycle; and the byzantine failure, in which the robot acts arbitrarily and possibly maliciously. - In the semi-synchronous setting, gathering with at most one crash failure is possible [1]. - In the semi-synchronous setting, gathering with at most one byzantine failure is impossible [1].

58

G. Prencipe and N. Santoro

- In the fully synchronous setting, gathering with at most ^ ^ byzantine failure is possible [1]. Finally, in [14] it is analyzed the case of systems where the robots have inaccuracies in sensing the positions of other robots, in computing the next destination point, and in moving towards the computed destination. The authors provide a set of limitations on the amount of inaccuracies allowing convergence; hence, they present an algorithm for convergence under bounded measurement, movement and calculation errors. 3.3 Following, flocking and capture In these problems there are two kinds of robots in the environment: the leader L, and the followers. The leader acts independently from the others, and we can assume that it is driven by an human pilot. The followers are required to follow the leader wherever it goes (following), while keeping a formation they are given in input (flocking). In this context, a formation is simply a pattern described as a set of points in the plane, and all the robots have the same formation in input (see Figure 3). In [26], an algorithm solving this problem has been tested by using computer simulation; the algorithm assumes no agreement. All the experiments demonstrated that the algorithm is well behaved, and in all cases the followers were able to assume the desired formation and to maintain it while following the leader along its route. Moreover, the obliviousness of the algorithm contributes to this result, since the followers do not base their computation on past leader's positions. Finally, if the leader is considered an "enemy" or "intruder", and the pattern surrounds it, the problem is known as capture. Also in this procedure that assumes no agreement and solves the problem has been presented in [27]. The proposed algorithm exhibits remarkable robustness, and numeric simulations indicate that the intruder is efficiently captured in a relatively short time and kept surrounded after that, as desired. Furthermore, the solution is selfstabilizing [17, 18]. In particular, any external intervention (e.g., if one or more of the cops are stopped, slowed down, knocked out, or simply faulty) does not prevent the completion of the task. - In the general setting there is a procedure for the flocking problem [26]. - In the general setting there is a procedure for the intruder problem [27].

4 Conclusion and Discussion In this paper, we surveyed a number of recent results on the interplay between robots' capabilities and solvability of problems. The goal of these studies is

Distributed Algorithms for Autonomous Mobile Robots

59

Leader , Initial Positions

Fig. 3. Trace of the vehicles while forming and keeping a wedge shaped formation. to gain a better understanding of the power of distributed control from an algorithmic point of view. The area offers many open problems. The operating capabilities of our robots are quite limited. It would be interesting to look at models where the robots have more complex capabilities, e.g.: the robots have some kind of direct communication capabilities; the robots are distinct and externally identifiable; etc. Little is known about the solvability of other problems like spreading and exploration (used to build maps of unknown terrains), about the physical aspects of the models (giving physical dimension to the robots, bumping, energy saving issues, etc.), and about the relationships between geometric problems and classical distributed computations. In the area of reliability and fault-tolerance, lightly faulty snapshots, a limited range of visibility, obstacles that limit the visibility and that moving robots must avoid or push aside, as well as robots that appear and disappear from the scene clearly are all topics that have not yet been studied. We believe that investigations in these areas will provide useful insights on the ability of weak robots to solve complex tasks. Acknowledgements The Authors would like to thank Paola Flocchini and Peter Widmayer for their help and suggestions in the preparation of this survey. This research is supported in part by the Natural Sciences and Engineering Research Council of Canada.

References 1. N. Agmon and D. Peleg. Fault-tolerant Gathering Algorithms for Autonomous Mobile Robots. In Proc. of the 15th ACM-SIAM Symposium on Discrete Algorithms, pages 1070 - 1078, 2004. 2. H. Ando, Y. Oasa, I. Suzuki, and M. Yamashita. A Distributed Memoryless Point Convergence Algorithm for Mobile Robots with Limited Visibility. IEEE Trans, on Robotics and Automation, 15(5):818-828, 1999.

60

G. Prencipe and N. Santoro

3. T. Balch and R. C. Arkin. Behavior-based Formation Control for Multi-robot Teams. IEEE Trans, on Robotics and Automation, 14(6), December 1998. 4. R. Beckers, O. E. Holland, and J. L. Deneubourg. From Local Actions To Global Tasks: Stigmergy And Collective Robotics. In Art. Life IV, 4"' Int. Worksh. on the Synth, and Simul. of Living Sys., 1994. 5. G. Beni and S. Hackwood. Coherent Swarm Motion Under Distributed Control. In Proc. DARS'92, pages 39-52, 1992. 6. Y. U. Cao, A. S. Fukunaga, A. B. Kahng, and F. Meng. Cooperative Mobile Robotics: Antecedents and Directions. In Int. Conf. on Intel. Robots and Sys., pages 226-234, 1995. 7. I. Chatzigiannakis, M. Markou, and S. Nikoletseas. Distributed Circle Formation for Anonymous Oblivious Robots. In Experimental and Efficient Algorithms: Third International Workshop (WEA 2004), volume LNCS 3059, pages 159 -174, 2004. 8. Q. Chen and J. Y. S. Luh. Coordination and Control of a Group of Small Mobile Robots. In Proc. IEEE Int. Conf on Rob. and Aut, pages 2315-2320, 1994. 9. M. Cieliebak. Gathering Non-Oblivious Mobile Robots. In Proc. 6th Latin American Symposium on Theoretical Informatics, pages 577-588, 2004. 10. M. Cieliebak, P. Flocchini, G. Prencipe, and N. Santoro. Solving the Robots Gathering Problem. In Proc. 30th International Colloquium on Automata, Languages and Programming, pages 1181-1196, 2003. 11. M. Cieliebak and G. Prencipe. Gathering Autonomous Mobile Robots. In Proc. 9th Int. Colloquium on Structural Information and Communication Complexity, June 2002. 12. R. Cohen and D. Peleg. Convergence Properties of the Gravitational Algorithm in Asynchronous Robot Systems. In Proc. of the 12th European Symposium on Algorithms, pages 228-239, 2004. 13. R. Cohen and D. Peleg. Robot Convergence via Center-of-Gravity Algorithms. In Proc. of the 11th Int. Colloquium on Structural Information and Communication Complexity, pages 79-88, 2004. 14. R. Cohen and D. Peleg. Convergence of Autonomous Mobile Robots with Inaccurate Sensors and Movements. In Proc. 23"* Annual Symposium on Theoretical Aspects of Computer Science (STACS '06), pages 549-560, 2006. 15. R. Cohen and D, Peleg. Local Algorithms for Autonomous Robots Systems. In Proc. of the 13th Colloquium on Structural Information and Communication Complexity, 2006. to appear. 16. X. Defago and A. Konagaya. Circle Formation for Oblivious Anonymous Mobile Robots with No Common Sense of Orientation. In Workshop on Principles of Mobile Computing, pages 97-104, 2002. 17. E. W. Dijkstra. Self-stabilizing Systems in Spite of Distributed Control. Comm. of the ACM, 17(ll):643-644, 1974. 18. S. Dolev. Self-stabilization. The MIT Press, 2000. 19. B. R. Donald, J. Jennings, and D. Rus. Information Invariants for Distributed Manipulation. The Int. Journal of Robotics Research, 16(5), October 1997. 20. E. H. Durfee. Blissful Ignorance: Knowing Just Enough to Coordinate Well. In ICMAS, pages 406-413, 1995. 21. P. Flocchini, G. Prencipe, N. Santoro, and P. Widmayer. Hard Tasks for Weak Robots: The Role of Common Knowledge in Pattern Formation by Autonomous Mobile Robots. In Proc. 10th International Symposium on Algorithm and Computation, pages 93-102, 1999.

Distributed Algorithms for Autonomous Mobile Robots

61

22. P. Flocchini, G. Prencipe, N. Santoro, and P. Widmayer. Gathering of Asynchronous Mobile Robots with Limited Visibility. In Proceedings 18th International Symposium on Theoretical Aspects of Computer Science, volume LNCS 2010, pages 247-258, 2001. 23. P. Flocchini, G. Prencipe, N. Santoro, and P. Widmayer. Pattern Formation by Autonomous Robots Without Chirality. In Proc. 8th Int. Colloquium on Structural Information and Communication Complexity, pages 147-162, June 2001. 24. P. Flocchini, G. Prencipe, N. Santoro, and P. Widmayer. Gathering of Asynchronous Robots with Limited Visibility. Theoretical Computer Science, 337:147168, 2005. 25. T. Fukuda, Y. Kawauchi, and H. Asama M. Buss. Structure Decision Method for Self Organizing Robots Based on Cell Structures-CEBOT. In Proc. IEEE Int. Conf. on Robotics and Autom., volume 2, pages 695-700, 1989. 26. V. Gervasi and G. Prencipe. Coordination Without Communication: The Case of The Flocking Problem. Discrete Applied Mathematics, 143:203-223, 2003. 27. V. Gervasi and G. Prencipe. Robotic cops: The intruder problem. In Proc. IEEE Conference on Systems, Man and Cybernetics, pages 2284-2289, 2003. 28. D. Jung, G. Cheng, and A. Zelinsky. Experiments in Realising Cooperation between Autonomous Mobile Robots. In ISER, 1997. 29. Y. Kawauchi and M. Inaba and. T. Fukuda. A Principle of Decision Making of Cellular Robotic System (CEBOT). In Proc. IEEE Conf. on Robotics and Autom., pages 833-838, 1993. 30. M. J Mataric. Interaction and Intelligent Behavior. PhD thesis, MIT, May 1994. 31. S. Murata, H. Kurokawa, and S. Kokaji. Self-assembling Machine. In Proc. IEEE Conf. on Robotics and Autom., pages 441-448, 1994. 32. F. R. Noreils. Toward a Robot Architecture Integrating Cooperation between Mobile Robots: Application to Indoor Environment. The Int. J. of Robot. Res., pages 79-98, 1993. 33. Y. Oasa, I. Suzuki, and M. Yamashita. A Robust Distributed Convergence Algorithm for Autonomous Mobile Robots. In IEEE Int. Conf. on Systems, Man and Cybernetics, pages 287-292, October 1997. 34. L. E. Parker. On the Design of Behavior-Based Multi-Robot Teams. Journal of Advanced Robotics, 10(6), 1996. 35. G. Prencipe. On The Feasibility of Gathering by Autonomous Mobile Robots. In Proc. 12th Int. Colloquium on Structural Information and Communication Complexity, pages 246-261, 2005. 36. G. Prencipe. The Effect of Synchronicity on the Behavior of Autonomous Mobile Robots. Theory of Computing Systems, 38:539-558, 2005. 37. S. Samia, X. Defago, and T. Katayama. Convergence Of a Uniform Circle Formation Algorithm for Distributed Autonomous Mobile Robots. In In Journes Scientifiques Francophones (JSF), Tokio, Japan, 2004. 38. K. Sugihara and I. Suzuki. Distributed Algorithms for Formation of Geometric Patterns with Many Mobile Robots. Journal of Robotics Systems, 13:127-139, 1996. 39. I. Suzuki and M. Yamashita. Distributed Anonymous Mobile Robots: Formation of Geometric Patterns. Siam J. Computing, 28(4):1347-1363, 1999. 40. O. Tanaka. Forming a Circle by Distributed Anonymous Mobile Robots. Technical report. Department of Electrical Engineering, Hiroshima University, Hiroshima, Japan, 1992.

62

G. Prencipe and N. Santoro

41. P. K. C. Wang. Navigation Strategies for Multiple Autonomous Mobile Robots Moving in Formation. Journal of Robotic Systems, 8(2):177-195, 1991.

Part III

Contributed Papers

The Unsplittable Stable Marriage Problem Brian C. Dean, Michel X. Goemans, and Nicole Immorlica ^ Department of Computer Science, Clemson University. bcdeanScs. clemson. edu ^ Department of Mathematics, M.I.T. goemans9math.mit.edu ^ Microsoft Researcli. nickleQmicrosoft. com Abstract. The Gale-Shapley "propose/reject" algorithm is a wellknown procedure for solving the classical stable marriage problem. In this paper we study this algorithm in the context of the many-to-many stable marriage problem, also known as the stable allocation or ordinal transportation problem. We present an integral variant of the GaleShapley algorithm that provides a direct analog, in the context of "ordinal" assignment problems, of a well-known bicriteria approximation algorithm of Shmoys and Tardos for scheduling on unrelated parallel machines with costs. If we are assigning, say, jobs to machines, our algorithm finds an unsplit (non-preemptive) stable assignment where every job is assigned at least as well as it could be in any fractional stable assignment, and where each machine is congested by at most the processing time of the largest job.

1 Introduction In the United States, a medical school graduate is required to complete a residency program at a hospital before entering the workforce as a doctor. Since the 1950s, the medical field has turned to a centralized mechanism, called the National Residency Matching Program (NRMP), to aid this marketplace [10]. In this program, final-year medical students and hospitals each submit preferences over possible matches, and an algorithm determines which matches will take place. In order for the system to be successful, it is essential that the computed matches be stable. That is, there should be no (student, hospital) pair that both prefer each-other to their assigned partners — such a pair would have an incentive to withdraw from the centralized matching system and to make its own plans on the side. Computing a stable matching is a classic problem in economics and computer science, and can be solved in polynomial time by the deferred acceptance algorithm of Gale and Shapley [3].-' For many years the NRMP proved to be quite successful. However, in the late 1990s it was observed that many matches were being formed outside the NRMP [12]. The problem stemmed from the fact that many medical students were getting married to one another during medical school, and so had complicated preferences that were ignored by the NRMP. In particular, married ^ For a discussion of this problem and related questions, see the books by Gusfield and Irving [4] and Roth and Sotomayor [14], or the lecture notes by Knuth [8]. Please use the following format when citing this chapter: Dean, B.C., Goemans, M.X., Immorlica, N., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 65-75.

66

B. Dean, M. Goemans, and N. Immorlica

students had strong preferences for hospitals in similar geographical locations. The NRMP was redesigned to accommodate such preferences [13]; currently, the NRMP permits married students to submit a joint preference list over pairs of hospitals and guarantees that, if they are matched, they will be matched to a pair in their list. Unfortunately, in a matching market with couples like the NRMP, a stable matching might not exist [10] and determining whether one exists is computationally difficult, in fact NP-hard [9]. Motivated by the issue of couples in the NRMP, we study a marketplace in which agents on one side of the market have non-uniform demands and agents on the other side have non-uniform quotas, or capacities. Demanding agents have a preference list over capacitated agents and prefer to be satisfied by a lexicographically maximal set of these agents. This problem is known as the stable allocation or ordinal transportation problem, and is a many-to-many generalization of the classical stable marriage problem, introduced originally by Baiou and Balinski [1]. It surfaces naturally in scheduling or load balancing settings where only "ordinal" information (ranked preference lists) is known. When demands are all 1 or 2 and capacities are integral, as in the student/hospital setting, this restricted preference domain becomes a special case of weakly responsive preferences studied by Klaus and Klijn [6]. In such cases, Klaus and Klijn [6] proved that a stable matching always exists. Instances of this problem with generalized demands/capacities include the assignment of teaching assistants (TAs) to courses in academic departments: TAs rank courses, course instructors rank TAs, each course requires a certain number of TA hours, and different TAs are responsible for working different numbers of hours. Another example is the assignment of load to servers in a network - clients prefer servers geographically nearby and servers prefer clients with higher service types. Baiou and Balinski [1] study these generalized settings and prove that even in this case a stable allocation always exists. For many settings, a stable allocation in which the demand of a single agent is satisfied fractionally is undesirable. Although a couple may prefer hospital a to 6 and thus a pair of placements (a, b) to a pair of placements (6, b), such an arrangement imposes strain on the matching. As often happens in labor markets with two-body problems, the couple may negotiate with hospital a to create an extra position, beyond the quota, for the extra member of the couple. In some sense, a fractional stable assignment is not stable. Thus, we seek a stable matching in which all the demand of a single agent is satisfied integrally. Clearly, such a matching may not exist, and so we relax our feasibility constraints and allow capacitated agents to be over-capacitated by at most the maximum demand. With a correspondingly appropriate modification of the definition of stability, we prove that a stable matching always exists, and give a modification of the Gale-Shapley algorithm to find it. Applied to the NRMP setting, our results compute a student-optimal (or hospital-optimal) stable matching where the number of students assigned to each hospital exceeds its quota by at most one position.

The Unsplittable Stable Marriage Problem

67

A close relative of the stable allocation problem is the well-studied transportation problem, where there are linear costs associated with every possible pairing and our objective is to compute a fractional assignment of minimum cost rather than a stable assignment. The stable allocation problem is also known as the ordinal transportation problem since it differs only in that we express the desirability of an assignment in an "ordinal" fashion using ranked preference lists. UnspHttable variants of the transportation problem have been previously considered in the literature, and a celebrated result of Shmoys and Tardos [15] states that from a fractional assignment (where all agents are fully assigned), we can construct an unsplit assignment of no greater cost where each agent is over-capacitated (or congested) by at most the maximum demand. Our results can viewed as a direct analog of this result for the ordinal case.

2 The Model Consider assigning a set [n] := { 1 , 2 , . . . , n} of items to a set [m] of bins. To be somewhat more concrete, let us employ scheduhng terminology and assume we are assigning "jobs" to "machines". Job i requires pi units of processing time, machine j has a capacity of Cj units, and at most Uij units of job i can be assigned to machine j . If Uij = pi for all {i,j), we follow the terminology of Baiou and Balinski [1] and say our problem is unconstrained. All problem data is assumed to be integral. 2.1 EVactional Assignment We first define a fractional setting where a job may be processed on multiple machines. A fractional assignment x is feasible if it satisfies I ] Xij
[n]

je[m\

^ Xij < Cj Vj G [m] ie[n] 0 < Xij < Uij \/(i,j) G [n] X [m].

(1)

In the traditional transportation problem (a many-to-many generalization of the bipartite assignment problem), we designate a weight Wij for assigning one unit of job i to machine j , then maximize ^WijXij over (1) using linear programming or network flow techniques (another popular objective is to minimize '^WijXij while insisting that all jobs must be fully assigned). In the stable allocation problem, however, we indicate the desirability of an assignment in an "ordinal" fashion by having each job (machine) submit a ranked preference list over all machines (jobs). Thus, each job i G [n] has a strict, transitive, and complete preference relation 7r(i) over the set [m] U 0 where {0} indicates a preference for remaining unmatched. If n{i) = ( j i , . . . , jfc_i,0 = jfc,ifc+i, • • •, Jm+i), then i prefers ja

68

B. Dean, M. Goemans, and N. Immorlica

to jb for any a fc. If job i prefers machine j to machine j ' , we write j >, j ' . Job i prefers a fractional assignment x to another fractional assignment x' if x is lexicographically larger according to n{i); that is, if Xij > x'^j for the earliest machine j in 7r(«) such that Xij j^ x[y In this case, we write x >, x'. Similarly, each machine j G [m] has a strict, transitive, and complete preference relation 7r(j) over the set [n\ U 0 where 0 indicates a preference for being under-utilized. If 7r(j) = ( J i , . . . ,ifc_i,0 = ik,ik+i-, • • • ,^n+i), then j prefers to accept load from job ia to ib for any a k. We write i >j i' if machine j prefers job i to job i', and we write X >j x' if machine j prefers assignment x to assignment x'; again, this means that Xij > x'ij for the first job i in 7r(j) where Xij ^ xj •. A blocking pair is a familiar feature that is forbidden in any stable assignment: it is a pair {i,j) where Xij < Uij and both i and j prefer each-other to at least some of their current assignments. In this case, job i and machine j would be "unhappy" with the current assignment and would prefer to increase Xij. That is. Definition 1. Job i and machine j form a blocking pair if there is some job i' and machine j ' such that Xij < Uij, Xij> > 0, Xi>j > 0, and we have i >j i' and 3 >i fA job i is saturated if all its load is assigned. Similarly, a machine is saturated if all its capacity is utilized. Definition 2. A job i is saturated if ^ • Xij > pi. A machine j is saturated if A^i ^ij

— ^3 •

Finally, ajob i is said to be popular in an assignment if there is some machine j to which i is not assigned, but where j prefers i to at least some of the jobs currently assigned to it. We define a popular machine similarly. Definition 3. In an assignment x, we say job i is popular if there exists a machine j with j >i 0 and Xij < Uij such that i >j i' for some job i' with Xi'j > 0. Likewise, we say machine j is popular if there exists a job i with i >j 0 and Xij < Uij such that j >i j ' for some machine j ' with Xij' > 0. If job i is popular due to machine j and i is not saturated, then our assignment is not stable since both i and j would be more satisfied if Xij were increased. Definition 4. An assignment x is stable if (i) it admits no blocking pairs, and (ii) all popular jobs and machines are saturated. A feasible stable assignment x is said to be job-optimal if every job prefers X to any other feasible stable assignment x', i.e. V z e [n], x>ix' (a machineoptimal assignment is defined analogously). In a job-optimal assignment, each

The Unsplittable Stable Marriage Problem

69

job simultaneously receives at least as much of an allocation of its first-choice machine as it could in any feasible stable assignment, and it also receives at least as much of an allocation of its second-choice machine as it could in any feasible stable assignment with the same first-choice allocation, and so on. It is always possible to find a job-optimal feasible stable assignment for any problem instance using a strongly-polynomial algorithm of Baiou and Balinski [1]. 2.2 Unsplit Assignment We now consider the "unsplittable" unconstrained stable allocation problem where each job must be entirely assigned to a single machine. Thus the feasible assignments x are precisely the integral solutions to (1) where either Xij = 0 or Xij = Pi for all {i,j). As the following simple example shows, an integral stable assignment may not exist. Example 1. Suppose there are two jobs ii and 12 with demands 1 and 2 respectively, and two machines j i and J2, both with capacity 2. Let Tx{ii) = 7r(i2) = (jii J2) and n{j\) = 7r(J2) = ( H , « 2 ) - Then the only stable assignment is ^iiji = li ^i23i ~ li ^^d x,jj2 = 1, but this is not an unsplit assignment. We therefore consider a relaxation that is directly analogous to a result of Shmoys and Tardos [15] for the bipartite assignment problem with costs. Assuming existence of a feasible fractional assignment of cost C with all jobs fully assigned, Shmoys and Tardos show how to round this solution in polynomial time to obtain an unsplit solution of cost no more than C where each machine is congested (filled beyond its capacity) by at most Pmax = maxjpj. Similar results have been achieved in literature on unsplittable flows (see [7, 2, 16] for more background), where our goal is generally to take a fractional solution to a network flow problem and round it to an unsplit flow (where the flow for each commodity follows a single path) without significantly raising the cost of the flow, and without causing excessive congestion on edges. Definition 5. An assignment x is minimally congested if for every machine j , removal of the least-preferred job (to jj currently assigned to j results in j being utilized at or below its capacity. Note that in a minimally congested assignment, each machine is overcapacitated by at most pmax- We show how a modified version of the GS algorithm can find, in polynomial time, a stable unsplit assignment that is job-optimal among all minimally congested stable unsplit assignments. Suppose a; is a job-optimal feasible stable fractional assignment. We prove that in a job-optimal unsplit assignment, each job is assigned to at least the best of its fractional assignments in x (our analog of the condition that cost does not increase). Our unsplit assignment is stable in that (i) it admits no blocking pairs and (ii) all popular machines are saturated. Note that one must take some care

70

B. Dean, M. Goemans, and N. Immorlica

here with the definition of condition (ii). We define machine j to be saturated with respect to its original capacity, Cj, and not the inflated capacity Cj + Pmax according to which our unspHt solution is feasible, i.e. machine j is saturated if Z^j xy > Cj. Otherwise, it might be impossible to satisfy (ii) by ensuring popular machines are saturated — for example, if Cj is odd but all pj's are even. This definition makes intuitive sense because a machine beyond its capacity will not want any new jobs assigned to it.

3 T h e Gale-Shapley Algorithm Gale and Shapley [3] devised a simple intuitive algorithm, now quite well known, for solving the classical "one-to-one" stable marriage problem. The algorithm is usually described in terms of men being assigned to women, although we continue to use job/machine terminology since it is less awkward once we advance to many-to-many matchings. The Gale-Shapley (GS) algorithm has each job i issue "proposals" to machines in the order of Vs preference list. Each machine j tentatively accepts the best proposal received so far. If machine j is tentatively matched with job i and receives a more favorable proposal, it tentatively accepts the new proposal and rejects i, which then continues to propose to machines further down on its preference list. Remarkably, it can be shown that regardless of the order in which jobs propose, the GS algorithm always terminates with a job-optimal and machine-pessimal stable matching. Each job receives the most preferred partner it could receive in any stable matching, and each machine receives the least preferred partner it could receive in any stable matching. By symmetry, the reverse is true if the machines do the proposing. Baiou and Balinski [1] mention that the GS algorithm can be generalized to solve the many-to-many stable allocation problem, although its running time in this case is only pseudo-polynomial. The generalized GS algorithm issues "aggregate" proposals: in each iteration a job i that is not fully assigned issues a proposal to the next machine j in its preference list and proposes all of its unassigned processing time (up to Wy). Machine j accepts only as much as allowed by its capacity, current allocation, and preference list, possibly rejecting (fractionally) some of the jobs already assigned to it if they are less preferred than job i. Whenever a job is "split" due to a fractional acceptance or rejection, it remains split into two "virtual jobs" for the remainder of the algorithm, each of which carries out independent sequences of proposals. Just as with the classical unit stable matching problem, one can show that order of proposals and rejections does not matter — we always obtain a job-optimal feasible stable assignment. A similarly defined algorithm with machine proposals always finds the machine optimal assignment. Theorem 1. For any order of proposals, the job-proposing GS algorithm computes the job-optimal fractional stable assignment.

The Unsplittable Stable Marriage Problem

71

This theorem follows immediately from the fact that we can interpret the extended GS algorithm for the many-to-many stable allocation problem as nothing more than the standard "one-to-one" GS algorithm applied to an expanded instance where each job i is replaced with p, unit-sized jobs (each with the same preference list) and each machine j is replaced by Cj unit-sized machines (each with the same preference list). The many-to-many algorithm is sped up by issuing proposals in batches, but it inherets from the one-to-one algorithm the property that the final solution must be job-optimal irrespective of the order of proposals. As an interesting remark, if problem data is irrational, then not only does this reduction to the one-to-one case fail, but it is also not known whether the GS algorithm terminates after a finite number of iterations. We comment on this issue further in the conclusion section.

4 Computing Unsplittable Stable Allocations In this section we discuss our "ordinal" analog for the stable allocation problem of the result of Shmoys and Tardos for the minimum-cost bipartite assignment problem. Since the constraints Xij < Uij do not make sense for an unsplittable stable allocation problem, we henceforth assume we are dealing with an unconstrained stable allocation problem. Let us modify the GS algorithm as follows. Jobs issue proposals in sequence according to their preference lists, and in each iteration an arbitrary unassigned job i issues a proposal to the next machine j on its preference list. In this case, however, all proposals and rejections are "integral" in that either an entire job is accepted or rejected. Machine j accepts Vs proposal, but then proceeds to reject in sequence the least favored jobs assigned to it (possibly including i) until j is at most over-congested by the processing time of a single job — that is, until rejecting the next job would leave the machine being utilized strictly below Cj units of load. Note that such an algorithm results in an assignment where each machine is congested by at most the maximum processing time of a job. If each machine stores its accepted jobs in a heap based on preference list ranking, this integral variant of the GS algorithm runs in 0 ( m n log n) time. We now prove some desirable properties of the algorithm. First we show that the assignment output by our algorithm is stable and job-optimal. The proof of the following theorem is similar to the traditional proof for the correctness and optimality of the one-to-one GS algorithm. Theorem 2. The integral job-proposing GS algorithm computes i/ie job-optimal stable unsplit assignment among all minimally congested unsplit stable assignments. Proof. Let x* be the solution output by the GS algorithm. Clearly, x* is an unsplit assignment that congests each machine by at most Pmax- Let x*{i) be the machine to which job i is assigned in x* and x*{j) be the set of jobs to

72

B. Dean, M. Goemans, and N. Immorlica

which machine j is assigned in x* (i.e. x*{j) = {i : x*^ > 0}). We also extend the preference notation such that for a set S, S >j i means i' >j i for all ?' G 5 with i' yt i. We first show that x* is stable. Suppose not. First note that once a machine is saturated, it never again becomes unsaturated. Thus, every popular machine j must be saturated since if j is popular due to i, then i must have proposed to j at some point and been rejected. This means that the instability in x* must be caused by a blocking pair. Let {i,j) be a blocking pair. There are two cases. If i never proposed to j , then, since jobs propose in decreasing order of their preference list, x*{i) >j j which contradictions the assumption that (i, j) is a blocking pair. On the other hand, if i proposed to j and was rejected, then x*{j) >j i since machines only ever improve the set of jobs assigned to them. We now show that x* is job-optimal. Suppose not and let i be the first job rejected by one of its stable machines (i.e. a machine assigned to i in some minimally congested stable unsplit assignment), and let j be the first stable machine to reject i. Call the minimally congested unsplit stable assignment in which i and j are matched x. When j rejected i, in the current tentative assignment x', x'{j) >j i and Y^ii^x'(i)Pi' — ^J- ^^ ^'^^ know that there must be some i' S x'{j)\x{j); if this were not the case and x'{j) C x{j), then x could not have been minimally congested (removal of job i and all other jobs j prefers less than i would still leave machine j saturated). Since i' has not yet been rejected by a stable machine, and since jobs propose in decreasing order of their preference list, j >»' x{i'). But then (i', j) form a blocking pair in x, and so j could not have been a stable machine for i. We now observe that this solution computed by the integral variant of the GS algorithm assigns each job to at least the best of its fractional assignments in the job-optimal fractional assignment. Thus, the jobs weakly prefer the solution output by the integral variant to the solution output by the fractional variant - i.e. the solution is both integral and lexicographically larger. Our proof uses the fact that the order of proposals does not affect the outcome of the GS algorithm. Thus, we can run the fractional variant of the GS algorithm using the order of proposals induced by the integral variant. During this process, we observe that jobs are assigned to the same machines in both variants. However, the fractional variant may have additional proposals to make after the integral variant completes. As jobs always propose to machines in decreasing order of their preference list, and as the fractional (integral) variant computes the joboptimal fractional (unsplit) stable solution, this coupling of the two algorithms shows that the unsplit solution must be preferred to the fractional solution by each job. Let x(i) be the set of machines to which i is partially assigned in assignment X, i.e. x{i) = {j : Xij > 0}. Theorem 3. Consider any feasible fractional stable assignment Xfrac 'md the job-optimal minimally congested unsplit stable assignment Xint- Then for all jobs i, Xint{i)

>i

XfraS)-

The Unsplittable Stable Marriage Problem

73

Proof. The proof follows from Theorem 1 and the fact that jobs propose in decreasing order of their preference list (and so as the algorithm runs the jobs' situations worsen). More formally, consider the sequence of proposals defined by the integral GS algorithm. Call this sequence (11,12,...,«;) (note this list includes repetitions and I may be greater than n). Run the fractional GS algorithm with the same order of proposals. We prove by induction that after the proposal of ik, the current assignment x in the integral variant and x' in the fractional variant satisfy x{j) = x'{j) for all j and a machine is saturated in x if and only if it is in x'. This is clearly true after the proposal of ii. Assume this is the case after the proposal of ife_i and let j be the machine to which ik proposes. By inductive assumption, j must be the same machine in both the integral and fractional variants of the algorithm. If j rejects ik in the integral variant, then it must be that x{j) >j ik and 13iea;(j)P«+ — ^j- Thus, in the fractional variant, X^ig^./^,) x'^J = Cj and x'{j) >j ik so all of i^'s load is rejected. A similar argument holds if j rejects ik in the fractional variant, and so the inductive hypothesis holds. Therefore, after the Tth proposal in the integral variant, the final solution Xint of the integral variant is at least as preferable as the current solution x' of the fractional variant for each job. Furthermore, as jobs propose in decreasing order of their preference list, the final solution Xfrac of the fractional variant cannot be preferred to the current solution x' by any job. This completes the proof. We remark that all the theorems in this paper hold if we instead seek the machine-optimal solution. We merely need to run the Gale-Shapley algorithm with machine-proposals - a machine proposes to the next job on its preference list if it is currently under-utilized (it's load is currently less than its capacity). A job (fractionally) accepts a proposal if it is (fractionally) unassigned or if it prefers the proposing machine to (some of) its current machine(s), in which case it rejects (some of) its current machine(s).

5 Conclusion In this paper, we studied a natural integral variant of the stable allocation problem in which every job was unsplittably assigned and every machine was not excessively congested. Our results have implications for many economic settings where varying sized agents must be matched to each other. Our work leaves open a number of interesting questions: Rural hospitals: It is well known that in one-to-one matching, the set of singles remains the same in every stable matching. Roth [11] extended this theorem and showed that in one-to-many matching, an agent not fully utilized in a stable

74

B. Dean, M. Goemans, and N. Immorlica

matching always receives the exact same assignment in every matching.^ It seems hkely t h a t similar statements might hold in a many-to-many matching as well. It would be interesting to learn whether the same machines are congested in every stable unsplit matching, and if so whether these machines are congested by the same amount in every stable unsplit matching, a n d / o r t h a t the uncongested machines have the same assignment in every stable unsplit matching. Incentives: Centralized matching algorithms like the one proposed in this paper are often used in economic settings where agents are self-interested and might alter their submitted preference list in order to improve their match. It is known t h a t no stable mechanism can be incentive-compatible for b o t h jobs and machines. In a job-optimal mechanism, for example, machines have an incentive to lie. However, Immorlica and Mahdian [5] showed t h a t , in a one-to-many matching, if preference lists of jobs are short and preferences are drawn randomly according to a particular class of distributions, then each agent has a unique stable partner with high probability, and thus has no incentive to lie. It would be interesting to prove a similar statement in the many-to-many setting studied here.

References 1. M. Baiou and M. Balinski. Erratum: The stable allocation (or ordinal transportation) problem. Mathematics of Operations Research, 27(4):662-680, 2002. 2. Y. Dinitz, N. Garg, and M.X. Goemans. On the single-source unsplittable flow problem. Combinatorica, 19:17-41, 1999. 3. D. Gale and L.S. Shapley. College admissions and the stability of marriage. American Mathematical Monthly, 69(1):9-14, 1962. 4. D. Gusfield and R. Irving. The Stable Marriage Problem: Structure and Algorithms. MIT Press, 1989. 5. N. Immorlica and M. Mahdian. Marriage, honesty, and stability. In Proceedings of 16th ACM Symposium on Discrete Algorithms, pages 53-62, 2005. 6. B. Klaus and F. Klijn. Stable matchings and preferences of couples. Journal of Economic Theory, 121:75-106, 2005. 7. J.M. Kleinberg. Approximation algorithms for disjoint paths problems. PhD thesis, M.I.T., 1996. 8. D.E. Knuth. Stable marriage and its relation to other combinatorial problems. In CRM Proceedings and Lecture Notes, vol. 10, American Mathematical Society, Providence, RI. (English translation of Marriages Stables, Les Presses de L'Universite de Montreal, 1976), 1997. 9. E. Ronn. Np-complete stable matching problems. Journal of Algorithms, 11:285304, 1990. 10. A.E. Roth. The evolution of the labor market for medical interns and residents: a case study in game theory. Journal of Political Economy, 92:991-1016, 1984. ^ This is known as the rural hospital theorem as it explains why rural hospitals, typically unpopular among students in the NRMP, always receive the same assignment in every stable matching.

The Unsplittable Stable Marriage Problem

75

11. A.E. Roth. On the allocation of residents to rural hospitals: a general property of two-sided matching markets. Econometrica, 54:425-427, 1986. 12. A.E. Roth. The national residency matching program as a labor market. Journal of the American Medical Association, 275(13) ;1054-1056, 1996. 13. A.E. Roth and E. Peranson. The redesign of the matching market for american physicians: Some engineering aspects of economic design. American Economic Review, 89:748-780, 1999. 14. A.E. Roth and M. Sotomayor. Two-Sided Matching: A Study in Game-Theoretic Modeling and Analysis. Cambridge University Press, 1990. 15. D.B. Shmoys and E. Tardos. Scheduling unrelated machines with costs. In Proceedings of the 4th annual ACM-SIAM Symposium on Discrete algorithms (SODA), pages 448-454, 1993. 16. M. Skutella. Approximating the single source unsplittable min-cost flow problem. In Proceedings of the 4ist Annual Symposium on Foundations of Computer Science (FOCS), pages 136-145, 2000.

Variations on an Ordering Theme with Constraints Walter Guttmann and Markus Maucher Fakultat fur Informatik, Universitat Ulm, 89069 Ulm, Germany [email protected] • markus.maucherQuni-ulm.de Abstract. We investigate the problem of finding a total order of a finite set that satisfies various local ordering constraints. Depending on the admitted constraints, we provide an efficient algorithm or prove NP-completeness. We discuss several generalisations and systematically classify the problems. Key words: total ordering, NP-completeness, computational complexity, betweenness, cyclic ordering, topological sorting

1 Introduction An instance of the betweenness problem is given by a finite set A and a collection C of triples from A, with the task to decide if there is a total order < of ^4 such that for each {a,b,c) € C, either a
78

W. Guttmann and M. Maucher

being tractable or NP-complete for further generalisations of ordering problems with constraints. A second step towards this goal is to explain the structure underlying the ordering problems [7].

2 A Generalisation of Topological Sorting In this section, we substantiate our interest in generalising the ordering problems mentioned in the introduction by discussing an instance that can be solved by a generalised version of topological sorting. Since the instance arises in a practical application, we first give a short overview of the context and then proceed to the mathematical abstraction and the solution. We consider the part of an object-oriented model of a system specified by the UML class diagram shown in Fig. 1. The classes L and M are related to each other, and the association class K details this relationship. Note that the association from L to M is directed, which means that objects of the class M cannot access those of K and L [8].

K M Fig. 1. UML class diagram with association class

Prom time to time, a software that implements this model needs to make the instances that have been accumulated in memory persistent to a database. The representations in memory using pointers and in a relational database clash, however, resulting in object-relational mapping problems [9]. For our special problem, the following approach is appropriate. There should be one database table for each of the classes K, L, and M, into which objects of the respective classes save themselves, with unique identifiers being generated upon storage. To hold the instances of the associations, the socalled links, another table is devised that keeps the identifiers of related objects. For efficiency reasons, one of the three objects that participate in a link should make the entry into the association table. Since all three identifiers are needed for this, the only object of each link capable to do this is the one stored at last. Moreover, because of the restricted visibility in the model, this must not be the object of class M for it cannot access the other identifiers. To summarise, for each triple (a, 6, c) of objects from classes {K, L, M) that constitute such a link, a or 6 must be stored after c. This is the reason for the requirement c < o or c < b given before for the total order. In practice, a UML class diagram may also have directed associations without a detailing association class. Such a pair (d, e) of objects would have the

Variations on an Ordering Theme with Constraints

79

requirement d < e modelling t h a t d must be stored before e. We therefore state the decision problem of this, more general version. P r o b l e m 1. INSTANCE: Finite set A, collection B of pairs from A, collection C of triples from A. Q U E S T I O N : IS there a bijection f : A -^ {1,2,.. .,\A\} such t h a t / ( a i ) < / ( a 2 ) for each (01,02) G B, and / ( a s ) < / ( a i ) or / ( a s ) < / ( a 2 ) for each (01,02,03) e CI Note t h a t the bijection / induces a total order, and vice versa. We prove t h a t problem 1 is efficiently decidable by algorithm T shown in Fig. 2, an extension of topological sorting [6, Sect. 2.2.3]. T h e algorithm maintains working sets

E CA, F input:

CB,iindGCC. finite

set A, collection of pairs B and triples C from A

output: total order of A such that the first element of each pair in B precedes the second, and the third element of each triple in C precedes the first or the second method: {E,F,G) *- {A,B,C) Order <— empty sequence while S 7^ 0 d o find e e E such that Va:, y £ E : {e,y) ^ F A{x,y,e) if such an e exists then G^ {{x,y,z) eG\xT^eAy=/^e} F^{{x,y)eF\yj^e} E^E\{e} prepend e to Order else output "there is no order" halt end end output Order

^ G

Fig. 2. Algorithm T

T h e o r e m 1. Algorithm

T shown in Fig. 2 solves problem 1.

Proof. Assume algorithm T proposes an order. T h a t order is a permutation of A since during every iteration one element e is removed from E and prepended to the order. To see t h a t the constraints specified by B are satisfied note t h a t each (01,02) G B remains in F until the iteration where 02 = e, thus 02 is prepended to the order. While (01,02) G F, however, the chosen element e cannot be o i , hence oi precedes 02 in the order.

80

W. Guttmann and M. Maucher

Similarly, to see that the constraints specified by C are satisfied note that each (01,02,03) € C remains in G until the iteration where oi = e V 02 = e, thus Oi or 02 is prepended to the order. While (01,02,03) G G, however, the chosen element e cannot be 03, hence 03 precedes oi or 02 in the order. Assume algorithm T fails to find an order. In this case, there is a non-empty subset E C A such that no e G £^ satisfies the required property. Thus, for each e € E either (e, y) £ F for some y £ E ov {x, y,e) G G for some x,y € E. Since F C B and G C C each e e E must precede some e' G £^ in a total order. There is no such order of finite sets. D The time complexity of algorithm T is polynomial in the size of the input. Implemented using suitable data structures, we have even achieved linear time. To this end, we take two measures to ensure that the body of the loop is executed in constant time. - The working sets F and G are no longer maintained as a whole, but replaced by links from the elements of A to the constraints they occur in. This allows for constant time updates to the working sets. - As soon as all constraints are satisfied for an element, it is added to a list of candidates. A suitable e can thus be found in constant time in the next iteration of the loop. Taking C = 0 and requiring B to be a (strict) partial order over A demonstrates that algorithm T is indeed a generalisation of topological sorting. Since we do not assume that the elements of a triple in C are distinct, one may even entirely dispose of B by adding a triple (02,02,01) to C for each pair (01,02) e B. While this procedure works for the problem at hand, it might fail for other types of problems discussed in Sect. 4.

3 Constraints over Three Elements We explore different kinds of generalising the betweenness, cyclic ordering, and topological sorting problems introduced before. By 53 we denote the symmetric group of size 3. It contains all permutations of three elements, which we denote as detailed in ^3 = {(123), (132), (213), (231), (312), (321)}. The first generalisation keeps the assumption that a collection C of triples is given but abstracts from the constraint P C S3 specifying the relative order of the elements of each triple. We therefore have a family of problems, one for each P. Problem Family 2. INSTANCE: Finite set A, collection C of triples (01,02,03) of distinct elements from A. QUESTION: IS there a bijection / : A —> { l , 2 , . . . , | y l | } such that for each (01,02,03) e C there is a p e P with /(op(i)) < f{ap^2)) < /(ap(3))?

Variations on an Ordering Theme with Constraints

81

Choose P = {(123), (321)} for betweenness, P = {(123), (231), (312)} for cyclic ordering, and P = S^X {(123), (213)} to get the problem discussed in Sect. 2. T h e added distinctness condition a i 7^ 02 ^ as 7^ a i is easy t o check. T h e total number of problems in this family is 21*^^1 = 2^ = 64. Already from the small sample just presented it follows t h a t some of these problems are tractable while others are NP-complete. T h u s the task arises to classify the remaining problems. All of t h e m are in NP, since a non-deterministic algorithm can guess the order and check in polynomial time t h a t the constraints specified by C are satisfied with respect to the chosen P. This remark applies to all problems discussed in this paper. To reduce the number of problems t h a t must be investigated, the following symmetry consideration applies. Regard, for example, the problems P i = {(123), (213)} and P2 = {(231), (321)} t h a t differ just by a consistent renaming of the elements of their permutations. Such a renaming is achieved by composing a permutation from the left, in our case P2 = (231) o P^. Intuitively, this can be compensated by permuting the positions in each triple so t h a t the modified constraints access the original elements. In our example, a triple ( a i , 02, ^s) from an instance of P i would be rearranged to (as, a i , a2) for P2. Precisely, symmetry is exploited by permuting each triple and applying the inverse permutation to all constraints. It follows t h a t two problems P i and P2 such t h a t P2 = TT o P i for some TT e <S3 have the same time complexity. Another kind of symmetry enables a further reduction of the number of problems. Intuitively, reversing each constraint can be compensated by transposing the resulting total order. Precisely, a partial order can be extended to a total order if and only if its transpose can be extended—just take the transpose of the total order. It follows t h a t two problems P i and P2 t h a t differ just by reversing their permutations, t h a t is P2 = P i o (321), have the same time complexity For example, ^3 \ {(123), (213)} and ^3 \ {(321), (312)} are two such problems. After applying b o t h kinds of symmetry considerations to the 64 problems, we are left with those shown in Fig. 3. We prove t h a t the classification provided there is correct. tractable

NP-complete

0 {(123)} {(123), (132)} {(123), (213), (231)} 53 \ {(123), (213)} ^3

{(123), (231)} {(123), (321)} {(123), (132), (231)} {(123), (231), (312)} & \ {(123), (231)} <S3\{(123),(321)} Sz \ {(123)}

Fig. 3. Tractability of all problems C 53 up to symmetry

82

W. Guttmann and M. Maucher

Theorem 2. The problems of the family 2 are tractable or NP-complete as shown in Fig. 3. Proof. The problems 0 and 1S3 are solved trivially. The problems {(123)}, {(123), (132)}, and {(123), (213), (231)} are solved by topological sorting. The problem S3 \ {(123), (213)} is solved by algorithm T from Sect. 2. We have already mentioned that betweenness {(123), (321)} and cyclic ordering {(123), (231), (312)} are NP-complete [2, 4]. To see that the problem ^3 \ {(123), (321)}—which might be called nonbetweenness—is also NP-complete, we perform a reduction from betweenness. Let A' and C" characterise an instance of betweenness. Construct the instance of non-betweenness where A = A' and C consists of two triples (02,01,03), (01,03,02) for each (01,02,03) £ C. Intuitively, if neither the first nor the third element of a triple must be arranged between the other two, the second element is forced into that position. This instance has a solution if and only if the corresponding instance of betweenness has one. A similar argument replaces each triple (01,02,03) G C" by two triples (01,02,03), (03,02,01) to reduce betweenness to the {(123), (132), (231)} problem. Replace each (01,02,03) € C" by two triples (02,01,03), (02,03,01) to reduce betweenness to the problem 1S3 \ {(123), (231)}. By the same argument, the problem iS3\{(123)} is most difficult. Intuitively, every other problem can be simulated by prohibiting all unwanted triples one by one. For example, a reduction from cyclic ordering replaces each (01,02,03) by three triples (01,03,02), (02,01,03), (03,02,01). NP-completeness of the remaining problem {(123), (231)} is proved by Corollary 1 in Sect. 5. D Additional structure illuminating the interrelation of the problems listed in Fig. 3 is provided by a reduction method [7].

4 Constraints over Additional Pairs The second generalisation we take a look at has already been touched in Sect. 2, where the collection of triples was joined by a collection of pairs. For that special instance, the additional constraint pairs have no impact on the complexity of algorithm T since they could also be replaced by triples. In general, however, this is not the case. For example, whatever additional betweenness triples are devised to replace a pair (01,02) that requires oi to precede 02, they are also satisfied by transposing the resulting total order. There is no way to express absolute direction in the betweenness problem. We therefore have another family of 64 problems, again indexed by P C <S3. Problem Family 3. INSTANCE: Finite set A, collection B of pairs from A, collection C of triples (oi, 02,03) of distinct elements from A.

Variations on an Ordering Theme with Constraints

83

QUESTION: IS there a bijection f : A -^ {1,2,..., \A\} such that / ( a i ) < 7(02) for each (01,02) € B, and for each (01,02,03) G C there is a p G P with

/(ap(i)) < /(ap(2)) < /(ap(3))?

Note that the symmetry considerations presented in Sect. 3 apply as well to this family. The permutation of the positions in the triples is independent of the additional pairs. Symmetry by reversing each constraint can be extended to this, more general case by transposing the relation B to accommodate to the reversed order. With the results of Sect. 3 in place, the complexity of each problem in the new family can easily be derived. It turns out that the classification remains unchanged. Theorem 3. The problems of the family 3 are tractable or NP-complete as shown in Fig. 3. Proof. Taking B = % demonstrates that the new problems are indeed generalisations. All NP-complete problems of Sect. 3 thus remain NP-complete. On the other hand, the pairs in B feature as additional input for topological sorting. Thus, all tractable problems are solved using the more general algorithm T that already accepts an additional collection of constraining pairs. D

5 Constraints over Disjoint Triples The third variation we are investigating takes advantage of the expressivity gained by the pairs introduced in Sect. 4. It is rather a specialisation of those problems where we assume that any two triples in the collection C are pairwise disjoint when viewed as sets. This family of problems is also indexed by P C 53. Problem Family 4. INSTANCE: Finite set A, collection B of pairs from A, collection C of pairwise disjoint triples (oi, 02,03) of distinct elements from A. QUESTION: IS there a bijection / : A —> { 1 , 2 , . . . , \A\} such that / ( o i ) < 7(02) for each (oi, 02) G B, and for each (01,02,03) G C there is a p G P with /(ap(i)) < / K ( 2 ) ) < /(ap(3))? Note that the symmetry considerations presented in Sect. 3 do not affect disjointness, and therefore apply also to this family, the new problems being restrictions of those in Sect. 4. By the latter reason, algorithm T can still be applied to solve the tractable problems. The question remains whether some of the NP-complete problems become more easy. In the remaining part of this section, we answer the question in the negative. Let us start with the problem P = {(123), (231)}, which we call the intermezzo problem. The requirement for the triples in (01,02,03) G C therefore reads /(oi) < /(02) < /(03) or /(02) < /(os) < / ( o i ) . We prove its NPcompleteness by reduction from 3SAT using the component design technique described in [1, Sect. 3.2.3].

84

W. Guttmann and M. Maucher

Lemma 1. The intermezzo problem is NP-complete. Proof. Let an instance of 3SAT be characterised by the set of variables U = {ui,..., Un} and the set of clauses C = {(ci,i V ci,2 V 01,3),..., {Cm,i V Cm,2 V Cm,3)}, where Cij = Uk or Cjj = Uk for some k. Let Uk = Uk, and let a (B b denote the number c G {1,2,3} such that a + b = c (mod 3). Construct the instance of intermezzo where A ^ {uk,uuk,i I l < f c < n A l < i < 3 } U {c\^j | l < i < m A l < j < 3 A l < Z < 3 } B = {(uk,\,Uk,3), {uk^i,Uk,3) I 1 < fc < n}U {(Ci,i,2, clj), ( c | j , Cij,i) | l < i < m A l < j < 3 } U {(4^.ei,c^)|l
{(4^.,c^,cf,^.) | l < i < m A l < j < 3 } The notation c^j,; is an abbreviation of Uk,i where Uk = Ci^. We describe the given construction that is illustrated in Fig. 4 in more detail. For each literal Uk we construct three elements Uk,i that are grouped in the triple (•us;4,Ufc,2, Wfc^s) as shown in Fig. 4(a). The same construction is applied for each literal Uk- For each variable Uk we thus have two such triples, and we connect them by two edges {uk,i,Uk,3) and (ufc,i,Wfc,3) as shown in Fig. 4(b). The subgraph for each variable therefore consists of 6 nodes, 2 edges, and 2 triples. For each occurrence of a literal Cij in a clause Cj we construct three elements c\,j that are grouped in the triple {cjj,cfj,c^j) as shown in Fig. 4(c). For each clause c, we thus have three such triples, and we connect them pairwise by edges (cK-01, c? •) as shown in Fig. 4(d). The subgraph for each clause therefore consists of 9 nodes, 3 edges, and 3 triples. The connection between the subgraphs for the variables and those for the clauses is obtained by constructing two edges {cij^2, c] j) and (c? •, Cj,j,i) for each occurrence of a literal Cij in a clause. Note that Cijj = Uk,i for positive literals Ci,j = Uk, and Cijj = Uk,i for negative literals Cj,j = Uk- Figure 4(e) shows this construction for the occurrences of the positive literal Cj^i = Uk and the negative literal Ch,i = Wfc in two different clauses Ci and c/j. Further connections are suggested by arrows attached to one node only. The whole graph consists of \A\ = 6n + 9m nodes, \B\ = 2n + 9m edges, and \C\ = 2n + 3m triples. We prove that this instance of intermezzo is solvable if and only if the corresponding instance of 3SAT is satisfiable. Let / be an ordering function as required by the specification of intermezzo. Define the truth assignment t{uk) = f{uk,3) < f{uk,3)- Assume that t does not satisfy C and let (ci,i VCj,2 VCj^s) be a clause such that -^t{cij) for all 1 < j < 3. 1. By definition of t we have /{cij^s) < /(cj^-^a).

Variations on an Ordering Theme with Constraints Uk,2

Uk,3

Uk,l

Uk,l

Uk,3

Uk,l

(a) Triple for each hteral Uk

^i,j

Uk,2

85

Uk,3

Uk,2

(b) Two triples for each variable Uk

'i',3

(c) Triple for each occurrence of a literal cij in a clause

(d) Three triples for each clause Cj

Ch,l = Uk

(e) Construction for two clauses and one variable

Fig. 4. Graph showing the construction in the reduction from 3SAT to intermezzo

86 2. 3. 4. 5.

W. Guttmann and M. Maucher Since Since Since Since

(cj,j,i,Ci,j,3) e -B we have /(cij,i) < ficij^s). (cij,i,Cij,2,Ci,j,3) e C we have /(cij,i) < f{cij,2){clpCi^j,i),{ci,jfi,clj) £ B we have /(c?^) < / ( c l j ) . ( c ^ , c^^-, ^ ^ ) G C we have /(c?^) < / ( c ^ ) .

6. Since ( c j j e i ' ^ j ) ^ -B we have / ( ^ j e i ) < fi^lj)7. Therefore we have /{cjj) = /(^jg^g) < /(^j-^a) < K4,j®i) contradiction.

< fi4,j)^

a

Let i be a truth assignment that satisfies C". For 1 < fc < n let i^ = Wfc if t{uk) and tk = Ufc if -'i(wfc). For 1 N such that g{clj) =

£,(ifc'i) 5'(tfc!2) gltkfl) 5(ci\) gl4-)= gl4-)= 5(c|^) gl4-)= ^(ci^.) gldj) 9{tk,3) gltk,i) g{ik,3)

3i + j

= D + k = 2£) + fc = SD + k = 4£) + 3i + j 5£) + 3i + j 6D + 3z + j = 71? + 3i + j iD + 'ii+j = 9£> + 3i + i = lOD + 3i + j = 11-D + fc = 12D + fc = 13-D + fc

for 1 < z < m A 1 < J < 3 A - • ^ ( Q J )

forl
where D is large enough to keep the definitions separate, for instance chosen as D = 2n + Am + 4. The mapping g satisfies the constraints specified by B since 1. g{tk,i) < 13D < g{ik,3) and 5(^,1) < 2D < IID < g{tk,3). 2. 5(ci,j,2) < 4D < gicjj) and gicfj) < 6D < UD < g{cij,i) if t{cij), and 5(cij,2) < 3 D < 413 < 5(4^.) andfir(c2^.)< D < g{cij,i) if -^(cij). 3. 5(cl,,,) < 5D < 6D < g{cli^Q2) and 5(4/,ffi2) < 8D < 5(1 —> N defined by /(e) = |{a e ^ | g{a) < g{e)}\ is one-to-one, and satisfies the constraints specified by B and C. D

Variations on an Ordering Theme with Constraints

87

We obtain the missing fact from the proof of Theorem 2 in Sect. 3 as a consequence of Lemma 1. Corollary 1. The problem {(123), (231)} from the family 2 is NP-complete. Proof. Let A', B', and C" characterise an instance of intermezzo. Construct the instance of the problem {(123), (231)} from the family 2 where A = A' U {n} for some n ^ A', and C = C U {(n, a i , 02) | (ai, 02) S B'}. This instance has a solution if and only if the corresponding instance of intermezzo has one. D Continuing the main objective of this section, we reduce intermezzo to the problem {(123), (321)} from the current family. Note that the existing NPcompleteness proof for betweenness does not apply here because it uses nondisjoint triples [2]. Lemma 2. The problem {(123), (321)} from the family 4 is NP-complete. Proof. Let A', B', and C" characterise an instance of intermezzo. Construct the instance of betweenness where A extends A' by three new elements a[, a'2, as for each (01,02,03) e C". Note that there are 3|C"| distinct new elements since the triples in C are pairwise disjoint. Moreover, C consists of two triples (01,03,03), (0^,02,02) for each (01,02,03) £ C. Finally, for each (01,02,03) G C, B extends B' by inserting three new pairs (o']^,oi), (03,02), (02,03) and, for each pair (a,ai), one new pair {a,a'i). Intuitively, an element oi is split into two elements oi and o'l such that o'l immediately precedes Oi. Assume there is a total order -<' of the instance of intermezzo. The order -< modifies -<' by replacing, for each (01,02,03) G C, the occurrence of oi in -<' with - either a[ -< oi -< O3 -< 02 if Oi -<' 02 -<' 03, - or 03 -< 02 ^ o'l -< Oi if 02 -<' 03 -<' oi, such that these four elements succeed without a gap. By definition of intermezzo exactly one of the two cases applies for each triple, thus -^ is a total order of A. The order -< satisfies each triple (01,03,03) since oi ^ 03 -< 03 in the first case and 03 -< Og -< oi in the second case. The order -< satisfies each triple (01,02,02) since o'^ ^ 03 -< 02 in the first case and 02 -< 02 -< o'l in the second case. The order -<, being an extension of -<', satisfies B'. In both cases a'^ -< oi, 03 -<; 02, and 02 -< 03 for each triple (01,02,03) G C", and, since a[ and oi succeed without a gap, o -< a[ whenever o -<' oi. Hence, ^ is a total order of the constructed instance. Assume there is a total order -< of the constructed instance. The order -<' is the restriction of -< to A. For each triple (01,02,03) G C, 02 -<' 03 since (02,03) G B. Assume that 02 -<' Oi -<' 03 for some such triple. 1. By definition of -<' we also have 02 -< oi ^ 0 3 . 2. Since (01,03,03) G C we have Oi -< O3 -< 03. 3. Since (o^, oi) G B and (03,02) G B we have o'l -^ oi ^ 03 -< 02-

88

W. Guttmann and M. Maucher

4. Since {a[, a'2,02) G C we have a[ -< ai ^ a'^ ^ a'2 ^ 02. 5. Therefore ai -< 02 and as -< ai, a contradiction. Thus, oi -<' 02 ^ ' 03 or a^ -<' 03 ^ ' ai, so -<' satisfies all triples in C Finally, (01,02) G -B' => (01,02) G B => oi -< 02 ^ oi -<' 02, so -<' satisfies B'. Hence, -<' is a total order of the instance of intermezzo. D Finally, intermezzo is reduced to each of the remaining five problems, completing the proof that the classification still remains unchanged. Note again that the existing NP-completeness proof for cyclic ordering does not apply here because it also uses non-disjoint triples [4]. T h e o r e m 4. The problems of the family 4 o-i^^ tractable or NP-complete as shown in Fig. 3. Proof. As stated before, algorithm T solves the tractable problems, being restrictions of those from family 3. Lemmas 1 and 2 have already proved two further problems NP-complete. Recall that each of the remaining five problems shown in Fig. 3 represents a class of problems up to symmetry as described in Sect. 3. To simplify the proof we choose five specific problems, one from each class, appropriate for the following argument. They are, precisely, {(123), (231), (321)}, {(123), (231), (312)}, S3 \ {(213), (132)}, 53 \ {(213), (312)}, and ^3 \ {(213)}. We show that each of these problems is NP-complete by reduction from intermezzo. Let A', B', and C" characterise an instance of intermezzo. Construct the instance for any of the five problems where A = A', C = C, and B = B' U {(02,03) I (oi, 02,03) G C'}. If the instance of intermezzo has a solution, it also solves the constructed instance since each of the five problems are supersets of {(123), (231)}. If the constructed instance has a solution, it also solves the instance of intermezzo since by the extension of the pairs, 02 must precede 03 for each triple (01,02,03) G C" and none of the five problems contains (213). Hence, the constructed instance has a solution if and only if the corresponding instance of intermezzo has one. D

6 Conclusion Let us summarise the contributions of this paper. In Sect. 2 we have presented an efficient algorithm—a generalisation of topological sorting—that solves an object-relational mapping problem modelled as an ordering problem with local constraints. Several generalisations of this and other known tractable and NP-complete ordering problems have been explored starting with Sect. 3, where the constraints are specified by the relative order of three-element subsets [4, 2]. Additionally, the relative order of two-element subsets is admitted for specification in Sect. 4. Finally, causing some effort, the three-element subsets are required to be pairwise disjoint in Sect. 5.

Variations on an Ordering Theme with Constraints

89

We have established which of the considered problems are efficiently solvable and which are NP-complete, proving that the classifications coincide for all three problem families that are related as shown in Fig. 5. That picture is completed by the variant that requires disjoint triples but does not permit pairs—this variant can be solved trivially.

Section 4: pairs, overlapping triples Section 3: no pairs, overlapping triples

Section 5: pairs, disjoint triples

no pairs, disjoint triples Fig. 5. Variants of problems with triples

The problems discussed in this paper also arise in the context of qualitative spatial reasoning [5]. The algebraic treatment pursued in that area originates in qualitative temporal reasoning, notably with Allen's interval algebra [10]. All subclasses of Allen's interval algebra have been classified as being either NP-complete or tractable [11, 12]. Note that a simple translation from Allen's interval algebra to our formalism fails for two reasons. First, the relative positions of intervals use not only < but also the <, =, and 7^ relations. Second, there may be different disjunctions in effect between different pairs of intervals. This could be simulated with the exclusion problem, but that is already NP-complete. Conversely, a simple translation from our formalism to Allen's interval algebra fails also for two reasons. First, the start and end points of intervals are correlated, whereas no such restrictions apply for our ordering problems with constraints. Second, there is only one clause for each pair of intervals, but our set C models arbitrary conjunctions. Let us finally mention two further generalisations of the problems presented in this paper. First, the number of elements involved in specifying constraints may be increased beyond three. To this end, a reduction technique can be defined that reveals an interesting structure underlying the ordering problems. It again turns out that large classes of the generalised ordering problems are either tractable or NP-complete, and we intend to address this dichotomy. Second, the strict order may be replaced by a weak order. Both topics are currently under investigation [7].

90

W. Guttmann and M. Maucher

References 1. M.R. Garey and D.S. Johnson. Computers and Intractability. W.H. Freeman and Company, 1979. 2. J. Opatrny. Total ordering problem. SIAM Journal on Computing, 8(1):111-114, February 1979. 3. B. Chor and M. Sudan. A geometric approach to betweenness. SIAM Journal on Discrete Mathematics, ll(4):511-523, November 1998. 4. Z. Galil and N. Megiddo. Cyclic ordering is NP-complete. Theoretical Computer Science, 5(2):179-182, October 1977. 5. A. Isli and A.G. Cohn. A new approach to cyclic ordering of 2D orientations using ternary relation algebras. Artificial Intelligence, 122(1-2):137-187, September 2000. 6. D.E. Knuth. Fundamental Algorithvns, volume 1 of The Art of Computer Programming. Addison-Wesley, third edition, 1997. 7. W. Guttmann and M. Maucher. Constrained ordering. Technical Report UIB2005-03, Universitat Ulm, December 2005. 8. Object Management Group, http://www.omg.org/. UML 2.0 Superstructure Specification, August 2005. 9. M. Fowler. Patterns of Enterprise Application Architecture. Addison-Wesley, 2002. 10. J.F. Allen. Maintaining knowledge about temporal intervals. Communications of the ACM, 26(ll):832-843, November 1983. 11. B. Nebel and H.-J. Biirckert. Reasoning about temporal relations: A maximal tractable subclass of Allen's interval algebra. Journal of the ACM, 42(l):43-66, January 1995. 12. A. Krokhin, P. Jeavons, and P. Jonsson. Reasoning about temporal relations: The tractable subalgebras of Allen's interval algebra. Journal of the ACM, 50(5):591640, September 2003.

BuST-Bundled Suffix Trees Luca Bortolussi^, Francesco Fabris^, and Alberto Policriti^ ^ Department of Mathematics and Informatics, University of Udine. bortolussiIpolicriti AT dimi.uniud.it ^ Department of Mathematics and Informatics, University of Trieste. frnzfbrs AT dsm.uniuv.trieste.it Abstract. We introduce a data structure, the Bundled Suffix Tree (BuST), that is a generahzation of a Suffix Tree (ST). To build a BuST we use an alphabet S together with a non-transitive relation w among its letters. Following the path of a substring /? within a BuST, constructed over a text a of length n, not only the positions of the exact occurrences of /3 in a are found (as in a ST), but also the positions of all the substrings /Jj, /Sj, /?3, •.. that are related with /3 via the relation « among the characters of S, for example strings at a certain "distance" from (3. A BuST contains 0{n^~^^) additional nodes {S < 1) in probability, and is constructed in 0(ji^'^^) steps. In the worst case it contains 0{n'^) nodes.

1 Introduction A Suffix Tree is a data structure computable in linear time and associated with a finite text a = a[lj,a[2],... ,a[n] = a[l...n], where a[i] £ S and S — {ai,a2 • •. JUK} is the alphabet (that is \S\ = K). In the following we suppose the existence of an ordering among alphabet letters and we assume to append a character # ^ 17 at the end of our text, as is customary when working with ST^s. A ST allows to check in 0{m) time if an assigned string /?, |/3| = m, is a substring of a; moreover, at the same time it gives the exact positions j i , J2, • • • ,jr of all the r occurrences of /3 into a in 0{r) additional time. Therefore, a ST solves the Exact String Matching Problem (ESM) in linear time with respect to the length n of the searched string. A ST solves in linear time also the Longest Repeated Exact Substring Problem {LRES) of an assigned text a. A complete and detailed treatment of these results can be found in [6]. Even if very efficient in solving the ESM and the LRES problem, the ST data structure suffers of an important drawback when one has to solve an Approximate String Matching Problem [ASM), or to solve the harder Longest Repeated Approximate Substring Problem (LRAS). In these cases, one needs to search for strings [3^,0^,0^,... substrings of a, such that d{P,p.) < D, where d(-, •) is a suitable distance (most frequently Hamming or Levenshtein distance) and D is constant or proportional to the length of /?. This happens because the structure of a ST is not adequate to handle distance in a natural way. This

Please use the following format when citing this chapter: Bortolussi, L., Fabris, F., Policriti, A., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 91-102.

92

L. Bortolussi, F. Fabris, and A. Policriti

forces one to take into account errors by using unnatural and complicated strategies, that inevitably lead to cunabersome algorithms. In general, many different indexing structures other than ST are used to tackle approximate matching problems [9, 8, 5], but all these approaches use an exact index for the text together with some searching strategy to find all (approximate) occurrences of the pattern /3 in the text a. Among those structures, STs play a prominent role, not only for approximate matching, but also in pattern discovery algorithms, like in [7], and for statistical analysis of approximate occurrences [3], where it is important to have knowledge about the inner structure of the processed text. In this work we present a generalization of a Suffix Tree, the Bundled Suffix Tree (BuST), which contains information about an approximate relation between strings as a structural property of the tree. This allows us perform some kind of approximated string matching with a BuST in the same manner in which we perform exact string matching with a ST. In particular, BuST are better suited for LRES and all the problems that require some form of exploitation of the inner (approximate) structure of a string. The matching criterion we use can be very general, in fact we only require to be given a (not necessarily transitive) relation among letters of the alphabet S. For example, the notion of Hamming distance induces a very natural non-transitive relation on E when each letter a e S is in fact a t-tuple over a sub-alphabet Si (for example Si = \^A,C,G^TY}'- the relation between two I7-characters ai.aj € S holds if and only if d^ (aj, a^) < D, where, d^(-, •) is the Hamming distance and D is a constant. Other notions of distance can be used as well. Bundled SufRx Trees encode in a compact way the relational structure existing between the substrings of the processed text a. In fact, the relation among the letters of the alphabet can be easily extended to strings (two strings are in relation if so are all their constituting characters), and then we can consider all the relations intercurring between the substrings of a. This information is added to the Suffix Tree by marking some positions in the tree (that can be both in the middle of the edges or over its nodes) with labels corresponding to suffixes, in such a way that the existence of a label j after a certain point implies that the string labeling the path from the root to that point is in relation with a prefix of suffix j . In other words, while constructing a BuST, we are resurrecting some nodes of the underlying suffix trie, and attaching to them an additional information in terms of labels. The nodes are added only in the lowest position satisfying the property stated above, to avoid the insertion of redundant information (see def. 2). A detailed analysis of the dimension of BuST shows that, though the worst case size is 0 ( | a p ) , the average size is subquadratic (but superlinear), see Section 3. Observe that the information we add to a ST is internal to the processed string a, in the sense that we do not add any information about the relation of substrings of a with external strings. For this reason, BuST can be useful for all those applications exploiting this internal information (as LRAS) and not necessarily, for example, to search for the approximate occurrences of an external pattern in the text a. A suitable application for BuST is presented in

BuST-Bundled Suffix Trees

93

this paper and concerns the calculation of the approximate frequency of appearance of a given subword (with the relative calculation of associated measures of surprise), of. Section 5. An advantage is that the above mentioned information can be extracted from the BuST in the same way this extraction is done with SufRx Trees in the exact case. The notion of relation between letters of an alphabet is a general concept, susceptible of encoding different properties connected with the specific application domain, e.g. Hamming-like distances or scoring schemes. Moreover, the particular relation used is completely orthogonal with respect to the definition, the construction and the analysis of the data structure. In this presentation we will deal with a restricted type of relation, constructed over an alphabet of macrocharacters, by means of a threshold criterion relative to a selected distance (mainly Hamming distance). The macroletters can have fixed or variable length; this is not a problem as long as they form a prefix-free code. On the other hand, the introduction of macrocharacters brings some rigidity in the type of approximate information that can be encapsulated. For instance, the Hamming-like relation introduced above puts in correspondence two strings if their distance is less than a threshold proportional to their length, and if the errors are distributed among the tuples. Moreover, only strings of length proportional to the macroletters' length can be compared. This rigidity, however, is the price to pay to "localize" the approximate information we are looking for: with the Hamming-like relation, we "localize" a global distance between two strings by splitting it evenly between their tuples. The paper is organized as follows. In Section 2 we give the definition of the structure and a naive algorithm for its construction. In Section 3 we analyze the dimension of the data structure in the worst and in the average case. In Section 4 we give some hints to an optimal construction algorithm, while Section 5 contains an application for computing approximate surprise indexes. Finally, in Section 6 we draw some conclusions. The interested reader can find complete proofs, details on the optimal construction and further information in [4].

2 Naive construction of a BuST A ST is not suitable to handle approximate search in a natural way essentially because of its rigidity in matching characters: they either match and the (unique) path proceeds, or the characters are different and a branching point is necessary. Conversely, in a BuST we accept the idea that a path is good not only when characters match, but also when they are in relation. Let S = {fli,... ,a/e} be an alphabet, and w be a symmetric and reflexive binary relation on E, encapsulating some form of approximate information. Definition 1. Given a string /? = /3[1,..., m], we say that 7 = 7 ( 1 , . . . , m] is a variant of (5 if and only if f3[i] w 7[i], Vi = 1 , . . . ,m, and we write f3 '^ "f. We denote with w (/3) = {7 | /3 w 7}.

94

L. Bortolussi, F. Fabris, and A. Policriti

The case in which s» is an equivalence relation trivializes the approach. Hence, we assume that, in general, « is not transitive. Other non equivalence relations could be considered as well. Given a ST for a, the key idea for constructing the associated BuST is that of marking in the ST (all) the paths corresponding to (prefixes of) w-variants of each substring a[j . . . n], for 1 < j < n. This is achieved by inserting nodes over these position and labeling such nodes with the index of the starting position of the sufRx of which they are w-variants (see Figure 1). Intuitively, we are bundling several paths over the skeleton of the ST. In order to distinguish these newly inserted nodes, we refer to them as red nodes, while we call black the nodes of the original ST. Notice that, according to the previous characterization, a node can be both black and red. In addition, red nodes can have a set of labels associated to them. Moreover, red nodes that end up in between a ST edge are not branching and are simply splitting the edge—i.e. they are nodes of the underlying Suffix Trie. To (naively) construct the BuST of a text a, we can enter each suffix a[j .. .n] in the associated ST and find all possible paths that correspond to a (maximum length) prefix of one of its ^-variants. This is done by successively comparing and (w-)matching characters of a and a[j .. .n]. When the first letter of a[j . . . n], say a\p], not in relation with the processed letter of the current path in the ST is found, a red node with label j is inserted (if not already present) in the position just before a[p]. If a red node is already present at that position, label j is added to its label set. Turning back to the comparison phase, two diff'erent situations can occur. Either we are in the middle of an edge or on a branching node. In the former case we simply compare the current text character with the current suffix character a[i]. If the character is in relation with a[i] we continue, otherwise we insert the red node. In the latter case we have to consider the first letter of any branching path from the current node. Following the alphabet ordering and always keeping operative as many paths as are the letters in relation with a[i], new matching paths can be generated. If no letters are found that are in relation with a[i], then the new red node is superimposed over the existing black branching one. The BuST for the text a = bcabbabc is depicted in Figure 1, in which E = {a, b, c} and « is defined by a w 6, 6 w c, and a^ c. Below we give a formal definition of Bundled Suffix Tree. Definition 2. A Bundled Suffix Tree ('BuST^ B for a text a [ l . . .n], is a Suffix Tree S for a (the black skeleton) plus a set of internal (red) nodes with associated (multiple) labels, such that: (Main) the path label from the root to a red node labeled j is an ^-variant of a prefix of a[j.. .n]. (Uniqueness) in every path from the root to a black leaf labeled j , there can be at most one red node with label h ^ j . (Maximality) if a\h.. .h + i] is the string labeling the path from the root to a red node labeled j , then a[h.. .h + i] w a[j ... j + i ] but a[h + i + l] ^ a[j + i-\-l\.

BuST-Bundled Suffix Trees

95

Fig. 1. Bundled Suffix Tree for the sequence a = bcabbabc^^, with a ^ b ^ c. The Main property accounts for the most important function of a BuST, that is to encode all ^-variants of a substring of a. The Uniqueness property states that once a red node labeled h is inserted, the subtree rooted at this node cannot contain other red nodes with the same label. Maximality and uniqueness together assure that we insert at most one red node at the deepest possible position. Remark 1. If ^ is the path label of the ST for a, the starting positions of substrings 7 of a that are variants of /? are found by reading all the labels rooted at the end of (3. Remark 2. The BuST is a data structure which is, in some sense, in the middle between a Suffix Tree and a Suffix Trie. We recall that a Suffix Trie is similar in shape to a ST, but every edge contains as many nodes as the length of its label. While constructing a BuST, we insert nodes splitting edges, hence the set of nodes of a BuST contains that of a ST and resembles to that of the corresponding Suffix Trie. The analogy stops here, as red nodes may have multiple labels and are added using relation w as matching primitive. In order to simplify the following computations, we assume that the relation « enjoys the hypercube-like property over S: for each a E E, there is a constant number V oi b E S, such that a K b. When elements of S are tuples built over a sub-alphabet Si, we will put a « 6 if and only if d{a,b) < D, where d(a, 6) is a suitable distance between tuples and JD is a constant. In such cases we will also assume that the constant D is proportional to the length of Si i-tuples constituting elements of S. If we work with the Hamming distance, then the macro-characters b such that afsb are all the elements of the Hamming sphere of radius d and centered in a. In such a case the constant V is the volume of this Hamming sphere.

96

L. Bortolussi, F. Fabris, and A. Policriti

3 Structural properties of a BuST In order to study the structure of a BuST, we have to compute, for each assigned sufSx a[j .. .n], the number R{j) of red nodes inserted; then the total number of red nodes inserted^ is i? = Xl?=i -^0)- We will perform first the average case analysis, leaving the worst case one at the end. Note that R{j) corresponds to the number of substrings in a that are (maximum length) prefixes of » variants of a[j .. .n]. Remember, also, that for any red node with label j , the label of the path starting from the root and leading to it, is a w-variant of the suffix alj .. .n]. In order to find the paths with this property, we reason on the execution of the naive construction presented in the previous section. While processing suffix j , we have to follow a[j .. .n] on the black skeleton as long as the two letters we are comparing are in relation. When we find the first letter in a[j ... n] that is not in relation with the current letter of the STpath (or to any letter that immediately follows a black branching node), we insert a red node with label j (or we add label j to a preexisting red node). In particular, at every branching node of the ST we have to visit only the edges starting with a character in relation with the corresponding one in the suffix. Suppose the ST has height h, then it is contained in a complete K-aiy tree of height h, K = \S\. In the hypothesis made at the end of the previous section, we know that only V out of K characters are in relation with one letter, hence at every internal node only V out of K edges will survive during the construction. In this way, we can bound the number of survived paths at depth h, and thus •R(i)) by V'^ (at each level, the number of active paths is multiplied by a factor V). A more reasonable bound of R{j) can be obtained by replacing h with the average depth d. Therefore, the value of (an upper bound on) R{j) is strictly connected with the average structure of the ST. In particular, we are interested in the average behaviour of the height and of the average depth of a path from the root to a leaf. These quantities have been analyzed in [10, 11], under the hypothesis of the text being generated by a stationary and memoryless source <S. If X = {Xi,... ,Xn,- • •} is the sequence of random variables generated by the source, we indicate with iJ„ the height of the Suffix Tree built from {Xi,..., Xn} and with Z„ the average depth. Prom [10] we have that the average value of the height, hn = E[Hn], asymptotically converges (in probability) to log(n)/log(l/p"'"), while the asymptotic behaviour of 2„ = E[Zn] approaches log{n) /H(S). Here p+ is the maximum value of the probability distribution on E that defines S, while H{S) is the Shannon entropy of S. The results stated above allow us to compute probabilistic upper bounds (denoted by <) for the quantity R{j): R{j) < V''- - y'°evp+ " = n ^ ^ We are correctly counting the size of the sets of labels inserted, not the number of red nodes.

BuST-Bundled Suffix Trees

97

\i 15 14

c c c c

b b h b b b b b * 19 12

11 10 3

a a a b b b b b b b b * (5 12

11 to *

a a b b b b b b b b # c .,,5 a . , . a t<

m m

ij

13 12 It to 9

a b b b b b b b b *

2m. is I*

a a a a

a— b

Mill 13 12 U IS 9

b b b b

b b b b *

11 r n

ct — c

la 12

11 10 »

Fig. 2. A worst case BuST for the sequence a = aaccbbbb# with 5 = logF/log(l/p+). A better estimate of R{j) can be obtained by replacing hn with Zn, obtaining R{j) < n^ , with S' = l o g y / l o g i J ( 5 ) . Therefore, the total number of red nodes inserted, denoted by R, is bounded on average by: R

E«(^-)^.^"' j= l J= l

,1 + 5

The value of S depends on the probability distribution of the source and on the relation between the letters of the alphabet. For instance, for an Hamminglike relation with macrocharacters of length 4 and error rate 25% built over DNA alphabet, and the maximum probabihty of a DNA letter varying from 0.25 to 0.5, the value of S remains between 0.46 to 0.92, hence the size of the structure is bound by a subquadratic function. Observe that the bound we give is coarse, in fact S can be greater than one, while the size of the data structure cannot be more than quadratic in the length of the processed text. In fact, the number R{j) of red nodes inserted while processing suffix j can be at most one per each path of the Suffix Tree, or equivalently, at most one for each suffix of the text, hence R{j) < n. Therefore R < n?. This theoretical bound can be reached for particular texts, as shown in the following example. Example 1. Consider a sequence of the form a — a^c^ft^", over the alphabet E = {a,fo,c, d}, with a w 6 « c w d w a as relation (it is hypercube like). The lengths of the runs of a,h,c in a are in proportion of 1 : 1 : 2. Note from Figure 2 that, if the length of the text is 4n, then the rectangular area delimited by the dashed line contains n{n — 1) red nodes (the ones with label from 2n + \ to 4n — 1, repeated n times).

98

L. Bortolussi, F. Fabris, and A. Policriti

4 Optimal Construction We briefly outline here an algorithm for constructing BuSTs which is optimal, in the sense that its complexity is of the same order of magnitude of its output (i.e., essentially 0{R), the number of red nodes inserted). First of all, let is put forward some useful notation. Given two strings (3 and 7, we write 7 ^ /3 if 7 is a prefix of /3. 7 C /? means that 7 is a substring of /?, while "f 'i^ (3 means that 7 is in «-relation with a prefix of /3. Negations of these expressions are indicated by 7 7^ /3, 7 (t /? and 7 ;^ /?, respectively. Consider now a red node r, with label i,-^ such that its path label £(rj) equals some string 0:7, x G 17, 7 G E^. Hence x^ '^ a[i.. .n], but, \/y £ E such that x^y C a, x-yy ;^a[i.. .n]. All this information implies that 'y ^ a[i+l.. .n], but we cannot conclude that there must be a (i + l)-red node r-j+i after 7. In fact, there can be edges departing from 7 with label 72: such that 'jz ;^ a[i + l.. .n], which derive from paths labeled with yjz, y ^ x (and maybe y 76 a[i]). In other words, if we cross a suffix link {SL from now on) from a node r^ and we find ourselves in a black node p, we may need to visit the whole subtree rooted at p to complete the insertion of (i + l)-red nodes. Indeed, the situation is even worse. There can be paths in the ST, where a (i + l)-red nodes must be inserted, which can never be reached, neither directly traversing a SL from a (i)-red node, nor visiting subtrees at the end of a SL. These paths correspond to positions that can be reached only from SLs that depart from nodes with path label z^ and z 56 a\i]. Therefore, the frontier of a BuST is much more complex than that of a ST, and it cannot be controlled easily using SL. In some sense, the (main) problem is that, while inserting (i + l)-red nodes, we need access to zones of the ST that are forbidden to suffix i, because their path label begins with a character which is not equivalent to a[i\. Now, suppose we have a (i + l)-red node rj+i with path label ^(rj+i) = 7. It follows that 7 ;^ a[i + 1 . . . m] and Vy G 17 such that 7y C a, 7j/ ;^ a[i + 1 . . . n]. Thus we can consider all the positions in the tree identified by the labels x-y, where X7 C a and x « a[i], claiming that in all these points we find a (i)red node. In fact, \/y G E such that xjy C a, it holds that xjy ^ a[i. -.n], otherwise we have that -jy ;^a[i + l.. .n], which is a contradiction. Therefore, if we have a way to reach from a position 7 all the positions xj in the tree, we may be able to insert all (i)-red nodes from (i + l)-red nodes without matching any character of a [ i . . . n] along any path. The operation of going from 7 to xy is, in some sense, like crossing in the inverse direction a SL. If we are disposed to pay a price in terms of space used, we can define a collection of pointers, called inverse suffix links (ISL), that do this job. Specifically for each node p of the ST, with path label /?, and for each letter x £ E, there is an inverse suffix link lSL{p,x) that points to the position of the tree with path label x/3, if any. Note that this position may well be in the middle of an edge. From now on we refer to such a node as an (i)-red node

BuST-Bundled Suffix Trees

99

Equipped with ISL, and with some extra care to keep correctly into account the maximahty property of BuST, we can define an algorithm that builds the BuST for a starting from its ST and processing the text backwards, from the last suffix to the first. Red nodes for suffix i are generated from red nodes for suffix i + 1, essentially by visiting their parent nodes (in the ST) and checking the positions of the tree pointed by ISL departing from there (only for the letters in relation with a[i]). Each red node rj+i can be processed in constant time (actually in 0{K'^), with K = \E\), thus giving rise to an algorithm (called ISL.BUST) with complexity 0{R). The interested reader can find all the details in [4], where the following theorem is proved. T h e o r e m 1. ISL-BUST constructs a BuST for a text a in time

0{R).

5 An Application of BuST: detecting approximate unusual words In this section we present an application of BuSTs, related to the detection of unusually overrepresented words in a text a. Specifically, we admit as occurrences P' of a word /3 also strings that are "close" to /?, where the concept of closeness means that /3' is a variant of /?. Before entering into the details related to the use of BuSTs, we give a brief overview of a method presented by Apostolico et al. in [2, 1]. The problem tackled is the identification, in a reliable and computationally efficient way, of a subset of strings of a text a that have a particularly high (or low) score of "surprise". Particular care is given in finding a suitable data structure that can represent this set of strings in a space-efficient way, i.e. in a size linear w.r.t. the length of the processed text. The class of measures of surprise considered is the so called z-score, defined for a substring /3 of a as S{P) = (/(/3) - E{l3))/N(P). Here /(/3) is the observed frequency of /3, E{(3) is a function that can be interpreted as a kind of expected frequency for (3 and N{(3) is a normalization factor. Intuitively, we are computing the (normalized) difference between the expected value for the frequency of /? and the observed one. If this score is high, it means that (3 appears more often than expected, while if it is very low (and negative), than /3 is underrepresented in a. Conditions on E and N are given to guarantee that, whenever /(/3) = /(/37), then 5{P) < S{p-j). In other words, while looking for overrepresented words, we do not have to examine all the 0{n'^) substrings of a text a, but we can focus on the longest strings sharing the same occurrences, as they are those having the higher z-scoie. It is easy to see that those strings correspond exactly to the labels of the (inner) nodes of the suffix tree for a, so we must compute the z-score only for these strings. Their frequency can be computed easily by a traversal of the tree in overall linear time. On the other hand, the computation of E and N can be far from trivial, and its complexity is deeply related to the choice of the probabilistic model adopted for the source. E

100

L. Bortolussi, F. Fabris, and A. Policriti

is usually taken as the expectation of the frequency of a word, while A'' is usually chosen as the variance or as its first order approximation Y^£^(/3)Prob(/3). If the probabilistic model of the source is stationary and memoryless, computation of E can be carried out in constant time after a linear preprocessing. We stress that Suffix Trees not only give rise to an efficient algorithm for computing overrepresented words, but they also allow a compact representation of them. In addition, they allow to reply efficiently to a query of the type: "is a substring /? of a overrepresented?". The answer to such a question is, in general, not binary. It can be the case, in fact, that f3 terminates in the middle of an edge of the Suffix Tree, so there exists a superstring Pj of j3 with the same set of occurrences of 0, but with an higher z-score. Therefore an answer to the above query can be this superstring, which is maximal w.r.t. the 5 measure. In order to improve the above approach, however, we can consider the case in which we are willing to admit as occurrences of a string /3 also strings which differ from it, but are "close enough". In [3], an approach is presented to look for overrepresented strings of length m with at most k errors, in the sense that for each /? substring of a, we count the number of substrings of a of length n with distance at most k from /3, we calculate the expected frequency of approximate occurrence of/3, and finally we compute the z-score w.r.t. such parameters. The overall algorithm has complexity 0{kn^), where n — \a\. The approach we present here is a straightforward adaption of the algorithm for the exact case, casted in the realm of BuSTs. In this setting we have at our disposal a simple and powerful tool for defining a concept of "closeness" between two strings, i.e. the relation on the alphabet of macrocharacters. For instance, we can use an Hamming-like relation (cf. Introduction) on macrocharacters of length m, putting in relation two of them if their Hamming distance is D or less. In this case, we put in relation strings of length multiple of m, which can differ in D/m of their positions, with errors evenly distributed. Thus, we can search for surprising strings of variable length, by counting all the substrings at distance proportional to their length (with some rigidity induced by the usage of macroletters). We can proceed as follows: given a text a, if we use macrocharacters of size m, we construct the m strings in this new alphabet, obtained by segmenting a starting from different positions (i.e. from position 1 to position m). Then we build the generalized BuST for those m strings, and we visit it to mark each internal node, both black and red, with the number of black leaves and red nodes present in the subtree rooted at it. This operation can be performed in time 0{R). At this point, for each substring /3 of a (of size multiple of m), we can read in the BuST its approximate frequency //j(/3), i.e. the number of substrings of a that are variants of /3. With an abuse of notation, we denote from now on by a and /3 also the corresponding strings in macrocharacters. We compute the 2;-score under the hypothesis that a is generated by a Bernoulli process, and we indicate the probability of generating a macroletter a with Pa- However, here we have to compute, for each substring /? of a the probability of finding a substring /3' w /? in a. For a macroletter a we have

BuST-Bundled Suffix Trees

101

that p^ = J2baa Pb is the probability of finding a macrocharacter in relation with a, while for a string a i . . . a^ the probability of finding an approximate occurrence under the source model is P r o b ( a i . . .ajj) = ni=iPai- Thus the expected numbers of occurrences in a, \a\ — n oi a. string in relation with P, \p\ = m is (n — m + l)Prob(/3). Therefore we can adopt the same trick used for computing expectations for the exact case: we precompute a vector A[i] = Prob(Q;[l.. .i]) in Hnear time using the recursive relation A[i] = A[i — l]Pa[i], then the expectation for a[i.. .j] can be computed in constant time by Proh{a[i...j]) = A[j]/A[i]. As normalization factor, we can use the expectation itself, or the first order approximation of the variance. A direct computation of the variance itself seems much more complicated, as here we cannot use anymore the method used in [2] (in essence, we should replace the concept of autocorrelation with the weaker notion of w-autocorrelation, i.e. we should look for w-periods of words; we leave this investigation for future work). With those choices for the source model and for the normalization factor, we are able to compute the ^-score 5 for each string labeling an inner node (both black and red) of the BuST in constant time. Note that if the path in the tree labeled by /3 ends in the middle of an edge, its frequency is the same as that of the string P^ labeling the path from the root to the fist node (black or red) below the end of /?, and therefore 5{P'i) > 5{p). So we are guaranteed, in order to find the maximal surprising strings, that we need to compute the index only for the nodes of the BuST. In addition the algorithm runs in a time proportional to the size of the BuST itself, which is subquadratic on average. Note also that the number of maximal surprising strings (modulo the approximations introduced by the relation) is of the same size of the BuST, so we are computing the 2;-score in optimal size and time.

6 Conclusions We presented BuST, a new index structure for strings, which is an extension of Suffix Trees where the alphabet is enriched with a non-transitive relation, encapsulating some form of approximate information. This is the case, for instance, of a relation induced by the Hamming distance for an alphabet composed of macrocharacters on a base one. We showed that the average size of the tree is subquadratic, despite a quadratic worst case dimension, and we provided a construction algorithm linear in the size of the structure. In the final section, we discussed how BuST can be used for computing in a efficient way a class of measures of statistical approximate overrepresentation of substrings of a text a. We have also an implementation of the (naive) construction of the data structure in C, which we used to perform some tests on the size of BuST, showing that the bound given in Section 3 is rather pessimistic (cf. again [4]). BuST allow to extract approximate information from a string a in a simple way, essentially in the same way exact information can be extracted from ST.

102

L. Bortolussi, F. Fabris, and A. Policriti

In addition, they are defined in an orthogonal way w.r.t. the relation and the alphabet used, hence they can be adapted in different contexts with minor efforts. Their main drawback is t h a t the usage of a relation on the alphabet permits to encode only a localized version of approximate information, like global Hamming distance distributed evenly along strings. Future directions include the exploration of other application domains, like using the information contained in BuST to build heuristics for the difficult consensus substring problem (cf. [7]).

References 1. A. Apostolico, M. E. Block, and Lonardi. Monotony of surprise and large-scale quest for unusual words. Journal of Computational Biology, 7(3-4):283-313, 2003. 2. A. Apostolico, M. E. Block, S. Lonardi, and X. Xu. Efficient detection of unusual words. Journal of Computational Biology, 7(l-2):71-94, 2000. 3. A. Apostolico and C. Pizzi. Monotone scoring of patterns with mismatches. In Proceedings of WABI 2004, 2004. 4. L. Bortolussi, F. Fabris, and A. Policriti. Bundled suffix trees. Technical report, Dept. of Maths and Informatics, University of Udine, 2006. http://www.dimi.uniud.it/bortolus/techrep.htm. 5. R. Cole, L. Gottlieb, and M. Lewenstein. Dictionary matching and indexing with errors and don't cares. In Proceedings of STOC 2004, pages 91-100, 2004. 6. D. Gusfield. Algorithms on Trees, Strings and Sequences: Computer Science and Computational Biology. Cambridge University Press, London, 1997. 7. L. Marsan and M. F. Sagot. Extracting structured motifs using a suffix tree algorithms and application to promoter consensus identification. In Proceedings of RECOMB 2000, pages 210-219, 2000. 8. G. Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(l):31-88, 2001. 9. G. Navarro, R. Baeza-Yates, E. Sutinen, and J. Tarhio. Indexing methods for approximate string matching. IEEE Data Engineering Bulletin, 24(4):19-27, 2001. 10. W. Szpankowski. A generalized suffix tree and its (un)expected asymptotic behaviors. SIAM J. Computing, 22:1176-1198, 1993. 11. W. Szpankowski, P. Jacquet, and B. McVey. Compact suffix trees resemble patricia tries: Limiting distribution of depth. Journal of the Iranian Statistical Society, 3:139-148, 2004.

An 0(1) Solution to the Prefix Sum Problem on a Specialized Memory Architecture Andrej Brodnik^^, Johan Karlsson^, J. Ian Munro^, and Andreas Nilsson-"^ ^ Lulea University of Technology Dept. of Computer Science and Electrical Engineering S-971 87 Lulea Sweden {j ohan.karlsson,andreas.nilsson}@csee.Itu.se ^ University of Primorska Faculty of Education Cankarjeva 5 6000 Koper Slovenia andrej.brodnikOpef.upr.si

^ Cheriton School of Computer Science University of Waterloo Waterloo, Ontario Canada, N2L 3G1 imunroOuwaterloo.ca

Abstract. In this paper we study the Prefix Sum problem introduced by Predman. We show that it is possible to perform both update and retrieval in 0(1) time simultaneously under a memory model in which individual bits may be shared by several words. We also show that two variants (generalizations) of the problem can be solved optimally in 0(lg A^) time under the comparison based model of computation.

1 Introduction Models of computation play a fundamental role in theoretical Computer Science, and indeed, in the subject as a whole. Even in modeling a standard computer, the random access machine (RAM) model has been subject to refinements which more realistically model cost or, as in this paper, suggest feasible extensions to the model that permit more efRcient computation, at least for some problems. Work taking into account a memory hierarchy, either when memory and page sizes are known (cf. [2]) or not (cf. [11]) is an example of the former. Taking into account parallelism, as in the PRAM model (cf. [17,26]), is an obvious example of the latter. More subtle examples include the recent result that the operations of an arbitrary finite Abelian group can be carried out in constant time (We assume a word of memory is adequate to hold the size of the group.) provided one can reverse the bits of a word in constant time [8]. This argues for a more robust set of operations. Here we deal with the way a single level memory is Please use the following format when citing this chapter: Brodnik, A., Karlsson, J., Munro, J.I., Nilsson, A., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 103-114.

104

A. Brodnik et al.

organized and demonstrate that the power of a machine can be increased if we permit individual bits to occur in several words simultaneously. This Random Access Machine with Byte Overlap (RAMBO) was first suggested by Predman and Saks [10] and subsequently used by Brodnik et al. [6] and Brodnik and lacono [7]. Indeed it is shown in the latter two papers that a priority queue of word sized objects can be maintained in constant time under a particular form of the RAMBO model, whereas Beame and Fich [3] and Brodnik and lacono [7] have both shown lower bounds on the problem under various forms of the RAM model. Here we discuss solutions to variants of the Prefix Sum problem (i.e. finding the sum of the first j elements in an array and also updating these values) which was introduced by Predman [9]. Various lower bounds have been proven for the problem. We, however, focus on the problem under a nonstandard, though very feasible, model to achieve a constant time solution. Predman and Saks actually suggested the RAMBO model in connection with the Prefix Sum problem. They claim, with no hint of how it may be done, that Prefix Sum mod 2 can be solved in constant time under the model. We show how this can be done not only for Prefix Sum mod 2 but for Prefix Sum modulo an arbitrary universe size M < 2^^^/"^ where b is the word size, n = [Ig A''] and A'' is the size of the array. The RAMBO model, besides the usual RAM operations (cf. [27]), also has a part of memory where a bit may occur in several registers or in several positions in one register. The way the bits occur in this part of the memory has to be specified as part of the model. One example of such a memory variant is a square of bits with b rows and b columns. A 6-bit word can be fetched either as a row or a column. In such a memory each bit can be accessed either by the row word or the column word. The form of RAMBO used by Brodnik et al. [6] to solve the priority queue problem in 0(1) worst case time makes use of words corresponding to the leaves of a balanced binary tree. Each node of the tree contains a flag bit and each such word contains the flags along the root to leaf path, so, for example, the flag at the root is in all of these words. The speciflc architecture was called Yggdrasil after the giant ash tree linking the worlds in Norse mythology. That variant has been implemented in hardware [18] and the actual rerouting of the bits on a word fetch is not difficult. In this paper we modify the Yggdrasil variant slightly and solve the Prefix Sum problem. This gives further evidence of the value of such an architecture, at least for a special purpose processor. Now let us formally define the Prefix Sum problem: Definition 1 The Prefix Sum problem is to maintain an array, A, of size N, and to support the following operations: Update ( j . Z\) A{j) := A{j) + A Retrieve(j) return ^^^Q A{i) where 0 < j < N.

An 0(1) Solution to the Prefix Sum Problem

105

Predman showed that, under the comparison based model of computation, an 0(lg A'') solution exists for the Prefix Sum problem [9]. The problem can be generalized in several ways and we start by adding another parameter, k to the Retrieve operation. This parameter is used to tell the starting point of the array interval to sum over. Hence, R e t r i e v e ( k , j ) returns ^ ^ ^ ^ . ^ ( j ) ; where 0 < k < j < N. This variant is usually referred to as the Partial Sum or Range Sum problem. The Partial Sum problem can be solved using a solution to the Prefix Sum problem ( R e t r i e v e ( k , j ) = R e t r i e v e ( j ) - Retrieve(k-1)). In fact, the two problems are often used interchangeably. Furthermore, there is no obvious reason to only allow addition in the Update and Retrieve operations. We can allow any binary function, ®, to be used. In fact we can allow the Update operation to use one function, ©„, and the Retrieve operation to use another function, (Br- We will refer to this variant of the problem as the General Prefix Sum problem. Moreover, one can allow array position to be inserted at or deleted from arbitrary places. Hence, we can have sparse arrays, e.g. an array where only ^(5) and A{500) are present. Positions which have not yet been added or have been deleted have the value 0. We refer to this variant as the Dynamic Prefix Sum problem. Brodnik and Nilsson [21, pp 65-80] describe a data structure they call a BinSeT tree which can be modified slightly to support all operation of the Dynamic Prefix Sum problem in 0(lg A'') time. The Searchable Partial Sum problem extends the set of operations with a s e l e c t ( j ) operation which finds the smallest i such that '^].^oA{k) > j [23]. Hon et al. consider the Dynamic version of the Searchable Partial Sum problem [16]. Another generalization is to use multidimensional arrays and this variant has been studied by the data base community [4,12,13,15,24,25]. Several lower bounds have been presented for the Prefix Sum problem: Fredman showed a i7(lg N) algebraic complexity lower bound and a J7(lg N/ Iglg N) information-theoretic lower bound [9]. Yao [29] has shown that J?(lg A''/lglg A'') is an inherent lower bound under the semi-group model of computation and this was improved by Hampapuram and Fredman to i?(lg N) [14]. We side step these lower bounds by considering the RAMBO model of computation [5,10]. As with all RAM based model we need to restrict the size of a word which can be stored and operated on. We denote the word size with b and assume that b is an integer power of 2 which is true for most computers today. A bounded word size also implies a bounded universe of elements that we store in the array. We use M to denote the universe size. Hence all operations © have to be computed modulo M and we require that each of the operands and the result are stored in one word. We will use n and m to denote [Ig N] and \\g M] respectively. Hence, A'' < 2" and M < 2™. Both n and m are less than or equal to b, (n, m < b). In one of the solutions we actually require that nm < b. In Sect. 2 we show a 0(1) solution to the Prefix Sum problem under the RAMBO model using a modified Yggdrasil variant. In Sect. 3 we discuss a

A. Brodnik et al.

106

0(lg A'') solution to the General and Dynamic Prefix Sum problems and finally conclude the paper with some open questions in Sect. 4.

2 An 0(1) Solution to the Prefix Sum Problem In our 0(1) solution to the Prefix Sum problem we use a complete binary tree on top of the array (Fig. 1). We label the nodes in standard heap order, i.e., the root is node vi and the left and right children of a node Ui are r^2» and I'si+i respectively. In each node we store m bits representing the sum of the leaves in the left subtree. Since we build a complete binary tree on top of the array we assume that N = 2"' (if this is not true we still build the complete tree and in worst case waste space proportional to N/2 — 1). We do not store the original array A since its values are stored implicitly in the tree. The only value not stored in the tree (if N — 2^ only) is ^(A'' — 1) and we store this value explicitly (vnl). Formally we define: Definition 2 A N-m-tree is a complete binary tree with N leaves in which the internal nodes (vi) store a m-bit value. In addition, a m-bit value is stored separately (Vnlj. To update A{j) (Algorithm 1) in this structure we have to update all the nodes on the path from leaf j to the root in which j belongs to the left subtree. To Retrieve (j) (Algorithm 2) we need to sum the values of all the nodes on the path from leaf j + 1 to the root in which j + 1 belongs to right subtree. Note that the path corresponding to array position j starts at node '^N/2+J/2-

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

Fig. 1. Complete binary tree on top of A- Nodes are storing the sum of the values in the leaves covered by the left subtree. The method described above implies a 0{\gN) update and retrieval time in the RAM model. To achieve constant time update and retrieval we use a variant of the RAMBO model similar to the Yggdrasil variant. In the Yggdrasil variant, registers overlap as paths from leaf to root in a complete binary tree with one bit stored in each internal node [6]. We generalize the Yggdrasil variant and let it store m bits in each node and call this variant m-Yggdrasil. In any

An 0(1) Solution to the Prefix Sum Problem

107

update (j, A) if (j == N-1) vnl = vnl + A; else i = N + j; while (i > 1) next = i div 2;

if a mod 2 == 0) I'next

= Vnext

+ ^

mod

M) ;

i = next;

Alg 1: Updating of a N-m-tree in 0(lg A'") time.

retrieve(J) if (j == N-1) sum = v n l ; i = N+j ; else sum = 0; i = N + j + 1; while ( i > 1) next = i div 2; if (i mod 2 == 1) sum = sum + Vnext mod i = next; r e t u r n sum;

M;

Alg 2: Retrieve in a N-m-tree in 0(lgN)

time.

m-Yggdrasil, register r e g [ i ] corresponds to the path from node i'N/2+i to the root of the tree. Each register consists of nm < b bits. In total the m-Yggdrasil registers need {N — I) • m bits. Now, we use the registers from m-Yggdrasil to store the nodes of our tree. The path corresponding to array position j is stored in reg [j /2] and hence all nodes along the path can be accessed at once. We let levels of the tree be counted from the internal nodes above the leaves starting at 0 and ending with n — 1 at the root. If the ith bit of j is 1 then j is in the right subtree of the node on level i of the path and in the left otherwise. Hence j can be used to determine which nodes along the path should be updated (nodes corresponding to bits of j that are 0) and which nodes should be used when retrieving a sum (nodes corresponding to bits of j that are 1). When updating the m-Yggdrasil registers (Algorithm 3), for all bits of j , if the ith bit of j is 0 we add A to the value of the ith node along the path from j to the root. To do this we shift A to the corresponding position {A « (im)) and add to r e g [ j / 2 ] . Instead of checking whether the ith bit of j is 0 we can

108

A. Brodnik et al.

mask the shifted A with a value based on NOTj. T h e value consists of, if the i t h bit of NOT j is 1, TO Is shifted to the correct position and TO OS otherwise.

update y , zl) if (j == N-1) vnl = vnl + else for (i=0; 0 if ( ( ( j » reg[j/2]

A; < n; i++) i ) AND 1) == 0) = r e g [ j / 2 ] + (.A «

(i*in));

A l g 3 : U p d a t i n g of a N-m-tree stored in m-Yggdrasil memory ( 0 ( l g A'') time).

Actually, as long as the binary operation only affects the TO bits t h a t should be u p d a t e d we can use word-size parallelism (cf. [5]) and perform the u p d a t e of all nodes in parallel. In Sect. 2.1 we show t h a t addition modulo M can be implemented affecting only m bits. We use two functions ( d i s t ( i ) and m a s k d ) ) to simplify the description of the u p d a t e and retrieve methods. T h e function d i s t ( i ) , (0 < i < 2™) computes nTO-bit values. T h e values are n copies of the TO bits in i. For example, given m = 3 , n = 4 d i s t ( O l O ) is 010010010010. T h e function m a s k d ) , (0 < i < 2") also computes n m - b i t values. These values are computed as follow: bit j (0 < j < n) of i is copied to bits jrn..{j + 1)TO — 1. For example, given TO = 3, n = 4, mask (1001) is 111000000111. Both these functions can be implemented by using word-size parallelism [5]. We can u p d a t e the tree in constant time using the procedure in Algorithm 4. First we make n copies of A and then mask out the copies we need. T h e n finally we add the value in r e g [ j / 2 ] and the masked distributed A and store the result in r e g [ j / 2 ] . For the case when j = N —1 we simply add v n l and A and store it in v n l . This gives us the following lemma: L e m m a 1 The update operation of the Prefix Sum problem in 0{1) when part of the N-m-tree is stored in a m-Yggdrasil

can be supported memory.

update ( j . A) if (j == N-1) vnl = vnl + A; else r e g [ j / 2 ] = r e g [ j / 2 ] + (dist(Z\) AND mask(NOT j ) ) ; A l g 4: Updating of a N-m-tree stored in m-Yggdrasil memory using word size parallelism ( 0 ( 1 ) time).

An 0(1) Solution to the Prefix Sum Problem

109

To support the retrieve method in constant time we use a table SUM [ i ] , (0 < i < 2"™) with m-bit values that are the sum modulo M of the n m-bit values in i. To retrieve the sum (Algorithm 5) we read the register reg corresponding to j and mask out the parts we need. Then we use the table SUM to calculate the sum. Finally, we add vnl to the sum if j = A'' — 1.

retrieve (j) if (j == N-1) V = reg[j/2] AND mask(j); else V = reg[(j+l)/2] AND mask(j+l); sum = SUM [v] ; if (j == N-1) sum = vnl + sum; return sum; Alg 5: Retrieve in a N-m-tree stored in m-Yggdrasil memory using word size parallelism (0(1) time).

The space needed by the table SUM is 2"™ . m = A^'s^ • m = M^s^ • m, which is rather large. In order to reduce the space requirement we can reduce, by half, the number of bits used as index into the table. This gives us a space requirement of vM^^-m. We do this by shifting the top n/2 m-bit values from reg down and computing the sum modulo M of these values and the bottom n/2 values. Then this new (n/2)m-bit value is used as index into SUM instead. We can actually repeat this process until we get the m-bit we desire, and hence we do not need the table SUM (Algorithm 6). However, this does increase the time complexity to O(lgn) = 0{lglgN). This gives us a trade off between space and time. By allowing 0(/,) steps for the retrieve method we need M's-'^/^'. m bits for the table. Lemma 2 The retrieve operation of the Prefix Sum problem can be supported in 0{L + 1) time using 0(M^sN/2^ .m + m) bits of memory in addition to the N-m-tree. Part of the N-m-tree is stored in m-Yggdrasil memory. By adjusting c we can achieve the following result: Corollary 1 The retrieve operation of the Prefix Sum problem can be supported in: - 0(1) time using 0{M^^^^^'^^^'^-m) bits of memory in addition to the N-m-tree, with t = 1. - 0(lglgA'') time using 0{m) bits of memory in addition to the N-m-tree, with L=\\g\gN^.

110

A. Brodnik et al.

retrieve (j) if (j == N-1) V = reg[j/2] AND mask(j); else V = reg[(j+l)/2] AND mask(j+l); L = [ign]; do i = i-1;

vnew = (v»((2')m)) + (v A N D ((l«((2')m))-l)); V = vnew; while ((. > 0) sum = v; if (j == N-1)

sum = vnl + sum; return sum; Alg 6: Retrieve in a N-m-tree stored in m-Yggdrasil memory using no additional memory (0(lglgA'') time).

2.1 Addition modulo M Let us consider the two m-bit operands a and h which are split into two pieces each {aio, a^i, bio and bhi)- The two pieces aio and ahi contain the m/2 least and most significant bits of a respectively (similarly for bio and bhi)- Note that aio and the other pieces are stored in m-bit but only the m/2 least significant bits are used. We can now add the the two operands clio = aio + bio

(1)

clhi

(2)

= dhi + bhi •

However, both cljo and clhi might need m/2 + 1 bits for its result. The m/2 + 1 bit of clio should be added to clhi and we split clio into two pieces (cl^o./o and clio,hi) and add the most significant bits to clhi, Chi ~ Chi + Cio,hi

(3)

Clo = Clo^io .

(4)

The result of a + 6 is now stored in cio and Chi and we have not used more than m bits in any word. However, in total m + 1 might be needed for the value. To compute c mod M we can check whether or not c — M >= 0, if so c mod M = c — M and otherwise c mod M = c. However, we do not want to produce a negative value since that would affect all the bits in the word. Instead we add an additional 2™ to the value and compare to 2™, i.e. c + 2™ — M > 2"^. Since 2"* — M > 0 this will never produce a negative value. Note that c + 2 ' " - M < M - l + M - l + 2 ' " - M = M + 2 ' " - 2 < = 2"+^ - 2 which

An 0(1) Solution to the Prefix Sum Problem

111

only needs m + 1 bits to be represented. Hence, if we calculate this value using the strategy above we will not use more than m bits of any word. Furthermore, a straight forward less than comparison can not be performed using word-size parallelism since all bits of the words are considered. Instead we view the comparison as a check whether the m + 1st bit is set or not. If it is set the value is larger than or equal to 2"* (cf. [19,22]). We can actually create a bit mask which consists of m Is if the m + 1st bit is set and m Os otherwise d = (c + 2™ - M AND 2"^) - ((c + 2™ - M

AND

2"^) »

m) .

(5)

This bit mask d can then be used to calculate res — c mod M. Since res is equal to c — M if the m + 1st bit of c is set and c otherwise we get res = ((c - M)

AND

d) OR (c AND NOT d) .

(6)

When computing c — M we must make sure that we do not produce a negative value. This is done by using a similar strategy as for addition above, but we also set any of the bits in Chi,ki to 1 during the computation. If c — M is greater than 0 this will not affect the result and otherwise the result will not be used. We have a procedure which can be used to compute (a + b) mod M without using more than m bits in any word. Hence, word-size parallelism can be used and we get our main result from this section: Theorem 1 Using the N-m-tree together with the m- Yggdrasil memory we can support the operations of the Prefix Sum problem in 0{i+l) time using {N—l)m bits of m-Yggdrasil memory and OlM^/"^ • m + m) bits of ordinary memory.

3 An O(lgAr) Solution to the General and Dynamic Prefix Sum Problem We can actually partially solve the General Prefix Sum problem using the N-mtree data structure and the m-Yggdrasil variant of RAMBO. All binary operations such that all elements in the universe have a unique inverse element (i.e. binary operations which form a Group with the set of elements in the universe) and only affect the m bits involved in the operation can be supported. This includes for example addition and subtraction but not the maximum function. To solve the General and Dynamic Prefix Sum problem for semi-group operations we modify the Binary Segment Tree (BinSeT) data structure suggested by Brodnik and Nilsson. It was designed to handle in-advance resource reservation [21, pp 65-80] and if it is slightly modified it can solve both the General and Dynamic Prefix Sum problems efficiently. The original BinSeT stores, in each internal node, (U, the maximum value over the interval, and 5^ the change of the value over the interval. Further, it also stores r, the time of the left most event in the right subtree. Instead of storing times as interval dividers we store array indices. To solve the Dynamic Prefix Sum problem with addition as operation and we only need

112

A. Brodnik et al.

to store 5. When solving the General and Prefix Sum problem one need to store information depending on the two binary operations ®u and ©rWhen adding a new array position or deleting an array position the tree is rebalanced (of. [1,20]) and hence the height is always 0(lg A^). When updating a value in an array position we start at the root and search for the proper leaf using the interval dividers. During the back tracking of the recursion we update the information stored in each affected node. At retrieval we process the information of the proper nodes when traversing the tree. Since the height of the tree is 0(lg A'') all the operations can be performed in 0(lgA'') time. This matches the lower bound by Hampapuram and Predman [14] BinSeT consists of 0{N) nodes when we use it to solve the General Prefix Sum. Each node contains 0(1) m-bit values and hence the total space requirement is 0{Nm) bits.

4 Conclusion The Dynamic and General Prefix Sum problems can both be solved optimally in 0{\gN) using 0{Nm) space under the comparison based model with semigroup operations. The Prefix Sum problem can be solved in 0(1) time under the RAMBO model when we allow 0(V'M(r'g^l) . rn) bits of ordinary memory and 0{Nm) bits of m-Yggdrasil memory to be used. This is a huge amount of ordinary memory and if we restrict the space requirement to be sub exponential in both A'' and M {0{m) bits of ordinary memory and 0{Nm) bits of m-Yggdrasil memory) we need to used 0(lglg A'') time. We know of no better lower bound under RAMBO than the trivial Q{1) when only allowing 0((ArO(i) + M ° ( i ) ) m ) space. Further, it is currently unknown if one can achieve a 0(1) solution to the Dynamic and General Prefix Sum problems using the RAMBO model. Another open question is whether or not it is possible achieve a o(lg A'') solution to the multidimensional variant.

Acknowledgment We thank the anonymous reviewers for helpful comments and additional references.

References 1. G. M. Adelson-Velskii and E. M. Landis. An algorithm for the organization of information. In Soviet Math, Doclady 3, pages 1259-1263, 1962.

An 0(1) Solution to the Prefix Sum Problem

113

2. Alok Aggarwal and Ashok K. Chandra. Virtual memory algorithms (preliminary version). In Proceedings of the 20th Annual ACM Symposium on Theory of Computing, pages 173-185. ACM Press, May 2-4 1988. 3. P. Beame and F. E. Fich. Optimal bounds for the predecessor problem and related problems. Journal of Computer and System Sciences, 65(l):38-72, 2002. 4. Fredrik Bengtsson and Jingsen Chen. Space-efficient range-sum queries in OLAP. In Yahiko Kambayashi, Mukesh Mohania, and Wolfram W6i3, editors, Data Warehousing and Knowledge Discovery: 6th International Conference DaWaK, volume 3181 of Lecture Notes in Computer Science, pages 87-96. Springer, September 2004. 5. Andrej Brodnik. Searching in Constant Time and Minimum Space (MmiMM R E S MAGNI MOMENTI SUNTJ. PhD thesis, University of Waterloo, Waterloo, Ontario, Canada, 1995. (Also published as technical report CS-95-41.). 6. Andrej Brodnik, Svante Carlsson, Michael L. Fredman, Johan Karlsson, and J. Ian Munro. Worst case constant time priority queue. Journal of System and Software, 78(3):249-256, December 2005. 7. Andrej Brodnik and John lacono. Dynamic predecessor queries. Unpublished manuscript, 2006. 8. Arash Farzan and J. Ian Munro. Succinct representation of finite abelian groups. In Proceedings of the 2006 International Symposium on Symbolic and Algebraic Computation, Lecture Notes in Computer Science. Springer, 2006. To appear. 9. Michael L. Fredman. The complexity of maintaining an array and computing its partial sums. Journal of the ACM, 29(l):250-260, January 1982. 10. Michael L. Fredman and Michael E. Saks. The cell probe complexity of dynamic data structures. In Proceedings of the 21st Annual ACM Symposium on Theory of Computing, pages 345-354. ACM Press, May 14-17 1989. 11. Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. Cache-oblivious algorithms. In IEEE, editor, 40th Annual Symposium on Foundations of Computer Science (FOCS), pages 285-297. IEEE Computer Society, IEEE Computer Society, October 17-19 1999. 12. Steven P. Geffner, Divyakant Agrawal, Amr El Abbadi, and T. Smith. Relatve prefix sums: An efficient approach for querying dynamic OLAP data cubes. In Proceedings of the 15th International Conference on Data Engineering, pages 328335, 1999. 13. Steven P. Geffner, Mirek Riedewald, Divyakant Agrawal, and Amr El Abbadi. Data cubes in dynamic environments. Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, pages 31-40, 1999. 14. Haripriyan Hampapuram and Michael L. Fredman. Optimal biweighted binary trees and the complexity of maintaining partial sums. SIAM Journal on Computing, 28(l):l-9, 1998. 15. C. Ho, R. Agrawal, N. Megiddo, and R. Srikant. Range queries in OLAP data cubes. In Proceedings ACM SIGMOD International Conference on Management of Data, pages 73-88, 1997. 16. Wing-Kai Hon, Kunihiko Sadakane, and Wing-Kin Sung. Succinct data structure for searchable partial sums. In Toshihide Ibaraki, Naoki Katoh, and Hirotaka Ono, editors, Algorithms and Computation - ISAAC 2003, 14th International Symposium, volume 2906 of Lecture Notes in Computer Science, pages 505-516. Springer, December 2003. 17. Richard M. Karp and Vijaya Ramachandran. Parallel algorithms for sharedmemeory machines. In van Leeuwen [28], chapter 17, pages 869-941.

114

A. Brodnik et al.

18. Roni Leben, Marijan Miletic, Marjan Spegel, Andrej Trost, Andrej Brodnik, and Johan Karlsson. Design of high performance memory module on PCIOO. In Proceedings Electrotechnical and Computer Science Conference, pages 75-78, Slovenia, 1999. 19. Kjell Lemstrom, Gonzalo Navarro, and Yoan Pinzon. Practical algorithms for transposition-invariant string-matching. Journal of Discrete Algorithms, 3(24):267-292, 2005. 20. Anany Levitin. Introduction to The Design & Analysis of Algorithms. Pearson Education Inc., Addison-Wesley, 2003. 21. Andreas Nilsson. Data Structures for Bandwidth Reservation and Quiality of Service on the Internet. Lie. thesis, Department of Computer Science and Electrical Engineering, Lulea University of Technology, Lulea, Sweden, April 2004. 22. W. Paul and J. Simon. Decision trees and random access machines. In Proc. Int'l. Symp. on Logic and Algorithmic, pages 331-340, Zurich, 1980. 23. Rajeev Raman, Venkatesh Raman, and S. Srinivasa Rao. Succinct dynamic data structure. In Algorithms and Data Structures, 7th International Workshop, volume 2125 of Lecture Notes in Computer Science, pages 426-437. Springer, 810 August 2001. 24. Mirek Riedewald, Divyakant Agrawal, and Amr El Abbadi. Flexible data cubes for online aggregation. In Database Theory - ICDT 2001, 8th International Conference, London , UK, January 4-6, 2001, Proceedings, volume 1973 of Lecture Notes in Computer Science, pages 159-173, 2001. 25. Mirek Riedewald, Divyakant Agrawal, Amr El Abbadi, and Renato Pajarola. Space-efRcient data cubes for dynamic environments. In Proceedings of the International Conference on Data Warehousing and Knowledge Discovery (DaWak), pages 24-33, 2000. 26. L. G. Valiant. General purpose parallel architectures. In van Leeuwen [28], chapter 18, pages 943-971. 27. Peter van Emde Boas. Machine models and simulations. In van Leeuwen [28], chapter 1, pages 3-66. 28. Jan van Leeuwen, editor. Handbook of Theoretical Computer Science, volume A: Algorithms and Complexity. Elsevier/MIT Press, Amsterdam, 1990. 29. Andrew C. Yao. On the complexity of maintaining partial sums. SIAM Journal on Computing, 14(2):277-288, May 1985.

An Algorithm to Reduce the Communication Traffic for Multi-Word Searches in a Distributed Hash Table Yuichi S e i \ K a z u t a k a Matsuzaki^, and Shinichi Honiden^ ^ The University of Tokyo Information Science and Technology Computer Science Department, Tokyo, Japan s e i Q n i i . a c . j p ^ The University of Tokyo Information Science and Technology Computer Science Department, Tokyo, Japan m a t s u z a k i S n i i . a c . j p ^ National Institute of Informatics, Tokyo, Japan honidenQnii.ac.jp

A b s t r a c t . In distributed hash tables, much communication traffic comes from multi-word searches. The aim of this work is to reduce the amount of traffic by using a bloom filter, which is a space-efficient probabilistic data structure used to test whether or not an element is a member of a set. However, bloom filters have a limited role if several sets have different numbers of elements. In the proposed method, extra data storage is generated when contents' keys are registered in a distributed hash table system. Accordingly, we propose a "divided bloom filter" to solve the problem of a normal bloom filter. Using the divided bloom filter, we aim to reduce both the amount of communication traffic and the amount of data storage.

1 Introduction Peer-to-peer systems are distributed networks t h a t can share contents or services without the need for a central server. T h e first peer-to-peer systems, such as Napster [5] and Gnutella [1], lacked scalability. Distributed hash table (DHT) systems such as Chord [19], CAN [15], and P a s t r y [17] aim to overcome this challenge. T h e D H T provides storage and retrieval by using a hash function. W h e n a node participates in the D H T system, it is given a range of hash values for which it is responsible. T h e n the node finds the hash value of the key^ of the content it has. It then sends [h(key), the content ID, its address] t o any node participating in the D H T . T h e message is forwarded from node t o node until it gets to the node responsible for h(key). Once this has been done, the contents can be found by any user; the user needs only to again hash a key to h(key) and ask any node to find the d a t a corresponding with h(key). In full-text searching, each node stores the posting list for the word(s) it is responsible for. A query involving multiple words requires t h a t the postings for ^ We call the hash value of x "h(a Please use the following format when citing this chapter: Sei, Y., Matsuzaki, K., Honiden, S., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 115—129.

116

Y. Sei, K. Matsuzaki, and S. Honiden

one or more of the words be sent over the network. For simpHcity, this discussion will assume a two-word query. Sending the smaller of the two postings to the node holding the larger posting list is cheaper; the latter node then performs the intersection and ranking and returns the few highest-ranking document identifiers. According to [13], analysis of 81,000 queries made to a search engine for mit.edu [4] shows that the average query would move 300,000 bytes of postings across the network. Of the queries analyzed, 40% involved just one word, 35% two, and 25% three or more. Google indexes more than 3 billion Web documents [2], and mit.edu has 1.7 milhon Web pages; scaling to the size of the Web (3 billion pages) suggests that the average query might require 530 MB. If the Internet bandwidth of users is 1 Gbps, and users want to get a reply to their query within 0.5 seconds, for example, the amount of traffic must be less than 0.5 Gb (12.1% of 530 MB). The normal process of searching for multi-word text in a DHT system is shown schematically in Figure 1 and Table 1-SA. We call this method simple algorithm (SA). The example in the figure and table represents the case of searching for two words, "Wl" and "W2". Usually, the transmission from a node to a destination node needs other intermediary nodes; however, in this paper, we omit the intermediary nodes. In the case of SA, a huge amount of traffic occurs when the node responsible for h(Wl) transmits content IDs to the node responsible for h(W2). To reduce this traffic, in the related works we will introduce in Section 2, two main types of measures are taken: using a device for (1) registering contents' keys, or (2) transmitting content IDs. We suggest using a divided bloom filter (DBF), as well as using both devices (1) and (2). First, as regards measure (1), we reduce the amount of traffic in searching for multi-word contents by using a bloom ffiter ([8], [9]) when a node registers its contents' keys. In addition, as regards measure (2), we reduce the amount of traffic by transmitting the DBF of content IDs in place of the content IDs themselves.

2 Related Work The bloom filter is used in this paper and in related works in an aim to reduce the amount of traffic in searching for multi-word text in DHT systems. We describe this filter below. 2.1 Bloom filter A bloom ffiter is a space-efficient probabilistic data structure used to test whether or not an element is a member of a set. A basic description of a bloom filter and its problem are given in this subsection.

An algorithm to reduce the commmunication traffic in a DHT 4. [All content Ids which include "W1°], h(W2), user address

5. Extraction of the intersection of the received ids and tlie saved ids.

Node

T^ode responsible for h(W1)

responsible for h(W2) ' 6. [All content ids which include both "Wl" and "W2".]

3. h(W1), h(W2), useraWress

2. Calculation of h(W1), h(W2)

117

\

1.1 want contents which contain "W1" and "W2", User

Fig. 1. The process of simple algorithm: normal searching for multi-word text (here, a user want contents which contain the two words "Wl" and "W2") on a DHT Basic description of Bloom Filter Imagine there are set A and set B. To get An B in a. simple manner, all the elements of set A are transmitted to the side of set B, and the elements existing in both set A and set B are extracted. At this time, the size of the traffic is the sum of the size of each element in set A. In the method using the bloom filter, set A itself is not transmitted; the bloom filter created by set A is transmitted. The size of the bloom filter is less than the whole size of set A, so the amount of traffic is reduced. The side of set B that received the bloom filter can create SB satisfying SB '^ An B and SBQB.

If the test to check whether an element is a member of ACiB or not to SB is executed, some false positives (an element that is not a member oi ACiB being returned) occur, but false negatives (an element that is a member of A Ci B being not returned) cannot occur. The false positive rate declines exponentially as the size of bloom filter is increased. Set SB created by the side of set B is transmitted to the side of set A, and ^ n B is gained. The execution procedure for the bloom filter is as follows. The idea is to allocate a vector v oim bits, initially all set to 0, and then choose k independent hash functions, /ii, /i2,.-., hk, each with range 1,..., m. For each element a £ A, the bits at positions hi{a),h2{a),...,hk{a) in v are set to 1. (A particular bit might be set to 1 multiple times.) Given a query for b, we check the bits at positions hi{b),h2{b), ...,hk{b). If any one of them is 0, certainly b is not in set A. Otherwise, we conjecture that b is in the set, although there is a certain probability that this is incorrect. This is called a "false positive". Parameters k and m should be chosen such that the probability of a false positive (and hence a false hit) is acceptable.

118

Y. Sei, K. Matsuzaki, and S. Honiden Simple algorithm (SA)

Transmission fixedsize bloom filter algorithm (TfBFA)

Saving fixed-size bloom filter algorithm (SfBFA)

Saving and transmittion divided bloom filter algorithm (STDBFA)

The contents data N(Wi) contains

Tuple of Ih(Wi), content. ID, node address]

Same as SA

Tuple of Qi(WJ), content li^, node address, fBF]

Tuple of [h(WO, content; ID. node address, DBF]

Execution of UN

Calculation of the DHT hash values h(Wl) and h(W2)

Same as SA

Same as SA

Same as SA

Transmissio n from UN to N(W1)

h(Wl), h(W2), UN address

Same as SA

Same as SA

Same as SA

Creation of a fBF from the saved IDs

Extraction of IDs that have possibilities of containing W2 by using the saved fBFs

Extraction of IDs that have possibilities of containing W2 by using the saved DBFs, and creation of a DBF from the extracted IDs

The rest is same as SA

h(W2), the DBF

Execution of N(W1)

Transmissio nfrom N(W1) to N(W2)

h(W2), saved IDs, UN address

h(W2), the fBF

Execution of N(W2)

Extraction of the intersection of the received ids and the saved ids

Extraction of IDs that have possibilities of being the constituent element of the fBF N(W2) received, from the IDs registered with h(W2)

Extraction of IDs that have possibilities of being the constituent element of the DBF N(W2) received, from the IDs registered with h(W2)

Transmissio nfrom N(W2) to UN

Extracted IDs [Finished]

X

X

Extracted IDs

The rest is same as TfBFA

Transmissio nfrom N(W2) to N(W1) Execution of N(W1)

: \ ' ;

Transmissio nfrom N(W1) to UN

• Extracted IDs [Finished]

UN : user node

Extraction of the intersection of the received ids and the saved ids

N(Wi); a node responsible for h(Wi)

Table 1. The sequence of searching for multi-word text (here, a user want contents which contain the two words "Wl" and "W2")

T h e false positive rate ( F P R ) is a function of k, m, and n, expressed as follows [9].

An algorithm to reduce the commmunication traffic in a DHT

119

FPR = (1 - (1 - l/m)'=")'=

(1)

~(l_e-'="/'«)'=.

(2)

When fe = In 2 X m/n, Equation (2) has a minimum value. At that time, FPR is (1/2)'=. If the target FPR is set to FPRtarget, k = [logi/2 FPRtarget\ • Thus, m = [ [ l o g i / 2 FPRtarget\

X u/ In 2J .

(3)

The salient feature of bloom filters is that there is a clear tradeoff between m and the FPR. Problem with the Bloom Filter If n (the number of elements of a set) and FPRtarget are given, the filter bit size m can be minimized by setting parameter k to optimum value. This m value should be shared at the system level. This is because ifTOis different for different filters, the hash functions differ for checking whether or not a given element is a member of the constituent element of the filter. It is thus necessary to re-calculate the hash value of each element per query. We call the bloom filters for which the sizes are the same "fixed-size bloom filters (fBFs)", and we call the bloom filters for which the sizes are different "variable-size bloom filters (vBFs)". We should use fixed-size BFs in order to avoid to calculate many hash values. However, if the numbers of sets are different, it is a problem that the filter bit size of fBFs is bigger than that of vBFs on average [18]. This is because the FPR increases exponentially as the number of elements of the set increases under the condition that the filter bit size does not change. In summary, if we use fixed-size BFs, FPR is higher for the same size of variable-size BF on average. If we use variable-size BFs, calculating hash values takes much time. This comparison is further described in 3.2. 2.2 Reducing the amount of traffic in searching for multi-v^ford in DHT Several studies have been done to reduce the communication traffic in searching for multi-word text in DHT. Two main developments have come from this research. The first development is a device for registering content keys; the second is a device for transmitting content IDs. In the first approach, in [11], the set of keywords included in the content was also regarded as a DHT key. The authors created combinations with three words or less, and registered the combinations as well as each word in the DHT. However, the number of combinations increases exponentially as the number of words increases. In [10], the target for search is a Resource Description Framework [7] (RDF). A system that saves "RDF triples" dispersed in DHT was developed. In this system, the RDF triple itself as well as each element of the RDF triple is registered. Because each RDF triple has only three elements, this method prevented much

120

Y. Sei, K. Matsuzaki, and S. Honiden

extra data storage. However, the method cannot apply to full text searching because the contents have many elements and the extra amount of data storage becomes massive^. In [20], a summary of content is registered as DHT keys. Because doing so reduces the number of keys, the amount of traffic in searching for multi-word text was reduced. However, the amount of information was also reduced by summing up content, so this approach cannot apply to the full-text searching we are addressing. As a second approach, described in [21], [16], and [13], the fixed-size bloom filter is used for transmitting content IDs in searching for multi-word text. By doing so, the amount of traffic was able to reduced without generating any AndSearchData. Prom here on, we call this method a "transmission fixed-size bloom filter algorithm" (TfBFA). The specific process using TfBFA is shown in Table 1-TfBFA. In this process, node N(W2), received by the fixed-size bloom filter from node N(W1), transmits the content IDs it extracted to node N(W1) so as to cut off content IDs accidentally included owing to false-positive results. The advantage of the method using a bloom filter for transmitting content IDs is that there is no AndSearchData; however, a disadvantage of the method is that the reduction rate of the communication traffic is smaller than that in the first approach.

3 Proposed Technique The related works used a bloom filter for transmitting content IDs, but we also use it for registering the keywords of content. In this section, the problem of the bloom filter and its solution are also described. 3.1 Saving fixed-size bloom filter algorithm (SfBFA) We developed a device for registering contents' keys. When a node registers its content, it creates a fixed-size bloom filter from all words of the content. Then it registers the filter as well as the hash value of the word to be registered, the content ID, and its address. The specific process for registering contents is as follows. 1. The node calculates the hash values of all words of the content except for word "Wl" to be registered. 2. The node creates a fixed-size bloom filter from all hash values it calculated in step (1). 3. The node registers the tuple of h(Wl), the content ID, the node address, and the fixed-size bloom filter it created in (2) in the node assigned h(Wl). We call this method a "saving fixed-size bloom filter algorithm" (SfBFA). The process for searching for two-words is shown in Tablel-SfBFA. ^ We call the extra data storage for reducing the amount of traffic "AndSearchData".

An algorithm to reduce the commmunication traffic in a DHT

121

Problem of SfBFA As described in subsection 2.1, the optimum filter bit size depends on the number of elements in the set. In this study, the number of elements in the set is the number of words in the content. Because the numbers of words in the content are different, setting the optimum filter bit size becomes a problem. The k hash functions used in creating the filter should be shared on the DHT system level, so the size of the filter should also be shared on the system level. The filter bit size can be set to be big enough, but the amount of AndSearchData and traffic will be increased. On the contrary, if the filter bit size is set too small, because of the ascension of FPR, the amount of traffic will also be increased. We do not use variable-size bloom filters because douing so would mean taking too much time to calculate hash values. 3.2 Divided bloom filter (DBF) We propose divided bloom filters to overcome the problem of bloom filters. Each filter bit size can thus be maintained by dividing the set into several sets that have the same number of elements and by creating filters from each set. We call filters created by dividing the original set "divided bloom filters" (DBFs). According to Equation 3, m is proportional to n. For this reason, if the FPR of the bloom filter from original set is a, the FPR of each filter of DBFs is also a. However, the following problem occurs. When an element b is checked as to whether or not it is a constituent element of the DBF, if it is checked through every divided filter and the number of divided filters is GN, FPR=l-(l-a)^^.

(4)

If a is sufficiently small, a to the power of more than two can be ignored, so FPR = GN X a.

(5)

According to this equation, FPR increases as the number of divisions increases. The solution needs to identify only one filter that can include element b. By this, FPR is equal to a in total. The only filter that can include element b can be identified by using a DHT hash function without creating extra data storage. When the node divides the set of words in the content, the node calculates the DHT hash value of each word of the content and divides words into groups according to the DHT hash value. In doing so, the system determines the following parameters in advance. - MN: average number of words each group can include - Filter bit size and hash functions used to create filters The specific process to divide the words of content C is as follows. The value that the DHT hash function can return is 1,2,..., DA'' — 1.

122

Y. Sei, K. Matsuzaki, and S. Honiden

«

0.01

Z o V

0,001

2

0.0001

- * - f i x e d - s i z e BF J^ variable-size BF

0,00001

-•-DBF 0,000001

FPR_target

Fig. 2. Average FPR of 1,000 sets (number of elements is from 1 to 1,000) 1. The node calculates the number of groups GN = [WN/MN + 0.5} depending on WN, i.e., the number of the words of content C. 2. The node gives each group Gi{i = 1,...,GN) the assigned range of value R{Gi) = [{DN/GN) x{i- 1), {DN/GN) x i). 3. The node extracts a word of content C, considers it as w, and calculates the DHT hash value h{w). 4. If R{Gj) includes h{'w), the w is grouped in Gj. 5. The node repeats steps (3) to (4) for all words of content C. In this method, it is not guaranteed that that each group has the same number of words. However, if the hash function is colhsion-free, it is assumed that each group has almost the same number of words. Whether a word 6 is a member of the words of the content C is determined as follows. 1. The node that received DBF calculates each assigned range of the value R{Gi) of each group G, according to the number of filters it received. 2. The node calculates the DHT hash value h{b) of word b. 3. The node determines R{Gj) including h{b). At this time, b can be a member of only group Gj. 4. The node judges whether word b can be a constituent element of the filter created by group Gj. Comparison of fBF, vBF, and D B F Let us compare the following features of fixed-size BFs, variable-size BFs, and DBFs: 1. average FPR in creating filters from several sets that have different number of elements and 2. time complexity where an element is checked as to whether it is a member of the filter.

An algorithm to reduce the comminunication traffic in a DHT

123

10000

1000

- * - f i x e d - s i z e BF -A-variable-size BF

100

-•-DBF

10

"1

— 0.01

0.001

0.0001

FPR.target

Fig. 3. Required time for checking whether an element is a member of each filter (number of filters is 1,000,000) 1: Figure 2 shows the average FPR of 1,000 sets in each filter method (fixedsize BF, variable-size BF, and DBF). The number of elements of the contents of the sets is from 1 to 1,000. The filter size was determined by FPRtarget^- We changed FPRtarget from 1/2 to l/2'^9. MN for the DBFs was set to 100. As FPRtarget becomes small, we found, the actual FPR of fixed-size BFs becomes much larger than FPRtarget and that of the DBFs becomes slightly larger than -^

^-^target'

2: Figure 3 shows the simulation result of the required time to check whether an element is a member of a set. We created 1,000,000 filters respectively (fixedsize BF, variable-size BF, and DBF) where the number of elements is 100, and we set FPRtarget = 0.1,0.01,0.001, andO.0001. We created an element b randomly and measured the required time to determine whether b was a member of each filter. In regards to fixed-size BFs and DBFs, according to Figure 3, the required times do not vary with change in FPRtarget- In regards to variable-size BF, we recalculated k hash values for each filter. Hence, the required time was very long. In regards to DBFs, the required time was much less than that of variable-size BFs and close to that of fixed-size BFs. 3.3 Saving divided bloom filter algorithm (SDBFA) We call the method where the node registers a DBF as well as its content ID, its address, and the hash value of the key a "saving divided bloom filter algorithm" (SDBFA). If this SDBFA is used, the approximate minimum length of the filter satisfying the target F P R can be obtained even if different contents have different numbers of words.

^ That is, we set the filter size to the size of variable-size BFs whose FPR is FPR

target-

124

Y. Sei, K. Matsuzaki, and S. Honiden

3.4 Saving and transmission divided bloom filter algorithm (STDBFA) An SDBFA can adopt the method using DBF for transmitting content IDs. Doing so reahzes the same amount of AndSearchData while decreasing FPR. We call the algorithm-synthesized SDBFA and the method using DBF for transmitting content IDs a "saving and transmission divided bloom filter algorithm" (STDBFA). The process of searching for multi-word text is shown in Table 1-STDBFA.

4 Experiment and Evaluation Experiments with "simple algorithm" (SA), "transmission fixed-size bloom filter algorithm" (TfBFA), "saving fixed-size bloom filter algorithm" (SfBFA), "saving divided bloom filter algorithm" (SDBFA), and "saving and transmission divided bloom filter algorithm" (STDBFA) were performed. SA does not take any actions for reducing amount of traffic, TfBFA was used in our previous work, and SfBFA, SDBFA, and STDBFA are new methods proposed in this paper. We measured the average amount of traffic in the five algorithms mentioned above. In addition, we compared the amount of AndSearchData needed for each algorithm. 4.1 Experimental setup As we described in Section I, the aim of the experiment is to limit the amount of traffic in searching for multi-word text (i.e., 12.1% of SA; 64 MB data from 530 MB data). We prepared 10,000 pubfished papers as contents for the experiment. When we extracted the words of the content, we used the database of vocabulary WordNet [6] and extracted the nouns, verbs, and adjectives included in the content. The virtual user selected two words and searched for contents containing the two words. The general hash function SHA-1 [12] was used as the DHT hash function. Because SHA-1 returns a bit value of 160, the content ID has 160 bits. The calculation of the amount of traffic generated by TfBFA and STDBFA that use a fixed-size bloom filter or DBF in transmitting content IDs is as follows. In Table 1, in the case of TfBFA and STDBFA, the total amount of traffic is the sum of the amount of traffic node N(W1) transmits to node N(W2) and the amount of traffic node N(W2) transmits to node N(W1). On the other hand, in the case of SA, SfBFA and SDBFA, the amount of traffic is the only amount of traffic node N(W1) transmits to node N(W2). In this experiment, the average amount of traffic over 1000 trials with simple algorithm was 2.97 KB.

An algorithm to reduce the commmunication traffic in a DHT

125

0.001 FPR ids

Fig. 4. Average amount of traffic using TfBFA compared with that using SA 4.2 Experimental results The searches were repeated 1000 times. Prom here on, we call the FPRtarget of filters created from content IDs "FPRij^g" and call the FPRtarget of filters created from words included in the contents ^'FPRyjords" • In the experiment on TfBFA, we set FPRids to 0.4, 0.2, 0.1, 0.01, and 0.001. Figure 4 shows the result. The amount of traffic for each of the FPRids, respectively, was 0.26, 0.17, 0.16, 0.19, and 0.24 compared with that of SA. If the FPRids is small, the filter bit size that node N(W1) transmits to node N(W2) in Table 1-TfBFA becomes bigger. To the contrary, if the filter bit size is large, the number of content IDs that node N(W2) transmits to node N(W1) becomes larger. In regards to SfBFA, we set FPRyjorda to 0.4, 0.2, 0.1, 0.01, and 0.001 (Figure 5-Left on the extreme right point and Figure 5-Right on the extreme right point). Figure 5-Left shows the amount of traffic involved in searching for multi-word text, and Figure 5-Right shows the amount of AndSearchData in registering one content to the nodes. In regards to SDBFA, FPR words was set to the same value as in the experiments with SfBFA, and MN was set to 10, 20, 50, and 100 (Figure 5 except for each extreme right point.) In Figure 5-Left, SDBFA (which uses DBF) can be seen to have reduced the amount of traffic more than SfBFA (which uses a normal bloom filter). As shown in Figure 5-Right, the amount of AndSearchData with the method using a normal bloom filter is not so different from that with the method using DBF. When FPR^ords = 0.1, the goal of 12.1% traffic compared with SA was realized by using DBF. Figure 5-Right shows that the average amount of AndSearchData per content was the same as that with SfBFA. The amount of AndSearchData is the same as that of SDBFA. In regards to STDBFA, FPR^ords was set to 0.1 and MA'' to 10 for registering contents' keys, and FPRids was set to 0.1 and MN to 2, 5, 10, 20, and 50 for transmission of content IDs (Figure 6.) In Figure 6, the condition MN = 20 can be seen to have reduced the amount of traffic the most.

126

Y. Sei, K. Matsuzaki, and S. Honiden -FPR —H -FPR - A • FPR — X - FPR —e • FPR

Z 5

^ —B—

I

A - " "A-

e • • A - -

:

x ^'

•A""

20 50 (SDBFA)

100

[ o- -- -e • • X - - —X --

A : ~X- -^

''p

^ = ^ r-^- - ^ 10

5 1200 o 5" 1000

words =0.4 words =0.2 words =0.1 words =0.01 words =0.001

< 5 •S " •100 I ^ 200 o

i

^J^.--^-

•

^

^

. • ^ ^ • .

— -o

- X - -^

i

-^1

^

-

•

^

\

0

10 NotDivided (SfBFA)

- o - - — ©-

20 50 (SDBFA)

100

NotDivided (SfBFA)

MN

MN

Fig. 5. Left: Average amounts of traffic using SfBFA and SDBFA; Right: Amounts of AndSearchData using SfBFA, SDBFA, and STDBFA

Fig. 6. Amount of traffic using STDBFA We also examined the effect of changing the number of contents from 1,000 to 10,000 (Figure 7). In regards to TfBFA, the amount of traffic had significant changes. In regards to SDBFA and STDBFA, however, the amount of change in traffic was stably small. Furthermore, the amount of AndSearchData was less than that of SfBFA. Table 2 is a compilation of the results for all algorithms. The values are the average amount with change in the number of contents from 1,000 to 10,000. In the case of searching for multi-word text, TfBFA used in conventional research used needed 23.7% of the traffic of SA. However, SfBFA (which uses the method of registering fixed-size bloom filters created by all words of the content) reduces the amount of traffic more than TfBFA. In addition, compared to SfBFA, SDBFA and STDBFA (which use the proposed DBF rather than a normal bloom filter) reduce both the amount of traffic and the amount of data storage. 4.3 Discussion In this work, we set the target as text documents, but we believe that the proposed techniques (SDBFA and STDBFA) can apply to multimedia contents

An algorithm to reduce the commmunication traffic in a DHT

127

-A-TfBFA • e - SfBFA(FPR_words=0.001) -Q- SDBFA(FPR_words=0.01) •STDBFA(FPR_words=0.01)

0.45 0.4

1 S 0.35 2 i 0.3 ? I 0.25 = fe 0.2 I I 0.15 < 8 0.1 0.05 0 3

4

5

6

7

8

10

Number of contents [x1000]

Fig. 7. Amount of traffic with change in the number of contents Amount of traffic compared with SA

Amount of data storage per content [KB]

TfBFA

0.237

SfBFA

0.144

Desired Value

0.121

_„.----''^^-

SDBFA

0.072

730

STDBFA

0.059

730

^^^--^"^^ 1095

Table 2. Comparison of the results for all algorithms

like movies or music. At this time, the keys for DHT are the texts inserted in multimedia contents by languages that describe metadata (like MPEG7 [3]). If mounting metadata into multimedia contents could be done automatically, contents would have much metadata. If a DHT system for these multimedia contents were constructed, the amount of traffic generated in searching for multiword text would grow larger. However, we believe that our proposed method would also be able to reduce the amount of traffic in such a system. Some DHT algorithms taking mobility and wireless environments into account have been developed (e.g., M-CAN [14] and Warp [22]). Compared to traditional P2P, characteristics of MP2P include unreliable connection, limited bandwidth, and the constraints of mobile devices. Hence, we believe that our proposed method can better apply to these DHTs. Note that in the experiments in this work, the virtual user queried random words. However, we should perform experiments by creating a user model from real DHT systems or histories of real search engines.

128

Y. Sei, K. Matsuzaki, and S. Honiden

Furthemore, we only evaluated two-word multiple searching. Three-word multiple searching should be conducted as follows. Let the three words be " W l " , "W2", and " W 3 " , and the node responsible for h ( W l ) be N ( W 1 ) . In regards to SDBFA and STDBFA, node N(W1) extracts only the content IDs t h a t can include W 3 as well as W2; therefore, in these cases, we predict t h a t the amount of traffic would be decreased compared to t h a t of two-word multiple searching.

5 Conclusion We aimed to reduce the amount of traffic for multi-word searches in D H T s . First, as a device for registering contents' keys, we used a bloom filter created from all words of the content. In this method, some amount of extra d a t a storage for reducing the amount of traffic occurred. We proposed a divided bloom filter (DBF) so as to overcome the limitations of the role of the bloom filter if several sets have different numbers of elements. We used the D B F to reduce the amount of extra d a t a storage as well as the amount of traffic. Second, as a device for transmitting the content IDs, a method by which the node transmits not content IDs themselves, b u t D B F s of them, was effective in reducing t h e amount of traffic. In regards to the saving divided bloom filter algorithm (SDBFA) and the saving and transmission divided bloom filter algorithm (STDBFA) proposed in this paper, we were able t o get favorable results for the amount of traffic in searching for multi-word text as well as d a t a storage.

References 1. Gnutella, http://gnutella.wego.com/. 2. Google, http://goGgle.com/. 3. ISO/IEC TR 15938-8:2002: Information technology, multimedia content description interface. part8: Extraction and use of mpeg-7 descriptionscISO/IEC/JTC 1/SC 29, 2002. 4. Massachusetts institute of technology, http://mit.edu/. 5. Napster, http://www.napster.com/. 6. Wordnet, http://wordnet.princeton.edu/. 7. World-wide web consortium: Resource description framework, http://www.w3.org/rdf. 8. Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7);422-426, 1970. 9. A. Broder and M. Mitzenmacher. Network applications of bloom filters: A survey. In Proceedings of 40th Annual Allerton Conference on Communication, Control, and Computing, pages 636^646, 2002. 10. Min Cai and Martin Frank. Rdfpeers: a scalable distributed rdf repository based on a structured peer-to-peer network. In WWW '04-' Proceedings of the 13th international conference on World Wide Web, pages 650-657, New York, NY, USA, 2004. ACM Press.

An algorithm to reduce the commmunication traffic in a DHT

129

11. Austin T. Clements, Dan R. K. Ports, and David R. Karger. Arpeggio: Metadata searching and content sharing with chord. 12. D. Eastlake 3rd and P. Jones. US Secure Hash Algorithm 1 (SHAl). RFC 3174, September 2001. 13. J. LI, B. LOO, J. HELLERSTEIN, F. KAASHOEK, D. KARGER, and R. MORRIS, the feasibility of peer-to-peer web indexing and search, 2003. 14. Gang Peng, Shanping Li, Hairong Jin, and Tianchi Ma. M-can: a lookup protocol for mobile peer-to-peer environment. In ISPAN, pages 544-550, 2004. 15. Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, and Scott Schenker. A scalable content-addressable network. In Proceedings of the ACM Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pages 161-172, August 2001. 16. Patrick Reynolds and Amin Vahdat. Efficient peer-to-peer keyword searching. In Middleware, pages 21-40, 2003. 17. Antony I. T. Rowstron and Peter Druschel. Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility. In Symposium on Operating Systems Principles, pages 188-201, 2001. 18. Michael A. Shepherd, William J. Phillips, and C.-K. Chu. A fixed-size bloom filter for searching textual documents. Comput. J., 32(3):212-219, 1989. 19. Ion Stoica, Robert, David Karger, Frans Kaashoek, and Hari Balakrishnan. Chord: A scalable Peer-To-Peer lookup service for internet applications. In Proceedings of the 2001 ACM SIGCOMM Conference, pages 149-160, 2001. 20. Chunqiang Tang, Zhichen Xu, and Sandhya Dwarkadas. Peer-to-peer information retrieval using self-organizing semantic overlay networks. In SICCOMM, pages 175-186, 2003. 21. Jiangong Zhang and Torsten Suel. Efficient query evaluation on large textual collections in a peer-to-peer environment. In Peer-to-Peer Computing, pages 225233, 2005. 22. Ben Y. Zhao, Ling Huang, Anthony D. Joseph, and John Kubiatowicz. Rapid mobility via type indirection. In IPTPS, pages 64-74, 2004.

Exploring an Unknown Graph to Locate a Black Hole Using Tokens Stefan Dobrev^, Paola Flocchini^, Rastislav Kralovic^''^*, and Nicola Santoro'' ^ SITE, University of Ottawa, { s d o b r e v , f l o c c h i n } S s i t e . u o t t a w a . c a ^ Dept. of Computer Science, Comenius University, kralovic9dcs.fmph.uniba.sk ^ School of Computer Science, Carleton University, santoroQscs.carleton.ca

Abstract. Consider a team of (one or more) mobile agents operating in a graph G. Unaware of the graph topology and starting from the same node, the team must explore the graph. This problem, known as graph exploration, was initially formulated by Shannon in 1951, and has been extensively studied since under a variety of conditions. The existing investigations have all assumed that the network is safe for the agents, and the solutions presented in the literature succeed in their task only under this assumption. Recently, the exploration problem has been examined also when the network is unsafe. The danger examined is the presence in the network of a black hole, a node that disposes of any incoming agent without leaving any observable trace of this destruction. The goal is for at least one agent to survive and to have all the surviving agents to construct a map of the network, indicating the edges leading to the black hole. This variant of the problem is also known as black hole search. This problem has been investigated assuming powerful inter-agent communication mechanisms: whiteboards at all nodes. Indeed, in this model, the black hole search problem can be solved with a minimal team size and performing a polynomial number of moves. In this paper, we consider a less powerful token model. We constructively prove that the black hole search problem can be solved also in this model; furthermore, this can be done using a minimal team size and performing a polynomial number of moves. Our algorithm works even if the agents are asynchronous and if both the agents and the nodes are anonymous.

1 Introduction 1.1 T h e P r o b l e m T h e problem of exploring an unknown graph using a t e a m of one or more mobile agents (or robots) is a classical fundamental problem t h a t has been extensively studied since its initial formulation in 1951 by Shannon [19]. It requires the agents, starting from the same node, t o visit within finite time all Partially supported by grant VEGA 1/3106/06. Please use the following format when citing this chapter: Dobrev, S., Flocchini, P., Kralovic, Santoro, N., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 131-150.

132

S. Dobrev et al.

the sites of a graph whose topology is unknown to them. Different instances of the problem exist depending on whether or not the agents are required to eventually stop the exploration; and, if so, whether or not they must construct an accurate map of the network. Further differences exist depending on a variety of factors, including the (a)synchrony of the agents, the presence of distinct agent identifiers, the amount of memory, the coordination and communication tools available to the agents, etc. (e.g., see [1, 2, 3, 4, 6, 7, 13, 14, 15, 18]). Notice that, except for trees, the exploration with stop of anonymous graphs is possible only if the agents are allowed to mark the nodes in some way; various methods of marking nodes have been used by different authors ranging from the weak model of tokens to the most powerful model of whiteboards. The solutions proposed in the literature succeed in their task only assuming that the network is safe for the agents. This assumption unfortunately does not always hold in real systems and networks; for example, a node could contain a local program (virus) that harms the visiting agents; or the network could contain failed nodes that might damage incoming agents. In fact, protecting an agent from "host attacks" (i.e., harmful network sites) has become a pressing security concern (e.g., see [17, 20]). Recently the exploration problem has been examined also when the network is unsafe [5, 8, 9, 10, 11, 16]. The danger considered is the presence in the network of a black hole ( B H ) , a node that disposes of any incoming agent without leaving any observable trace of this destruction. Note that such a dangerous presence is not uncommon; in fact, any undetectable crash failure of a site in an asynchronous network transforms that site into a black hole. In spite of this severe danger, the goal is for the team of agents to be able to explore the network and, within finite time, discover the location of the BH. More precisely, at least one agent must survive, and any surviving agent must have constructed a map of the network indicating the edges leading to the BH. This version of the exploration problem is called black hole search ( B H S ) . It is known that, for its solution, the number of nodes of the network must be known to the agents [9]; furthermore, if the graph is unknown, at least A + 1 agents are needed, where A is the maximum node degree in the graph [10]. In the case of asynchronous agents in an unknown network, termination with an exact complete map in finite time is actually impossible; in fact, regardless of the protocol, a surviving agent upon termination can be wrong on Z\ — deg{BE) links, where deg{x) denotes the degree of node x [10]. Hence, in the case of asynchronous agents, BHS requires termination by the surviving agents within finite time and creation of a map with just that level of accuracy. The problem of asynchronous agents exploring a dangerous graph has been investigated assuming powerful inter-agent communication mechanisms: whitehoards at all nodes. In the whiteboard model, each node has available a local storage area (the whiteboard) accessible in fair mutual exclusion to all incoming agents; upon gaining access, the agent can write messages on the whiteboard and can read all previously written messages. This mechanism can be used by the agents to communicate and mark nodes or/and edges, and has been em-

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

133

ployed e.g. in [6, 8, 9, 10, 11, 13, 14]. In the whiteboard models, the black hole search problem can be solved with a minimal team size and performing a polynomial number of moves (e.g., [8, 9, 10, 11]). The problem of exploring a dangerous graph has never been investigated in the less powerful token model, which is instead commonly employed in the exploration of safe graphs. In the classical token model, each agent has available a token that can be carried, can be placed in the center on a node, or removed from it. All tokens are identical (i.e., indistinguishable) and no other form of marking or communication is available. In our variation {enhanced token modet) we allow tokens to be placed also on a node in correspondence to a port. Notice that the classical token model can be implemented with 1-bit whiteboards, while our variation is not as weak; in fact, it could be implemented by having a log d-whiteboard on a node with degree d. The principal question targeted by our research was the impact of the communication model to the solvability and complexity of the BHS problem: to what extent can be the whiteboard model weakened, and still allow the polynomial solvability of B H S ? With this goal in mind, we examine the problem of performing black hole search in the enhanced token model. Several immediate computational and complexity questions naturally arise. In particular, are the weaker communication and marking capabilities provided by enhanced tokens sufHcient to solve the problem ? If so, how can the problem be solved? at what costs? In this paper we provide definite answer to these questions. 1.2 Our Results In this paper we present an algorithm that works in the token model and solves the BHS problem with the minimal number of agents and with a polynomial number of moves. Our algorithm works even if the agents are asynchronous, and if both the agents and the nodes are anonymous. More precisely, we consider an unknown, arbitrary, anonymous network and a team of exploring agents starting their identical algorithm from the same node (home-base). The agents are anonymous, they move from node to neighboring node asynchronously (i.e., it takes a finite but unpredictable time to traverse a link). Each agent has available an indistinguishable token (or pebble) that can be placed on, or removed from, a node; on a node, the token can be placed either in the center or on an incident link. In our algorithm there are never two tokens placed on the same location (node center or port), nor an agent ever carries more than one token. Using only this tool for marking nodes and communicating information, we show that with A + 1 agents the exploration can be successfully completed. In fact, we present an algorithm that will allow at least one agent to survive and, within finite time, the surviving agents will know the location of the black hole with the allowed level of accuracy. The number of moves performed by the agents when executing the proposed protocol is shown to be polynomial. The proposed algorithm is rather complex.

134

S. Dobrev et al.

This work is the first that addresses the problem of exploration of a dangerous unknown graph using tokens. Our results indicate that, perhaps contrary to expectation, our variation of the token model is computationally as powerful as the whiteboard one with regards to black hole search. topology arbitrary, unknown arbitrary, known arbitrary, unknown

communication # of agents # of moves whiteboard A+1 whiteboard A+1 ©(iVlogiV) tokens

A+1

OiA^M^N'')

Fig. 1. Existing and new results for the BHS problem.

1.3 Related Work The research on safe exploration of unknown graphs was started in 1951 by Shannon [19]. Most of the work since has been concentrated on exploration by a single agent (e.g., [2, 7, 18]). Safe explorations by multiple agents were initially studied for a team of more recently the investigations have focused on collaborative exploration by Turing machines. An exploration algorithm for directed graphs that employs two agents was given in [3], whereas algorithms for exploration by more agents were given by Prederickson et al. for arbitrary graphs [15], by Averbakh and Berman for weighted trees [1], and more recently by Praigniaud et al. for trees [13]. To explore arbitrary anonymous graphs, various methods of marking nodes have been used by different authors. Bender et al. [2] proposed the method of dropping a token on a node to mark it and showed that any strongly connected directed graph can be explored using just one token, if the size of the graph is known and using ©(log log A'') tokens, otherwise. Dudek et al. [12] used a set of distinct markers to explore unlabeled undirected graphs. Yet another approach, used by Bender and Slonim [3] was to employ two cooperating agents, one of which would stand on a node, thus marking it, while the other explores new edges. In Praigniaud et al. [13, 14], marking is achieved by accessing whiteboards located at nodes, and their strategy explores directed graphs and trees. The explorations of unsafe graphs are quite recent and have focused mostly on asynchronous environments. The BHS problem has been studied when the network is an anonymous ring, characterizing the limits and determining optimal solutions [9]. When the network is an arbitrary graph the problem has been investigated in [10], and several tight bounds have been established, depending on the level of topological knowledge available to the agents. Por example, when the network is arbitrary, the topology unknown and no form of consistent edge labehngs are present, A + 1 agents are necessary and Q{N'^) moves are required in the worst case. Improved bounds on the number of moves have later been obtained in the case the agents have a complete map of the network (but

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

135

not the location of the BH) [11]. In the case of specific graphs, including many important interconnection networks, the number of moves can be reduced to linear [8]. In all these investigations, the nodes of the network have available a whiteboard, i.e., a local storage area that the agents can use to communicate information. Access to the whiteboard is gained in mutual exclusion and the capacity of the whiteboard is always assumed to be at least of f2{logN) bits. In the synchronous environments, the investigations have produced optimal solutions for trees [5]; approximation results have been obtained for arbitrary graphs in [5, 16].

2 T h e Model The network G = (V, E) is a simple undirected graph with node-connectivity two or higher; let A'' = |V^| and M = \E\ be the number of nodes and of edges of G, respectively, d{x) denote the degree of x, and A denote the maximum degree in G. If {x,y) 6 E then x and y are said to be neighbors. The nodes of G are anonymous (i.e., without unique names). At each node x there is a distinct label (called port number) associated to each of its incident links (or ports). Without loss of generahty, we assume that the labels at x e V are the consecutive integers # 1 , # 2 , . . . , #d(x). Operating in G is a team of zi -t-1 anonymous agents. The agents know the number of nodes of the network, can move from node to a neighboring node in G, have computing capabilities and limited amount of memory (0(Mlog A") bits suffice for our algorithm). We also assume that agents know the degree A of the B H . Each agent has a token that can be placed on on a node and removed from it; tokens are identical and their placement can be used to mark nodes and ports/links. More precisely, a node can be marked by a token in different modalities: in the center, or in correspondence of one of the incident ports. The agents obey the same set of behavioral rules (the "algorithm") and initially, they are all located at the same node h, called home-base (home-base). The agents can be seen as automata, where one computational step of an agent A in a node v is defined as follows. Based on the state (local memory) of A and on the presence of tokens at v and incident links (examined atomically): - change the state (local memory of A) - remove (or place) at most one token from v or an incident link and - start waiting (for a token to disappear) or leave v via one of the incident links. The computational steps are atomic and mutually exclusive, i.e. no more than one agent computes in the same node at the same time. The links satisfy FIFO property, i.e. the agents entering a link e = {u, v) at u will arrive at v and execute the computational steps in the same order they entered e. The agents are asynchronous in the sense that waiting (for a token to disappear) and traversing a link can take an unpredictable (but finite) amount of time.

136

S. Dobrev et al.

The network contains a black hole (BH) that destroys any incoming agent without leaving any trace of that destruction. The goal of a black hole search algorithm V is to identify the location of BH; that is, within finite time, at least one agent must terminate, and all the surviving agents must construct a map of the entire graph where the homebase, the current position of the agent, and the location of the black hole, are indicated. Note that termination with an exact map in finite time is actually impossible. In fact, since an agent is destroyed upon arriving to the BH, no surviving agent can discover the port numbers of the black hole. Hence, the map will have to miss such an information. More importantly, the agents are asynchronous and do not know the actual degree d{Bu) of the black hole (just that it is at most A). Hence, if an agent has a local map that contains N — 1 vertices and at most A unexplored edges, it cannot distinguish between the case when all unexplored ports lead to the black hole, and the case when some of them are connected to each other; this ambiguity can not be resolved in finite time nor without the agents being destroyed. In other words, if we require termination within finite time, an agent might incorrectly label some links as incident to the BH; however the agents need to be wrong only on at most A--d(BH.) links. Hence, we require from a solution algorithm V termination by the surviving agents within finite time and creation of a map with just that level of accuracy. The complexity measures of a solution protocol are: the number of agents used, called size of the team, and the total number of moves performed by the agents during the execution, called cost.

3 The Solution 3.1 Overview In our algorithm, each agent constructs its own local map (quasi-)independently from other agents until it enters the B H or explores at least N — 1 vertices and M - A edges. In the beginning, the local map of each agent contains only the home-base. During the computation, the communication ports in the graph are classified by each agent EIS follows: - unexplored port/edge - not in the local map: the port is not marked by a token - dangerous port - not in the local map; the port is marked by a token - safe edge - in the local map; connecting two already explored vertices - quasi-safe edge - in the local map; connecting two already explored vertices, but could be wrong Throughout the execution, whenever an agent leaves via a port that might lead to the BH, it leaves its token there, marking the port as dangerous. The

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

137

algorithm requires t h a t no agent enters a dangerous port, ensuring in this way t h a t at most A agents enter the black hole. We will thus say t h a t a dangerous port blocks the (other) agents. Initially, all ports incident to the home-base are unexplored. T h e local m a p of an agent is constructed by adding edges in a sequential manner according to Algorithm 1: T h e searching for an unexplored port is straightforward: any

loop traverse the local map and look for an unexplored port p if unexplored port p found t h e n EXPLORE(P)

continue the main loop else if local map contains N—1 vertices and there are at most A outgoing edges then TERMINATE

else 9 SUSPEND 10 e n d if 11 e n d if 12 13 e n d loop

traversal of the explored part using only the edges identified as safe in the local m a p will do. In the execution of ExPLORE(p), the agent explores the edge Incident to port p, determines whether it leads t o a new node or to an already discovered one^, and updates the local m a p . Due to complex interaction of anonymity with asynchrony, in some cases the agent might be unsure of whether an edge leads t o a new node or to an already visited one. However, the agent is able to recognize this uncertainty, and will add this edge to the local m a p as quasi-safe instead of safe. Eventually, no unexplored port is found. If A'' — 1 nodes has been visited, the remaining node is the B H and the algorithm can terminate. Otherwise, the access t o the unexplored part of the graph is blocked by dangerous ports. Since G is two-connected, at least one of those ports does not lead to the B H and the token will eventually be removed from it, making it unexplored. In order to avoid live-lock, the agent t h a t failed t o find an unexplored port suspends itself using procedure SUSPEND until such a progress has been made. T h e basic idea of S U S P E N D is to go to the home-base, set a flag there (by using a token) indicating t h a t an agent is waiting for wake-up, verify t h a t no progress has been made before the flag has been set up, and then wait to be woken^ p might lead to the B H as well, in which case the agent disappears there and does not continue the algorithm

138

S. Dobrev et al.

up. Complementarily, whenever an agent removes its token from an edge, it goes to the home-base and wakes up the agents waiting there (using procedure W A K E - U P ) . There are several technical issues to be dealt with (discussed in the detailed description), e.g. several agents might be executing SUSPEND and W A K E - U P simultaneously, the flag can only be implemented using tokens, as well as the interference with the rest of the algorithm. 3.2 Detailed Description In this section we give the full description of the algorithm. The following three rules clarify some terms used in the description: R l "cautious step" in a vertex v over a Unk I = put a token on link I, traverse the hnk, return to v, take the token, perform W A K E - U P , return to V and traverse / R 2 "put token in the home-base" = wait for all known safe links incident to home-base to become unmarked, then put the token R 3 "put token on a Unk" (in vertex v) = wait for v to become empty, then put the token The nodes on the other ends of the links # 1 and # 2 from home-base are called s t o r e r o o m s (SR) and they play special role in the algorithm (as we will see, they will be employed to allow communication among the agents when they are temporary suspended looking for a new port to explore). Each agent starts the algorithm by exploring (using cautious step) S R I and S R 2 from the home-base (in this order). Since the graph is simple, at least one of them is safe; if both these links are dangerous, the agent will simply wait until one of the blocking tokens disappears. Eventually, each agent will know about one or two safe storerooms. The primary storeroom for an agent is defined as the storeroom known to be safe with the lower numbered link leading to it. Note that if the B H is located in one of the storerooms, all surviving agents will choose the other S R as their primary SR. However, if none of the SR'S contains the BH, there might be agents with different primary SR'S (some might find S R I safe and choose it, some might find it temporarily dangerous and select SR2).

As this might lead to problems, the algorithm tries to remedy this situation by "updating" the primary store room of agents that had originally selected S R 2 and later discover that S R I has become safe in the meanwhile. The update rule is called R 4 and will be described later.

Explore The execution by agent A of procedure ExPLORE(p) is to enable A to traverse an unexplored edge e = {u, v) (starting at port p in u) and add it (possibly with the vertex v) to the local map. Agent A starts executing a cautious step over the edge e and, if survives, it proceeds with determining whether or not e leads to a new (not in the local map) vertex.

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

139

Notice that recognizing if v is already in the local map would be an easy task if either the agents were able to recognize their own tokens, or they were able to recognize the home-base. In fact, if agents were able to recognize their tokens, then A could simply put its token at v and scan the explored subgraph: if it finds its token, v is already explored, otherwise it is a new node. If the agents were able to recognize the home-base, then A could determine whether V is a new node as follows. For each node w in the local map, A guesses that V = w and verifies whether that is really true: Let a be a sequence of port labels specifying a safe path (determined by looking in the local map) from w to the home-base. Starting from v, A follows^ the port labels specified by a. If A finishes in the home-base, then v — w, otherwise A makes another guess. If all guesses fail, ?; is a new node. However, in our model the agents can not recognize their tokens nor the home-base. Still, the basic structure is to guess for all already explored nodes w whether v = w and to verify the guess, although the verification is much more involved. Let Pw (we will use P when w is clear from the context) be a sequence of port labels starting with the label of the port from u to f and then following a path (using only edges marked as safe in the agent's map) from w through the primary S R and ending in u. Clearly, li v = w then p specifies a simple cycle in the graph (and therefore \p\ < n, even if actually v ^ w). Agent A verifies whether v —w hy following the labels specified by a cyclic repeating of p (we will call it /?*) for up to A''^ edges or until A finds a difference between what it sees in the current node and what it should see (according to its map) a V = w. The number of steps is chosen large enough so that following P* creates a cycle even ifv^w (as we will see later, using only p is not enough). This means (as will be proven later) that if no discrepancy has been found for N^ steps, u and v indeed lie on a cycle C passing through the correct SR, with the labels specified by /3*. Unfortunately, it is still possible that, although no discrepancy is found, v ^ w: this could happen if \C\ is a multiple of |/3|. In this case the agent verifies whether v = w or not in the procedure VERIFY, which will be described later. The N^ steps along /3* must be done in cautious manner, not entering dangerous ports, since it may be the case that v ^w and p* leads to the BH. The cautious walk is complicated by the fact that a port to be taken (let its label be A) from a node w' might be dangerous. If this happens, the agent cannot afford to wait in w' until the token is removed, because this edge might indeed lead to the BH. Instead, it wants to ensure that, liv = w then the token will be removed allowing A to continue its cautious walk through A. To do so, A goes backwards for |/3| steps reaching a safe node through safe links; this node might indeed be w' (this happens if the guess i; = w is correct), or it could be a different node w". Agent A waits here until there is no token on the port labelled A. Although not sure about the identity of the node, the agent knows cautious walk needs to be used, as v might be different from w, and a from v might lead to the BH

140

S. Dobrev et al.

t h a t A must lead to a safe node {A is now revisiting nodes it has visited earlier) thus the token will be eventually removed from there. After ensuring the removal of the token, agent A returns t o w'. It can happen t h a t the port A is still dangerous. However, if v = w then this must be a newly placed token. Since (as we will see later) during the whole execution of the algorithm a token is placed on a given port less then 2AMN^ times, if after 2AMN^ cleaning tries A is still blocked, t h e n v ^ w. T h e Algorithm 2 describes the procedure EXPLORE in full detail.

A l g o r i t h m 2 Exploring an edge with label h by EXPLORE 1: do a cautious step over link li, let I2 := label of the link upon which you arrived 2: for all w in local map do 3: compute the sequence P 4: for A''^ steps do 5: while next port 7 in (3* is dangerous and this loop has been executed less than 2AMN^ times do 6: go back |/3| steps 7: wait until there is no token on the edge along which you arrived 8: go forwards \(3\ steps 9: end while 10: if port 7 is still dangerous then 11: backtrack your steps to v and continue the outermost for cycle for the next w 12: end if 13: do a cautious step 14: if what you see in the vertex you arrived to is not compatible with the local map assuming v = w then 15: backtrack your steps to v and continue the outermost for cycle for then next w 16: end if 17: end for 18: if VERIFY then / / after traversing N^ edges there was no discrepancy, so I am in a cycle. Is it a short one ? 19: add edge to w to the local map as quasi-safe 20: exit from EXPLORE 21: end if 22: end for 23: add to the local map the new vertex and edge; the added edge is marked as safe

Notice t h a t , during the actual exploration, tokens are placed in correspondence to links only. Thus, a token found on a link is a clear sign of danger. As we will soon discover, b o t h in the verification process (described below) and in the suspension process (described later) tokens are instead placed in (and removed from) the home-base and the storerooms. In other words, the home-base and t h e storerooms are employed to accomplish different tasks and this requires much care to avoid ambiguity and interference between different activities.

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

141

Verification T h e test of a candidate vertex w in the procedure E X P L O R E may end, after traversing the sequence /3* for A'"^ steps, in a situation where the agent knows t h a t either (3 or its multiple forms a safe cycle connecting u and v. T h e procedure V E R I F Y is used to verify whether the cycle consists of just one repetition of /3 (in which case v = w).

A l g o r i t h m 3 VERIFY - let p be the S R , if the hypothesis about w is true 1: PosCount = NegCount = 0 2: loop 3: go to home-base, wait until it becomes empty, and go to the primary S R 4: if the S R is empty t h e n 5: put token and exit loop 6: else 7: wait until the S R becomes empty 8: e n d if 9: e n d loop 10: while PosCount < 2AMN^ + AMN and NegCount < 2AMN^ + AMN d o 11: if known, go to the other S R and wait until it becomes empty 12: go to the home-base, wait until it becomes empty 13: go to p 14: if there is a token t h e n 15: PosCount = PosCount + 1 16: else 17: NegCount — NegCount + 1 18: e n d if 19: go to the primary S R and if empty update the knowledge of storerooms using rule R4 and restart algorithm 20: e n d while 21: take token 22: if PosCount > 2AMN^ + AMN t h e n 23:

return T R U E

24: else 25: return FALSE 26: e n d if

T h e idea of V E R I F Y is t o use a token in the primary S R for breaking symmet r y on the /3*-cycle. An agent A performing a V E R I F Y first makes sure t h a t it is not interfering with any other agent by waiting until both the home-base and the S R ' S it knows to be safe are empty. It t h e n puts its token in the primary S R and walks^ along the /?*-cycle for |/3| steps t o a vertex w' and checks whether there is a token in w'. T h e idea is t h a t \iv — w then w' is the S R and contains the token, iiv^w then w' should be empty as it is not t h e correct S R . •* Note that it is not needed to use cautious steps, as the cycle identified by (3* has already been traversed and is known to be safe

142

S. Dobrev et al.

Notice that a straightforward check on whether there is a token in w' can fail for two reasons. (1) It may happen that w' is not a SR but, say, the homebase. As mentioned above, the home-base is also used by procedure SUSPEND and WAKE UP, which are employed when an agent has not found a suitable port to explore and is waiting for one to become available. If some other agent has started to perform a SUSPEND (which requires putting a token in home-base) while A traveled to w', A is deceived since it finds a token in w', but this is not the token it left in SR! (2) It may happen that w' is indeed a S R but some other agent took the token from the S R in the meanwhile (when finishing SUSPEND); so A is again deceived because it does not find its own token. Luckily, as will be shown later, each of these two cases occurs less than K times, where K = 2AMN^ + AMN. Hence, if A saw a token in w' at least K times, then w' must be the SR; conversely, if A saw no token in w' at least K times, then w' is not the SR. One last complication comes from the fact that, at the beginning of each iteration of the while cycle, A has to make sure that the home-base and the SR'S are empty. The problem is that agents cannot always agree on one primary SR. In fact, (if the B H is not in a SR) there are three types of agents: some think that only S R I is safe, other thing that only SR2 is safe, while the third group knows that both SR'S are safe. However, if an agent does not know that both SR'S are safe, it cannot make sure that both of them are empty. In this case it may happen that the result of V E R I F Y is wrong. This is the reason why when A decides that v = w,\t marks the edge {u,v) as quasi-safe and never uses it for traversals. Note that if EXPLORE declares w to be a new vertex it never errs, so the spanning tree defined by the safe edges is always available for traversal. As we prove later, the only way for an agent A to find an empty S R on line 19 is if A does not know about (safe) S R 1. This means that after seeing an empty SR, A can update its knowledge about the storerooms and reset the algorithm according to rule R4. R 4 = When an agent first realizes that both SR'S are safe, it performs the following actions: - If you have no token and your old primary S R is S R 2 , execute G R A B T O K E N starting from S R 2 , else execute G R A B - T O K E N from the home-base. - Update the knowledge about SR'S. - If you came to the home-base to perform W A K E - U P but have not done so, do it now - Restart the whole algorithm

Grab-Token The procedure G R A B - T O K E N is used by an agent A to pick up a token that it has previously put at the home-base or a SR. It might happen that some other agent B has meanwhile picked the A's token instead of its own. However, in such case B's token must be somewhere around (in the home-base or in a

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

143

SR) and A will take it (or the token of yet another agent).

A l g o r i t h m 4 GRAB-TOKEN - starts in home-base 1: if there is a token in home-base, get it and exit GRAB-TOKEN 2: go to primary SR, if there is a token there, get it and exit GRAB-TOKEN 3: go to the home-base and if there is a token there, get it and exit GRAB-TOKEN 4: go to the other SR and get token

Suspend & Wake-Up Recall that an agent A performs SUSPEND when further exploration progress is blocked by dangerous links, but A knows that eventually at least one of those link will become unblocked. The basic idea is to put the token in the home-base to signal "I want to be waken-up", check whether a progress has been made before the token was put down (to prevent deadlock, as an agent performing W A K E - U P after removing its token from a dangerous edge might have arrived to the home-base before the token was put there) and, if not, then wait until the token disappears. An agent performing W A K E - U P simply moves a token from the home-base (if there is any) to its primary SR. The problems arise because several agents might be executing SUSPEND, W A K E - U P and V E R I F Y simultaneously, and because the agents do not necessarily agree on the correct SR. Dealing with that constitutes the most technical part of the algorithm. The basic idea is to wait until any activity going on (detected by non-empty home-base or SR) looks to have finished and then restart SUSPEND. Still, there are many possible cases how the agents can steal each other's tokens and/or misinterpret what is going on. The reasons behind the design of SUSPEND and W A K E - U P will become fully apparent only when reading the formal proofs in the next section. The idea of W A K E - U P is to wake-up an agent suspended at home-base by moving its token to a SR. In order to make G R A B - T O K E N work, the wakingup agent first places its token in the S R and then removes the token from the home-base. If the home-base is empty or the S R is full, W A K E - U P does nothing, because either there is nobody suspended, or it has been already waken-up and just has to pick up its token. When an agent suspended at home-base sees that its token has disappeared, it will search around and find its token (using GRAB-TOKEN)

4 Correctness and Complexity Let us call an agent informed if its knowledge about which storerooms are safe is correct. If the B H is located in one of the storerooms, all agents (that have finished initialization) are informed; otherwise an informed agent knows that

144

S. Dobrev et al.

A l g o r i t h m 5 SUSPEND

go to home-base, wait until it is empty and put a token there scan all known SR'S and return to home-base if SR'S were empty then traverse the local map else if there is a token in home-base then get token go to the SR that contained a token, wait until it becomes empty and restart SUSPEND 9

else / / my token has been moved

GRAB-TOKEN 10 restart SUSPEND 11 end if 12 13 end if

upon return from traversal 1 if traversal revealed progress then 2 GRAB-TOKEN 3 else 4 wait until home-base becomes empty GRAB-TOKEN 5 6 end if Algorithm 6 WAKE-UP

go to home-base and if empty, abort go to "correct" SR if SR full then abort else put token go to home-base GRAB-TOKEN

end if both storerooms are safe. However, the notion of an informed agent is for the purpose of the proof only. The agents themselves may not know whether they are informed or not. The overall structure of the correctness proof, which is quite complicated, is the following: we first prove that during the whole algorithm, at most A agents enter the B H , and all agents that are alive make progress by eventually exploring a new edge. Second, we prove that all informed agents maintain a correct local map, i.e. the local map of an informed agent is at any time Isomorphic to some subgraph of the network (including port labels). The above arguments are formally carried out through a sequence of Claims and Lemmas, which will lead to the main Theorem:

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

145

T h e o r e m 1. (Main T h e o r e m ) At least one agent successfully terminates with a correct map. Due to the lack of space, we present only the key lemmas, we omit some proofs and we only informally sketch some reasonings. Let us start with some basic observations. Since a token is put in a vertex only in SUSPEND, W A K E - U P or VERIFY, we get: Claim. l.A token is in the vertex v only if i; is a home-base or a SR. The most technical part of the algorithm is the implementation of the communication between agents by means of tokens. We are specifically interested in agents who have put their token in the home-base or in the S R and are now without a token; we will call them empty-handed to distinguish them from agents who do not have a token because they are performing a cautious step. Prom the definition of cautious step, from Claim 1, and by construction we get: Claim. 2. There are as many empty-handed agents as tokens in the home-base and storerooms. An agent performing procedure G R A B - T O K E N visits the home-base and possibly some SR'S a constant number of times in a search for a token. For the correctness of the algorithm it is important to prove that a token is always found. L e m m a 1. An agent always gets a token in procedure

GRAB-TOKEN.

Proof. Consider, for the sake of contradiction, an agent A executing G R A B that has not found a token. Let to be the time when A sees that its primary S R X is empty and starts to travel back to home-base. Let ii > io be the time when A arrives to the home-base, finds it empty again, and starts to travel to S R y. By Claim 2, at time to there must be at least one token T in home-base or S R y. However, since A does not find T, T must have disappeared after to before A gets there. The only way for T to disappear is if it is taken by some empty-handed agent B. However, since B is empty-handed, there must be another token T' in some vertex (home-base or SR) at the time when B grabs T. The idea is to argue about T and T' and show that A would find one of them. In particular, we first prove that at some point in time after to both home-base and SR y are full, and then prove that from this fact it follows that A finds a token. Let us focus on the time t' when B put T' and thus became empty-handed. We distinguish three cases. First, consider t' > ti. B could not have removed T from the home-base before time ti, therefore at time ^i (and t' as well, as it is B that removes it) T must be in S R y. Since A started traveling from the home-base to SR y at time ti < t' and due to the FIFO property, B cannot get to SR y before A and so A finds T in S R y - contradiction. TOKEN

146

S. Dobrev et al.

Next, let t' < to- This means that at time to both A and B are emptyhanded, and moreover, SR X is empty. Hence, due to Claim ??, at time to both home-base and S R y are full. Third, let to < t' < ti. There are two possibilities: (1) B (at time t') put T' in S R X. Since to < t', due to FIFO property B cannot take T from the home-base before A does - contradiction. (2) B (at time t') put T' somewhere else (home-bgise or S R y). In such a case, at time t' both home-base and S R y are full, containing T and T': By assumption, B is the agent that takes T, therefore T did not move between to and t'. Hence, it must be the case that the home-base and S R y are full at some time t between to and ti. Since we suppose that A does not find a token, it must be that both tokens in home-base and SR y disappear at some time after t. However, at time to, SR X is empty, so at that time at most one agent other than A is empty-handed. Any agent that becomes empty-handed by putting a token in S R X after to cannot, due to FIFO, prevent A from grabbing a token. This means that only one of the tokens in home-base and S R y can disappear after time t and before A arrives there, i.e. A will find a token - contradiction. Lemma 2. A token is removed from a given link less then 2AMN^

times.

Proof. A token is put and removed on a link only during cautious step. Cautious steps are performed only on line 13 in EXPLORE, which is executed less than A''^ times (at most A'"-^ iterations of the inner loop, for at most A' — 1 candidate vertices). EXPLORE is called by each of the A agents at most M times. Finally, each agent might reset the algorithm once, applying rule R 4 We now aim at proving that at most A agents disappear in the B H . In order to do so we need to show first that an agent can enter a B H only during a cautious step, i.e. that edges marked safe in the local map of an agent correspond to safe edges in the network. To do so, we use the following technical lemmas, whose proofs are omitted due to the lack of space. Lemma 3. Consider a situation when both SR'S are full and agents A and B are the only empty-handed agents. Then, before A or B grabs a token from the home-base or some SR, no agent other than A or B grabs a token from a SR. Lemma 4. No token placed in S R I will be stolen. Moreover, let A be an agent knowing that S R I is safe that puts a token in the home-base. Then A's token will not be kicked out to S R 2 . Lemma 5. A token put in a SR x by an informed agent A executing VERIFY can be removed from x only by A. We can now prove the following: Lemma 6. When an agent A adds a vertex v to its local map as a new vertex, then the local map indeed did not contain v.

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

147

Proof. A vertex v is added as new only if the test w = v \u EXPLORE failed for every candidate w. We show that if the test fails then indeed w ^ v. The test for a given w can fail: - By having the port 7 still dangerous after executing the loop on lines 5..9 for 2AMN^ times. However, if to = u, then between each iteration of that loop the port 7 is cleared which is a contradiction with Lemma 2. Hence, w ^ v. - By noticing (in line 14) difference between what the map tells what should be seen if w = to and what really is visible. Clearly, in such case v ^ w. - By having V E R I F Y return false. V E R I F Y returns false if the agent A has not found the token in the vertex p (which is equal to its correct SR x if v = w) for at least 2AMN^ times. Note that A always leaves S R X with its token there. We distinguish two cases: (i) If a;=SRl the lemma follows from the second part of Lemma 4: no other agent steals A's token from S R I , SO ii V = W then A always sees a token in p and, subsequently, VERIFY never returns FALSE. (ii) Let X = S R 2 . Which agent could remove ^ ' s token from S R 2 ? Prom Lemmas 4 and 5 we know that ^ ' s token was not removed from S R 2 by an agent B executing VERIFY from S R I , because in that case B's token remains in S R I . It cannot be the case that ^ ' s token was removed by an agent B executing V E R I F Y from S R 2 , because that agent would have first placed its token in S R 2 . Therefore, A's token was removed by an agent B executing a G R A B - T O K E N as a part of SUSPEND or W A K E - U P . However, for each removal of a token from a S R by an agent B executing SUSPEND there must have been a wake-up of some other agent that kicked out a token from the home-base to a S R (otherwise B would have picked up its token in the home-base). The only exception are the cases when an agent becomes informed and first takes a token from its old primary SR, which can happen at most A times. The lemma follows from the fact that there are less then 2AMN^ wake-ups. Using the previous lemma, we can argue that an agent disappears in a B H only during a cautious step: L e m m a 7. / / A enters BH, the link e upon which it arrived is marked by its token. Since no agent enters a link marked by a token and the degree of the B H is at most A, we get: T h e o r e m 2. At most A agents die. The next lemmas are needed to show that no deadlock can occur, i.e. every agent is always able to continue its algorithm after some finite time. First, we prove that no deadlock occurs when an agent is waiting for a disappearance of a token: L e m m a 8. A token from the home-base eventually disappears.

148

S. Dobrev et al.

Proof. The only way a token can be put in the home-base is in SUSPEND. Consider for the sake of contradiction that an agent A puts a token in the home-base at time to and that token never disappears. That means A went to its primary S R X and found in empty at time ii, then returned and went to rescan. We claim that if the token from the home-base does not disappear, then no token appears in the S R X after time i i . An agent B executing V E R I F Y cannot place a token in S R X after ti ~ ii B checked the home-base (line 3. of V E R I F Y ) before to, then A would have found its token in SR a;, if it checked it after ^O) it would wait in the home-base until it becomes empty. The only other possibility is that B is executing WAKE-IJpand placed its token in S R X after ti. In such case B would find A's token in the home-base (when executing G R A B - T O K E N ) and take it. Contradiction. Because A did not take its token after returning from rescan, it has seen no progress and did not terminate. This means (by Theorem 2 and from the two-connectivity of G) that it cannot be the case that all blocked links lead to the BH. Therefore one of them will eventually be freed and some agent B will execute W A K E - U P . If a; = 1 (i.e. A's primary S R is S R I ) , B will execute W A K E - U P using S R I (either because S R I was its primary SR, or because of rule R 4 - the link to the S R I is free due to rules R 2 and R3) and since S R I is empty after time ti, it will indeed remove A's token from the home-base. Contradiction. If a; = 2, there are two cases. If J5's primary S R is S R 2 , the same argument as above applies. Otherwise S R I does not contain the B H and the link leading to it will eventually become free and due to rule R 3 remain so. That means A will eventually notice that S R I is safe and apply rule R 4 , executing G R A B - T O K E N starting from S R 2 . Since SR2 remains empty after i i , A will pick its token from the home-base. Contradiction. In a similar fashion, we can show the following lemma, which is, due to space constraints, presented without proof: L e m m a 9. A token from a S R eventually disappears. Prom the construction and Lemmas 8 and 9 we get: T h e o r e m 3. An agent never deadlocks. The next two lemmas are crucial for bounding the number of moves. Due to space restrictions we present them without proofs. L e m m a 10. An agent spends 0{AMN'^)

moves in one call to

L e m m a 1 1 . An agent spends 0{AMN'^) outer loop of Algorithm 1.

steps executing one iteration of the

The last property we need for the proof of Theorem 1 is:

VERIFY.

Exploring an Unknown Graph to Locate a Black Hole Using Tokens

149

L e m m a 12. Each informed agent has a correct map. Proof. It follows from Lemma 6 that if an agent A adds a new vertex v to its map, then indeed v has not been in ^ ' s local map before. So it remains to be proven that if an informed agent A adds an edge (w, w) between two visited vertices to its map, then there is an edge (w, w) in the graph. Adding an edge {u,w) requires that the hypothesis v = w tested in EXPLORE and V E R I F Y returns TRUE. We first prove that after successfully finishing A''^ iterations of the loop on line 4 in EXPLORE the sequence /3* defines a (not necessarily simple) cycle connecting v and u, whose length is a multiple of |/?|. Let /3 = {l3i,P2, • • • ,Pk) where each (ii specifies two port numbers: a consistent traversal must arrive via port pi and leave via port P2. Since k < N ,hy traversing (3* for A''^ steps it must happen that the agent visits a particular vertex q twice with the same position in the sequence /?; say /?j. Clearly, from now on the agent walks in cycle. Let q be the first such vertex. However, since /3j specifies also the arriving port number, it means that the agent has both times arrived to q using the same port, i.e. it already started in the cycle. To conclude, we prove that if VERIFY returns T R U E for some informed agent it must be that the cycle formed by /3* has length |/3| and hence v = w.li V E R I F Y returns T R U E it means that A saw a token in p at least 2AMN'^ times and between every two successive visits of p there was a time when home-base was free and, if there are two storerooms, also a time when SR2 was free. If p was not S R I , it must be that either p is home-base or p is S R I and each of the 2AMN^ times some agent put its token at p (which was removed before the next visit of A in p). We conclude the proof by showing that a token is put in p less then 2AMN^ + AMN times. There are two possible situations when an agent B could put its token to p: either B performs a V E R I F Y in S R 2 (there are at most AMN such cases: B must be a non-informed agent and it puts its token once per each call of VERIFY before getting informed), or B performs a S U S P E N D - W A K E - U P pair. However, in the latter case there must be a cautious step that triggers this W A K E - U P which, according to Lemma 2, accounts for another 2AMN^ possibilities. By Lemmas 1-12, the main theorem (Theorem 1) follows. Let us now consider the number of moves. By Lemmas 10,11 plus the fact that each of the A agents performs at most M iterations of the loop in Algorithm 1, we have T h e o r e m 4. The B H can he located using 0{A'^M'^N'^)

moves.

References 1. I. Averbakh and O. Berman. A heuristic with worst-case analysis for minimax routing of two traveling salesmen on a tree. Discr. Appl. Math., 68:17-32, 1996.

150

S. Dobrev et al.

2. M. Bender, A. Fernandez, D. Ron, A. Sahai, and S. Vadhan. The power of a pebble: Exploring and mapping directed graphs. In Proc. 30th ACM Symp. on Theory of Computing (STOC'98), 269-287, 1998. 3. M. Bender and D. K. Slonim. The power of team exploration: two robots can learn unlabeled directed graphs. In Proc. 35th Symp. on Foundations of Computer Science (FOCS'94), 75-85, 1994. 4. M. Blum and D. Kozen. On the power of the compass (or, why mazes are easier to search than graphs). In 19th Symposium on Foundations of Computer Science (FOCS'78), 132-142, 1978. 5. J. Czyzowicz, D. Kowalski, E. Markou, and A. Pelc. Searching for a black hole in tree networks. In Proc. 8th International Conference on Principles of Distributed Systems (OPODIS 2004), 35-45, 2004. 6. S. Das, P. Flocchini, A. Nayak, and N. Santoro. Exploration and labelling of an unknown graph by multiple agents. In Proc. 12th Int. Coll. on Structural Information and Communication Complexity (SIROCCO'05), 99-114, 2005. 7. X. Deng and C. H. Papadimitriou. Exploring an unknown graph. Journal of Graph Theory, 32(3):265-297, 1999. 8. S. Dobrev, P. Flocchini, R. Kralovic, G. Prencipe, P. Ruzicka, and N. Santoro. Optimal search for a black hole in common interconnection networks. Networks, 47(2):61-71, 2006. 9. S. Dobrev, P. Flocchini, G. Prencipe, and N. Santoro. Mobile search for a black hole in an anonymous ring. Algorithmica. To appear. 10. S. Dobrev, P. Flocchini, G. Prencipe, and N. Santoro. Searching for a black hole in arbitrary networks: optimal mobile agents protocols. Distributed Computing. To appear. 11. S. Dobrev, P. Flocchini, and N. Santoro. Improved bounds for optimal black hole search in a network with a map. In Proc. of 10th Int. Coll. on Structural Information and Communication Complexity (SIROCCO'04), 111-122, 2004. 12. G. Dudek, M. Jenkin, E. Milios, and D. Wilkes. Robotic exploration as graph construction. Transactions on Robotics and Automation, 7(6):859-865, 1991. 13. P. Fraigniaud, L. Gasieniec, D. Kowalski, and A. Pelc. Collective tree exploration. In 6th Latin American Theoretical Informatics Symp. (LATIN'04), 141-151, 2004. 14. P. Fraigniaud and D. Ilcinkas. Digraph exploration with little memory. In 21st Symp. on Theoretical Aspects of Computer Science (STACS'04), 246-257, 2004. 15. G. N. Frederickson, M. S. Hecht, and C. E. Kim. Approximation algorithms for some routing problems. SIAM J. on Computing, 7:178-193, 1978. 16. R. Klasing, E. Markou, T. Radzik, and F. Sarracco. Hardness and approximation results for black hole search in arbitrary graphs. In Proc. 12th Coll. on Structural Information and Communication complexity (SIROCCO'05), 200-215, 2005. 17. R. Oppliger. Security issues related to mobile code and agent-based systems. Computer Communications, 22(12):1165 - 1170, 1999. 18. P. Panaite and A. Pelc. Exploring unknown undirected graphs. J. Algorithms, 33:281-295, 1999. 19. CL. E. Shannon. Presentation of a maze-solving machine. In 8th Conf. of the Josiah Macy Jr. Found. (Cybernetics), 173-180, 1951. 20. Jan Vitek and Giuseppe Castagna. Mobile computations and hostile hosts. In D. Tsichritzis, editor. Mobile Objects, 241-261, 1999.

Fast Cellular Automata with Restricted Inter-Cell Communication: Computational Capacity Martin Kutrib^ and Andreas Malcher^ ^ Institut fiir Informatik, Universitat Giessen Arndtstr. 2, D-35392 Giessen, Germany kutribOinformatik.uni-giessen.de

^ Institut fiir Informatik, Johann Wolfgang Goethe Universitat D-60054 Frankfurt am Main, Germany a.malcherSem.uni-frankfurt.de

Abstract. A d-dimensional cellular automaton with sequential input mode is a d-dimensional grid of interconnected interacting finite automata. The distinguished automaton at the origin, the communication cell, is connected to the outside world and fetches the input sequentially. Often in the literature this model is referred to as iterative array. We investigate d-dimensional iterative arrays and one-dimensional cellular automata operating in real and linear time, whose inter-cell communication is restricted to some constant number of bits independent of the number of states. It is known that even one-dimensional one-bit iterative arrays accept rather complicated languages such as {a^ I p prim} or {a^ | n G N} [16]. We show that there is an infinite strict double dimension-bit hierarchy. The computational capacity of the one-dimensional devices in question is compared with the power of communication-restricted two-way cellular automata. It turns out that the relations are quite different from the relations in the unrestricted case. On passing, we obtain an infinite strict bit hierarchy for real-time two-way cellular automata and, moreover, a very dense time hierarchy for every fc-bit cellular automata, i.e., just one more time step leads to a proper superfamily of accepted languages. K e y words: Cellular automata; Iterative arrays; Restricted communication; Formal languages; Computational capacity; Parallel computing

1 Introduction Devices of homogeneous, interconnected, parallel acting a u t o m a t a have extensively been investigated from a computational capacity point of view. T h e specification of such a system includes the type and specification of the single aut o m a t a (sometimes called cells), their interconnection scheme (which can imply a dimension to the system), a local a n d / o r global transition function and the input and o u t p u t modes. Multidimensional devices with nearest neighbor conPlease use the following format when citing this chapter: Kutrib, M., Malcher, A., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 151-164.

152

M. Kutrib and A. Malcher

nections whose cells are finite automata are commonly called cellular automata. If the input mode is sequential to a distinguished communication cell, they are called iterative arrays (lA). In connection with formal language recognition I As have been introduced in [5], where it was shown that the language family accepted by real-time lAs forms a Boolean algebra not closed under concatenation and reversal. In [4] it is shown that for every context-free grammar a two-dimensional linear-time lA parser exists. In [6] a real-time acceptor for prime numbers has been constructed. A characterization of various types of lAs in terms of restricted Turing machines and several results, especially speed-up theorems, are given in [7, 8]. Several more results concerning formal languages can be found (e.g., in [12, 13]). In order to investigate the computational capacity of a device, there is a particular interest in infinite hierarchies of language families defined by bounding some resources. In [9] a dense lA time hierarchy beyond linear time has been proved. The gap between real time and linear time has been closed in [2]. Further hierarchies depending on the amount of nondeterminism and the number of alternating transitions performed by the communication cell are shown in [1, 3]. Descriptional complexity issues are studied in [10]. All these results concern iterative arrays where the states of the neighboring cells are communicated in one time step. That is, the number of bits exchanged is determined by the number of states. A natural and interesting restriction of lAs is to restrict the number of bits by some constant being independent of the number of states. Iterative arrays with restricted inter-cell communication have been investigated in [15, 16], where algorithmic design techniques for sequence generation are shown. In particular, several important infinite, non-regular sequences such as exponential or polynomial, Fibonacci and prime sequences can be generated in real time. Connectivity recognition problems are dealt with in [14], whereas in [17] the computational capacity of one-way cellular automata with restricted inter-cell communication is considered. Here we investigate d-dimensional iterative arrays and one-dimensional cellular automata operating in real and linear time. The inter-cell communication of the array is restricted to some constant number of bits, in order to determine the power and nature of the communication bandwidth in massively parallel devices. The paper is organized as follows. In Section 2 we define the basic notions and the main model in question, i.e., d-dimensional iterative arrays with restricted inter-cell communication. Section 3 is devoted to dimension and bit hierarchies. We show that there is an infinite strict double hierarchy. That is, for every dimension real-time (A:4- l)-bit restricted iterative arrays are strictly more powerful than real-time A;-bit restricted iterative arrays, and for every k-hit restriction real-time (d+l)-dimensional fc-bit restricted iterative arrays are strictly more powerful than real-time d-dimensional fe-bit restricted iterative arrays. In Section 4 we consider one-dimensional devices. The computational capacity of the devices in question is compared with the power of communication-restricted two-way cellular automata. It turns out that the relations are quite different from the relations in the unrestricted case. On passing, we obtain an infinite

Fast Cellular Automata with Restricted Inter-Cell Communication

153

strict bit hierarchy for real-time two-way cellular a u t o m a t a and, moreover, a very dense time hierarchy for every A;-bit cellular a u t o m a t a , i.e., just one more time step yields to a proper superfamily of accepted languages.

2 Definitions and Preliminaries We denote the rational numbers by Q, the integers by Z, the non-negative integers by N, and the positive integers {1,2,...} by N+. The empty word is denoted by A, the reversal of a word w by w^, and for the length of w we write \w\. T h e set of words over some alphabet A whose lengths are at most I GN is denoted by A-K Set inclusion and strict set inclusion are denoted by C and C, respectively. A d-dimensional iterative array is a d-dimensional array (i.e. N'') of finite a u t o m a t a , sometimes called cells, where each of t h e m is connected to its nearest neighbors in every dimension. For convenience we identify the cells by their coordinates. Initially they are in the so-called quiescent state. T h e input is supplied sequentially to the distinguished communication cell at the origin. For this reason, we have different local transition functions. T h e state transition of all cells b u t the communication cell depends on the current state of the cell itself and the current states of its neighbors. T h e state transition of the communication cell additionally depends on the current input symbol (or if the whole input has been consumed on a special end-of-input symbol). In an iterative array with A;-bit restricted inter-cell communication, during every time step each cell may communicate only k bit of information to its neighbors. These bits depend on the current state and are determined by so-called bit-functions. T h e finite a u t o m a t a work synchronously at discrete time steps.

X X X X X so so So

So

So

So — * So ^ - * So ^ ^ So — * So

So —^ So -—* So —^ So —— So So ^ ^ So *—» So — * So ^ ^ So

so

so aia2a3

so

So

so«-*

•• • an#

Fig. 1. A two-dimensional iterative array.

154

M. Kutrib and A. Malcher

D e f i n i t i o n 1. A d-dimensional iterative array with fc-bit restricted inter-cell communication (lA^) is a system (5, A, F, SQ, d,k,bi,..., b2d, S, So), where (1) (2) (3) (4) (5) (6) (7)

S is the finite, nonempty set of cell states, A is the finite, nonempty set of input symbols, F C S is the set of accepting states, So £ S is the quiescent state, d € N4. is the dimension, k e N+ is the number of bits which can be communicated to neighbor cells, 6j : 5 —> {0, l}*^, for 1 s is the local transition function for non-communication cells satisfying S{so, ( 0 , . . . , 0 ) , . . . , ( 0 , . . . , 0)) = SQ, (9) 5o••S X {AU {#}) X ({0,1}'=)'' -^ S is the local transition function for the communication cell. Let M be an lAj^.. A configuration of M at some time i > 0 is a description of its global state which is a pair {wt,Ct), where Wt G A* is the remaining input sequence and Ct : Ng —* S is a mapping t h a t maps the single cells t o their current states. For the sake of simpler notation in connection with cells at a face of N"*, we extend the mappings ct to arguments from Z'', and assume t h a t all cells in Z'^ \ Nff are permanently in the quiescent state sending zeroes. T h e configuration {wo,co) at time 0 is defined by the input word wo and the mapping c o ( J i , . . . ,id) = so, {ii, • • •,id) £ Ng, while subsequent configurations are chosen according to the global transition function A. Let {wt, ct), t > 0, be a configuration, t h e n its successor configuration {wt+i,ct+i) = A[{wt,Ct)) is as follows: c t + i ( i i , ...,id)

= S{ct{ii, ...,id), bi{ct{ii - 1 , 1 2 , . . . , id)), b2{ct{ii + 1 , 1 2 , . . . , id)), b3{ct{ii,i2 - 1, • • •, id)), b4{ct{ii,i2 + ! , • • • , id)), •••, b2d-\.{ct{i\,i2, •••,id1)), b2d{ct{ii,i2, •••,id + 1)))

for all ( i i , . . . ,id) e Ng \ { ( 0 , . . . , 0 ) } , and ct+i(0,...,0)-<5o(Q(0,...,0),a, 62(ci(l,0,...,0)),64(ct(0,l,...,0)),..., 62d(ct(0,0,...,l))) where a = #, Wt+i = X ii wt = X, and a = a i , wt+i = 02 • • • a „ if M); = a i • • • a „ . T h u s , t h e global transition function A is induced by 5 and <5o. A word w is accepted by an lA^. if at some time i during its course of computation on input w the communication cell becomes accepting. D e f i n i t i o n 2. Let M = {S, A,F,so,d,k,bi,..., (1) A word w e. A* is accepted by M, that C i ( 0 , . . . , 0) e F .

b2d, 6, So) be an lAf..

if there exists a time step i £ N such

Fast Cellular Automata with Restricted Inter-Cell Communication

155

(2) L{M) — {w & A* \ w is accepted by M} is the language accepted by M.. (3) Let i : N —> N, t{n) > n + 1, be a mapping. If all w £ L{M) are accepted with at most t{\w\) time steps, then L is said to be of time complexity t. The family of all languages which can be accepted by an lA^. with time complexity t is denoted by ^t(IA^,). If t equals the function n + 1, acceptance is said to be in real time and we write ^rt(IA^.). The linear-time languages ^itO-At) are defined according to ^itO-Af) = U e Q . o i ^i.n(IA^). Definition 3. Let L C A* be a language over an alphabet A and I G N+ be a constant. (1) Two words w E A* and w' G A* are I-right-equivalent with respect to L if for all y G A - ' ; wy & L <^=> w'y G L. (2) Nr{l,L) denotes the number of l-right-equivalence classes with respect to L. (3) Two words w G A-^ and w' G A - ' are I-left-equivalent with respect to L if for all y e A*: wy € L 4=4> w'y G L. (4) N(i{l,L) denotes the number of I-left-equivalence classes with respect to L. Lemma 4. Let k,d G N-^. be constants. (1) If L & Ji^rtilAf), then there exists a constant p G N such that

and (2) if L & J^ftilAf),

then there exists a constant p G N such that Ne{l,L)
for all I G N+ and all time complexities t :N —> N. Proof. Let M = {S, A,F,so,d,k,bi,...,b2d,S,5o) be a real-time lA^. that accepts L. In order to determine an upper bound for the number of Z-rightequivalence classes we consider the possible configurations of M after reading all but \y\ < I input symbols. The remaining computation depends on the last \y\ input symbols, the current state of the communication cell, and the states of the cells which can send information that is received by the communication cell during the last |y| + 1 time steps. These are at most {\y\ + 1)'^ cells. So, in total there are at most |5|^+(l^l+^)'' < |5'p('+^)'' different possibilities. Setting p = |5|2, we obtain Nr.{l,L) < p ( ' + i ) ' . Now let M. he a, lA^ that accepts L with time complexity t. In order to determine an upper bound to the number of /-left-equivalence classes we consider the possible configurations of M after reading prefixes w whose lengths are at most I. A computed configuration depends on the information which is sent to the array by the communication cell, and the current state of the communication cell. So, there are at most (2'=-'')l^l-i • | 5 | < | 5 | • 2'=''^'' different configurations. Setting p = \S\, we obtain Ne{l,L)
156

M. Kutrib and A. Malcher

3 Dimension and Bit Hierarchies The hierarchies are proved by specific witness languages which are defined dependent on the given resources. 3.1 Dimensions Here we fix the time complexity to real time, the number of communication bits to any constant k G N+, and consider the dimension. For any dimension d > 2 we define a language Ldim(d) as follows. We start with a series of regular sets: Xi = ${a, b}+,

Xi+i = $X+, for i > 1

Due to the separator symbol $, every word u G Xi+i can uniquely be decomposed into its subwords from Xi. So, we can define the projection on the j t h subword as usual: Let u = $ui • • • Um, where Uj G Xi, for 1 < j < m. Then u[j] is defined to be Uj, if 1 < j < m, otherwise u[j] is undefined. Now define the language M{d) = {•uct;e^'*$ • • • $e^^$e^''$t; \ u £ Xd and Xj e N+, and X = x\ -\

l
+ Xd and v = u[xd\[xd-\\ • • • [xi] is defined}

Finally, the language Ldim{d) is given as homomorphic image of M{d). More precisely, Ldim{d) = h{M{d)), where h : {a, 6, e, $, <):}* —» {a, 6}* is defined by: h{a) = 6a, h{h) = bb, h{e) = 6, /i($) = ab, h{
Yi+i = $Y;", for i > 1

It follows Yi C Xi, for all i G N+, and \Yi\ = 2 " ' . If we choose two different words u and u' from Yd+i, then there is one position at which u has a symbol a and u' has a symbol b or vice versa. We can address this position by w[xrf+i][a:d] • • • [xi]. Therefore, h{u)h{$e2^$a) ^ Ldim{d + 1). There are 2"^ different words in Yd+i, and for the length of the suffix we obtain |/i((t;e^''+i$ • • • $e^'$e2^$a)| < 3m((i-f-l)-f2(d-F3)-^2 since Xj < m. This implies a lower bound on the number of induced equivalence classes as follows: Nr{2,m{d -Vl) + 2{d -H 3) + 2, Ld^d

+ 1)) > 2"""^'

In contrast to the assertion, we now assume Ldim{d+l) G J^rtO-^k)- Then by Lemma 4 there exists a constant p G N-^. such that Nr{l, Ldim{d+ 1)) < p('+^) , for all I G N+. So, for I = 3m{d +l) + 2{d + 3) + 2 we have at most

Fast Cellular Automata with Restricted Inter-Cell Communication (3m(d+l)+2(d+3)+2+l)'' ^

157

{&md+2d+QY <^ pi.^'JrndY < 2riog(p)l (17d)''m''

classes. We choose m such that m > \log{p)'\{l7dY, than

and obtain strictly less

classes. Prom the contradiction we obtain Liii,n{d+ 1) ^ ^rt(IAfc). Now we turn to the construction of a real-tinae lA^"*"^ which accepts Ldim{d + 1 ) . First we observe that the structure of accepted words is regular. Therefore, the communication cell can check it and, moreover, can decode the checked input over {a,b} uniquely to a word from M{d + 1). For convenience, we explain the acceptance also in terms of these words. Basically, the idea is to store the prefix u in such a way that the symbol M[a;c(+i] • • • [a;i] is stored in cell (x^+i — l,Xd — 1,. • • ,xx — 1). While subsequently reading the sufRx (te^''+^$ • • • $e^i$e^^$ti symbol wfsd+i] • • • [xi] is addressed and sent to the communication cell where it is compared with v. Accordingly, we call the first phase the storage and the second phase the retrieval phase. We name cells dependent on their coordinates. A cell is said to be of level j , if its last j coordinates are 0, i.e., ( z i , . . . , i d + i - j , 0 , . . . , 0). Note that a level j cell is also of level j ' < j , and the communication cell is the sole level d + 1 cell. A cell with maximal level j activates its neighbors ( z i , . . . , i^-i-i-j, 0 , . . . , 0,1), (n, • • •, Jd+i-i, 0 , . . . , 1, 0 ) , . . . , ( i i , . . . , id+i-j, 1 , . . . , 0, 0), and ( i i , . . . , id+i-j + 1,0,... ,0), i.e., sends a non-zero signal for the first time. Therefore, each cell is uniquely activated by one of its neighbors and, moreover, can determine its maximal level by this neighbor. A cell with maximal level j < d may activate at most j + 1 neighbors. Activation takes place during the storage phase, in which cells mark a path to the current storage position by state components. When the communication cell reads h{a) (resp. h{b)), it sends the two bits 10 (resp. 11) along the path until the position is reached. Now the corresponding cell ( i i , . . . , id+i) stores symbol a (resp. b), activates its neighbor ( « i , . . . , i^+i +1) to be the next storage position by sending the bits 01, and extends the current path to the newly activated neighbor. Whenever the communication cell reads /i($), it sends the bits 01 along the path. In this situation the cells on the path count the number of at most d consecutive 01 signals, and possibly reroute the path as follows. A cell lets pass p — 1 signals, where p is the number of already activated neighbors. If there is another signal, it activates the next neighbor according to the above given ordering, and reroutes the path to it. Clearly, there cannot be more signals than the number of activated neighbors minus one, since the next predecessor cell of higher level does not let pass so many of them. When the communication cell reads h{
158

M. Kutrib and A. Malcher

A cell remembers whether it is on the path or not, and whether it is the end of the path. Initially, only the communication cell is on the path. If a cell is on the path but not at the end, it simply routes the signals along the path. The end of the path, say ( i i , . . . , i j , 0 , . . . , 0) sends the signal 1 to its neighbor (ii,... ,ij + 1,0,...,0) which in turn deletes it and becomes the new end of path. The end of path cell ( i i , . . . , ij, 0 , . . . , 0) deletes a 00 signal and sends the next 1 signals to its neighbor ( i j , . . . , ij, 1,0,..., 0). So, on input e^''+^$ • • • $e^'$ a path to cell {xd+i,Xd,... ,xi) is established. The (d+ l)st signal 00 causes cell {xd+i,Xd,..., xi) to send the information which it has stored during the storage phase back to the communication cell. The ( d + l ) s t 00 signal takes Xd+i +Xd + • • • + xi time steps to reach the end of path. Subsequently, the same number of time steps is necessary to send the information back to the communication cell. Altogether, these are 2x time steps. Therefore, the information can be compared with input symbol v by the communication cell. It remains to be mentioned that, in fact, symbol v has to be compared with the information stored in cell {xd+i — l,Xd — I,- • • ,xi — 1) instead of of {xd+i,Xd, • • • ,xi). But the construction can be modified appropriately in a straightforward manner. D

Corollary 6. Let d,keN+

be constants, then ^rtilAf)

C

^rt{IAf^^).

Proof. By Theorem 5, language Ldim{d + 1) is not accepted by any real-time lA^, but is accepted by some real-time lAj and, thus, by some lA^."^ . D The construction of Theorem 5 can be modified to show that the language Ldim{d+ 1) belongs to ^(t(IAi), i.e., one can trade one dimension for a slowdown from real time to linear time. Theorem 7. Let k,deN+

be constants, then J^rtil^f)

C

J/fuilAf).

3.2 Bits Here we fix the time complexity to real time, the dimension to any constant d £ N+, and consider the number of communication bits. For any number of communication bits k G N-j. we define an alphabet Ad,k = {do, • • • ,a2dk_2} and a language Lbit{d, k).

Lbit{d, k) = {ui- • • Um$e^'"+'*$e^$e^^$ti | m, x G N+ and x <m and Ui G Ad,ki I < i < m, and v = Ux} Theorem 8. Let k,d € N+ be constants. The language Lhit{d,fc+ 1) belongs to the difference ^rt{IAi+-^) \ .^rtilAf). Proof. Contrarily, assume Lhit{d,k + 1) G ^rtO-^k)- Then by Lemma 4 there exists a constant]) G N+ such that Ni{l,Lbit{d,k + l))
Fast Cellular Automata with Restricted Inter-Cell Communication

159

On the other hand, consider two different prefixes w — ui- • • ui$ and w' = w'l • • • wj$. Since they are different, there is an x such that u^ y^ u'x- Therefore, we2'+4$e^$e2^$u^ G Lbit{d,k + 1) <J=^ w'e^^+He''$e^''$U:, ^ Lbit{d,k + 1). For all d,k £ N+, there are (2 (2 • 2''-'' - 1)' > (2^-fc + 1)' > ^2^-k + 1); ^ ((1 + ^ ) 2 ' ^ - ^ ) ' different words of this form. Since 2d-\+i > 0, we may choose I in such a way that (1 + 23TFT)' > P- This implies the following lower bound on the number of induced equivalence classes: Ne{l,Lbit{d,k + l))

>p-2 d-k-l

Prom the contradiction we obtain Lbit{d, k + 1) ^ ^rt(IA^.). It remains to be shown that Lbit{d, k + 1) e .Sfj.t(IA^_,_J. As in the proof of Theorem 5, a corresponding iterative array stores the symbols Uj in a storage phase, and in a retrieval phase symbol u^ is addressed and sent back to the communication cell that compares it with v. The input symbols are binary encoded hy {k + 1) • d bits, respectively, such that the code of a, is i + 1. First, we present the construction for d = 1, which is generalized subsequently. During the storage phase, a symbol Ui is read and its code is sent to the array. It is stored in cell i at time step 2z. At time 2i 4- 1 cell i -I- 1 is activated by cell i. So, cell i + 1 can store symbol Uj+i at time 2(i -t-1). The following behavior stops the storage phase. It is constructed with an eye towards generalizations to higher dimensions. When the communication cell reads the symbol $ it sends a 0 to the array. When this 0 is to be stored in cell m -|-1 at time 2(m -I- 1), the cell recognizes the end of the storage phase, waits for three time steps, and sends a signal from right to left that informs all cells passed through about the end of the phase. The signal arrives at the communication cell at time step 2(m + 1) + 3 + (m + l) ~ 3m -|- 6, i.e., when the input prefix wi • • • Wm$e^'""'"'*$ has been read. Now the retrieval phase starts. To this end, the communication cell sends signals 1 to the array as long as it reads the next input part e^. When it reads the following $ it sends a 2. Each cell which receives a 1 for the first time deletes the 1 from the stream. The unique cell that receives the 2 immediately after receiving a 1 for the first time, identifies itself to be the addressed cell x. It sends its stored symbol Ux to the left. The symbol arrives at the communication cell at time 3x after the beginning of the retrieval phase, i.e., before the v appears in the input. We turn to higher dimensions. Roughly, the idea is to split the encodings of the input symbols Wj into d blocks of length k bits, respectively. These blocks are distributed to the d neighbors of the communication cell. This would lead to a straightforward generalization. But the problem arises that we cannot stop the storage phase since signal 0 (and any other signal) may appear as a block

160

M. Kutrib and A. Malcher

in encodings. So, we have to provide more sophisticated mechanisms. The communication cell still sends the d blocks to its neighbors. But the blocks, e.g., of symbol Ui are stored in the cells (i, 0 , . . . , 0), (i, 1,0,..., 0), {i, 0 , 1 , 0 , . . . , 0 ) , . . . , (i, 0 , . . . , 0,1). For example, the block sent to neighbor ( 0 , . . . , 0 , 1 , 0 , . . . , 0) of the communication cell is rerouted to the cells (i, 0 , . . . , 0 , 1 , 0 , . . . , 0) by this neighbor. The communication cell itself sends blocks to cells ( i , 0 , . . . ,0) with one time step delay. Therefore, all blocks of symbol Ui reach their destinations at time 2i + 1. In order to stop the storage phase, the symbol Ui is reconstructed in cell ( j , 0 , . . . ,0) at time 2z + 2. To this end, all cells storing blocks of Ui send their blocks to their common neighbor (i, 0 , . . . , 0). If cell (m + 1,0,..., 0) reconstructs the signal 0, it sends stop signals back to its neighbors at time 2(m + 1 ) 4 - 3 . In turn, these neighbors send stop signals back to the communication cell. So, the storage phase ends at time 2 ( m + l ) + 3 + (m + l) = 3m + 6, i.e., when the input prefix m- • • Wm$e^"'+'*$ has been read. The retrieval phase is a straightforward generalization of the one-dimensional case. D Corollary 9. Let k,dGN+

u

be constants, then ^rtilAf.)

u

C Ji'rtiIA'l_^_^).

u

ifrt(IAt) c ^rt(IA^) c ••• c Xt(IA^) c ••• u

u

u

u

u

u

u

u

u

^rt{lA\) C ifrt(IA^) C ••• C ^rtilAl) C ••• Fig. 2. Double hierarchy of fast lAs with restricted inter-cell communication.

4 Relations with Restricted Cellular Automata In this section we consider one-dimensional devices in order to compare their computational capacity with communication restricted cellular automata. A two-way cellular automaton with k-bit restricted inter-cell communication (CAk) is similar to an iterative array. The main difference is that the cell at the origin does not fetch the input but the input is supplied in parallel to the cells. I.e., an input ai • • • a„ is fed to the cells 1 , . . . , n such that initially cell i is in state Oj. Cells 0 and n + 1 are initially in a permanent so-called boundary state #. So,

Fast Cellular Automata with Restricted Inter-Cell Communication

161

cell 1 is the communication cell that indicates acceptance or rejection, and the array is bounded to the n cells which are initially active. Real time is defined to be n time steps. A one-way cellular automaton {OCAk) is a cellular automaton in which each cell receives information from its immediate neighbor to the right only. So, the flow of information is restricted from right to left. The relations between these devices and iterative arrays in general are depicted in the left part of Figure 3. Now we turn to explore the relations for restricted devices. Theorem 10. For all k G N+, there is a regular language which is not accepted by any real-time k-bit CA. Proof. Let Lk = {xvx \ v G {a}* and x G {OQ, . . . , 022^}} be the regular witness language. Assume contrarily, Lk is accepted by some real-time CAfc with state set S, bit functions foi, 62 : •S' ^ {0, l}^ giving the bits communicated to the left and to the right, and local transition function (5: {0, l}*^ x 5 x {0,1}*^ —> S. First we partition the input states {ao,... ,a22k} according to 61, i.e., two states si and S2 are in the same class if and only if &i(si) = &i(s2). Since there are 2^*^ + 1 input states and the range of 61 has 2'' elements, there is at least one class ^i with at least 2'^ -|- 1 states. Next, Si is partitioned according to bi{S{b2{a),s,bi{#))). Therefore, there is at least one subclass of ^i that has at least two states, say Oj and aj. For an accepting computation on input aio^ai, for some n G N+, we consider the relevant states of the cells n — 1, n, n + 1 at time steps 0, 1,2. In particular, CO(TI—1) = a, co(n) = ttj, co(n+l) = #, ci(n—1) = a', c\{n) — aJ, C2(n—1) = a". Due to the real-time restriction, states c\{n + 1), C2(n), and C2(n + 1) cannot affect the overall computation result. Since Oj and aj are in the same class 5i, for input aiO^aj we obtain co(n — 1) = a, co{n) = aj, co(n-t-1) = #, ci(n — 1) = a', ci(n) = a'j. Since Oj and aj are in the same subclass we obtain C2(n — 1) = a". Therefore, input aiO^aj not belonging to Lk would be accepted. D It is not hard to see that language Lk is accepted by a real-time CAfc+i as well as by a CAfc in time n -|- 1. So, we obtain a strict bit hierarchy for two-way real-time cellular automata. Theorem 11. Let k £ N+ be a constant, then ^rt{CAk)

C .ifrt(CAfc+i).

Moreover, by modification of the witness language, i.e., by increasing the underlying alphabet, we obtain a very dense strict time hierarchy. That is, if we allow just one more time step, we obtain a strictly more powerful device. Theorem 12. Let k € N-|-, r G N 6e constants, then ^rt+r{CAk)

C

^rt+r+l{CAk).

Since, trivially, any regular language is accepted by some real-time lAi, the next theorem completes the incomparability results. Theorem 13. Let k G M+ be a constant. There is a language belonging to the Jerence ^rt{OCAi) \ ^it{IAk).

162

M. Kutrib and A. Malcher

Proof. First we give the sketch of a construction of a one-bit real-time OCA that accepts the witness language Lfc = {ui • • • UmS^v \ m £ N+ and Ui G {oo,... ,o,2k-i}., 1 < i < m, and v € {e, ao, • • • 102*^-1}* ^ind x is greater than or equal to the number represented by the 2'^-ary interpretation of wi • • -Um}Initially, all non-boundary states send bit 1 to the left. This identifies the rightmost cell uniquely. Next all cells with input e send a 1 and all cells in a state Ui send a 0. This identifies cells in state w, with an e-neighbor to the right, and vice versa. Now all cells e with right neighbor Ui or in boundary state send a 0-signal to the left. All other cells e send bits 1 to the left until they receive a 0-signal from the right. The cells in states Ui form a 2'^-ary counter. The cells in state Um with e-neighbor start to decrease the counter by one in every time step until they receive a 0-signal. A counter cell accepts when it generates the first carryover to the left. In order to show that Lk+i is not accepted by any lAfc we adapt the proof of Theorem 8, and obtain N({m,Lk+i) > p • 2*^™ induced equivalence classes, and N({m, I/fc+i) < p • 1^"^ distinguished equivalence classes. D

^ t ( C A ) = iftt(IA)

^rt(CA)

^rt(OCA)

^rt(IA)

iftt(CAfe) / -$frt(CAfc)

\ iftt(IAfc)

^rt(OCAfc)

^rt(IAfc)

REG

REG

Fig. 3. Relations between unrestricted and restricted language families, respectively. Solid lines are strict inclusions, dotted lines are inclusions. Families which are not connected by any path are incomparable. Finally, we show the proper inclusions between language families that are related by inclusions for structural reasons. T h e o r e m 14. Let fc G N+ he a constant, then ^rt{OCAk)

C ££rt{.CAk).

Proof. It is well known that all unary languages belonging to .ifrt(OCA) are regular [11] languages. Therefore, it suffices to show that the non-regular language L = {a2"+2^ I X e N+} belongs to ifrt(CAi). A corresponding CAi works as follows. It sets up a binary counter whose least significant bit is stored in the leftmost cell. We observe that the counter is extended by one digit (cell) to the right at time steps 2^ + a;, for x G N. In particular, at time steps 2"^ — 1 all counter cells store bit 1. Subsequently, it

Fast Cellular Automata with Restricted Inter-Cell Communication

163

takes X + 1 time steps until the carryovers reach the new cell t h a t extends the counter. In addition, at time step 1 the rightmost cell sends a signal 1 to the left. T h e input is to be accepted if and only if this signal appears in a cell exactly at a time step at which this cell becomes the new most significant bit of the counter, i.e., at time steps 2^ +x. In this case the signal 1 is passed through the counter in order to cause the leftmost cell to accept. Since the previous counter length was x, the total input length is 2^ + x + x. D For the sake of completeness, the following theorem is presented without proof. T h e o r e m 15. Let A; G N+ 6e a constant,

then ^it{IAk)

C

Sfit{CAk)-

References 1. Buchholz T, Klein A, Kutrib M (1999) Iterative arrays with a wee bit alternation. In: Fundamentals of Computation Theory 1999, LNCS 1684, pp 173-184 2. Buchholz T, Klein A, Kutrib M (2000) Iterative arrays with small time bounds. In: Mathematical Foundations of Computer Science 1998, LNCS 1893, pp 243-252 3. Buchholz T, Klein A, Kutrib M (1999) Iterative arrays with limited nondeterministic communication cell. In: Words, Languages and Combinatorics III, pp 73-87 4. Chang JH, Ibarra OH, Palis MA (1987) Parallel parsing on a one-way array of finite-state machines. IEEE Trans Comput C-36:64-75 5. Cole SN (1969) Real-time computation by n-dimensional iterative arrays of finitestate machines. IEEE Trans Comput C-18:349-365 6. Fischer PC (1965) Generation of primes by a one-dimensional real-time iterative array. J ACM 12:388-394 7. Ibarra OH, Pahs MA (1985) Some results concerning linear iterative (systolic) arrays. J Parallel Distributed Comput 2:182-218 8. Ibarra OH, Palis MA (1988) Two-dimensional iterative arrays: Characterizations and applications. Theoret Comput Sci 57:47-86 9. Iwamoto C, Hatsuyama T, Morita K, Imai K (1999) On time-constructible functions in one-dimensional cellular automata. In: Fundamentals of Computation Theory 1999, LNCS 1684, pp 317-326 10. Malcher A (2004) On the descriptional complexity of iterative arrays. lEICE Transactions on Information and Systems E87-D:721-725 11. Seidel SR (1979) Language recognition and the synchronization of cellular automata. Technical Report 79-02, Department of Computer Science, University of Iowa, Iowa City 12. Smith III AR (1972) Real-time language recognition by one-dimensional cellular automata. J Comput System Sci 6:233-253 13. Terrier V (1995) On real time one-way cellular array. Theoret Comput Sci 141:331-335 14. Umeo H (2001) Linear-time recognition of connectivity of binary images on 1-bit inter-cell communication cellular automaton. Parallel Comput 27:587-599

164

M. Kutrib and A. Malcher

15. Umeo H, Kamikawa N (2002) A design of real-time non-regular sequence generation algorithms and their implementations on cellular automata with 1-bit inter-cell communications. Fund Inform 52:257-275 16. Umeo H, Kamikawa N (2003) Real-time generation of primes by a 1-bitcommunication cellular automaton. FYmd Inform 58:421-435 17. Worsch T (2000) Linear time language recognition on cellular automata with restricted communication. In: Latin 2000: Theoretical Informatics, LNCS 1776, pp 417-426

Asynchonous Distributed Components: Concurrency and Determinacy Denis Caromel and Ludovic Henrio CNRS - I3S - Univ. Nice Sophia Antipolis - INRIA Sophia Antipolis Inria Sophia-Antipolis,2004 route des Lucioles - B.P. 93 F-06902 Sophia-Antipolis Cedex {caromel, henriojOsophia.inria.fr

Abstract. Based on the imp^-calculus, ASP (Asynchronous Sequential Processes) defines distributed applications behaving deterministically. This article extends ASP by building hierarchical and asynchronous distributed components. Components are hierarchical - a composite can be built from other components, and distributed - a composite can span over several machines. This article also shows how the asynchronous component model can be used to statically assert component determinism.

1 Introduction The advent of components in programming technology raises the question of their formal ground, intrinsic semantics, and above all their compositional semantics. It represents a real challenge as practical component models are usually quite complex, featuring distribution over local or wide area networks. But, few formal models for component were proposed so far [4, 20, 3, 14]. Since the first ideas about software components, usually dated in 1968 [1], the design of a reusable piece of software has technically evolved. Prom the first off-the-shelf modules, a component has become a complex piece of parameterized code with attributes to be set. Its behavior can be adapted with various non functional aspects (life-cycle, persistence, etc.). Finally, such piece of code is to be deployed in a hosting infrastructure, sometimes it can also be retrieved for replacement with a new version. In recent years, one crucial new aspect of component has been introduced: not only the interfaces being offered are specified, but also the needed interfaces. A first key aspect of our work is to take into account this feature: the model being proposed allows to specify that a software components provides well defined interfaces, and requires well defined services or interfaces. A second and important contribution is to take into account components that are distributed over several machines. A given component can span as a unique entity over several hosts in the network. This work go further than a distributed-component infrastructure just allowing two components to talk over the network. Finally, the components being proposed are hierarchical (allowing a compositional specPlease use the following format when citing this chapter: Caromel, D., Henrio, L., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 165-183.

166

D. Caromel and L. Henrio

ification and verification of the behavior of large scale systems), communicating with remote method invocations (versus raw messages), and as much as possible decoupled (asynchronous to scale over large area networks). When building some kind of component calculus, one has the option to start from scratch, or on the contrary to rely as much as possible on syntax and semantics of a programming calculus. This paper clearly takes the latter approach, relying as much as possible on a long history of research on concurrent and distributed calcuU. It is in accordance with the practical situation where component infrastructure is usually added on top of a programming language. The main contributions of this paper are: - a formalization of a component model featuring distribution, asynchrony, and hierarchical composition; with two translations defining the semantics; - usage of components as a convenient abstraction for statically ensuring determinism, which, to our knowledge, is a totally novel approach. This article is first a direct formalization of the component model implemented in ProActive [5, 11]. More generally, our distributed component model is minimally characterized by asynchronous components, hierarchy, no shared memory, and a single threaded lowest level of components; thus, it can be adapted to turn any object model into distributed decoupled components communicating by structured method calls. Taking advantage of ASP and its properties [10], summarized in Section 2, this article provides a formal syntax for the description of distributed components in Section 3. Then, Section 4 shows an example of a deterministic component. Two translational semantics are given in Section 5. Finally, components provide a suitable abstraction for statically identifying deterministic programs as shown in Section 6.

2 Background 2.1 Some Related Works ASP is based on the untyped imperative object calculus of Abadi and Cardelli [2], with a local semantics inspired from [15]. Features [16, 13] are used to represent awaited results of remote calls, determinism is strongly related to process networks [17], and Hnear channels [18]. A comparison of ASP with other calculi can be found in [10, 9]. Components over Actors are presented in [4], compared to our work, Actor components neither are hierarchical nor benefit from the notion of futures. Moreover, the communication and evaluation model of Actors cannot guarantee the causal ordering and determinism properties featured by ASP. [3] focuses on the definition of connection and interactions, and on the specification on the behavior. Connectors having their own activity it is impossible to adapt our determinism properties to Wright.

Asynchonous Distributed Components: Concurrency and Determinacy

167

Stefani et al. [6, 20] introduced the kell calculus that is able to model components and especially sub-components control. We rather demonstrate how to build distributed components that behave deterministically and for which the deterministic behavior is statically decidable. Moreover, the properties shown here rely on properties of communications and semantics of the calculus that are not ensured directly by the kell calculus, and its adaptation would be more complicated than the new calculus presented here. However, those two approaches being rather orthogonal, one could expect to benefit of both by adapting a kell calculus-like control of components with an (adaptation of) ASP as the underlying calculus. Bruneton, Coupaye and Stefani also proposed a hierarchical component model: Fractal [12], together with its reference implementation JuHa [7]. Our work can also be considered as a foundation for distributed Fractal components, focusing on the hierarchical aspect rather than on the component control. 2.2 ASP Calculus: Syntax and Informal Semantics The ASP calculus [10], is an extension of the imp^-calculus [2, 15] with two primitives {Serve and Active) to deal with distributed objects. The ASP calculus is implemented as a Java library (ProActive [11]). ASP strongly links the concepts of thread and of object, it is minimally characterized by: - Sequential activities: each object is manipulated by a single thread, - Communications are asynchronous method calls, and - Futures as first class objects representing awaited results.

a,h £ L ::= x \ [h = bi;mj = ?(cEj \a.li \ a.li := b \a.mj{b) 1 clone{a) \Active{a,mj) \Serve{M)

y})aj

iSl..n

variable, object definition, field access, field update, method call. superficial copy, activates a. rrij defines the service policy serves a request among the set M of method labels, M = { m i , . . . ,mk}

Fig. 1. ASP Syntax {U are fields names, rrij are methods names)

ASP is formalized as follows. An activity (denoted by a, /3, 7, . . . ) is composed of a thread manipulating a set of objects put in a store. The primitive Active{a,m) creates a new activity containing the object a which is said active, m is a method called upon the activity creation. Every request (method call) sent to an activity is actually sent to this master object. An activity also

168

D. Caromel and L. Henrio

contains the pending requests (requests that have been received and should be served later) and the computed results of the served requests {future values). AO{a) represents a reference to the remote active object of activity a. A parallel configuration (denoted hy P, Q, . . . ) is a parallel composition of activities: P, Q ::= a[aa; (Ta; ia\ Fa', Ra', fa]\\l3[- • 'JW • • • where a„ is the term currently evaluated in a, Ua is the store (association between locations bi and objects), L^ is the location of the active object inside o"a, Fa is the list of calculated futures, Ra is the request queue, and fa is the future corresponding to a^Futures are generalized references that can be manipulated as local ones, they can be transmitted to other activities; and future identifiers are unique for the whole configuration. But, upon a strict operation (field or method access, field update, clone) on a future, the local execution is stopped until the value of the future is updated. Calling a method on an active object atomically adds a new entry in a request queue, associates a future to the response and deep copies the argument of the request in the store of the destination activity. Deep copy allows one to prevent distant references to passive objects, synchronous request delivery ensures causal order between requests. The primitive Serve{M) can appear at any point in the source code. Its execution stops the activity until a request on one of the methods of the set M is found in the request queue. The oldest such request is then removed from the request queue and executed (served). Once the response to a request is computed, the corresponding value {future value) becomes available and every activity can get it. The futures associated with the currently served requests are called the current futures. Returning the value associated to a future (also called "updating a future"), consists in replacing reference to a future by a deep copy of the future value. We proved that the value of a future can be returned at any time without any consequence on the execution. An operational semantics for ASP has been detailed in [10] and is denoted by —K It is based on a classical local reduction (-^s) on ^-calculus terms [2], This reduction specifies a single reduction point inside each activity which ensures a local sequentiality. 7l[a] denotes a reduction context, where the reduction point is inside a; thus a^ = lZ\L.mj{i')\ means the next reduction of activity a will consist in performing a method call on the object referenced (locally) by (,; if moreover (Ja{i) = AO{l3) then this is a remote method call to activity /3. —> denotes the reflexive transitive closure of —>. 2.3 ASP Properties: Deterministic Objects Networks This section presents the properties of the ASP calculus; mainly it recalls the definition of deterministic object networks which identifies a set of ASP terms that behave deterministically. Though DON terms are based on an intuitionist notion: "non-determinism only originate from conflicting requests"; ASP is the flrst calculus to feature such a property for concurrent imperative objects.

Asynchonous Distributed Components: Concurrency and Determinacy

169

In the following, ap denotes the activity a of configuration P. Without any restriction, and to allow comparison based on activities identifiers, we suppose that the freshly allocated activity names are chosen deterministically: the first activity created by a will have the same identifier for all executions. Potential Services Let Map be an approximation of the set of M that can appear in the Serve{M) instructions that the activity a may perform in the future. In other words, if an activity may perform a service on a set of method labels, then this set must belong to M.ap'3Q, P ^ ^ Q A a„Q = n[Serve{M)]

^ M G Map

This set can be specified by the programmer or statically inferred. Interfering Requests Two requests on methods mi and m2 are said to be interfering in a in a program P if they both belong to the same potential service, that is to say if they can appear in the same Serve{M) primitive: Requests on mi and m2 are interfering if {mi,m2} C M G Map Equivalence Modulo Replies =F, defined in [9], is an equivalence relation considering references to futures already calculated as equivalent to local reference to the part of store which is the (deep copy of the) future value. More precisely, =p is an equivalence relation on parallel configurations modulo the renaming of locations and futures and permutations of requests that cannot interfere. Moreover, a reference to a future already calculated (but not locally updated) is equivalent to a local reference to the (part of the store which is the) deep copy of the future value. Deterministic Object Networks If two interfering requests cannot be sent to the same destination (/3 below) at the same moment then the program behaves deterministically. Of course, two such request would originate from two different activities (ag). "there is at most one" is denoted by 3^. Definition 1 (DON) A configuration P, is a Deterministic Object Network (DON{P)) if it cannot he reduced to a configuration where two interfering requests can he sent concurrently to the same destination activity: P^Q^yp€Q,yM€

M(3Q, 3^aQ e Q, 3m e M, 3t, t',

aaQ= 7e[t.m(i,')] A auQ (<•) = A0(/3)

Theorem 1 ( D O N determinism). fDON{P) IP^^QiA

[P^Q2

A

* •Ri A Q2 * •i?2 A Ri =F i?2 Qi

=>

3i?i,i?2,

170

D. Caromel and L. Henrio

DON{P) ensures that, for all orders of request sending, we always serve the requests in the same order. Thus, provided no two requests can be sent at the same moment on the same potential service of a given destination, the considered program behaves deterministically. Section 6 will show how components can ensure this statically.

3 Distributed Components This section demonstrates how to build hierarchical and distributed components upon ASP. The asynchronous components presented below interact with method calls in an object-oriented way. The component specification presented in this section can be viewed as an abstraction of a classical ADL (e.g. the Fractal ADL [12]). Definition 2 (Primitive Component - Figure 2) A primitive component is characterized with a component name Name, together with names for a set of Server Interfaces (SI), and a set of Client Interfaces (CI). We denote by Exported{PC) the set {ShY^'^-'' and by Imported(PC) the set {CIj^^^-K PC ::= Name < {5Ji}*^^••^ {CIjY^^-^ > Primitive Component Activity: To give functionalities to a PC, we attach to it an ASP term, say a, corresponding to an object to be activated and its dependencies (passive objects); the service method of a: srv (the method to be triggered on activation of a; a mapping from Sis to subsets of the served methods; and a mapping from CIs to names of fields of the object a, these fields will store references to components. A4 ranges over the set of method labels, and C over the set of field labels of a. PC Act '•'•= NameAct < a, SrV, ips, fC

>

where ips : Exported{PC) —> p{M) and (pc '• Imported{PC) —> C are total functions This definition requires that a content PC Act is attached to each primitive component PC, this content consists of a single activity. Composite components can be built by interconnecting other components either primitive or composite - and exporting some Sis and CIs. We suppose that for all components, every interface has a different name (but names could also be disambiguated by using qualified names). Definition 3 (Composite Component) A composite component is a set of components exporting some server interfaces (ss), some client interfaces (sc), and connecting some client and server interfaces (defining a partial binding tp), only interfaces of the direct sub-components can be used:

Asynchonous Distributed Components; Concurrency and Determinacy

Client Interface

Server Interface Requests sent to PC on methods of Sh •

171

1

Ch Requests sent by PC on Ch Fig. 2. A primitive component PC

CC

: : = Name^

C i , . . . , Cm\ ^S'jfp', SC

»

Where a component d is either a primitive or a composite one: C ::= PC \ CC, and each client interface CI inside CC can only be connected once, leading to the following definition: £s : Exported(CC) ip : M

—» M

Imported{sc)

sc€Ci...Cm

EC '• M

Exported{sc)

—> M

Exported{sc)

is a total function is a partial function

SceCi,...Cm

Im.ported{sc) —* Imported{CC)

is a partial surjective function

SceCi...C,n

Such that dom{ip) D dom{ec) = 0 We define: Exported{CC)

= dom{ss) and Imported{CC)

— codom,{ec)-

Defining ss as a function allows to export a given internal server interface as several external ones, but imposes each incoming request to be communicated to a single destination (each imported interface is bound to a single server interface of an internal component). Similarly, a client interface is exported only once for communications to have a single determinate destination: £c is a function (each client interface of an internal component is plugged at most once to an exported interface). ^ is a function so that internal communications are determinate too (each chent interface of an internal component is plugged at most once to another internal server interface). And finally, also to ensure unicity of communication destination, sc and ip have disjunct domain so that an internal client interface cannot be both bound internally and exported. Correct Connections Figure 3 sums up the possible bindings that are allowed according to Definition 3. The component shown in the figure is a valid CC but not a DCC (DCC will be defined in Section 6.2, Definition 8). Incorrect Connections Figure 4 shows the impossible bindings that correspond to the restrictions of Definition 3. The condition of Definition 3 that prevents the composition from being correct is written above each sub-figure.

172

D. Caromel and L. Henrio

Fig. 3. A composite component ec is a function

-0 is a function

doTn{ec) n dom{ip) •

£5 is a function

h

>

-f-

>

y\

> Fig. 4. Incorrect bindings between components

To conclude this section, we present two useful definitions: closed components that have no interface and form independent systems; and complete components for which all interfaces are either bound internally or exported: every request sent on a client interface has a destination and every server interface can at some point receive requests. Definition 4 (Closed Component) A component C is dosed if it neither imports nor exports any interface: Imported{C) = 0 A Exported{C) = 0 Definition 5 (Complete Component) A primitive component is complete. A composite component Name •C Ci,..., Cm'-, ss\ '^'•, £c ^ is complete if it consists of complete components and all its internal interfaces are plugged or exported: Ci,..,Cm

are complete component A dom{^) Ddom{£c) =

1)

Imported{sc)

SCGCI...COT

A codom{tp) U codom{ss) =

M

Exported{sc)

SceCi...Cm

Non-complete components contain unplugged interfaces: some of the CIs of the sub-components must not be used (request without destination) or some of the Sis never receive any request (potential deadlock). As such it is reasonable to forbid them.

4 Example: A Fibonacci Component Consider the Process Network that computes the Fibonacci numbers in [19]. Let us write an equivalent composite component as shown in Figure 5. Both Consl

Asynchonous Distributed Components: Concurrency and Determinacy

173

FIB

ComputeFib(k)

CI : send(fib(l))... send(flb(k))

FIB <^Cont < {SIi,SI'},{CIc} >, CC-^Consl < {SIi},{Ch,Cl2} >,Cons2 < {Sh},{Ch,CI"} >; {5/4 ^ Sn};{Cl2 ^ Sh};{CIi ^ CI{,Ch ^ C / ^ , C / " ^ CI'} » , Add<{SIa,Sh},{CIa}>; {SI -^ SI'}; {CIc -> Sh, CI'i ^ Sla, CI's -^ Sh, Cla ^ Sh}; {CI' -^ CI} > AddAct = < [ n l = 0,n2 = O,out = Q; seri; = <;{s, -)Repeat{Serve{setl); Serve{set2); s.out.send{s.nl + s.n2)), sell = ?(s, n)s.nl := n, set2 = ?(s, n)s.n2 := n ], sera, {S/a -^ {setl}, Sh -» {se£2}}, {Cla -^ out} > Consi Act =<[out = \],nxt = []; serii = ?(s) (oMt.seW(l); nxt.send{l); Repeat{Serve{send))), send = ?(s,n){out.setl{n};nxt.send{n)) ] serv, {SI'4 —» {send}}, {CIi -^ out,Cl2 —> nxt} > Fig. 5. A composite component for computing Fibonacci numbers

and Cons2 forward their input to their two chent interfaces (upon initiahzation they respectively send 1 and 0 to their cUent interfaces); they are merged in a composite component. Add simply sends on its o u t p u t interface the addition of what the component receives on its two server interfaces. A controller Cont exports a server interface {ComputeFib{k)) taking an integer k and forwarding A; — 1 times its input on the other interface 5 / 1 to CIcPrimitive components for Add and Consi are specified by AddAct and Consi Act, the others can be specified similarly {Repeat performs an infinite loop, ";" expresses sequential composition, b o t h can be expressed directly in A S P ) . Cons2 can be specified by renaming inputs and o u t p u t s of Consi. Finally, the FIB composite component is built by interconnecting those components as shown and expressed in the figure. For example, requests sent by Consi on CI\ are first exported on interface CC'^ of CC and then sent, according to the bindings of FIB, to the interface Sla of Add. Cons2 sends send requests to the exported cHent interface, thus FIB produces Fib{l).. .Fib{k).

174

D. C a r a m e l and L. Henrio

5 Translational Semantics This section gives two possible translational semantics for the component model, with ASP as the target calculus. The first one only instantiates primitive components and directly binds them but is not compositional. The second one instantiates an additional activity for each primitive and each composite component, it is defined recursively on the component structure. Both semantics first rely on a deterministic deployment phase; then, components can be started and communicate by asynchronous method calls. Both translations rely on the fact that the names of the interfaces are pairwise distinct, and thus a single component corresponds to each interface. 5.1 A Static Deployment In the case of a dosed component CC = Name
C c Name < Ci,.. .,Cm;£s;^;£c ><^ 3« e i..m, C = Ci V CrCi The union of two disjunct partial function is denoted ®: if®9){x)=

f{x)iixedom{f), g{x) a X S dom(g) else undefined

For each SI of a composite component, ^ returns the primitive component interface which is (recursively) exported to it {Id\^ is the identity function on A, C Q C -^ C = C W C d C). And symmetrically, /x recursively follows imported interfaces. ^PC =

Id\ExpoTied{PC)

£,cc •• [j

Exported{CC') -^

CC'CCC

[j

ExpoHed{PC)

PC\ZCC

CName
Note that, if C is complete then ^c is total ^.pc = ^"'\lmported{PC) ^^•cc •• U Imported{PC) -^ pcacc

[j ImpoHed{CC') cc'^cc

A«Name
y^c defines all the bindings defined inside C:

Asynchonous Distributed Components: Concurrency and Determinacy

175

^PC : 0 ^ 0 'Pec • U Imported{C) -^ cncc

| J Exported{C) cocc

%ame«Cx,.,.,C™;£s;,A;£c» = ^ ®'PCi

® •••

®'^Cm

In the general case, /xc and \^c are partial functions. In the case of a complete component C, for any client interface CI of C or a component inside C, either Hc{CI) or ^c{CI) is defined.
Imported{PC) -^

PCQC

[j

Exported{PC)

PCQC

^c =^c °Pc ° fJ-c For a complete closed component C, $c is a total surjective function. We define below the deployment of the composite component CC: this static deployment creates as many activities as there are primitive components and binds their interfaces accordingly. Let PCn range over primitive components defined inside CC: PCn = Name„ < {S'/„i}'ei-'=", {C/„jp'^i-'" > s.t. PC^ C CC; and PCnAct = Name„_4c( < On, srVn, range over their activities. We denote Ns{SIp), the index of the primitive component defining the interface Sip: Ns{SIni) = n. The term defined in Figure 6 deploys the composite component CC defined above (the mutually recursive definition of activities let rec... and... can be built from core ASP terms). This deployment phase does not rely on any request and thus is entirely deterministic.

let rec ci=Actme{(ai.(pci(.CIii) := CNS(*CC(C'-'II))^' ' ' ••fCi(Chki) ~'=JVs(*cc(c/i/sj ))• ^™i) and C2=AcUve(a2-VC2iCl2i) : = c/Vs(*cc(CJ'2i))- ' ' ' •'PCi{Cl2k2) '•= <'Ns('fcc<.<="2k2'>'>'^^'"^^ and . .. and Cn=Active(an.'fiCnidnl) : = C«s(if.cc(C/n)) • • ' • -fCnC^^nkn) ~'^Ns(^cc('^'nkn'>''' ^^'"'^^

Fig. 6. Deployment of a composite component This is sufficient to give a semantics to the components with all useful connections bound; but, here, components are not runtime entities and this translation neither is modular, nor gives any way of manipulating dynamically the components (e.g. component reconfiguration is far from trivial). An active object representation of each composite component will make them accessible and reconfigurable at runtime.

176

D. Caromel and L. Henrio

5.2 A Compositional Translation The compositional translational semantics adds one active object for each composite and for each primitive component. This translation does not suppose that any component is closed but requires that method names can be manipulated. During the running phase, requests have to be dispatched between components: when a PC receives a send request from its contained active object, it serializes this request, forwards a Call request to the destination to which the CI is plugged. Then this method call may go through several CCs (first through CIs and then Sis). Finally, the Call request is received by a PC which de-serializes the request and calls a function on the contained active object. Primitive Components Each PC is translated into a functional active object and a component active object. The functional active object is built from the object specified in PC Act, but every ipc{CIj) field of the active object now references a passive object Clobjj and requests are sent through this object which acts as a proxy. The Clobjj serializes (builds a an object containing the method name) each request before forwarding it to the CI interface of the embedding PC {rrij methods range over the method of the interface CI). Clobjj = [PC = [],Vmj, ruj =

<;{s,x)s.PC.send{CIj,mj,x)]

Clobjj allows the component to systematically communicate using the encapsulating active object defined below. The active object for the primitive component contains CIj fields which store the destination component and interface to which they are plugged. Every request arriving at the CIj interface has to be forwarded to the destination identified and stored inside the CIj fields Figure 7 shows the object that is

[Name <{SIiy^^-'', {CIj}^^^-' > attached to the activity NamcAct < a,srv,ips,'PC >1 = let pc = Active{ [Vj e 1..1, CIj = [CDest = [], West = []J, started ~ false, act = []; Vj e l.-l, setCIj = ^is, CDest', IDest'){s.CIj.CDest := CDest').IDest := West', setact = <;{s,a)s.act := a, start = ^{s)s.started := true. Call = t;{s, Sh, ruj, x)3.act.mj{x), send = c(s, CIj,mk,x)s.CIj.CDest.Call{s.CIj.Idest, m t , x), srv = <;{s)Repeat{if started then Serve{Call, send) else Serve{setCIi..setCIi, setact, start)) ] , srv) in let ao = Active{{a.tpc{CIi):={CIobji.PC := pc). . .).(pc{CIi) := (Clobji.PC := pc), srv) in pc.setact{ao);pc

Fig. 7. Primitive Component Deployment instantiated for each primitive component, note that ao is initialized with the object containing the activity of the component in which (pc{CIj) fields are replaced with Clobjj objects.

Asynchonous Distributed Components: Concurrency and Determinacy

177

Composite Components Each CC contains the same CI fields as PCs, together with SI fields storing destinations to which received method calls must be forwarded. For each composite component Name ',sc ^ , we define N'g{SIi) the unique number such that Cjv' (Sh) defines the server interface 5/^; and similarly NQ{CIJ) such that Cjv (c/ ) is the sub-component containing the client interface CIj. Figure 8 describes the instantiation of a composite component: it creates an activity for this component, binds the client interfaces according to ec and xjj, and the server interfaces according to eg.

IName«Ci,..,Cm;;£s;V';ec » I = let ci = |Cil in ... let c ^ = [Cml in let Name = Active{[ VShedomies), Sh = [CDest = c^j, (£3(3/^)). ^^e«* = ^s(SIi)], \/CIjecodom{ec), CIj = [CDest =1],/Dest = (]), started = false; \/CIjecodomiec), setCIj = <;{s,Cdst',Idst')({s.CIj).CDest := Cdst').IDest := Idst', VShedomies), setSU = <;(s, Cdst', Idst')((s.SIi).CDest := Cdst').IDest := Idst', Call = <;(s, CI-SI, mj, x)s.CI.SI.CDest.Call{s.CI-SI.IDest, mj,x), start = (^{s)s. started = true, srv = (;{s)Repeat{if started then Serve(Call) else ServeiyCIj^dom{ec) setCIj , VSIi£doTn{es) setSIi, start)) ] , srv) in VCIj edom(i/.), c^,^^cij)-^^tC:i3(<^N'g{^{Cij})''>i'i'^h)) ^CIj ^ dom{ec)^ e^i ,f^j ,ySetCIj{^a.nie^€c{CIj)); ci.startQ;. . . ; Cm-start{); Name

Fig. 8. Composite Component Deployment Once deployed, the main component has to be started: |Name
The deployment phase relies on setact, and setCI requests but the order of these requests is always the same as first the setact requests are sent during the primitive component creation; and then the setCI are sent by the unique embedding composite component, and thus the deployment phase is deterministic. This translation reveals the importance of the first class nature of futures. Indeed, every request transits through several primitive and composite components; if futures could not be transmitted between activities, then every component activity would be blocked as soon as a request transits through it, leading almost systematically to a deadlock. Of course, the first class nature of futures is also a major advantage from a functional point of view for both translations. 5.3 Perspective: Reconfiguration and Component Controllers In the last translation extra activities are added (a kind of component membranes), and requests must transit through them. But this additional cost is

178

D. Caromel and L. Henrio

counterbalanced by a promising expressiveness: it permits to envision the dynamic manipulation of components and requests at execution. Indeed, the semantics only forwards Call and send requests but it could be extended in order to add non-functional behaviors to components (e.g., fault-tolerance, security), intercept requests, and perform treatments on transiting messages; or reconfigure them. Reconfiguration consists in providing primitives allowing to change dynamically es, £c, oi tp for a given composite; with the last encoding, this can be realized by convenient calls of setCIj and setSIi methods at the same level as the reconfiguration occurs. Although very interesting, defining safe and coherent reconfiguration of a whole distributed component system is a challenging perspective that is beyond the scope of this paper.

6 Deterministic Assembly of Objects and Components 6.1 Static D O N Suppose one has, for each runtime object, a static approximation of the activity it belongs to, denoted a, $, o & a means o is an object stored in a. Let Part{a) be true if the abstract activity d may dynamically be partitioned into several different activities: Part{a) <^ 3o, o', o e a, o' G 7 a 7^ 7 A d = 7 In other words, Vcc, -^Part{a) iff some abstract activities can be merged to form a single activity at runtime, but no abstract activity is split dynamically. Then, an object which can be either active or passive should be considered statically as active. To summarize: Va, -^Part{a.) =» (a 7^ 7 => d 7^ 7) Moreover, let Q{P) be an approximated call graph: If a request on the method foo can be sent firom o to o', and o 6 d and o' £ /? then {d,P,foo) G G{P), which means: P _ i _ Q A a„Q = n[L.foo{c')] A a„Q (t) = AO{P) => (d, /?, foo) G g{P) Finally, let us characterize Mg

V/3, M

by

eMp^MGM^

Then, the following property is an approximation of DON terms: Definition 6 (Static D O N ) Suppose the approximation of the set of activities is such that two activities cannot be merged: Va, -iPari(d). A program P is a Static Deterministic Object Network SDON{P) if for all methods that can be sent at any time from two different activities toward a given destination, those methods cannot interfere:

Asynchonous Distributed Components: Concurrency and Determinacy

/ {aj,mi) SDON{P)

^

id'J,m2)

179

€ g{P) ] G g{P) ^ ^ VM e M^^,

{mi,ma} 2 M

Theorem 2 (SDON determinism). SDON terms behave deterministically. Proof : It is sufficient to prove that SDON{P) => DON{P), or that -nDON{P) ^ -^SDON{P). Suppose P is not a DON, then it may send in the future two concurrent requests, and thus there is an activity /3 of a configuration Q such that P —> Q and: 3M e M0^,3a

+ a', 3 m i , m , G M, j ' ^ " = ^'f-^^^^^!()^"^^^ =

'^'^

' [tta'

t^^^)ni,^

=Tl[L2.m2{L'2)]Aaa^{t2)=AO{(3)

Then, as g{P) is an approximated call graph: 3M e Mpp, 3a ^ a', 3mi,m2 £ M/3p(d,/3,mi) G 0(P) A {a',$,1112) G C'(P) and, as Va, -iPart(d), and by definition of Mfj^: 3d ^ a ' A (a,/3,mi) e ^ ( P ) A {a',fi,m2) G ^(P) A mi,m2 G M A M G M ^ ^ Finally, P is not a SDON. D Of course, not every DON is a SDON, but SDON can be considered as the best approximation of DON that does not require control fiow analysis. 6.2 Deterministic Components We define a deterministic assemblage of components based on the fact that PCs provide an abstraction for activities and thus the SDON definition can be entirely expressed in terms of specifications of PCs and connections of interfaces. Indeed, suppose that for any two methods of the same SI cannot interfere, then a component system is deterministic if each SI can be accessed by a single activity (that is by a single component). Then, ensuring that only one CI is finally plugged to each SI is sufficient to ensure confiuence. As each PC can be considered as an abstraction of an activity, for each PC, we denote M.pc is the potential service of the activity defined by PCActDefinition 7 (Deterministic Primitive Component (DPC)) A primitive component PC = Name < {5/j}*^^"'^, {C/,}-'^^--' > is a DPC if its activity NameAct < o,, srv, ifs, ific > associates its server interfaces to disjoint subsets of the served methods of the embedded active object; and such that two interfering requests necessarily belong to the same SI: VM G Mpc,

Vmi,m2 G M (mi G (ps{SIi) Am2 G ips{SIj))

^i=j

180

D. Caromel and L. Henrio

Definition 8 (Deterministic Composite Component (DCC)) A DCCis a composite component built by connecting deterministic components. DC::=DCC\DCP DCC ::= Name <^ DCi, • • •, DCm;£s;i^\s:c

>

Where each SI is only used once, either bound or exported: ip, £c o.nd es are injective

A codom{il)) 0 codom{es) = 0

Non-Deterministic Connections Figure 9 illustrates the non-deterministic bindings between components, corresponding to restrictions expressed in Definition 8. The condition of Definition 8 that prevents the composition from being determinate is written above each sub-figure.

ec is injective

I/J is injective

codoTn(es)

H codom{tp)

= 0

£s is injective

1-'

-^

Fig. 9. Non-deterministic bindings between components

A DCC assemblage verifies the SDON property because each DPC statically identifies an activity; and the absence of sharing of Sis ensures that two activities cannot send concurrent requests on the same SI. Finally, the definition for DPC ensures that two requests on different Sis are not interfering. Theorem 3 (DCC determinism). DCC components behave deterministically. This theorem relies on the fact that composite components only forward requests if necessary, that is to say a request sent by a PC will be directly or indirectly transmitted to the PC that is finally plugged to the concerned interface, according to the 'I^cc function defined in Section 5.1. In other words, neither the content nor the order of requests on a given binding is modified by the composite components involved in the communication. Let us formally prove Theorem 3 in the case of the first translational semantics. In the case of the compositional semantics, more intermediate activities are created but each of them still verifies the SDON property. Proof : Let ^ — ^cc for CC a DCC. A DCC is only composed of injective if), ec and es functions, codomains of -0 and es are disjoint, and domain of ^ and domain of ec are disjoint, thus the <^ function is injective. In this translation, there is a bijection between the set of deployed activities and the set of

Asynchonous Distributed Components: Concurrency and Determinacy

181

PCs (statically defined), thus we can consider PCs as the abstract domain for activities. This abstraction does not merge activities: VPC, ->Part{PC). We denote comp{SI) the PC such that SI £ Exported{PC) and similarly comp{CI) the PC such that CI & Imported{PC). An approximation of G{P) becomes: {{PC,PC',m)\CI

e dom{ A m e (^s(^(C/))}

And thus the SDON property is verified (with PC Act = < o,, s, (fis, 'fie >)'• {PC,PC',mi)eg{P) \ {PC2,PC',m2) eg{P) \-^'ikel,2, PCy^PCi J

nikeipsiSh) A PC = comp{SIk) A Shy^Sh =>VMeMpc', {mi,m2}gM

Indeed, mi,m2 & M and M G Mpc would imply 5/1 = 5/2 because PC is a DPC. Finally a DCC behaves deterministically when deployed with the first translational semantics. D DCC assemblage allows to statically ensure deterministic behavior of components, only based on the following requirements. - Potential services can be statically determined, or are statically specified (every served set has been declared as a potential service). - SI interfaces are respected: they only receive requests on the methods they define; this could be checked by typing techniques [2] on ASP source terms. - Requests follow bindings and are not modified while following these bindings. - There is a bijection between primitive components and functional activities. The two first requirements correspond to static analysis or specification; whereas the two last ones must be guaranteed by the components semantics which is the case for both translational semantics of Section 5. We have shown in [9] that every Process Network can be translated into a (deterministic) ASP term, which can then be fit into a deterministic assemblage of components. Such a bijection between process networks and DCCs will finally provide a large number of DCCs.

7 Conclusion This article defines a hierarchical component calculus that provides a very convenient abstraction of activities and method calls. This abstraction allows static verification of determinism properties. Our component model is aimed at distribution, featuring asynchronous remote method invocations, and futures as generalized references passing through components. Primitive components are defined as a set of Server Interfaces (SI) and client interfaces (CI), together with an ASP term for the primitive component content. Intuitively, each SI corresponds to a set of methods, each CI to a field. Composite components are recursively made of primitives and other composites, with a partial binding between Sis and CIs, and some Sis and CIs exported.

182

D. Caromel and L. Henrio

Primitive deterministic components are defined by imposing t h a t each set of interfering requests belongs to the same server interface. A deterministic composite (DCC) avoids potential interferences by imposing at most a single binding towards a server interface. For D C C , both translational semantics lead t o configurations t h a t respect the SDON properties, hence their deterministic nature. This results mainly relies on the fact t h a t primitive components provide an abstraction of activities, and interfaces provide an abstraction of potential services. One might have noticed the absence of any notion of location or machine, in contrast to calculus such as Ambient [8]. Because of the A S P calculus properties, an activity and further a component, can be placed '^anywhere" without any semantic consequence. A given hierarchical component can be entirely m a p p e d on a single machine, within the same address space, or fully distributed over the network, each inner component being located alone on its own machine. Abstracting activities by components is also convenient for distribution; allowing to m a p each primitive to a single location and to span composites over several machines. Two translational semantics for the component model are proposed. T h e second translation allows to envision an even more interesting perspective: deterministic component reconfiguration. As components and bindings are achieved by A S P active objects, one can imagine to apply the general deterministic property (DON) to reconfiguration phase and to design coherent reconfigurations.

References 1. lose '79: Proceedings of the 4th international conference on software engineering, 1979. Chairman-F. L. Bauer and Chairman-Leon G. Stucki and Chairman-M. M. Lehman. 2. Martin Abadi and Luca Cardelli. A Theory of Objects. Springer-Verlag, New York, 1996. 3. Robert Allen and David Garlan. A formal basis for architectural connection. ACM Transactions on Software Engineering and Methodology, July 1997. 4. Mark Astley and Gul A. Agha. Customization and composition of distributed objects: Middleware abstractions for policy management. In Proceedings of the ACM SIGSOFT 6th International Symposium on Foundations of Software Engineering (FSE), 1998. 5. Prangoise Baude, Denis Caromel, and Matthieu Morel. Prom distributed objects to hierarchical grid components. In International Symposium on Distributed Objects and Applications (DOA), Catania, Sicily, Italy, 3-7 November, LNCS. Springer Verlag, Berlin, Heidelberg, 2003. 6. Philippe Bidinger and Jean-Bernard Stefani. The kell calculus: operational semantics and type system. In Proceedings 6th IFIP International Conference on Formal Methods for Open Object-based Distributed Systems (FMOODS 03), Paris, Prance, 2003. 7. Eric Bruneton, Thierry Coupaye, Matthieu Leclerc, Vivien Quema, and JeanBernard Stefani. An open component model and its support in Java. In Ivica

Asynchonous Distributed Components: Concurrency and Determinacy

8.

9. 10.

11.

12.

13. 14.

15.

16.

17.

18.

19.

20.

183

Crnkovic, Judith A. Stafford, Heinz W. Schmidt, and Kurt C. Wallnau, editors, CBSE, volume 3054 of Lecture Notes in Computer Science. Springer, 2004. Luca Cardelh and Andrew D. Gordon. Mobile ambients. Theoretical Computer Science, 240(1):177-213, 2000. An extended abstract appeared in Proceedings of FoSSaCS '98, pages 140-155. Denis Caromel and Ludovic Henrio. A Theory of Distributed Objects. SpringerVerlag New York, Inc., 2005. To appear. Denis Caromel, Ludovic Henrio, and Bernard Paul Serpette. Asynchronous and deterministic objects. In Proceedings of the 31st ACM SIGACT-SIGPLAN symposium on Principles of programming languages, pages 123-134. ACM Press, 2004. Denis Caromel, Wilfried Klauser, and Julien Vayssiere. Towards seamless computing and metacomputing in Java. Concurrency: Practice and Experience, 10(11-13):1043-1061, 1998. ProActive available at h t t p : / / w w w . i n r i a . f r / o a s i s / proactive. Bruneton E., Coupaye T., and Stefani J.B. Recursive and dynamic software composition with sharing. In Proceedings of the 7th ECOOP International Workshop on Component-Oriented Programming (WCOP'02), 2002. Cormac Flanagan and Matthias Felleisen. The semantics of future and an application. Journal of Functional Programming, 9(1):1-31, 1999. Dimitra Giannakopoulou, Jeff Kramer, and Shing Chi Cheung. Behaviour analysis of distributed systems using the tracta approach. Automated Software Engg., 6(1), 1999. Andrew D. Gordon, Paul D. Hankin, and Sren B. Lassen. Compilation and equivalence of imperative objects. FSTTCS: Foundations of Software Technology and Theoretical Computer Science, 17:74-87, 1997. Robert H. Halstead, Jr. Multilisp: A language for concurrent symbolic computation. ACM Transactions on Programming Languages and Systems (TOPLAS), 7(4):501-538, 1985. Gilles Kahn. The semantics of a simple language for parallel programming. In J. L. Rosenfeld, editor. Information Processing '74: Proceedings of the IFIP Congress, pages 471-475. North-Holland, New York, 1974. Uwe Nestmann and Martin StefFen. Typing confluence. In Stefania Gnesi and Diego Latella, editors. Proceedings of FMICS'97, pages 77-101. Consiglio Nazionale Ricerche di Pisa, 1997. Also available as report ERCIM-10/97-R052, European Research Consortium for Informatics and Mathematics, 1997. Thomas Parks and David Roberts. Distributed Process Networks in Java. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS2003), Nice, France, April 2003. Alan Schmitt and Jean-Bernard Stefani. The kell calculus: A family of higherorder distributed process calculi. Lecture Notes in Computer Science, 3267, Feb 2005.

Decidable Properties for Regular Cellular Automata Pietro Di Lena Department of Computer Science, University of Bologna, Mura Anteo Zamboni 7, 40127 Bologna, Italy, [email protected] Abstract. We investigate decidable properties for regular cellular automata. In particular, we show that regularity itself is an undecidable property and that nilpotency, equicontinuity and positively expansiveness became decidable if we restrict to regular cellular automata.

1 Introduction Cellular Automata (CA) are often used as a simple model for complex systems. They were introduced by Von Neumann in the forties as a model of selfreproductive biological systems [16]. Mathematical theory of CA was developed later by Hedlund in the context of symbolic dynamics [7]. To a cellular automaton one associates the shift spaces generated by the evolution of the automaton on suitable partitions of the configuration space. Adopting Kiirka's terminolgy we call column subshifts this kind of shift spaces (see [12] chapter 5). A general approach to the study of a cellular automaton is to study the complexity of its column subshifts (see [5, 13, 10]). Regularity has been introduced by Kurka for general dynamical systems [14]. A CA is regular if every column subshift is sofic, i.e. if the language of every column subshift is regular. Kurka classified CA according to the complexity of column subshift languages [13]. In Kurka's classification the main distiction is whether the cellular automaton is regular or not. He compared language classification with two other famous CA classifications such as equicontinuity and attractor classification. In this paper we study the decidability of topological properties for CA. In particular, we show that regularity is not a decidable property (Theorem 7) which implies that the membership in Kurka's language classes is undecidable. In contrast, we show that some topological properties which are in general undecidable become decidable if we restrict to the class of regular CA. For instance, we show that for regular CA nilpotency, equicontinuity and positively expansiveness are decidable properties (Theorem 6). Moreover, we provide an answer to a question raised in [3] showing that the topological entropy is computable for one-sided regular CA (Theorem 5). The paper is organized as follows. Section 2 is devoted to the introduction of the notation and general definitions while Section 3 contains our results. Please use the following format when citing this chapter: Di Lena, P., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 185-196.

186

P. Di Lena

2 Notations and Definitions 2.1 Shift Spaces and representations of Sofic Shifts Let A — {ai,..., a„} be a finite alphabet, n > 1. For any fc > 0, wiW2---Wk € A'^ is a finite sequence of elements of A. The sets A^ and A^ are respectively the set of doubly infinite sequences {xi)i^z and mono infinite sequences {xi)i^j^ where Xi G A. Let X e A^, for any integer interval [i,j], X[ij] e A^"*"*"^ is the finite subword XiXi+i...Xj of x. Define the metric d on A^ by d{x,y) = Si^z ^M ^^^^^ di{xi,yi) = 1 if Xi = Vi and di{xi,yi) = 0 otherwise. The set A^ endowed with metric d is a compact metric space. A dynamical system is a pair {X, F) where F : X -^ X is a continuous function and X is a compact metrizable space. The shift map (T : A^ —> A^, defined by a{x)i = Xi+i, is an homeomorphism of the compact metric space A^. The dynamical system {A^,a) is called full n-shift or simply full shift. A shift space or subshift (X, a) is a closed shift invariant subset of A^ endowed with a. The shift dynamical system {X, cr) is called one-sided if X C A'*^. In general, we denote the subshift {X,a) simply with X. Let denote with Bk{X) = {x £ A'^ | 3y G X,3i e Z,y[i,i+fc-ij = x} the set of allowed k-blocks of the subshift X, k > 0. The language associated to a subshift X is denoted with C{X) — Si^^BkiX). Any subshift is completely determined by its language (see [15]). The language of a subshift X is: 1. factorial: \i xyz € C{X) then y € C.{X). 2. extendable: Vx £ L{X), 3y £ C{X) such that xy e C{X). The language C{X) of a subshift X is bounded periodic if there exists integers m > 0, n > 0 such that Vx G C{X) and Vi >m,Xi = Xi+nA factor map F : (X,CT)—> {Y, a) is a continuous and cr-commuting function, i.e. F o a = a o F. If F is onto (or surjective), X is called extension of F and Y is called factor of X. If F is biiective, it is a topological conjugacy and X, F are said to be topologically conjugated shift spaces. A subshift is sofic if it can be represented by means of a labeled graph. We review the representation of a sofic shift as vertex shift of a labeled graph. A labeled graph G — (V, E, Q consists of a set of vertices V, a set of edges E and a labeling function ( : V —* A which assigns to each vertex v GV a. symbol from a finite alphabet A. Each edge e G E identifies an initial vertex i{e) G V and a terminal vertex i(e) G V. We denote the existence of an edge between vertices i),u' € y by 11 —* v'. Every sofic shift can be represented as the set of (mono or doubly) infinite sequences generated by the labels of vertices of a labeled graph. That is, the labeled graph Q = (V, E, (), with (^ : V -^ A, represents the (two-sided) sofic shift Sg = {x GA^\ 3{vi)i^z e V^,Vi -^ Vi+i,C{vi) =Xi,iG

Z}.

Decidable Properties for Regular Cellular Automata

187

The topological entropy h{X) = lim„-^oolog|S„(X)|/n of a shift space X is a measure of the complexity of X. While the topological entropy is not computable for general subshifts, it is for sofic shifts (see [15]). The language of a sofic shift is denoted as regular in the context of formal language theory (see [9] for an introduction). The class of regular languages is the class of languages which can be recognized by a deterministic finite state automaton (DFA). Formally, a DFA is a 5-tuple {Q,A,5,qo,F) where Q is a finite set of states, F C Q is the set of accepting states, qo ^ Q is the initial state, A is a finite alphabet and S : Q x A ^> Q is a, partial transition function (i.e. it can be defined only on a subset oi Q x A). The language represented by a DFA is the set of words generated by following a path starting from the initial state and ending to an accepting state. For every regular language there exists an unique smallest DFA, where smallest refers to the number of states. In general, most of the questions concerning regular languages are algorithmically decidable. In particular, it is decidable if two distinct DFA represent the same language. Prom a DFA representing the language of a sofic shift S it is possible to derive a labeled graph presentation of S in the following way: 1. the set of vertices V consists of the pairs {q,a) G Q x A s.t. S{q, a) G Q. 2. there exists an edge (g, a) —* {q', a'), {q, a), {q', a') G V, if 6{q, a) = q' 3.yv = {q,a) GV, ((V) = a. 2.2 Cellular Automata A cellular automaton is a dynamical system {A^, F) where A is a finite alphabet and F is a cr-commuting, continuous function. {A^, F) is generally identified by a block mapping / : A^^'^^ —+ A such that F(x)i = /(x[j_r,i+r.])j* G Z. According to Curtis-Hedlund-Lyndon Theorem [7], the whole class of continuous and crcommuting functions between shift spaces arises in this way. We refer to / and r respectively as local rule and radius of the CA. A CA is one-sided, if the local rule is of the form / : A''+^ —> A where Vx G A^,i G Z,F(x)i = f{x[ii^r])- -A- one-sided CA is usually denoted with {A^,F). We recall the definition of some topological properties of CA. Let d denote the metric on A^ defined in Section 2.1. Definition 1. Let {A^,F)

be a CA.

1. (A^,F) is nilpotent if 3N > 0, 3x G A^, a{x) = x, s.t. Vn > A^, F"(A^) = x. 2. {A^,F)

is equicontinuous at x G A^ if

Ve>0,3<5>0 s.t. VyGA^,d{x,y)

< 5,3n > 0 s.t. d{F''{x),F"{y))

3. (A^,F) is equicontinuous i/Vx G A^, {A^,F) is equicontinuous at x.

< e.

188

P. Di Lena

4. {A^,F) is almost equicontinuous if3x G A^ s.t. {A^,F) is equicont. 5. {A^,F) is sensitive if 3e > 0 s.t. Vx G A^,\/5 > 0,3y e A^,d{x,y) d(F''{x),F'^{y)) > e . 5. {A^,F)

atx.

< (5,3n > 0 s.t.

is positively expansive if 3e > 0 s.t. Vx, y G A^,xj^y,3n>0

s.t. c!(F"(x), F"(y)) > e.

Kari showed that nilpotency is an undecidable property [11]. In [4], Durand et al. showed that equicontinuity, almost equicontinuity and sensitivity are undecidable properties. Actually, it is unknown if positively expansiveness is or not a decidable property. Definition 2. (Column subshift) Let {A^,F) Sk = {xG (A'=)« \3yeA^:

he a CA. For k > 0 let

r{y)[o,k) =Xi,ie

denote the column subshift of width k associated to

N}

{A^,F).

Oilman noticed that the language of a column subshift is always contextsensitive [6]. Kurka classified cellular automata according to the complexity of column subshifts languages [13]. Definition 3. (Bounded periodic CA) {A^,F) C{St) is a bounded periodic language.

is bounded periodic z/Vi > 0,

Definition 4. (Regular CA) {A^,F) is regular if^t language (or, equivalently, if St is sofic shift).

> 0, C{St) is a regular

Definition 5. (Kurka's Language classification) Every cellular automaton falls exactly in one of the following classes. L I . Bounded periodic. L2. Regular not bounded periodic. L3. Not regular. Class LI coincide with the class of equicontinuous CA [13]. Thus the membership in LI is undecidable while it was unknown if it is for 1/2, L3. The topological entropy H{F) = limfe^oo h{Sk) of (A^,F) is a measure of the complexity of the dynamics of (A^, F). The problem of computing or even approximating the topological entropy of CA has been shown to be in general not algorithmically computable [8]. The topological entropy of one-sided CA has a simpler characterization than the general case (see [2]). Tiieorem 1. Let {A^,F)

be a CA with radius r. Then H{F) = h{Er).

Decidable Properties for Regular Cellular Automata

189

3 Results In this section we investigate decidable properties of regular CA. Most of our effort will be devoted to show that if 5 C {A'^^'+'^f is a sofic shift and {A^,F) is a CA with radius r, it is possible to decide whether S = S2r+i (Theorem 3). This strong result has a lot of consequences. The most relevant one is that for regular CA it is possible to compute column subshifts of every given width (Theorem 4). The (dynamical) complexity of a CA is strictly related to the complexity of column subshifts languages. Actually we show that, thanks to the computability property, it is possible to decide if a regular CA is nilpotent, equicontinuous or positively expansive (Theorem 6). Moreover, it turns also out, that it is possible to compute the topological entropy for one-sided regular CA (Theorem 5). The negative consequence of computability/decidability results is that regularity itself is an undecidable property (Theorem 7). In order to show our fundamental decidability result (Theorem 3) we need to define the concept of cellular automaton extension of a sofic shift and to show some basic properties. Definition 6. Let (A^,F) be a CA with radius r. Let Q = {V,E,C,) be a labeled graph with ( : V ^ A?'^'^^. For t > 0, let the (F,t)-extension ofQ be the labeled graph G(F,t) = {Vt,Et,Ct), with Ct '• Vt —^ A^''+*, defined in the following way (see figure 1): • vertex set: Vt = {{v,,..,vt)

G V* I 3a G A^'+\Civi)

= a[i,2r+i],l

• edge set: Et = {(ei,.., et) G J5* | 3v, v' G Vt, z(e,) - Vj, i(e,) = v'jJidvj))

= av'j)r+i}

• labeling function: \/v = {vi,...,Vt) G Vt,Ct{v) =awherea[i^2r+i] = C{vi)A

Definition 7. Let {A^,F) be a CA. Lett>0,k> 1 and let a,b € Bt{Sk) such that a = ai...ak, b = bi...bk where ai,bi G A* and aj+i = foj,l < i < /c. Then, we say that x,y are compatible blocks and we denote with aQb — ai...akbk their overlapping concatenation. Moreover, let x,y G Sk such that x = xi..Xk,y = yi-.-J/fc where Xi,yi G A^ and Xi+i = yi,l < i < k. We say that x,y are compatible sequences and, abusing the notation, we denote with xQy = xi...Xkyk their overlapping concatenation. The following two lemmas will be used extensively.

190

P. Di Lena ^(v") = aV..a'2r^j

f(a|...a2r+j) = a'f^j, Vie[1,t] Fig. 1. A legal edge v ^^ v' oi an (F, i)-extended graph G(F,t)Lemma 1. Let {A'^,F) be a CA with radius r. Let t > 0 and let a,b e Bt{S2r+i) be compatible blocks. Then aQb G Bt{S2r+2)Proof. Let a = ai...at where oi,...,at G A^'"+^ and let x £ A^ such that F'(x)[o,2r] = «i+ii 0 < i < t. Moreover, let b = bi...bt where bi,...,bt & ^^'"+^ and let y G A^ such that F*(j/)[i_2r+i] = &i+i, 0 1^'"+^. Then it is easy to check that F'^(z)[o^2r+i] = Q+ii 0 < i < ^ which implies that a Qb G Bt{S2r+2)- D Lemma 2. l e i (A^, F) be a CA with radius r. Let S C (A^''+i)'^ be a sofic shift and let G be a labeled graph presentation of S. Let x,y G Sg.^^ j, be compatible sequences. Then x Qy G Sg.j,^,. Proof. Since, by hypothesis, x = {xi)i^{^,y = {yi)ieN G Sg^j, ^^ there exist two paths wi —> U2 —> ••• and wi ^ 112 —> ... in Q such that C("i) = ^i and C,{vi) = yi, i G N. Then, (wi,fi) —> (^2,^2) —* ... is a legal path in G(F,2) which implies that xQy G Sg^^.,^. D The following proposition shows that the sofic shift presented by the (F, t)extension G{F,t) of a labeled graph G doesn't depend on G but only on the sofic shift presented by G.

Decidable Properties for Regular Cellular Automata

191

Proposition 1. Let {A^,F) be a CA with radius r and let Q,G' be two distinct labeled graph presentations of the same sofic shift S = Sg = Sg' C (yl^'"+'-)^. Then, for any t > 0, 5g(^ „ = 5'e|p ^. Proof. We show that Sg^^, ^^ C Sg' . The proof for the converse inclusion can be obtained by exchanging Q with Q'. First of all, note that, by definition of [F, l)-extension, Sg.^, ^, — Sgi . Let X G Sg^p^^ and let xj, ...,X( G S such that x = xi O ... © Xt- Then, xi, ...,Xt € Sc' and, by Lemma 2, it follows that x e Sa' • • Thanks to Proposition 1 we can refer directly to the extension of a sofic shift S rather than to the extension of a labeled graph presentation of S. Definition 8. Let {A^,F) be a CA with radius r. Let S C (yl2'-+i)N be a sofic shift and let Q be a labeled graph presentation of S. For t > 0, let denote with 5(F,t) = ^S(F.t) ^^^ (F,t)-extension of the sofic shift S. We now show some useful properties of the {F, i)-extensions of sofic shifts. Lemma 3. Let {A^,F) shift. Then Vi > 0,

be a CA with radius r. Let S C {A'^r+i^^ ^g „ g^^^

a. if E2r+i C S then S2r+t C S(^F,t), h.if IJ2r+i = S then E^r+t = S(F,t), c. if S2r+i 3 5 then E2r+t 3 5'(F,t) • Proof, a. Let x e £'2r+f such that x = xi © .. 0 xt where Xj G i?2r+i) 1 1 < « 0 and let a G Bk{S(^F,t))- Let «!,..., at G Bk{S) be such that ai © ... Q at = a. By hypothesis, ai,...,at G Bk{S2r+i) then, by Lemma 1, it follows that ai © ... © at G Bk{E2r+t)c. Since S2r+i D S, appling the same reasoning of point 6, it is possible to conclude that S2r+t 2 5'(F,t)' We have just to show that the inclusion is strict. Since S2r+i 3 S, there exists a block 6i G C{S2r+i) such that 6i ^ £ ( 5 ) . Then, let b G £(Z'2r+f) such that b = 6i 0 62 © ••• © ^t for some &2,..., 6t G /:(i:2r+i). IVivially,fo^ CiS^F.t))- • The following theorem easily follows from Lemma 3 and provides a strong characterization for regular CA. It is a two-sided extension of a theorem proved by Blanchard and Maass for one-sided CA [1]. Theorem 2. Let {A^, F) be a CA with radius r. Then {A^, F) is regular if and only if S2r+i is a sofic shift. Proof. The necessary implication is trivial. Then, suppose Z'2r+i is a sofic shift. For every d < 2r -f 1, Z'rf is a factor of S2r+i then it is a sofic shift. For every d> 2r + l,hy Lemma 3 point b, Sd can be represented by a labeled graph then it is a sofic shift. D In general, if Ud is a sofic shift for d < 2r + 1 it is not possible to conclude that the CA is regular (see [10]).

192

P. Di Lena

Definition 9. Let A be a finite alphabet. Let t > 1 and let [i,j] C [1,^] be an integer interval. Let

#(,,,]: {Ar - {A^-'+r denote the projection map induced by the one-block factor map

defined by (p[ij-^{ai...at) = aiai+i...0j,Vaia2...at £ A*. Remark 1. Let {A^,F) be a CA with radius r and let G{F,t) be tiie (F, t)extension of Q. Then for every i G [l,i], ^[i,2r+i](5'£;(f, j,) C S'g. Definition 10. Let {A^,F) be a CA with radius r and let S C sofic shift. S is F-extendibie if

(A2'-+1)N

be a

S = % 2 r + i ] ( V , t ) ) ' ^ ^ > 0,Vi G [l,i]. Note that for a sofic shift to be F-extendible is a necessary condition in order to be equal to S2r+iProposition 2. Let {A^, F) be a CA with radius r and let S C (A2''+1)N be a sofic shift. Then, S is F-extendible iff S = ^[i,2r+i](5'(F,2)) = ^i2,2r-+2](5'(F,2))Proof. The necessary implication is trivial. Then, let 5 = ^[i,2r+i](5'(F,2)) = ^[2,2r+2](5'(F,2))- Note that this imphes S = S(^F,I)- Let t > 2, we have to show that S = $[i,2r+i]{S{F,t)) iox 1 < i < t. Let z e S and let k G [l,i]. To reach the proof it is sufficient to show that z G ^[k,2r+k]iS(F,t))- Since S = ^[i,2r+i]iS(F,2)) = ^{2,2r+2]{S{F,2)), there exists xi,..,xt-i G 5(ir,2) such t h a t ^[2,2r+2](a;i)

=

^[l,2r+l](2;i+l), 1 < i < t -

1 a n d ^[2,2r+2](a;fc-l)

=

^[i,2r+i]{^k) = z. Then, xi,..,Xt-i are compatible and by Lemma 2, it follows that xi O ... O xt-i € S(^F,t) and ^[k,2r+k]{xi © ... 0 xt-i) = z. D Proposition 3. Let {A^,F) be a CA with radius r and let S C (^Sr+i^N jg ^ sofic shift. Suppose S is F-extendible then S C Z'2r+iProof We prove by induction on fc > 0 that Bk{S) C Bki^2r+i)1. (Base Case) By definition, Bi{S) C Bi{S2r+\) = A'^''+\ 2. (Inductive Case) Suppose Bk{S) C Bki^2r+i) for fc > 0. We have to show that Bk+i{S) C Bk+i{S2r+i)Since the radius of the CA is r, the set of blocks Bk+i{S2r+i) is completely determined by the set of blocks BkiSir+i) as well as the set of blocks Bk+i{^[r+i,3r+i]{S{F,2r+i))) is Completely determined by the set of blocks Bk{S(F,2r+i))- Thus, showing that Bk{S(^F,2r+i)) £ Bk{S4r+i) we can reach the conclusion Bk+iiS) C Bk+i{S2r+i)Let X G Sfc(5(F,2r+i))- Since S is F-extendible, there exist xi, ..,X2r+i G Bk{S) such that x = xiQ ...(Dx2r+i- By inductive hypothesis, xi,..., X2r+i G Bk{^2r+i) then, by Lemma 1, a; G Sfc('^4r+i)- •

Decidable Properties for Regular Cellular Automata

193

P r o p o s i t i o n 4. Let {A^,F) be a CA with radius r and let S C (A^r+i^N ^g ^ sofic shift. Then it is decidable if S is F-extendible. Proof. Given a labeled graph representation of S, it is possible to compute S(^F^2) and it is possible to compute labeled graph representations for ^[i,2r-+i]('S'(F,2)) and ^[2,2r+2]('S'(F,2))- Given labeled graph representation of 5, 5" = ^[i,2r+i](5'(F,2)) and S" = ^[2,2r+2]{S{F,2)) it is easy to build three finite state automata whose recognized languages are respectively C{S),C{S') and C{S"). Then, the proof follows from Proposition 2 and from the decidability of the equivalence between finite state automata. D P r o p o s i t i o n 5. Let {A^, F) be a CA with radius r and let S C S2r+i be a sofic shift. Then it is decidable if S2r+i = S. Proof. We provide a proof for the following claim which trivially is algorithmically checkable. Let M = ((5,A2'-+i,go,i^,<5) be the smallest DFA recognizing the language C{S). Let N = {\Q\ • |A|2'-+i)2'-+i. Then Z-sr+i -= S if and only if BNi^Ar+l) = Siv(5'(F,2r+l))By Lemma 3, the necessary condition is trivially true. Obviously, if Sir+i = 5(F,2r+i) then Z'2r+i = S. Thus, we show by induction on fc > 0 that Bk{Sir+l) = Sfc(S'(F,2r+l))a. (Base Case) By hypothesis, Sjv(^4r+i) = 'Sjv(5'(F,2r+i))- Moreover, since the language of a subshift is factorial, ^^(1^4^+1) = Sfc(S(F,2r+i))i Vfc < A''. b.(Inductive Case) Suppose 5 x ( ^ 4 r + i ) = BK{S(F,2r+i)), K > N. We have to s h o w t h a t BK+li^ir+l)

=

BK+l{S(^F,2r+l))-

Let Q = {V, E, 0 be the labeled graph presentation of S derived from the smallest DFA M according to the procedure described at the end of section 2.1. Note that the number of vertices of Q is less then or equal to \Q\ • \A\'^'^'^^. Moreover, let ^(F,2r+i) be the (F, 2r+l)-extension oiQ. Note that the number of vertices of Q(F,2r+i) is less then or equal to A''. Let a G BK+ii^ir+i) and let a^, ...,a^''+^ G Bx+i(-^2r+i) such that a = a^ 0 ... 0 a ^ ' ' + ^ Since, by inductive hypothesis, Bxi^Ar+i) = BKiS(^F,2r+i)): it follows that BK+i{^2r+i) = BK+I{S) and, trivially, that a^,...,a'^^'^^ G BK+I{S). Then there exist uniques legal paths

in g, where u\ = iqo,a\) and ( ( 4 ) = ai^i e [l,2r + 1], 1 < A; < iiT + 1. We show that there exists x G 5'(F,2r+i) such that x^o^i^] = a. Let y G S{F,2r+i) such that ^[0,^-1] = o,[o,K-i]- One such y exists since, by inductive hypothesis, Si<-(Z'4r+i) = S_ft-(5'(F,2r+i))- Then there exists an unique path vo -^ vi —» . . i n ^(F,2r+i) such that C2r+i{vi) = Vi, i G N and such that Vo = ((go,C[i,2r+i]). •••> (9o,C[2r+i,4r+i])) where c = yo G A'^''+^. Since K > N there exist 0 < i < j < K such that Vi — Vj. Then,

194

P. Di Lena

let consider a'' = aia2..aj'aj^ia^_,_2...a^^i, 1 < fc < 2r + 1. Obviously, a'' e C{S) n C{i:2r+i), l < k < 2r + l. Moreover, a^ are compatible then, by Lemma 1, a = a^ 0 ... 0 5^''+^ G £(1^4^+1) and, by inductive hypothesis, a e £(5(ir,2r+l))-

Let I = \a\. Let z G S(^p^2r+i) such that zro,!) = a. Then, there exists an unique path UQ —> tij —> .. in Q{F,2r+i) such that C2r+i{v[) = Zi, i GN and such that V'Q — VQ. Moreover, since V'Q = VQ and zjo,;) = a, it follows that ujj. = Vk for 0 < A; < « and v^_,_j. = Vj+k, 1 < k < C where C — K — j . Then it is easy to see that vo -> ... -^VK^ vl^c+i -^ '"i+c+2 -^ is a legal path in Q{F,2r + 1) and that the labehngs of the vertices in the path generate a sequence x G >S'(F,2r+i) such that xp^x] = o-- O Now we are ready to state our main result and next to show the most immediate consequences. Theorem 3. Let (A^, F) be a CA with radius r and let S C (^2r+i^N ^^ ^ ^^yj^ shift. Then it is decidable if S = U2r+i' Proof. S = S2r+\ if and only if S is F-extendible and S 2 ^2r+i- Then, the proof follows from Proposition 4 and Proposition 5. D We now explore some important consequences of Theorem 3 related to regular CA. Theorem 4. Let {A^,F)

be a regular CA. Then Vi > 0, St is computable.

Proof Let r be the radius of the CA. By Theorem 3, given a sofic shift S C (A^'"+^)^, it is possible to decide if 5 = ^2r+i- We can enumerate all labeled graph representing all sofic shifts contained in A'^^'^^. Then there exists an algorithm that iteratively generates graphs in the enumeration and checks if the shift represented is S2r+i- Since {A^,F) is regular, S2r+i will be eventually generated and recognized. This proves that, if {A^, F) is regular, Z'2r+i is computable. In general, if i < 2r 4-1, we can compute St by simply taking the projection ^[i,t](^2r+i) otherwise, if i > 2r -I-1, by Lemma 3 point b, we can compute St by computing the (F, t — 2r)-extension of S2r+i- D The following theorem gives an answer to a question raised in [3]. Theorem 5. The topological entropy of one-sided regular CA is computable. Proof Since the entropy of sofic shifts is computable, the conclusion follows from Theorem 1 and Theorem 4. D The following theorem shows that if we restrict to the class of regular CA, it is possible to provide answers to questions which are undecidable in the general case.

Decidable Properties for Regular Cellular Automata

195

Theorem 6. Let {A^, F) be a regular CA. Then the following topological properties are decidable. l.Nilpotency 2. Equicontinuity 3. Positively Expansiveness Proof. By Theorem 4, given {A^,F), it is possible to compute S2r+i' 1. It is easy to see that {A^, F) is nilpotent if and only if there exists a G A^^+^ and N > 0 such that Vn > N,'^x e S2r+i, o'"(x) = a. Given a labeled graph representation of Z'2r+i! this last condition is trivially algorithmically checkable. 2.It is easy to see that {A^,F) is equicontinuous if and only if £(Z'2r+i) is a bounded periodic language and that, given a labeled graph representation of I^2r+i, it is algorithmically checkable if C{S2r+i) is bounded periodic. 3.Every positively expansive CA is conjugated to (Z'2r+i,cr) where S2r+i is a shift of finite type and, in particular, it is an n-full shift (see [12]). Since, for positively expansive CA, n = |F"-'-(x)| for every x G A^, n is a computable number. The proof follows from the decidability of the conjugacy problem for one-sided shifts of finite type (see [15]). D To conclude, we show that, as a negative consequence of the decidability of properties in Theorem 6, regularity is an undecidable property which implies that the membership in Kurka's language classes is undecidable. Theorem 7. It is undecidable whether a CA is regular. Proof. Assume it is decidable if a CA is regular. Then, since nilpotent CA are regular, by Theorem 6, it is possible to decide if a CA is nilpotent. D

4 Conclusions a n d open problems We investigated decidable properties for regular cellular automata. We showed that regularity itself is not a decidable property (Theorem 7) and that, conversely, for regular cellular automata nilpotency, equicontinuity and positively expansiveness are decidable properties (Theorem 6). Moreover we aswered a question raised in [3] showing that the topological entropy is computable for one-sided regular CA (Theorem 5). It is unknown if almost equicontinuity and sensitivity are or not decidable properties for regular CA (since to be almost equicontinuous or sensitive is a dicotomy for CA, this two properties are either both decidable or both not decidable).

References 1. F. Blanchard, A. Maass. Dynamical Behaviour of Coven's Aperiodic Cellular Automata. Theor. Coraput. Sci., 163, 291-302 (1996).

196

P. Di Lena

2. F. Blanchard, A. Maass. Dynamical properties of expansive one-sided cellular automata. Israel J. Math. 99, 149-174 (1997). 3. P. Di Lena. On Computing the Topological Entropy of one-sided Cellular Automata. International Journal of Unconventional Computing (1995). To appear. 4. B.Durand, E.Formenti, G. Varouchas. On undecidability of equicontinuity classification for cellular automata. Discrete models for complex systems, DMCS '03 (Lyon), 117-127 (2003). 5. R.H. Gilman, Robert H. Classes of hnear automata. Ergodic Theory Dynam. Systems 7, no. 1, 105-118 (1987). 6. R.H. Gilman. Notes on Cellular Automata. Preprint (1988). 7. Hedlund, G. A. Endormorphisms and automorphisms of the shift dynamical system. Math. Systems Theory 3, 320-375 (1969). 8. L.P. Hurd, J. Kari, K. Culik. The topological entropy of cellular automata is uncomputable. Ergodic Theory Dynam. Sys. 12, no. 2, 255-265 (1992). 9. J. Hopcroft, J.D. Ullman. Introduction to automata theory, languages, and computation. Addison-Wesley Series in Computer Science. Addison-Wesley Publishing Co., Reading, Mass. (1979). 10. Z.S. Jiang, H.M. Xie. Evolution complexity of the elementary cellular automaton rule 18. Complex Systems 13, no. 3, 271-295 (2001). 11. J. Kari. The nilpotency problem of one-dimensional cellular automata. SIAM J. Comput. 21, no. 3, 571-586 (1992). 12. P. Kurka. Topological and symbolic dynamics. Cours Specialises [Specialized Courses], 11. Societe Mathematique de Prance, Paris (2003). 13. P. Kurka. Languages, equicontinuity and attractors in cellular automata. Ergodic Theory Dynamical Systems 17, no. 2, 417-433 (1997). 14. P. Kurka. Zero-dimensional dynamical systems, formal languages, and universality. Theory Comput. Syst. 32, no. 4, 423-433 (1999). 15. D. Lind, B. Marcus. An introduction to symbolic dynamics and coding. Cambridge University Press, Cambridge (1995). 16. J. von Neumann. Theory of self-reproducing automata. Univ. of Illinois Press, Urbana (1966).

Symbolic Determinisation of Extended Automata Thierry Jeron, Herve Marchand, and Vlad Rusu Irisa/Inria Rennes, Campus de Beaulieu, 35042 Rennes France. {Thierry.Jeron I Herve.Marchand I Vlad.Rusu} Qirisa.fr Abstract. We define a symbolic determinisation procedure for a class of infinite-state systems, which consists of automata extended with symbolic variables that may be infinite-state. The subclass of extended automata for which the procedure terminates is characterised as bounded lookahead extended automata. It corresponds to automata for which, in any location, the observation of a bounded-length trace is enough to infer the first transition actually taken. We discuss applications of the algorithm to the verification, testing, and diagnosis of infinite-state systems. Key words: symbolic automata, determinisation

1 Introduction Most existing models of computation are nondeterministic, but they include restricted, deterministic versions as subclasses. A natural question is comparing the expressiveness of the general, nondeterministic class with that of the corresponding restricted, deterministic subclass. For example, it is well known that nondeterministic and deterministic finite automata on finite words are equivalent, but for finite automata on infinite words, the equivalence depends on the acceptance condition (e.g., Miiller versus Biichi acceptance); and for pushdown and timed automata, the nondeterministic version is strictly more expressive than the deterministic one [1, 2]. Besides this theoretical interest, the distinction between nondeterministic and deterministic models has practical consequences. For example, verification consists in checking whether an implementation of a system satisfies a specification; both views of the system are modeled by automata of some kind. This problem can be seen as a language inclusion problem, which in turn can be encoded into a language emptyness problem (i.e., checking the emptyness of the language recognised by a product between the implementation and the complement of the specification). The complement of the specification is an automaton that accepts exactly the words that are rejected by the specification, and is easily computed if the specification is deterministic (by complementing the specification's acceptance condition). Otherwise, if the specification is nondeterministic, it has to be determinised, i.e., turned into an equivalent deterministic machine.

Please use the following format when citing this chapter: Jeron, T., Marchand, H., Rusu, V., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 197-212.

198

T. Jeron, H. Marchand, and V. Rusu

Hence, determinisation is an important operation in formal verification. It is also important in other fields such as conformance testing and fault diagnosis where deterministic testers (resp. diagnosers) have to be derived from specifications that are, in general nondeterministic due to, e.g., partial observation. In this paper we define a determinisation operation for a class of infinite-state systems, which consists of extended automata operating on symbolic variables and communicating with the environment via synchronising actions. Variants of this model are often encountered in the literature and can be used, e.g., for the formal specification of reactive systems. The determinisation procedure consists in iterating a sequence of local determinisation steps, which postpone operations on the variables until it becomes clear which exact operations should have been performed. The subclass of extended automata on which the procedure terminates is characterised as bounded-lookahead automata, for which the observation of a bounded-length trace is enough to infer the first transition actually taken. The result is nontrivial because the order in which local determinisation steps are iterated has a strong influence on termination. The main difficulty was to find an order for which the bounded lookahead decreases at each iteration, thus ensuring termination of the procedure. The rest of the paper is organised as follows. We first introduce extended automata and the determinisation operation by means of examples. Then, in Section 2 we formally define the syntax and semantics of extended automata, and in Section 3 the determinisation operation is formally defined. The operation may not terminate in general, and in Section 4 the subclass for which the procedure does terminate is precisely characterised via necessary and sufficient conditions. However, these conditions are undecidable, hence, we also provide sufficient, decidable conditions for termination. In Section 5 we discuss applications of our procedure to the verification, testing, and diagnosis of reactive systems, and conclude in Section 6. The technical report [3] contains proofs of all the results. Example 1 (extended automata, determinisation). Figure 1 (left) depicts an extended automaton S. In location IQ, the action a occurs. If a; > 0 then the control goes to location li and the variable x is decreased by 1, and if x < 0 then the control goes to location h and x is increased by 2. Clearly, if a; = 0 then the next control location and the next value of x are not uniquely defined: the system is nondeterministic. The right-hand side of Figure 1 depicts the automaton det{S) obtained after determinising S. Intuitively, the locations l\ and I2, which could be nondeterministically chosen as the next control location after an action a, are merged into one new location denoted by {lo,{li,l2))- A new transition labeled by a goes from IQ to (^o, (h^h))- This transition is taken if a occurs, and if x satisfies the disjunction x > 0 V x < 0 (which actually simplifies to true). This condition is the disjunction of the guards of the two transitions involved in the nondeterministic choice in S. Note, however, that those transitions perform different assignments to variables: X := X — 1 for one, and x := x-(- 2 for the other. Hence, the new transition

Symbolic Determinisation of Extended Automata

199

lo

({l0,{h,l2))) cc > 0 A (x - 1)> 0

Fig. 1. Left: extended automaton S

/

\

x < 0 A (x + 2)< 0 c

Right: extended automaton det{S)

from lo to {IQ, {h^h)) of det{S) does not "know" which assignment to perform. To solve this problem, the idea is to postpone assignments until it becomes clear which one of the transitions of the nondeterministic choice was actually taken, and then to "catch up" with the assignments in order to preserve the semantics. Hence, if 6 occurs after a, then the transition from /Q to Zi was taken (hence, X := X — 1 sould have been performed), but if c occurs after a, the transition from ^0 to I2 was taken (hence x := x + 2 should have been performed). Note how the assignments are simulated in det{S): the transition labeled by h (resp. by c) has x — 1 (resp. x + 2) substituted for x in its guard and assignments. To match the behaviour of 5 , in which the transition labeled by h (resp. c) are fireable only after a transition labeled a has been fired with a; > 0 (resp. a; < 0) holding, the guard of the transition labeled by h (resp. c) in det{S) is strengthened by x < 0 (resp. x > 0).

2 Extended automata Extended automata consist of a finite control structure and a finite set of typed variables V. Each variable x €V takes values in some domain donix • A valuation V of the variables V is a function that associates to each variable x G y a value v{x) e dorrix- The set of valuations of the variables V is denoted by V. In the sequel, a predicate P over variables V is often identified with its set of "solutions", i.e., the set of valuations V' C V of the variables V for which P is true. Definition 1 (extended automaton). An extended automaton (sometimes refered to simply as an automaton^ is a tuple S = {V, 0, L, 1°, S, T):

200

T. Jeron, H. Marchand, and V. Rusu

- V is a finite set of typed variables - O is the initial condition, a predicate on V, assumed to have a unique solution - L is a nonempty, finite set of locations and 1° G L is the initial location, ~ S is a nonempty, finite alphabet of actions, - T is a set of transitions. Each transition t € T is associated with a tuple {ot, Gt, at, At, dt), where - Ot £ L is called the origin of the transition, ~ Gt is a Boolean expression over variables V, called the guard, - at G E is called the action of the transition, - At is the assignment of the transition: a set of expressions of the form {x := A^)xev where, for each x £ V, the right-hand side A'^ of the assignment X := A^ is an expression on V, - dt € L is called the destination of the transition. We sometimes write t : {o,G,a,A,d) to emphasise the tuple associated to t. By slight abuse of notation, we shall denote by o an operation of syntactical substitution: a guard G (or an assignment A) is composed with another assignment A' by replacing in G (resp. in the right-hand side of A) all the variables by their corresponding right-hands sides from A'. Examples of such substitutions in guards and assignments have been given in Example 1 above. The semantics of extended automata is described by labelled transitions systems. Definition 2 (Labelled Transition System (LTS)). A Labelled Transition System is a tuple S = {Q,Q^,A,^ where Q is a set o/states, Q° C Q is the set of initial states, A is a set of labels, and —>C Q x A x Q is the transition relation. The LTS semantics of an extended automaton enumerates the valuations V of its variables V. For an expression E involving (a subset of) V, and for u G.V, we denote by E{u) the value obtained by substituting in E each variable x by its value vix). Definition 3 (Semantics of extended automata). The semantics of an extended automaton S = {V,0,L,l°,E,T) is an LTS [Sj = {Q,{q°},A,^), where " the set of states is Q = L x V, - the set of initial states is the singleton {g°} = {(^o, ^o)} where PQ is the unique valuation satisfying O, - the set of labels is A = T, > is the smallest relation in Q x A x Q defined by the following rule: {l,v),{l',v')

eQ

t:{l,G,a,A,l')eT {i,u)^{i'y)

G{v) = true

v'= A{v)

Symbolic Determinisation of Extended Automata

201

The rule says that the transition t : {l,G,a,A,l') is fireable in a state (/,J/) if the guard G evaluates to true when the variables evaluate according to v; then the transition takes the system to the state {I', P') where the assignment A of the transition maps the valuation v to v'. We extend this notion to sequences of transitions a = ti • ta- • -tn G T*, saying that a is fireable in a state q S Q if there exists states qi = q,q2,- • -Qn & Q such that Vi = 1.. .n — 1, qi A qi+i. We then write g -^ to say that a is fireable in q. The transition sequence a is initially fireable if it is fireable in the initial state qo. A state q is reachable if there exists an initially fireable transition sequence a leading to it, i.e., 3a G T*,qo -^ q. We denote by Reach{S) the set of reachable states. For a sequence a = i i - - - i „ G T " (n > 1), we let first{a) = ti. Definition 4 (trace). The trace of a transition sequence (T = ti • t2- • -tn is the projection traceia) ~ at^ • a^^ • • • a^^ of a on the set S of actions. The set of traces of an extended automaton S is the set of traces of initially fireable transition sequences and is denoted by Traces{S).

3 Local Determinisation Intuitively, an extended automaton is deterministic if in each location, the guards of the transitions labeled by the same action are mutually exclusive. Determinising an extended automaton >S means computing a deterministic extended automaton det{S) with the same traces as S. Definition 5 (deterministic extended automaton). An extended automaton (y, 0 , L, i°, S, T) is deterministic in a location I G L if for all actions a € E and each pair 11 : {l,Gi,a,Ai,li) and t2 : {l,G2,a,,A2,l2) of transitions with origin I and labeled by a, the conjunction of the guards Gi A G2 is unsatisfiable. The automaton is deterministic if it is deterministic in all locations I & L. It is assumed that the guards are written in a theory where satisfiability is decidable, such as, e.g., combinations of quantifier-free Presurger arithmetic formulas, arrays, and lists. Such formulas are expressive enough to encode the most common data structures, and their satisfiability is decidable using, e.g., the classical Nelson-Oppen combination of decision procedures [4]. Note that determinism does not take reachability of states into account. However, since extended automata have a unique initial state, the definition of determinism is equivalent to the fact that the semantics of a deterministic extended automaton is a deterministic LTS in the usual sense. Exemple 1 shows that determinising two transitions consists in merging the two transitions into a new one, and propagating guards and assignments onto transitions following them (cf. Figure 1). Formally, follow{t) = {t' G T\ot' = dt}. We also denote by Idy the identity assignments over variables V, i.e., x := x for each x GV.

202

T. J6ron, H. Marchand, and V. Rusu o

t/\t2 • ( d t j , d t 2 )Y

dti

-!

dt2

Jt2

h

di 4 t'/ Gfollow{ti),i = 1,2

"' d{

'•2

ii-^ If d{

? =mod{t'i),i = 1,2

re Fig. 2. Determinising 2 transitions: (left) before

(right) after

Definition 6 (determinising two transitions). Let S he an extended automaton, and let t\,t2 & T he two transitions with same origin o = Ofj = Ot^ and same action a = at^ = at^- The automaton det2{S,ti,t2) is defined as follows. //Gtj A Gfj is unsatisfiable then det2{S,ti,t2) —S, otherwise, C7<jet2(S, t) = 0s ^ L.'det2(5,t) = LsU {{o, {dti,dt2))}, where (o, (dti.dt^)) is a new location ^

=1°

1° '• d.ct2t.S, t)

'•S

^S Ts\{ti,t2}U{ti^2}UTiUT2, where - ti,2 = (o, Gti VGt2,a,/dv',(o, (dti,dt2)>), - for i = 1,2, Ti = Ut'efoiio™(t ji^^^^iC*')}! ™*^ tf^^ transitions modi{t') : ((o, {d(i,dt2)), Gf, AGf o At.,at',At' oAt,,dt').

^det2(S,

t) =

%et2(S,t)

-

The operation is illustrated in Figure 2. The transitions ti and t2 in S are replaced in det2{S,ti,t2) by the set of transitions {ii,2}UTiUT2. The transition ti^2 leads from the common origin o of i 1,^2 to the new location (o, (dtijdt^)); its guard is the disjunction of those of t\, ^2; hence, ii,2 can be fired whenever ii or ^2 can. However, t\^2 does not perform any of the assignments oiti, ^2 because it does not "know" which ones to perform. The assignments are postponed onto copies of the transitions t' G follow{ti) (i = 1,2), modified in order to "catch up" with the effect of transition ti: - the guard G^^diit') equals Gt, A Gf o Aj,.. Intuitively, this amounts to firing the transition modi{t') in det2{S,ti,t2), under exactly the same conditions as the transition t' in <S: the conjunct Qti "recalls" that ti should have been fired before t', and by composing Gf with At.., the effect of ti on the variables is simulated before the guard of t' is evaluated. ~ A„„dj(t') performs the assignments of Af composed with the assignments At^. In this way, the cumulated effect on the variables of firing in sequence ti then t' in <S is simulated.

Symbolic Determinisation of Extended Automata

203

Definition 7 (Local determinisation in location). The local determinisation in location I of an extended automaton S = {V, 0, L, 1°, E, T), where I G L, is defined as follows. Let Ti CT be the set of all transitions with origin I, then: - det{S,l) = S if for every pair of distinct transitions t\, t2 & Ti such that ati = 3*2; the formula G^ A Gtj is unsatisfiable; - otherwise, choose two distinct transitions ti,t2 G Ti such that a^j = a^j, Gti A Gf2 is satisfiable, and let det{S,l) = det{det2{S,ti,t2),l)The operation terminates, as the set of pairs of nondeterministic transitionsn decreases.

4 Bounded-Lookahead Extended A u t o m a t a We now know hot to eliminate nondeterminism from a location I G Lg. Then, to eliminate the nondeterminism globally from S, one should iterate det{S, I) for all I £ Ls- However, local determinisation creates new locations, which may themselves be nondeterministic and have to be determinised, which may give rise to yet another set of nondeterministic locations, etc. This raises the question of whether the global determinisation process ever terminates. In this section we define a global determinisation procedure that we show to terminate exactly for the class of bounded lookahead extended automata. Intuitively, an automaton is deterministic with lookahead n if any nondeterministic choice can be resolved by looking n actions ahead. Definition 8 (bounded lookahead). An automaton S = {V,0,L,q°,E,T) has lookahead n £ N in a state q € Q[s] if Vui, <72 € T""^ . q -^ Aq -^ Atrace{ui) = trace{a2) => first{ai) = first{a2). The automaton has lookahead n in a set Q' C Q^gj of states if it has lookahead n in every q G Q'. Finally, S has bounded lookahead if, for some n GN, S has lookahead n in the whole set Q/sjWe shall find it convenient to define the lookahead of a location of an automaton. Definition 9 ((smallest) lookahead in location). An automaton S has lookahead n in location I G L if S has lookahead n in the set {{l,y)\v G V}. S has smallest lookahead n G N in a given location I if it has lookahead n in I, and does not have lookahead n — 1 in I. We denote by look{l,S) G N the smallest lookahead of location I in S (if it exists), otherwise, look{l,S) ^ oo. For example, the automaton depicted in the left-hand side of Figure 3 has look = 1 in /o, because, when e occurs, the left-hand side a-labeled transition must have been fired, but when b occurs, the right-hand side a-labeled transition has been fired. On the other hand, the automaton depicted in the left-hand side of Figure 4 does not have look = 1 in /Q, because the occurence of b does not reveal which of the a-labeled transitions was fired. However, the following action (either c or d) reveals all the past trace, hence, look = 2 in ^o for the given automaton.

204

T. Jeron, H. Marchand, and V. Rusu

Fig. 3. Inherited nondeterminism may not decrease global lookahead.

Fig. 4. Created nondeterminism has decreased global lookahead. Definition 10 (global lookahead). look{S) =

maxi^ig{look[l,S)}.

Clearly, a location I is deterministic in an automaton S iff look{l,S) = 0; and the automaton S itself is deterministic iff look{S) = 0. The following proposition says that the lookahead of an automaton does not increase by local determinisation. Proposition 1 (Global lookahead does not increase). look{det{S,1)) < look{S). The following examples show that look{S) may or may not decrease with local determinisation. Consider the automaton on the left-hand side of Figure 3, which has global lookahead 1. Determinising in IQ leaves the automaton in the right-hand side, which still has the same global lookahead! The determinisation in IQ in Figure 4, however, decreases the global lookahead of the automaton from 2 to 1.

Symbolic Determinisation of Extended Automata

205

The diflerence between these situations is the following: in Figure 3, the determinisation step has merged the nondeterministic location I2 into the new location (^0, (^ii'2)), hence, the resulting automaton has inherited (in a sense that will be made precise below) the nondeterminism that I2 had; because of that nondeterminism, the global lookahead has not decreased. On the other hand, the determinisation step in Fig. 4 does not have this problem: both h, h are deterministic, and, even though the new location (^o, (^ii '2)) is nondeterministic, the nondeterminism is createdhy the fact that /i, I2 bring one 6-labeled transition each. Definition 11 (created/inherited nondeterminism). LetS he an extended automaton andti,t2 be two transitions of S involved into a nondeterminism in °*i = °t2 = °- ^^t {°! (dfiidta)) be the new location resulting from the determinisation det2{S,ti,t2), and assume that (o, {dtj,dt2)) is nondeterministic in det2{S,ti,t2). We say that this nondeterminism is created if both dtj, dj^ are deterministic in S, otherwise, the nondeterminism is inherited. Now, consider a global determinisation procedure that performs local determinisation steps in a breadth-first order: the first iteration determinises the nondeterministic locations of the original automaton, and each subsequent iteration determinises the new nondeterministic locations, generated during the iteration that preceded it. Figure 3 also illustrates the first iteration of such a breadth-first procedure on the automaton in the left-hand side. The resulting automaton is depicted on the right-hand side. Both automata have the same global lookahead = 1 . Hence, the lookahead cannot be used as a decreasing measure to ensure the termination of the procedure. Even worse, applying local determinisations in a depth-first order (i.e., determinising new nondeterministic locations as soon as they are created) may not terminate, even when the automaton has bounded lookahead. An example is shown in Figure 5: the automaton in the left-hand side has global lookahead 1, and, by determinising in IQ, one obtains the automaton depicted in the right-hand side of the figure, which contains a sub-automaton isomorphic the automaton in the left-hand side, with global lookahead still 1. After determinising in the newly created location, the sub-automaton is still there, and remains present all through the process of depth-first determinisation, which, in this case, clearly does not terminate. Hence, applying local determinisation steps in depth-first or in breadthfirst order does not lead, in general, to a terminating global determinisation procedure. However, Proposition 2 below shows that if an iteration of a breadth-first procedure only gives rise to created nondeterminism, the global lookahead does decrease. Proposition 2 (Global lookahead decreases if all new nondeterminism is created). LetS' be an automaton obtained by determinising all nondeterministic locations {li,.. .Ik} of an automaton S in an arbitrary order, (i.e., So = S,

206

T. Jeron, H. Marchand, and V. Rusu

Fig. 5. Depth-first determinisation may not terminate. \/i < k — 1, Si+i = det{Si, U), and S' = Sk)- If none of these local determinisation steps gave rise to inherited nondeterminism, then look{S') < look(S). To ensure that all new nondeterminism is created, one must determinise locations whose direct successors are deterministic. But now we are faced with another difficulty: if the automaton has cycles in which every location is nondeterministic, it is impossible to choose a location on the cycle to start determinising with! This will lead us to "breaking" such cycles by determinising one location on each of them. Definition 12. A location V is a direct successor of a location I in S if there exists t £ Ts such that Of = I and dt = I'• A cycle is a sequence c = ti • h' • -in € T* such that Vi = 1 , . . . n — 1, dj. = Ot^^j, and dt„ = Ot^. The cycle is elementary if moreover Vi, j = 1 , . . . n — 1, z < j => dt^ ¥" '^u holds. We say I £ c if 3i £ {l,...n}.l = djj, denote by C{S) the set of cycles of S, and by

C{S,l) =

{ceC{S)\lec}.

Definition 13 (nondeterministic cycle). A cycle c is nondeterministic if VI G c, I is nondeterministic. We denote by N'iS) the set of nondeterministic cycles of S. Lemma 1. For S an automaton and all locations I G Ls, C{det(S,l),l) fl N{det{S, I)) = 0, and Vc' GC(S).C' i 0(3, /) A c' ^ Af{S) ^ d G C{det{S, I)) \ J\f{det{S,l)). Proof. For the first statement, note that / is deterministic in det{S,l), hence, by definition, a cycle c G C{det{S, I), I) cannot be nondeterministic in det{S, I), i.e., it cannot be in Af{det{S,l)). For the second statement, the left-hand side of the implication means that the cycle c' G C{S) does not visit I, but visits some other location I' which is deterministic in <S. Determinisation in I leaves c' unchanged, thus, c' G C{det{S,l)), and /' is still deterministic in det{S,l), hence, c' ^ Af{det{S,l)).D

Symbolic Determinisation of Extended Automata

207

Lemma 1 says that cycles visiting I in det{S, I) are not nondeterministic, and cycles c' that do not visit I and that are not nondeterministic in S are still not nondeterministic cycles of det{S,l). The consequences are that determinising one location per elementary nondeterministic cycle generates an automaton without any nondeterministic cycles, and determinisation does not add new nondeterministic cycles. We now introduce our global determinisation procedure (Fig. 6), which starts by "breaking" all elementary nondeterministic cycles, by determinising one location on each. Procedure det{S) while C ~ {ce A^(>S)|c elementary} ^ 0 do choose c & C; choose / G c; iS := det{S,l) endwhile n := 0; S„ := S while Sn is nondeterministic do while L' := {I G Ls„ 15!^ is nondeterministic in i} ^ 0 do L" := {/' € L'\S'„ is deterministicin all direct successors of I'}) choose I (z L S'n:= det{S'n,l) endwhile Sn •= S'n; n •.= n+l endwhile

return Sn-

Fig. 6. Global determinisation procedure det{S)

Theorem 1 (termination, sufficient condition). det{S) terminates

iflook{S)<

oo.

Proof. By Lemma 1 and Proposition 1, the elimination of nondeterministic cycles (first while loop in Figure 6) terminates and does not increase look{S). Consider the sets L" C L' computed at each new iteration of the inner (third) while loop. Note that V ^ % and L" = 0 implies that there exists a nondeterministic cycle in <S„. Indeed, assume /i G L', then L" = 0 imphes h ^ L", which implies that h has a direct successor h € Lg^ where S'^ is also nondeterministic, which implies again l^ G L'. The process continues, and we eventually build a nondeterministic cycle in <S„, which is impossible since all nondeterministic cycles were eliminated. Inside the inner while loop, L' ^ 0, and by the above reasoning, L" ^ 0. Hence, the choose I operation (from L") inside the loop is always possible, and then determinising in location I decreases the cardinal of L' by one. Since L'

208

T. Jeron, H. Marchand, and V. Rusu

is finite {L' C Lg^) and its cardinal decreases, eventually L' — % and the inner while loop terminates. L' = 0 also means that at the end of the inner while loop, iS^ is deterministic in all locations Ls„, hence, nondeterministic locations in S'^ are new. For termination of the outer while loop, we prove look{Sn+i) < look{Sn)We know that after the inner loop, the nondeterministic locations in S'^ are new (in Ls' \ Ls„) and cannot have inherited nondeterminism, because they were generated by determinising locations in L", whose direct successors are, by construction, deterministic. Finally, by Proposition 2, look{S'n) < look{Sn), and iS„+i becomes S'^ after n is incremented, and the proof is done. • The fact that bounded lookahead is necessary for termination is based on: Proposition 3 (/oo^ decreases by at most 1).

look{det2{S,ti,t2))'>look{S)—l.

Then, a finite sequence of det2Q operations cannot decrease lookahead from oo toO: Theorem 2 (necessary condition). / / det{S) terminates then look{S) < oo. This concludes the study of the procedure's termination. It also preserves traces: Theorem 3. If det{S) terminates then Traces{det{S)) — Traces{S). The determinisation procedure can be improved using approximate reachability analysis. Assume that an over-approximation Reach" 3 Reach{S) of the reachable set of states is known (e.g., by abstract interpretation). Moreover, assume that this set is described using a formula in the same logic as the automaton's guards, which we have assumed to be decidable for satisfiability (cf. Section 2). Then, Definition 5 of a deterministic extended automaton can be weakened, by requiring that Reach" A Gtj A Gf^ be unsatisfiable (instead of Gti A Gt2 unsatisfiable). This new definition of determinism increases the subclass of extended automata on which the determinisation procedure terminates. The procedure now terminates for automata satisfying a modified definition of bounded lookahead, which, intuitively, requires only states in the set Reach" (instead of Qisj) to have bounded lookahead. Checking for Bounded Lookahead The bounded lookahead condition is clearly undecidable for extended automata. We now give a sufficient criterion for this condition. We need a notion of product of extended automata: Definition 14 (Synchronous Product). For j = 1,2, the extended automata Sj = {Vj,0j,Lj,l^,Sj,Tj) are compatible »/Vi n V2 = 0 and Si — S2. The synchronous product S = iSi||iS2 of two compatible automata <Si,<52 is the automaton {,VO,L,l°,S,T) with: V = Vi U V2, 9 = Oi A 02, L = Li x L2, l'^ = (l^, I2), S = Si = S2, and the set T of transitions of the composed system is the smallest set defined by the rule: ti : {li,a,Gi,Ai,l[)

g Ti

^2 : {h,a,G2,A2,l'2)

t:{{hj2),a,GiAG2,AiUA2,{l[,l'2))^'^

€ T2

Symbolic Determinisation of Extended Automata

209

Then, the bounded lookahead condition for an extended automaton can be equivalently formulated as follows. Consider an extended automaton S = {V,&,L,l°,E,T), and let the primed copy S' of S be the automaton obtained by "priming" all the components of S except the alphabet E, i.e., S' = {V',G',L',l'°,S,T'), where V = {v'\v G V}, L' = {l'\l e L}, and for states q' = {{l,v))' = {V,v') where v' is the same valuation as v, but for variables V, i.e., Vx' e V, v'{x') ^ v{x). Proposition 4 (checking for bounded lookahead). An extended automaton S has bounded lookahead iff, for all q,qi,q2 & Q[si and distinct transitions ti,t2 € Ts with atj = ajj, if q -^s
5 Applications of Determinisation Verification A standard verification problem is that of trace (or language) inclusion: given two systems J (the implementation) and S (the specification), decide whether Traces{X) C Traces{S). When T, S are extended automata and <S is deterministic, the problem reduces to a reachability problem in the extended automaton X||i5, where S is obtained from S by adding a new location fail ^ L, and for each I G L and a G 17, a new transition with origin I, destination fail, action a, identity assignments, and guard At:(i,a,Gt,At,i')er'~'^t- '^^^ ^^"^ transitions allow actions in S whenever they are not allowed in S. Hence, when S is deterministic, Traces{I) C Traces(S) iff no location in the set {{l,fail\l € L^)} is reachable in I\\S. When S is not deterministic, the above statement is incorrect. Let S be the nondeterministic automaton in the left-hand side of Figure 3. A naive application of the completion operation on S builds a transition labeled b from h to fail, suggesting that a • & is not a trace of S, which is obviously false. In particular, verification would wrongly declare erroneous an implementation that exhibits the trace a • b. Hence, to be adequate for verification, S has to be determinised before being completed.

210

T. Jeron, H. Marchand, and V. Rusu

Conformance Testing Conformance testing is a functional testing that consists in comparing a black-box implementation J to a formal specification S according a conformance relation. The implementation is a black box, i.e., only its interface (input and output alphabet) is known. In [5] we show that conformance of an implementation 5 to a specification according to the standard ioco relation [6] is equivalent to the fact that running a canonical tester in parallel with the implementation never reaches a certain set of locations. The tester can be automatically computed from the specification using operations similar to the completion operation, defined above, and, of course, determinisation. Without determinisation, the tester might wrongly declare non-conformant an implementation that is conformant to the specification (a phenomenon similar to that exhibited by the trace a • b, noted in the previous paragraph). Fault Diagnosis The determinisation problem for extended automata also has a close relationship with diagnosis for discrete event systems [7]. For instance, an extended automaton with bounded lookahead can be seen as an automaton in which nondeterministic choices are diagnosable; and checking membership to the class of bounded lookahead automata can be reduced to a diagnosability problem in this model. Also, the sufficient criterion for bounded lookahead (around Proposition 4) was inspired by the algorithm used to check diagnosability [8], based on the search of specific cycles in a product of the specification with itself. Conversely, it could be profitable to re-define diagnosability in terms of our bounded lookahead condition, in order to capture a notion of diagnosability for richer, infinite-state models. Finally, the construction of a diagnoser from an automaton specifying a plant and a fault model is based on determinisation: one has to determinise the plant "decorated" with past occurrences of (unobservable) faults. Our determinisation procedure then constitutes a basic block for the construction of diagnosers from plants specified as extended automata, thus extending the works on diagnosis to expressive, infinite-state models.

6 Conclusion, Related Work, and Future Work In this paper we present a determinisation procedure for extended automata and prove that the procedure terminates exactly for the class of extended automata with bounded lookahead. The intuition behind this class is that in any location, for any trace, there exists a bounded number of steps after which the first transition taken is uniquely identified. Technical difficulties for proving termination arise from the fact that the order in which elementary determinisation steps are applied has a strong infiuence on termination. The main dificulty was to find an adequate order, for which the bounded lookahead provides a decreasing measure. The models of extended automata considered in this paper only have observable actions. One can also consider models with internal (unobservable) actions.

Symbolic Determinisation of Extended Automata

211

In this case, determinisation first consists in an extended e-dosure generalising that of finite automata. The extended e-closure algorithm is then based on the propagation of guards and actions onto the next transitons labeled by observable actions [9], and terminates iff there are no cycles of transitions labeled by internal actions. The present work was initially motivated by conformance testing, more specifically, model-based testing based on the ioco theory [6]. In this framework, off-line test generation (computation of test cases from specifications) involves determinising the specification in order to compute the next possible observable actions after each trace, and, therefore, to obtain deterministic test cases [10]. In that work, we consider an extension of the model presented here (actions are either inputs or outputs and may carry communication parameters), which can be handled by a small modification of our determinisation procedure. The procedure also has potentially interesting application in the verification and diagnosis of infinite-state systems. An alternative approach, which is also used in conformance testing and in fault diagnosis, is on-the-fly determinisation of a bounded number of transitions of a (basic, symbolic, or timed) automaton, starting from the initial state [6, 11, 12]. In this case, the problems related to termination disappear, because the number of determinisation steps is finite and defined in advance by the bounded exploration depth. However, this approach cannot be used for constructing canonical testers, which we found to be a useful object, and cannot be used for proving trace inclusion.

References 1. John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman. Introduction to Automata Theory, Languages and Computability. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA, 2000. 2. Rajeev Alur and David L. Dill. A theory of timed automata. Theoretical Computer Science, 126(2):183^235, 1994. 3. T. Jeron, H. Marchand, and V. Rusu. Symbolic determinisation of extended automata. Technical Report 1176, IRISA, February 2006. 4. Greg Nelson and Derek C. Oppen. Simplification by cooperating decision procedures. ACM Trans. Program. Lang. Syst, l(2):245-257, 1979. 5. Vlad Rusu, Herve Marchand, and Thierry Jeron. Automatic verification and conformance testing for validating safety properties of reactive systems. In Formal Methods 2005 (FM'05), volume 2805 of LNCS, pages 223-243, 2005. 6. Jan Tretmans. Test generation with inputs, outputs and repetitive quiescence. Software - Concepts and Tools, 17(3):103-120, 1996. 7. M. Sampath, R. Sengupta, S. Lafortune, K. Sinnamohideen, and D. Teneketzis. Failure diagnosis using discrete event models. Proceedings of the IEEE Transactions on Automatic Control, 4(2): 105-124, 1996. 8. S. Jiang, Z. Huang, V. Chandra, and R. Kumar. A polynomial time algorithm for diagnosability of discrete event systems. IEEE Transactions on Automatic Control, 46(8);1318-1321, August 2001.

212

T. Jeron, H. Marchand, and V. Rusu

9. Elena Zinovieva. Methodes symboliques pour la generation de tests de systemes reactifs comportant des donnees. PhD thesis, Univ. of Rennes, Nov. 2004. 10. B. Jeannet, T. Jferon, V. Rusu, and E. Zinovieva. Symbolic test selection based on approximate analysis. In Tools and Algorithms for the Construction and Analysis of Systems (TACAS'05), volume 3440 oi LNCS, 2005. 11. T. Jeron and P. Morel. Test generation derived from model-checking. In Computer-Aided Verification (CAV'99), volume 1633 of LNCS, pages 108-122, 1999. 12. Moez Krichen and Stavros Tripakis. Black-box conformance testing for real-time systems. In SPIN'04, volume 2989 of LNCS, pages 109-126, 2004.

Regular Hedge Model Checking Julien d'Orso^ and Tayssir Touili^ ^ University of Illinois at Chicago, dorso91iafa.jussieu.fr ^ LiAFA, CNRS & Univ. of Paris 7. touilieiiafa.jussieu.fr Abstract. We extend the regular model checking framework so that it can handle systems with arbitrary width tree-like structures. Configurations of a system are represented by trees of arbitrary arities, sets of configurations are represented by regular hedge automata, and the dynamics of a system is modeled by a regular hedge transducer. We consider the problem of computing the transitive closure T"*" of a regular hedge transducer T. This construction is not possible in general. Therefore, we present a general acceleration technique for computing T^ • Our method consists of enhancing the termination of the iterative computation of the different compositions T* by merging the states of the hedge transducers according to an appropriate equivalence relation that preserves the traces of the transducers. We provide a methodology for effectively deriving equivalence relations that are appropriate. We have successfully applied our technique to compute transitive closures for some mutual exclusion protocols defined on arbitrary width tree topologies, as well as for an XML application.

1 Introduction Regular Model Checking has been proposed as a general and uniform framework to analyse infinite-state systems [21, 28, 12, 7]. In this framework, configurations are represented by words or trees, sets of configurations by regular finite word/tree automata, and the transitions of the system by a regular relation described by a word/tree transducer. A central problem in regular model checking is to compute the transitive closure of a regular relation given by a finite-state transducer. Such a representation allows to compute the set of reachable configurations of a system (thus enabling verification of safety properties) as well as to detect loops between configurations if the transformations are structure preserving (thus enabfing verification of liveness properties) [12, 6]. However, computing the transitive closure of a transducer is not possible in general since the transition relation of any Turing machine can be represented by a regular word transducer. In fact, the major problem in regular model checking is that a naive computation that consists in iteratively computing the different compositions T* of a transducer T does not terminate in general. Therefore, a main issue in regular model checking is to define general acceleration techniques that will force the above iterative procedure to terminate for many practical applications. Please use the following format when citing this chapter: d'Orso, J., Touili, T., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 213-230.

214

J. d'Orso and T. Touili

During the last years, several authors addressed this issue. (1) First in the case of regular wore? model checking where configurations are encoded as words. These works have been successfully applied to reason about linear parametrized systems (i.e., parametrized systems where the processes are arranged in a linear topology) [12, 23, 19, 13, 25, 3, 4], as well as systems that operate on linear unbounded data structures such as lists, integers, reals, and even hybrid automata [5, 11, 6], and programs with pointers [9]. (2) Then in the case of regular tree model checking where configurations are represented by trees of arbitrary sizes (but fixed arities). These works have been applied to the analysis of parametrized systems with tree topologies [17, 19, 2, 1], and multithreaded programs [22, 17, 14, 8, 26]. In this paper, we develop the regular model checking paradigm further, and consider the more general case of regular hedge model checking, where configurations are represented by trees of arbitrary arities. Indeed, arbitrary width tree-like structures are very common and appear naturally in many modeling and verification contexts. We can mention at least three examples of such contexts: - XML documents can be modeled by unranked trees whose nodes are labeled with the tags of the document [29, 24]. For example, a document having n pages, where page i has ki paragraphs can be represented by a tree whose root has n children, and where the i*'' child has fcj children. Since the number of pages and paragraphs in a document are arbitrary, unranked trees are necessary to represent such documents. Then, transformations on XML documents such as XSLT can be represented by relations on unranked trees. - Configurations of multithreaded recursive programs can also be represented by unbounded width trees where the leaves are labeled with the control points of the program and the inner nodes with the sequential and the parallel operators • and II. For example, a term | | ( i i , . . . ,in) represents a configuration where the terms i i , . . . ,i„ are in parallel. Since the number of parallel processes can be arbitrarily large, we need unbounded width trees to accurately represent such configurations. Then, actions of the program such as procedure calls, launching of new threads, synchronisation statements, etc, can also be represented by relations on unranked trees [15, 16]. - Many parametrized protocols are defined on tree topologies with unbounded width. Indeed, in the case of tree networks, the number of processes and the topology of the network (including the arities of the different nodes) are not fixed. In this case, labeled trees of arbitrary width and height are needed to represent configurations of tree networks of arbitrary numbers of processes: each vertex in a tree corresponds to a process, and the label of a vertex is the current control state of its corresponding process. Typically, actions in such parametrized systems are communications between processes and their sons or fathers. These actions correspond in our framework to tree relabeling relations (transformations which preserve the structure of the trees). Examples of such systems are multicast protocols, leader election protocols, mutual exclusion protocols, etc.

Regular Hedge Model Checking

215

We use hedge automata [18] to symbolically represent infinite sets of unranked trees, and hedge transducers to model transformations on these trees. Then, as in the case of regular word and tree model checking, the central problem is to compute the transitive closure of a hedge transducer T. Our aim is then to define general techniques which can deal with different classes of relations, and which can be applied uniformly in many verification and analysis contexts such as those mentioned above. The main contribution of this work is the definition of a general acceleration technique on relabeling hedge transducers (tranducers that preserve the structure of the trees). Our technique works as follows: To enhance the termination of the iterative computation of the different compositions T ' , we merge equivalent states using an appropriate equivalence relation, i.e., an equivalence relation that preserves the traces of the transducers (for which collapsing two states does not add new traces to the transducers). The main problem amounts then to defining and computing appropriate equivalences. We provide a methodology for deriving such equivalence relations. More precisely, we consider equivalence relations induced by two simulation relations, namely a downward and an upward simulation, both defined on hedge automata. We give sufficent conditions on the simulations that guarantee appropriateness of the induced equivalence. Furthermore, we define effectively computable downward and upward simulations for which the induced relation is guaranteed to be appropriate. We have successfully applied our technique to compute transitive closures of some mutual exclusion protocols defined on arbitrary width tree topologies. We were also able to handle an XML application. This effort is reported in Section 6. Related work. There are several works on efficient computation of transitive closures for word transducers [12, 19, 25, 5, 11, 6, 4] and tree transducers [17, 2, 1]. However, these works only consider trees where the arities are fixed, whereas our framework allows to consider ranked as well as unranked trees. In fact, our technique can be seen as an extension of the approach used in [1] to hedge transducers. Note that arbitrary arities make this extension nontrivial. In particular, the transition rules of the collapsed hedge transducer under construction make use of regular languages over classes of tuples, these classes themselves being potentially regular languages. This nesting of languages is delicate to manipulate. More recently, hedge automata have been used to compute reachability sets of some classes of transformations, namely Process Rewrite Systems (PRS) [15] and Dynamic Pushdown Networks (DPN) [16]. Compared to our work, these algorithms compute the sets of the reachable states of the systems, whereas we consider the more general problem of computing the transitive closure of the system's transducer. Moreover, our technique is general and can be uniformly apphed to all the classes of relabeling transformations, whereas the algorithms of [15, 16] can only be applied to the specific class of PRS or DPN. Outline. In Section 2, we give the definitions of hedge automata and transducers, and show how the i*'' iterations for a relabeling hedge transducer can be

216

J. d'Orso and T. Touili

effectively computed. In Section 3, we describe our general semi-algorithm. In Section 4, we define relations ~ induced by downward and upward simulations, and give sufficient conditions ensuring that ~ is an appropriate equivalence relation. We provide in Section 5 an effectively computable example of such an equivalence. Finally, in Section 6, we show some examples on which we applied our technique.

2 Hedge automata and transducers 2.1 Terms Let E be an unranked alphabet and rf be a fixed denumerable set of variables {a;i, X2,...}. The set Ts[Af] of terms over S\J X is the smallest set such that:

-

S\JX(ZTs\X],

- if / 6 r , i i , . . . ,i„ G

TE[X]

for some n > 1, then f{ti,...

,^„) £

TE[X].

Terms without variables are called ground terms. Let Tj; be the set of ground terms over S. A term t in T^[<^] is linear if each variable occurs at most once in t. A context C is a linear term of T^[A']. Let i i , . . . ,i„ be terms of Ts, then C [ t i , . . . ,i„] denotes the term obtained by replacing in the context C the occurrence of the variable Xj by the term ti, for each 1 < i < n. As usual, a term in Tjjf/f] can be viewed as a rooted labeled tree u where the leaves are labeled with variables or elements in S, and every internal node N with a symbol A(A'") G S, where A is the labeling associated to u. 2.2 Hedge automata To finitely represent infinite sets of terms, we use hedge automata [18]: Definition 1. A Hedge automaton is a tuple A = {Q,S,F,S) where Q is a finite set of states, E is an unranked alphabet, F C Q is a set of final states, and 6 is a set of rules of the form f{L) —> q, where f G S, q € Q, and L C Q* is a regular word language over Q. A is deterministic if for every f € E, if 5 contains two rules f[Li) —+ qi and f{L2) —> 92, then Li D L2 = 0. We define a move relation —>s between ground terms in T^uQ as follows: for every two terms t and t', we have t —>s t' iS there exist a context C and a rule r = / ( L ) -^ q e S such that t = C f{qi{ti),... ,g„(i„)) , qi • • • qn & L, and

t' =

c[q{fit,,...,tn))

Let —»5 denote the reflexive-transitive closure of —>j. A ground term t &Ts is accepted by a state q\it —>5 q{t). Let Lq = {t\t -^s
Regular Hedge Model Checking

217

The language of A, denoted by L{A), is the set of all ground terms accepted by A. A set of terms £ over E is hedge regular if there exists a hedge automaton A such that £ = L{A). Intuitively, given an input term t, a run of ^ on t according to the move relation —*s can be done in a bottom-up manner as follows: first, we assign nondeterministically a state q to each leaf labeled with symbol / if there is in S a rule of the form f{L) —> q s.t. e £ L. Then, for each node labeled with a symbol g, and having the terms ti,... ,t„ as children, we must collect the states qi,...,qn assigned to all its children, i.e., such that ti —>5 qi{ti), for 1 < i < n, and then associate a state q to the node itself if there exists in 5 a rule r = g{L) -^ q such that qi- • • q-n £ L. A term t is accepted if A reaches the root of i in a final state. Theorem 1. [18] The class of Hedge automata is effectively closed under determinization and under boolean operations. Moreover, the emptiness problem for Hedge automata is decidable. 2.3 Relabeling hedge transducers and relations Definition 2. A Relabeling Hedge Transducer is a tuple T = (Q, E, F, A) where Q is a finite set of states, S is an unranked alphabet, F C Q is a set of final states, and A is a set of rules of the form f{L) -^ q{g), where f,g £ S, q £ Q, and L C Q* is a regular word language over Q. As for hedge automata, a relabeling hedge transducer defines a move relation —>/i between ground terms in TSUQ as follows: for every two terms t and t', we have t —>4 t' iff there exist a context C and a rule r = f{L) —> q{g) £ A such tha.tt = C f{qi{ti),...,qn{tn))

, qi-• • qn ^ L, a.nd t'= C q{g{ti,...

,tn)) •

Let —»/i denote the reflexive-transitive closure of —>/i. The transducer T defines the following relation between unbounded width trees: Rr = {{t,t') £ TE X Ts \ t —>A lit'), for some q £ F}. Note that RT is structure preserving, i.e., if {t,t') £ RT, then t and t' correspond to two different labelings of the same skeleton tree. Remark 1. Let / and g be two letters in E. We represent the pair (/, g) by f /g. Let t and t' be two terms corresponding to different labelings Ai and A2 of the same underlying tree u. We define the term t/t' as the labeling A3 of u such that for every node N of u, XsiN) = Ai(Af)/A2(iV). A relabeling hedge transducer T = {Q, E, F, A) can be seen as a hedge automaton A = (Q, E x E,F, 5) over the alphabet E x E, where 6 is the set of rules f/g{L) —» q s.t. f{L) —» q{g) £ A. Then it is easy to see that L{A) = {t/t'\it,t')£Rr}. A relation R over Ts is hedge regular if there exists a relabeling hedge transducer T such that R = Rr- We denote by R^- the composition of Rr,

218

J. d'Orso and T. Touili

n times. As usual, R^ = Un>i-^r denotes the transitive closure of Rr- Let L C TE he a. hedge tree language. Then, we define the set Rr{L) = {t' £ E \ ^teL,{t,t') GRT}. We show in what follows that hedge regular relations are closed under composition, and that they preserve regularity of hedge languages. First, we need to define the product of regular word languages as follows: Definition 3. Let L i , . . . , L„ be n regular word languages over the alphabet Q. The product Li (8> • • • 0 I/„ is defined by: ii®--®L„ = {(9j,...,g^)---«,---,Ol9i---9reLi,l
- P F = P} X • • • X P^; -T = { ( ( p i , . . . , P „ ) , ( g i , . . . , g n ) , ( p i r - - - . K ) ) I {Pi^li^P'i) GTj}. Let A = {Qi,S, Pi, (5i) be a hedge tree automaton and T = {Q2, S, P2, A2) be a relabeling hedge transducer. Let B = {Q, E, F, 6) be the hedge tree automaton such that Q = QiX Q2, F = FiX F2, and 5 is the set of rules g{L) —> (gi, ga) such that there exists two rules / ( L i ) —> gi G (5i and / ( i a ) —> 92(5) S A2 such that L = Li (g) L2. Then we have the following: Lemma 1. L{B) =

Rr{L{A)).

Let T = {Q,E,F,A), and let the relabeling hedge transducer Tn = {Qn,E,Fn,An) defined as follows: Q„ = Q"', Fn — F'^, and Zi„ is the set of rules of the form f(L) -^ ( g i , . . . , qn){g) such that there exist in A rules of the form fi{Li) -^ gi(/i+i), 1 < i < n, s.t. fi = f, /„+i = 5, and L = Li®- • -^LnThen we can show that: Lemma 2. Pr„ = Rr-

3 Computing transitive closures Our goal in this work is to compute a relabeling hedge transducer that recognizes the transitive closure P ^ of a regular hedge relation Rr- Unfortunately, this is not possible in general since the transitive closures are not necessarily hedge regular. Therefore, our purpose is to propose a semi-algorithm that, in

Regular Hedge Model Checking

219

case of termination, computes a relabeling hedge transducer that recognizes the transitive closure Rlj. More precisely, starting from a relabeling hedge transducer T, we derive a transducer, called the history hedge transducer that characterizes the transitive closure R^. The set of states of the history transducer is infinite. To tackle this issue, we present a method (that is not guaranteed to terminate) for computing a finite-state transducer which is an abstraction of the history transducer, based on a notion of an equivalence relation on the states of the history transducer. The abstract transducer can be generated on-the-fly by a procedure which starts from the original transducer T, and then incrementally adds new states and transition rules, merging equivalent states. Let us first give the formal definition of the history hedge transducer: Definition 4. The history hedge transducer of a relabeling hedge transducer T = (Q, E, F, A) is the (infinite) transducer given by the tuple H = [QH, ^, FH, AH) such that: QH = [j Qn, FH = [j Fn, and AH = U ^nn>l

n>l

n>l

Since Rj- = Rr^ (Lemma 2), and by definition i?-H = U -Rr„, it follows n>l

that: T h e o r e m 2. R1^ = RHAs mentioned previously, Ti cannot be computed in general since it has an infinite number of states. To sidestep this problem, we will compute an equivalent smaller transducer 7i^ (that might be finite), obtained by merging the states oiJi according to an equivalence ~ on QH- This transducer is defined as H^ = (Q~, -S', F^, A^) such that: - Q^ = {g~ I q & QH}, where g^ denotes the equivalence class of the state q w.r.t. ~; - Fr^ = {g^ I q G FH} is the set of equivalence classes of FH w.r.t. ~; - A^ is the set of rules f{L^) —> s^{g) such that f{L) —> s{g) is a rule in AH, where L^ is obtained from L by substituting each state q by its equivalence class g^. We compute K~ iteratively according to the following procedure: 1. We compute successive powers of T: 7i-^, Ti-^, Ti-^,... (where H - ' = Uj;=i 'Fj) while collapsing states according to ~ . We obtain the sequence of transducers H^^, 'H^^ W^^,--2. If at step i we obtain that R^
220

J. d'Orso and T. Touili

4 Appropriate equivalences for hedge automata Let A = {Q,E,F,5) be a hedge automaton. We define in this section an appropriate equivalence ~ on the set of states Q such that L{A^) = L{A). To do so, we first define two simulation relations, namely a downward simulation =4down and an upward simulation =4up on Q, and then we show how to generate an appropriate equivalence ~ from these simulations. 4.1 Dow^nward and upward simulations We introduce here the notion of downward and upward simulation for hedge automata: Definition 5. [Downward Simulation] A binary relation =4down on Q is a downward simulation iff for any symbol f € S, for all states q,r G Q, we have: Whenever q =4down f, f{L) —* q G S, then for every states qi,...,qn G Q s.t. Qi- • • r in 5 such that qi ^d

own '^li • • • jQn "^down

' ' n ; and

ri • • • Tn & L .

It is easy to see that if q =4 down ^i then whenever a term t is accepted by state q (i.e., t —^s 9(0)i i* is also accepted by state r. Lemma 3. Let =4down be a downward simulation on Q. The reflexive closure and the transitive closure of =4down O-T^ both downward simulations. Furthermore, there is a unique maximal downward simulation on Q. Definition 6. [Upward Simulation] Given a downward simulation =4down on Q, a binary relation =4up on Q is an upward simulation w.r.t. =4down iff for any symbol f £ S, for all states qi,ri G Q, the following holds: Whenever qi =4up fi o.nd f{L) -^ q & S, then for every states g i , . . . , g„ G Q s.t. qi- • -qi- • -qn G L, there exist states r i , . . . , r„, r G Q and a rule f{L') -^ r in 5 such that qj 4down TJ, for j ^i,ri---rn& L', and q 4up r. It is easy to see that whenever g ^„p r, for every context C and every terms tx,... ,tn,t',t such that t = C[ti,... ,ti,t',ti+i,... ,in] and C[ti,...,ti,q{t'),ti+i,...,tn]

-^S S{t)

for a state s; then there exists a state s', s =4up s' such that: C[ti, ...,ti,

r{t'),ti+i, ...,tn]-^5

S'(t)

Lemma 4. Let =4down be a reflexive (transitive) downward simulation on Q, and let =4up be an upward simulation w.r.t. =4down- The reflexive (transitive) closure of =4up is also an upward simulation w.r.t. =4down- Furthermore, there is a unique maximal upward simulation on Q.

Regular Hedge Model Checking

221

4.2 Induced equivalence We define an equivalence relation derived from two binary relations: Definition 7. Two binary relations ^ i and :<2 are said to be independent iff whenever q ^i r and q •:<2 r', there exists s such that r •<2 s and r' -
4.3 Defining an appropriate equivalence Let A = [Q, a, F, 6) be a hedge automaton. Let =4down be a downward simulation, and let =^„p be an upward simulation w.r.t. =4down- Thanks to Lemmas 3 and 4, we suppose without loss of generality that =4 down and =4 up are reflexive and transitive. Let :
5 A n instance of an appropriate equivalence Let us now come back to our relabeling hedge transducer T = (Q, S, F, A) and its corresponding history transducer Ti = {QH-,S,FH,AH)We suppose that T is deterministic (this is not a restriction thanks to Theorem 1 and Remark 1). Recall that our purpose is to effectively compute an appropriate equivalence relation ~ on QH such that Liji.^) = Liji). We give in this section an example of a computable equivalence ~ on QH induced by a downward simulation =4down, an upward simulation w.r.t. ^down, and a relation ^ satisfying the conditions required in the previous section. First, we need to introduce the notion of copying states: Definition 8 (Copying States). Let q G Q be a state:

222

J. d'Orso and T. Touili

~ q is a prefix copying state iff for every term t: t^Aq{t')

ifft = t'

- q is a suffix copying state iff for every term t, context C, and qp G F: C[q{t)]^AqF{C'[t\)

iffC = C'

Let 5 be a set in QH X QH- We define the relation Rs generated by S as the smallest reflexive-transitive relation that contains S and that is a congruence with respect to product, i.e., if ( ( g i , . . . , ?„), {q'l,..., g^)) € Rs, then for any si,...,Sk,s[,...,s'i in QH, ( ( s i , . . . , Sfe, Q i , . . . , g„, s'l,..., s;), {si,...,Sk,q'i,...,q'ra,s[,...,

s'l)) G Rs

Lemma 6. If the set S is a downward (resp. upward) simulation on Q-H, then its generated relation :<s is also a downward (resp. upward) simulation. Let Qpref be the set of prefix copying states of T, and Qsuff be the set of suffix copying states of T that are not in Qpref- Let =4down be the binary relation on QH X QH generated by the set {{{Q,q),q),{q,{q,q))

Iq&Qpref}

We show that =4down is a downward simulation: Lemma 7. =4 down is a downward simulation. Let =4up be the binary relation on QH X QH generated by the set {{{Q,q),q),iq,iq,q))

\q&Qsuff}

Then we have: Lemma 8. =4up is an upward simulation w.r.t. =4downLet :<==4up- Then, we can show that :< and =4down are independent: Lemma 9. :< and =4down o,i^s independent. Let then ~ be the relation induced by =4down and ^ . We show that the conditions of Theorem 3 are satisfied: Lemma 10. Whenever x G FH and x =4up y, then y G FH- Moreover, if X & Fr^ and x € X, then x G FHIt follows then from Theorem 3 that: Theorem 4. L(K^) = L(W). Remark 2. Note that both ^^^down and =^„p are included in ~ (this is due to the fact that these relations are reflexive and symmetric). Now, it remains to show how can the equivalence ~ be effectively computed. For this, we need to compute the sets of copying states Qpref and QsuffThis is described next.

Regular Hedge Model Checking

223

Input: Hedge transducer T = (Q, E, F, A), and a state q. Begin d := {q} Repeat for each qi G d, and for each rule r = /(-L) —> g(gi), add {92 1 L n (Q*g2Q*) y^ 0} to rf. Until No more additions can be made End Output: "Yes" if all rules r encountered were copying (i.e. such that / == ff)"No" otherwise.

Fig. 1. Determining whether a state is prefix copying. 5.1 Computing copying states The algorithm for checking whether a state q is prefix copying is shown in Figure 1. Intuitively, the algorithm worlcs as follows: it tries to explore all rules r useful for computing the language of T with q as the only accepting state. If all such rules r are of the form / ( L i ) -^ fiQi), then q is indeed prefix-copying. Input: Hedge transducer T = (Q, E, F, A), and a state q. Begin up := {q}, side := 0 Repeat for each qi € up, and for each rule i• = fiL) -* 9{q2) such that L n (Q*q2Q*) / 0, then / q} to side. add 92 to up, and add {q' \ LD Q'q'Q')y^1>Aq' Until No more additions can be made End Output: "Yes" if all rules r encountered were copying (i.e. such that f = g) and all states in side are prefix-copying and there is a final state in up. "No" otherwise.

Fig. 2. Determining whether a state is suffix copying. The algorithm for checking whether a state q is suffix copying is shown in Figure 2. Intuitively, the algorithm explores all rules r leading from state q to a final state according to the move relation for T. We must first check that all rules r encountered are copying rules. However, the test performed until now only checks what lies on the path from q up to the root of an accepted context. Therefore, we need to also check what's happening to the child nodes along this root path. This is the purpose of the variable side. Any subtree attached to a

224

J. d'Orso and T. Touili

child of the root path is accepted by some state in side. Hence, we require that all states in side are prefix copying.

6 Applications In this section, we give the results of applying the procedure of Section 3 to the analysis of two mutual exclusion protocols defined on arbitrary width tree-like networks, and of an XML application. 6.1 The unranked simple token protocol We consider the example of the unranked simple token protocol, which is a mutual exclusion protocol defined on arbitrary width tree-like networks. Each process stores a single bit which reflects whether the process has a token or not. The process that has the token has the right to enter the critical section. In this system, the token can move from a leaf upward to the root in the following fashion: any process that currently has the token can release it to its parent. Initially, the system contains exactly one token, located anywhere. More formally, the passing of the token upward the tree can be represented by the following relabeling hedge transducer T = (Q, E, F, A) where Q = {qo,Qi,(l2}, ^ = {n,t}, F = {q2}, and A contains the rules: n{q*o) ^ qoin) (1) t{q*o) ^ qiin) (2) niQoQKlo) -^ 92(t) (3)

n{q*oq2q*o) -^ g2(n) (4)

The intuition behind the states of the transducer is the following. - State qo is meant to accepts all "pairs" of identical trees where the token doesn't appear. This is a prefix-copying state. - State qi is an intermediate state meaning that the current node released the token. Its parent then acquires the token. - State q2 is the final state of the transducer. It accepts all "pairs" of trees in which the token has moved one step upward. This is a suffix-copying state. According to the algorithm of Figure 1, we get Qpref = {qo}, and with the algorithm of Figure 2, we get Qsuff = {qs}Let us now apply the algorithm described in Section 3. We will compute the difi^erent iterations W^^,..., H^*. We terminate at step i if R^ gi^(n) (2) ?^(9o~9l~'^o~) -^ 92~(i) (3) n((?S^g2-gS~) -^ ? 2 ~ H (4) Computing H^^: Take 7i^^ and add rules *(%*-)-» (91,go)~(n) (5) = (2)®(1) n{q*o^{qi,qo)r.qU ^ fe,gi)^(n) (6) = (3)®(2) n{q*o^iq2,qi)^qU ^ q2^{t) (7) = (4)®(3)

Regular Hedge Model Checking

225

For example, rule (5) is obtained by composing rules (2) and (1). The resulting product is the rule t{{qo,qo)*) —> {qi,qo){n) (denoted (2)(8)(1) above). Since {qo,qo) =4down Qo {qo S Qpref) and 4downQ~ (Remark 2), we get that (QO^QO) ~ qo- Therefore, merging w.r.t. ~ , we get rule (5). Note that rule (7) has been simplified. Indeed, performing the product of the rules (4) and (3) yields the rule n{L) -^ 92~(t)i where L is the following regular word language: (go,?o)*(92,go)(go,9o)*(9o,9i)(go,9o)* + iqo,qo)*{qo,qi) {qo,qo)*iq2,qo){qo,qo)* + {qo,qo)*{q2,qi)iqo,qo)*- For the sake of brevity. We omit the first part of L since the states {q2,qo) and {qo,qi) are not reachable. Computing H^^: Take H^^ and add the following rules obtained as described previously: n(q^^{qi,qo)r.qU "^ (^2, 9l, ?o)^(n) (8) = (3)®(5) = (6)®(1) n(9S^( 9oW (1) niq*o)^qi{t) (2) t(go*) ^ g2(n) (3) t(qoqiqo) -> qsin) (4) n{q^q2qo) ^ 93(0 (5) Hq^qsq^) -^ qsin) (6) The intuition behind the states of the transducer is as follows: - State qo accepts all "pairs" of identical trees where the token never appears. This state is prefix-copying. - State qi is the intermediate state denoting that the current node just acquired the token. Its parent neighbor releases the token. - State q2 is also an intermediate state. It means that the current node releases the token. The parent node acquires the token. - State qs is the final state. It accepts all "pairs" of trees in which the token has moved one step upward or downward. This state is suffix-copying. Computing H^^: Take T and replace occurences of a state in a rule of A with its equivalence class w.r.t. ~ . n{qU -> qo^{n) (1) n{qU ^ 91-^(0 (2) t{qU ^ 92~(n) (3) t{q^^qi^qU -^ g3~(n) (4) »^(9o~92-^go~) -^ 93~(i) (5) n(g^^g3~go~) -* g3~(n) (6)

226

J. d'Orso and T. Touili

Computing W^^. ^ake H^^ and add rules n{qU-^iqo,qi)^it) (7) n{qo^{qo,qi)r.qoJ -> {quq3)r~.{n) (8) t{q*o^{qi,q3)r.q*oJ - - 93~(n) (9) i(9o~)^fe,go)^(n)(10) n{qor.iq2,qo)r.qo^) ^ (g3,Q2)~(n) (11) n{q^^{q3,q2)r.qU ^ g3~(0 (12) t{qU^{'l2,qi)^{t) (13) ri(9o~(g2,gi)~9S-) ^ 93~(") (14) "(9o~(92,go)~go~(9o,9i)~gS~) -^ 93~(n) (14) ?^(9o~(9o,9i)~9o~(92,go)~9o^) ^ g3~(n) (14) n{qU^{qi,q2)^{n){l5) tiq^Uqi,q2)^qU -^ <73^(t) (16)

= ( 1 ®(2) = (2 0(4) = (4 ®(6) = (3: = (5 ®(3) = (6 0(5) = (3 0(2) = (5 0(4) = (5 0(4) = (5 = (2 0(3) = (4 0(5)

Computing Ti^^: Take H^^ and add rules (9o,9i,g3)~(n) n{qo^iqo,qi)^qoJ n{qo^{qo,qi,q3)r.qor.) (gi,93)~(n) (93,g2,9o)~(n) »^(9o~(92,9o)~go~) (?3,g2)~(n) "•(9S~fe>92,go)~9o-) 93-(n) n{qoM2,9o)~go~ (^o, 9i, g3)~Q5-) 93~(n) '^(95-(9o,gi,g3)~go~(Q'2,go)~9S~) 93-(n) '^(9o~(93,92, go)~g5~(9o, 9i)~9o~)

"(9o-(9o,9i)-go-(93,92,go)~gS-) -^ g3~(n)

(17 (18 (19 (20 (21 (21 (22

= = = = = = = (22 =

Computing Ti^^: Take ?i^^ and add rules n{qo^{qo,qi,q3)r^qo^) -» (go,gi,g3)-(n) ?^(Qo-(?3,g2,go)-go-) -^ (93,g2,?o)~(n) «(Q'S~(?3,?2,go)~gS-(9o,gi,g3)~g5-) ^ g3~(«) "(gS~(9o,«i,g3)~gS~(«3,g2,go)-gS-) ^ g3~(n)

23) 24) 25) 26)

(1)®(8) (2)0(9) (11)0(1) (12)0(3) (14)0(6) (14)0(6) (6)0(14) (6)0(14) = = = =

(1)0(18) (6)0(19) (6)0(21) (22)0(6)

The procedure terminates at step 4, since subsequent iterations have the same language. Note that some rules have been omitted if they contain unreachable states. Some redundant rules have been omitted as well, for the sake of simplicity. 6.3 An XML application Figure 3 represents an XML document that stores the informations about the clients of a store and the items they bought. Each client has four fields: name, address, the different items that were bought, and the status of the order, i.e., whether the order is treated or not. status is 1 if the order is being treated, 0 if it has not been treated yet, and 2 if its treatment is finished. Initially, the first client has status 1, and the others 0. This document can be represented by the tree of Figure 4. Note that we need here arbitrary-width trees since the number of clients and the number of bought items are arbitrary.

Regular Hedge Model Checking

227

Philipp < / n a m e >
• • •
<status> 1 < / s t a t u s > bed chair fridge Maria < / n a m e >
• • •
<status> 0 < / s t a t u s > TV radio closet F i g . 3 . Part of a document containing information about the clients of a store clients

name adflress status

items

Maria

bed

chair

fridge

closet

F i g . 4. The previous XML document as a tree

T h e store has a software t h a t t r e a t s the clients in the order they appear in the XML document. T h e effect of one action of the software consists in changing the status of the current client (resp. the next one) to 2 (resp. to 1) to

228

J. d'Orso and T. Touili

express that the treatment of the current client is over, and that now we moved to the treatment of the next chent. This transformation can be represented by the following relabehng hedge transducer r = {Q,S,F,A), where Q = {9i9iiQii9i')92,92i92'9/}; ^ — {9/}i ^ — S' ^ {name,address,item,items, status, client, clients, 1,0,2}, where Z" is a finite alphabet that corresponds to the names, addresses, etc, and that is not relevant for us in this application; and A contains the following rules: -

For every f G S, /(g*) ^ g(/); 1(e) —> 92(2): 1 is changed to 2; status{q2) —> q'2{status); client{q*q'2q*) —> q'^iclient); 0(e) —> gi(l): 0 is changed to 1; status{qi) —+ q[{status); client{q*q'lq*) —> q'l(client); clients{q*q2qiq*) —> qf- we make sure that the client whose " 1 " has been changed into "2" is adjacent in the document (and therefore in the tree) to the client whose "0" has been changed into " 1 " .

In order to check the behavior of this software, we need to compute the transitive closure r"*". Our technique terminates in this example and computes r"*". We skip here the details since they are similar to the previous examples.

7 Conclusion In this paper, we have extended the regular model checking framework so that it can handle systems with arbitrary width tree-like structures. Since the central problem in regular model checking is the computation of transitive closures of transducers, the main contribution of this paper is a general acceleration technique that computes the transitive closure of a given hedge transducer. The technique is based on defining and effectively computing an equivalence relation used to collapse the states of the transitive closure of the hedge transducer. We have successfully applied our technique to compute transitive closures for (1) some mutual exclusion protocols defined on arbitrary width tree topologies; and (2) XML document transformations. As future work, it would be interesting to see if one can extend our technique to handle non-structure preserving transducers. It would also be of interest to see if we can combine our simulation-based technique with other regular model checking techniques such as abstraction [11, 10] or learning [27, 20].

References 1. P. A. Abdulla, A. Legay, J. d'Orso, and A. Rezine. Simulation-based iteration of tree transducers. Proceedings of TACAS'05, 2005.

Regular Hedge Model Checking

229

2. Parosh Aziz AbduUa, Bengt Jonsson, Pritha Mahata, and Julien d'Orso. Regular tree model checking. In Proc. 14"' Int. Conf. on Computer Aided Verification, volume 2404 of Lecture Notes in Computer Science, pages 555-568, 2002. 3. Parosh Aziz Abdulla, Bengt Jonsson, Marcus Nilsson, and Julien d'Orso. Regular model checking made simple and efficient. In Proc. CONCUR 2002, 13*'' Int. Conf. on Concurrency Theory, volume 2421 of Lecture Notes in Computer Science, pages 116-130, 2002. 4. Parosh Aziz Abdulla, Bengt Jonsson, Marcus Nilsson, and Julien d'Orso. Algorithmic improvements in regular model checking. In Proc. 15"' Int. Conf. on Computer Aided Verification, volume 2725 of Lecture Notes in Computer Science, pages 236-248, 2003. 5. Bernard Boigelot, Axel Legay, and Pierre Wolper. Iterating transducers in the large. In Proc. 15*'' Int. Conf. on Computer Aided Verification, volume 2725 of Lecture Notes in Computer Science, pages 223-235, 2003. 6. Bernard Boigelot, Axel Legay, and Pierre Wolper. Omega regular model checking. In Proc. TACAS '04, lO"' Int. Conf. on Tools and Algorithms for the Construction and Analysis of Systems, Lecture Notes in Computer Science, pages 561--575, 2004. 7. A. Bouajjani. Languages, Rewriting systems, and Verification of Infinte-State Systems. In ICALP'OL LNCS 2076, 2001. invited paper. 8. A. Bouajjani, J. Esparza, and T. Touili. Reachability Analysis of Synchronised PA systems. In INFINITY'04. ENTCS, 2004. 9. A. Bouajjani, P. Habermehl, P. Moro, and T. Vojnar. Verifying programs with dynamic 1-selector-linked structures in regular model checking. Proceedings of TACAS'05, 2005. 10. A. Bouajjani, P. Habermehl, P. Moro, and T. Vojnar. Verifying programs with dynamic 1-selector-linked structures in regular model checking. In TACAS05, Lecture Notes in Computer Science, pages 13-29. Springer, 2005. 11. A. Bouajjani, P. Habermehl, and T. Vojnar. Abstract regular model checking. In CAVO4, Lecture Notes in Computer Science, pages 372-386, Boston, July 2004. Springer-Verlag. 12. A. Bouajjani, B. Jonsson, M. Nilsson, and T. Touili. Regular model checking. In Emerson and Sistla, editors, Proc. 12*'' Int. Conf. on Computer Aided Verification, volume 1855 of Lecture Notes in Computer Science, pages 403-418. Springer Verlag, 2000. 13. A. Bouajjani, A. Muscholl, and T. Touili. Permutation rewriting and algorithmic verification. In Proc. LLCS' 01 17*'' IEEE Int. Symp. on Logic in Computer Science. IEEE, 2001. 14. A. Bouajjani and T, Touih. Reachability analysis of process rewrite systems. In FSTTCSOS, Lecture Notes in Computer Science, pages 73-87, 2003. 15. A. Bouajjani and T. Touili. On computing reachability sets of process rewrite systems. In Proc. 16"' Int. Conf. on Rewriting Techniques and Applications (RTA '05), volume 3467 of Lecture Notes in Computer Science, April 2005. 16. Ahmed Bouajjani, Markus Miiller-Olm, and Tayssir Touili. Regular symbolic analysis of dynamic networks of pushdown systems. In CONCUR'05, LNCS, 2005. 17. Ahmed Bouajjani and Tayssir Touili. Extrapolating Tree Transformations. In Proc. 14*'' Int. Conf. on Computer Aided Verification, volume 2404 of Lecture Notes in Computer Science, pages 539-554, 2002. 18. A. Bruggemann-Klein, M. Murata, and D. Wood. Regular tree and regular hedge languages over unranked alphabets. Research report, 2001.

230

J. d'Orso and T. Touili

19. D. Dams, Y. Lakhnech, and M. Steffen. Iterating transducers. In G. Berry, H. Comon, and A. Finkel, editors, Computer Aided Verification, volume 2102 of Lecture Notes in Computer Science, pages 286-297, 2001. 20. P. Habermehl and T. Vojnar. Regular model checking using inference of regular languages. In Proc. of 6th International Workshop on Verification of Infinite-State Systems—Infinity'04, pages 61-72, Sept. 2004. 21. Y. Kesten, O. Maler, M. Marcus, A. Pnueli, and E. Shahar. Symbolic model checking with rich assertional languages. Theoretical Computer Science, 256:93112, 2001. 22. D. Lugiez and Ph. Schnoebelen. The regular viewpoint on PA-processes. In Proc. 9th Int. Conf. Concurrency Theory (CONCUR'98), Nice, France, Sep. 1998, volume 1466, pages 50-66. Springer, 1998. 23. A. Pnueli and E. Shahar. Liveness and acceleration in parametrized verification. In CAV'OO. LNCS, 2000. 24. H. Seidl, Th. Schwentick, and A. Muscholl. Numerical Document Queries. In PODS'OS. ACM press, 2003. 25. T. Touili. Regular Model Checking using Widening Techniques. Electronic Notes in Theoretical Computer Science, 50(4), 2001. Proc. Workshop on Verification of Parametrized Systems (VEPAS'Ol), Crete, July, 2001. 26. T. Touili. Dealing with communication for dynamic multithreaded recursive programs. In 1st VISSAS workshop, 2005. Invited Paper. 27. A. Vardhan, K.Sen, M. Viswanathan, and G. Agha. Actively learning to verify safety for FIFO automata. In FSTTCS04, Lecture Notes in Computer Science, pages 494-505, 2004. 28. Pierre Wolper and Bernard Boigelot. Verifying systems with infinite but regular state spaces. In Proc. 10th Int. Conf. on Computer Aided Verification, volume 1427 of Lecture Notes in Computer Science, pages 88-97, Vancouver, July 1998. Springer Verlag. 29. Silvano Dal Zilio and Denis Lugiez. Xml schema, tree logic and sheaves automata. In RTA '03, 2003.

Completing Categorical Algebras (Extended Abstract) Stephen L. Bloom^ and Zoltan Esik^* ^ Department of Computer Science Stevens Institute of Technology Hoboken, NJ 07030 ^ Institute for Informatics University of Szeged Szeged, Hungary, and GRLMC Rovira i Virgili University Tarragona, Spain

A b s t r a c t . Let £• be a ranked set. A categorical Z'-algebra, cZ'a for short, is a small category C equipped with a functor oc • C " s-C, for each a £ Sn, n >0. A continuous categorical X'-algebra is a cSa which has an initial object and all colimits of w-chains, i.e., functors N >-C; each functor ac preserves colimits of w-chains. (N is the linearly ordered set of the nonnegative integers considered as a category as usual.) We prove that for any cZ'a C there is an w-continuous cSa. C", unique up to equivalence, which forms a "free continuous completion" of C. We generalize the notion of inequation (and equation) and show the inequations or equations that hold in C also hold in C". We then find examples of this completion when - C is a cEa of finite Z'-trees - C is an ordered S algebra - C is a cZ'a of finite A-sychronization trees - C is a cSa of finite words on A.

1 Introduction Computer science is necessarily concerned with fixed point equations, and in finding settings in which fixed point equations may be solved. Such equations arise in well known ways, for example, in specifying b o t h the syntax and semantics of programming languages. In many examples, the setting is some kind of ordered algebra A with the properties t h a t A contains a least element J_, and w-chains, i.e., increasing sequences ao < ai < ... have least upper bounds. In this setting, the least solution of an equation X=

f{x),

* Partially supported by the National Foundation of Hungary for Scientific Research. Please use the following format when citing this chapter: Bloom, S.L., EsLk, Z., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 231-249.

232

S. Bloom and Z. Esik

when / : A >• A preserves least upper bounds of w-chains may be found as the least upper bound of

± 0, of pairwise disjoint sets, the collection of finite and infinite Z'-trees may be equipped with an ordering by adjoining a new label _L to SQ, and defining s • X is the totally undefined function, and fn+i{x) = if {B? x) then {fn{f{x))

e l s e x.

However, not all fixed point equations may be solved by means of least upper bounds. One example that plays an important role in the semantics of parallel computation is synchronization trees, see [Mil89, Win84]^ or [BE93]. For a fixed alphabet A, an ^-synchronization tree is a finite or countable rooted tree, in which every edge is labeled by a letter in A; the collection of these trees forms a category ST A , in which a morphism / : s ^t is a function from the vertices of s to the vertices of t which preserves the root, the edge relation and the labeling. This category has an initial object J_, the rooted tree with no edge, and is equipped with at least the operations of prefixing and sum. For each letter a G A, and each synchronization tree t, a : t is the tree obtained from t by adding a new root, r and an edge labeled a from r to the root oit. When s, t are synchronization trees, s-f-Hs the tree obtained from s,t by identifying their roots, and otherwise, keeping the vertices and edges of each. In this category, fixed point equations such as X = {a : x) + X

have solutions, but there is no canonical ordering on the category in which least solutions exist. However, this category has all colimits of w-diagrams; the rightside of fixed point equations determines a continuous endofunctor F : ST A ^STAFurther, the "initial fixed point" of the functor F is determined up to isomorphism as a colimit of the w-diagram ^ In [Win84], two complete partial orders are defined on synchronization trees. However, the definition depends on the concrete representation of trees and is thus not fully abstract.

Completing Categorical Algebras

233

Thus, ST A is an example of a continuous cUa defined in the abstract (and immediately below). There are other examples which we will mention after stating our main results. Although there are many kinds of completions in the category-theory literature, we were not able to find this particular completion, except for the case of linear orders. In volume 2 of [Ele02], Johnstone describes an "Ind-completion" of a category, which is certainly related to this one. However, Johnstone does not study algebraic structures on the category and thus does not consider (in) equations. The notion of a cS& probably occurs to all those familiar with both universal algebra and category theory, and the outline of an w-completion result is probably obvious to many. Perhaps the "right" notion of the truth of an inequation in a cSa is not obvious, and the details of the construction have turned out to be more delicate than expected. We think they merit exposition in this paper. In this extended abstract, only a few proofs will be given. A version of this paper with full proofs may be found at www.es .Stevens .edu/~bloom/research/pubs2/ccafull.pdf.

2 Some notation N is the category whose objects are the nonnegative integers, in which there is a morphism n >- p exactly when n < p. li f : X >• Y is either a function or functor, we write J / , fi, f{i) for the value of / on the argument i. The composite of / : x ^y and g : y ^z is written fg : x ^ z , where f,g are functions or functors.

3 The completion and characterization theorems Let 17 be a ranked alphabet. A categorical U-algebra C consists of a small category C, and, for each letter a £ I7„, a functor ac '• C" ^C. A m o r p h i s m

h-.C—^C of categorical 17-algebras is a functor h : C >• C such that for each n > 0 and eachCTG i7„, C" - ^ ^ C - ^ D and C" - ^ DP -^^ D are naturally isomorphic. A ci7a-morphism h is strict if the functors a • h and /i" • a are the same for all a € EnRecall that a functor h : D ^D' is w-continuous, or just "continuous", for short, if whenever a functor / : N >• D has a colimit {vn '• fn ^ d)n in D, then {vnh : fnh ^dh)n is a colimit oi f • h : N ^ D'. A cZ'a C is (w-) continuous if

234

S. Bloom and Z. Esik

- C is w-complete, i.e., C has an initial object ± and all functors N coHmits, and further, - each functor ac • C" ^C is continuous.

^C have

A (strict) morphism of continuous cZ'a's is a continuous functor F : C 3-D which preserves initial objects and is a (strict) ci^a morphism. Remark 1. Categorical i7-algebras are a generalization of ordered Z'-algebras and continuous cZ'a's are a generalization of (order) continuous U-algebras, see [Blo76, GTWW77, GueSl] or below. Let Tnix'(p) denote the collection of iJ-terms on p variables x i , . . . , X p . Suppose that C is a cI7a. Any term t G T m ^ ( p ) determines a functor tc : C^ ^C as follows: - {xi)c • C^

^ C is the i-th projection functor (1 »*')---fa)^> c'

-^ . g

A cI7a inequality is an expression s -< t

where s, t are terms in T m ^ ( p ) , for some p > 0. If C is a ci^a, we say C is a model for s ^ t, in symbols, C\=s
if there is a natural isomorphism sc ^tcOur main results are about completions of ci7a's. Theorem 1 (Completion theorem). For any cSa C having an initial object, there is a continuous cEa C", and a cEa morphism r]: C •—^

C",

with the following properties. If D is a continuous cSa, and if F : C >• D is any cSa-morphism which preserves initial objects, then there is a morphism F " : C ^ — - ^ D in the category of continuous cSa's, unique up to a natural isomorphism, such that the functors T] • F^ and F are naturally isomorphic.

Completing Categorical Algebras

235

It then follows that - C^ is unique up to categorical equivalence. - T] is a full and faithful functor which is infective on objects, and which preserves initial objects. ~ Any cSa inequality or equality which holds in C, also holds in C". Our characterization of C " involves the following notion. Definition 1. Suppose that K is a full subcategory of the category D. ~ K is compact in D if for each object c in K, and each object d of D, if there is a colimiting cone {rf : fi —

d)i

(1)

where f : N ^K, then any map c ^d factors through some rf. - D is compactly generated by K if K is compact in D and for every object d of D, there is a functor f : N ^ K and a colimiting cone as in (1) in which each colimit morphism rf : fi >• d is monic. Using this notion, we describe those situations in which the induced functor F^ in Theorem 1 is an equivalence. Theorem 2 (Characterization theorem). Suppose that D is a continuous cEa and F : C ^D is a cEa morphism which preserves initial objects. Then the induced functor F ^ : C" s- D is an equivalence iff F is full, faithful, and D is compactly generated by the image of F. We will outline the proofs after discussing some examples. 3.1 Ordered i^-algebras When X" is a ranked set, an ordered Z'-algebra consists of a partially ordered set (A, <) equipped with a function a : A"-

^ A

which is order preserving. Such algebras are categorical X'-algebras, in which the objects are the elements of A and in which there is a morphism a ^h exactly when a t. In [Blo76], varieties of ordered algebras were considered, and it was shown that each variety V was closed under the free w-completion of any algebra in V. Our main theorem is a significant generalization of this result.

236

S. Bloom and Z. Esik

3.2 S t r e e s As formalized in [BET93], a i7-tree i is a partial function t : N"^ ^ E, with source the set N!^ of finite sequences of positive integers, and target E, with the following properties. - The domain of t is a nonempty, prefix-closed subset of N!!),. - If u e N!!]_ is in the domain of t and if t{u) G I7„, and i is a positive integer, then ui, the sequence obtained by putting i at the end of the sequence u, is in the domain of i iff 1 < i < n. Thus, the leaves of t are those sequences u such that t{u) £ SoWe assume there is a distinguished letter ± e SQ. Then for trees s,t,we define s
a{ti,...,tn){u)

]a = <^ , . [ti{V)

if u is the empty sequence ., it U = IV,

where iv is the sequence obtained by putting i on the front of the sequence v. iJtr is an ordered cZ'a, in that there is a morphism s ^t, for any trees s,t iS s
D

Note that if D is any cSa with an intial object ±D, there is a unique ci7a morphism SFtr ^D taking J_ to ±D- Thus, Corollary 1. Str is the initial continuous cSa in the category of all continuous cSa's in which A. is the initial object: for any such continuous cSa D there is a continuous cSa-morphism Str ^D, unique up to an isomorphism. D 3.3 Synchronization trees We have shown in [BE93] that ST A defined briefly in the introduction is an u>continuous categorical Z'^-algebra, where S is the signature having a constant symbol 0, denoting the initial object -L, a unary function symbol a for each a G A, denoting the prefixing operation, and a binary function symbol +, denoting the coproduct operation described above. See also [Mil89, Win84]. Let !FSTA denote the full subcategory of ST A determined by the finite synchronization trees. Note that TSTA is also a cZ'a, a "categorical subalgebra" of5T^.

Completing Categorical Algebras Proposition 2. ST A is the completion of T ST A-

237 •

Let V be the collection of all cI7a's D in wliich. 0 is an initial object which satisfy the following: x+ 0 ^x x+ y = y+ x x + {y + z)'^{x

+ y) + z

Then it is not hard to show that the subcategory ^iST^(mon) of TSTA with the same objects having only monies as morphisms is the initial ciTa in V, in the following sense: for any cZ'a in V there is a cZ'a-morphism F : ^5T>i (monies) > D, unique up to a natural isomorphism. Corollary 2. J^STAi'mon)'^ in V.

is initial in the category of all continuous

cSa's

Proof. Let JD be a continuous cZ'a in V. Then there is a cSa morphism F : ^(ST^(mon) ——>• D, since .FiSTA(mon) is initial in D. But then there is a continuous F " : ^iST^(mon)'^ -—^ D, unique up to natural isomorphism, by the completion theorem. D 3.4 Words We recall from [Cou78, BE05] that when ^4 is a finite or countable set, a word over A (called an arrangement in [Cou78]) is a triple u — (L„, <«, A^) consisting of a finite or countable linearly ordered set {Lu, • u' • v' so that it agrees with / on the elements of L„ and with g on the elements of Ly. Let S be the signature with a constant symbol a, for each a £ A, denoting the constant functor W^ > WA whose value is the singleton word labeled

238

S. Bloom and Z. Esik

a, a symbol 0 in EQ denoting the constant functor whose value is the empty word, and a binary function symbol ; denoting the concatenation functor. The following fact was essentially shown in [Cou78]. Proposition 3. WA is a continuous

cSa.

In WA, one can solve such equations as x = a; a; and x = x;a; x. The initial solution to the second, is the word faf whose underlying order is isomorphic to the rationals, with every point labeled a. (There doesn't seem to be an ordering of WA such that {af is the least upper bound of a sequence of finite approximations.) Let J^WA be the full subcategory of WA determined by the finite words. Proposition 4. WA is the completion of !FWALet ^>V^(mon) be the subcategory oiJ^WA with the same objects, having only the monies as morphisms. Define the category Ai having as objects all ciTa's with an initial object 0 which satisfy the monoid equations 0;x^x x;0 = x x;{y;z) = {x\y);z It is not hard to show that .7-">Vyi(mon) is freely generated by ^ in A^ in the sense that for any cE& D in M., and any function / : A 3-obj(L>), mapping 'letters' in A to objects in D, there is a functor F : !FWA{'^on) ^D, unique up to a natural isomorphism, such that F{0) is initial and F{a) = / ( a ) , for each a E A. Thus, Corollary 3. J^WAi'fnon)^ is freely generated by A in the category of all continuous cSa's in M. D

4 Weak maps and compact generation An endofunctor m : N >- N is just a nondecreasing function. We say an endofunctor is unbounded if for each i £N, i < jm, for some j G N. When m : N >- N is an endofunctor and / : N >• C is a chain, we write mf for the composite

r^ _irv N - ^ C. Thus, on the object z G N, {mf)i = fimWhen f,g are chains, a weak map a : f -—^g is a natural transformation a : f

^ mg

for some unbounded endofunctor m on N. We define the composite of weak maps a : / >• ruag and j3 : g >• mph as aop

:= f

"> niag '""^

{mam/3)h.

Completing Categorical Algebras Definition 2. For weak maps a : f

^rriag and /? : /

239

^mpg, define a ~ /?

by: for all i >0 there is some j > ima,imf3 such that Oii • g{ima,j)

= Pi • g{imp,j).

(2)

It is clear that ~ is an equivalence relation on the weak maps with the same source and target. Let [a] : f ^g denote the ~-equivalence class of the weak map a : f ^g. This equivalence relation is compatible with composition. Proposition 5. / / o ; ~ a' : / ^g and p ^ /?' : g ^h, then a o / ? ~ a'o/?'. D

We will need the following fact about a ~ /?. Lemma 1 (Inflation Lemma). Suppose that a : f

s^mg and that m' : N

>• N is any functor satisfying km < km', for all k > 0. Define the natural transformation a' : /

^m'g

by di ••= fi

-^ gim " ' ^ ^

gim'-

Then

4.1 Compact generation Recall Definition 1. Note the similarity of this notion to that of the definition in [CCL80] of a continuous lattice. The following lemma indicates where compact generation arises. Lemma 2. Let C be a full subcategory of D. Suppose that f, f : N ^ C and that [rf : fi >-rf)i and {rf : f[ >-d')i are colimiting cones in D. Then 1. A weak map 7 : /

^mf

determines the map K(7)

as the unique morphism d

:d

3- d'

^ d' such that rf • K ( 7 ) = li • rfm

for all i > 0.

240

S. Bloom and Z. Esik

2. If ^ : f

>- mf

and 7 : /

3- m / ' are weak maps such that 7 ~ 7', then K(7) = K ( 7 ' ) .

3. Suppose that D is compactly generated by C and that for i > 0, the morphisms rf and rf are monies. Then, for any map h:d in D there is a weak map 7 : /

^d' >• mf

K(7)

such that

= h.

4- Suppose that D is compactly generated by C and for i > 0, the morphisms rf and rf are monies. Ifj : f ^mf and^ : f *-m/' are weak maps, and K{J) = K{^), then

Now, we give a condition sufficient to obtain a colimit of a functor G : N ^D. Lemma 3. We assume the following hypotheses. - Fori > 0, /* : N ^D is a functor with colimiting cone {TJ : / j ^K{f'^))j. " For each i < j , P'''^ : /* ^ f^ is a natural transformation such that /3''* = Ifi and, when i < j < k.

Thus, G : N ^D is a functor, where Gi = « ( / ' ) , and G{i,j) = «(/?*'•'), for allO
g{i,J):=f\iJ)-P;'' - Let fJ,i{j) = msix{i,j),

- Suppose that {rf : gi

and let 5' : /*

*-/Uiff be the weak map

^i^{g))i is a colimiting cone.

Then, it follows that (K((5*) : «(/*) ^^{g))i is a colimiting cone overG, where is the unique map satisfying the conditions that

K((5')

for all j .

Completing Categorical Algebras

241

The following Proposition is quite useful. Proposition 6. Suppose that D is compactly generated by the full subcategory C. Then: 1. C has initial object iff D has. 2. D has colimits of all u-diagrams iff each functor N s-C has a colimit in D. 3. A functor F : D—-^D' is continuous iff it preserves colimits of all functors

n—^c.

Proof. We prove only the second two statements. Proof of 2. Suppose that each functor N ^ C has a colimit in D. We show that if G : N ^D is a functor, G has a colimit in D. For each n > 0, let / " : N ^ C be a functor such that (rf : / f ^ G„)i is a colimiting cone in D. By Lemma 2, each 0 < i < j , each morphism G{i,j) : Gi ^Gj is determined by a weak map pi,j : f

—

mijf^.

For ease of notation, let's assume that all functors mij are the identity, so that for each 0 0, there is a weak map <5' : /* s- ^^g defined by

(As above, iii{j) = max(i,j).) Thus, there is a unique map such that for all j > 0, (3) holds. In particular, letting j = i,

K((5*)

*- d)i be a

: Gi

^d

r!=5\-T!

(4)

Claim. (K(5*) : Gi ^d)i is a coHmiting cone. Indeed, any cone {ui : Gi over G determines the cone

^e)i

{TI -Vi'-gi — ^ e)i

over g, and hence, there is a unique map

242

S. Bloom and Z, Esik j^* : d

> e

such that for all i,

We show that for all i > 0, Vi =^

K{5')

• u*.

(5)

Indeed, for fixed i, the maps aj :=

form a cone over / ' : N that for all j ,

T]

•

• u*

K{5')

^ C , so that there is unique map Q * : d r] . a* =

T]

•

^ e such

• v*.

K{5')

But Ui is one such map. Hence a * = ViWe now show u"^ : g ^e is the unique map such that for all«, (5) holds. Indeed, suppose Ui = K((P ) - a ,

all i > 0. Then, for each i , j , ^

• '^i

=

^

• K{5') • a

= ^ • < . o ) • "• But if i = j , Tl-Vi=-Tf

• a,

and v"^ is the unique such map. D Proof of 3. Suppose that F : D ^D' preserves the colimits of all functors N ^C. We show that F preserves the colimits of all functors N ^D. We use Lemma 3. Indeed, suppose that G : N ^D is a functor. Using the notation of the previous part, we have shown that {^{5') : Gi - ^

g)i

is a colimit of G, where, for each i >0, f^ : N (Tj : 4 —

^C is a functor and

G,)j

is a colimit in D, and where g is the diagonal functor, with colimiting cone (T-f : 9i

^ 9)i.

Completing Categorical Algebras

243

But now, applying F, the assumptions imply that

{rJF : fJF —

GiF)j

(TfF : giF - —

gF)^.

is a colimiting cone, as is

It then follows from Lemma 3 that i[Kid')F] : GiF —

gF)i

is a colimiting cone in D'.

D

5 Construction of C^ We now describe the cS& C"^ as a quotient of the functor category C^. 5.1 S t e p 1. We assume C has an initial object (if necessary, we adjoin one freely.) Let C^ be the category whose objects are all functors / : N ^ C; a morphism a : f >• g is & natural transformation. We usually denote the components of a natural transformation a : f *- g by ^n '• In

^ Qri'i

for n > 0. We impose the structure of a ci7a on C^ by 'lifting' the functors a : C' ^CtoN. For example, if cr e S2, and f,g : N ^ C, ac^{f,g) : N ^ C is the functor whose value on n is

The value on the arrow n < p in N is: (^c{f{n,p),g{n,p))

•.ac{fn,gn)

^ (^c{fp,9p)-

So, now, for every term 5 in Tinx;{p), S(7N is defined. (We usually will drop subscripts.) For example, if p = 2, and a : / *-/' and (3 : g ^g' are arrows in C^ (i.e., natural transformations), s{a,P):sif,g)

-^s{f',g')

is the natural transformation with components (s(a,/?))„ =s{an,(3n) •.s{fn,gn)

^s{f'^,g'^).

244

S. Bloom and Z. Esik

D e f i n i t i o n 3 (770 d e f i n e d ) . Let rio-.C be the functor

—^C"*

taking the object x in C to the functor

r]o{x) with ?7o(x)„ = x,

and rio{x){n,p) = 1^, the identity morphism x ^x, for all 0 < n < p. On the morphism g : x ^y in C, the value ofr]o{g) is the natural transformation Vo{x) ^Vo{y), each of whose components is g. P r o p o s i t i o n 7. The functor rjo : C ^C^ is a strict cUa-morphism, which is full and faithful, and injective on objects. If J- is an initial object in C, T/O(-L) is initial in C^. D Now for t h e next step. 5.2 S t e p 2. D e f i n i t i o n 4 . Let C " be the category whose objects are those of C^ in which a morphism [a] : / ^g is an '2:i-equivalence class of a weak map a : f ^mg. We define t h e canonical embedding of C into C ^ . D e f i n i t i o n 5 (77 d e f i n e d ) . Let rj: C ^C" in C to [rioif)] : rio{x)—^rio{y) in C^.

be the functor

taking f : x

^y

We would like t o impose t h e structure of a cSa, on C"^. T h e first problem is t h a t if cr € T m ^ ( 2 ) , say, and if a : / >• mag and P : f >• mpg', when m.a ^ nif}, how should we define cr{\a], [/?]) : o'{f,f') ^a{g,g'), since a{a,(3) may not be weak map! Indeed, for i G N, if ima 7^ imp, we have cr{a,f3)i =a{ai,pi)

: crifij'i)

^(^{9imc.,9'imfi)^

which is not a weak m a p . However, if m ^ = TO/3, this equation does define a weak m a p a{a, (3) : a{f, f) ^ a{g, g'). We have a simple alternative, using t h e Inflation Lemma 1, above. L e m m a 4 . Suppose that m, m' are unbounded endofunctors on N with jm < jm!, for all j > 0. Suppose also thatoi : fi ^mgi, j3i : fi ^m'gt are natural transformations such that ai c^ Pi, for each i = 1 , . . . , n . Then if a £ Sn, we have the natural transformations ( T ( Q ! I , . . . , Q : „ ) : = ( Q ; i , . . . , a „ ) -(T : o-(/i, . . . , / „ ) • — ^ m a { g i , . . . ,gn) o-(/?i, •••,/?») : = (/3i,...,/3n) -(^ •• (^(fi, • • •, fn) —^rn'a{gi,... With these

assumptions, (T(ai,...,a„) ~ (T(/3I,...,/3„).

D

,gn)-

Completing Categorical Algebras

245

Definition 6 (C^ as cSa). Suppose a G Sn o-nd n > 0. For any n-tuple [ai],..., [an], where [ai] is an equivalence class of a weak map ai : fi ^gi, i = 1,... ,n, choose some m : N a-N and some Pi : fi ^mgi, for i = I,... ,n such that ~ aiC^i j3i, for each i; - Pi • fi ^rngi, for each i. The existence of such Pi and m follows by the Inflation Lemma. Now define ^•^-([ai], ••-,["«]) : o - ( / i , . . . , / „ )

^a{gi,...,gn)

[cT{p,,...,Pn)], the equivalence class of the weak map cr{Pi, • • •, Pn)- (We write just a for aQU.) The fact that • C"^. But this is easy. We have thus constructed the cI7a C". We omit the proof of the following fact. Proposition 8. The functor rj is a strict cSa morphism which preserves the initial object, and is full, faithful and injective on objects. D In the next section we will prove that C^ is an w-continuous cZ'a.

6 C^ has the required properties In the previous section we defined the categorical iJ-algebra C"^ and the embedding T] : C ^ C"^. In this section, we prove that the construction satisfies all properties required in Theorem 1. We will show that C " is compactly generated by 77(C), and then apply Proposition 6. Lemma 5. / / / : N ^C is any functor, then f is the colimit object in C^ of the diagram

via the colimit morphisms

[rl] : Vifn) —

/

where, for each n, r / has the components ri^ii) := f{n,ma.x{i,n}). Further, each morphism [r/] is monic.

(6)

246

S. Bloom and Z. Esik

Proof. First, we show each morphism r / is monic. Suppose that / , g : N are objects in C"^, and a,(3 : g -—>• fnT] are weak maps such that f] ^ H • [4]

3-C

I/Q] . Wf Ti,

By the Inflation Lemma, we may assume that a,j3 : g bounded endofunctor m : N ^N. Thus,

>• mfn for some un-

so that for each i there is some j > n + im such that Oii- f{n,j)

= f3i-

f{n,j),

But this imphes a ~ /3, and hence [a] = [/?]. It is clear that for n
^ Vifp)

f commutes. Now suppose that g is any object in C", {[ui] : ri{fi) over the diagram fr]. But defining 1/* : f

^g)i is a cocone

^ g

as the weak map with components {i'*)i := i^i,

we have fj = r / • V*,

for each i. We now consider the factorization property.

D

Lemma 6 (C" has the factorization property). Suppose that c is an object in C, f : N ^C is an object in C^, and [a] : crj ^f is a morphism in C^. Then [a] factors as [a] = [grf\ • [r^], for some n > 0, and some morphism g : c

^ fn in C.

Completing Categorical Algebras Proof. If a : crj

*-m/ is any weak map, then, for any i, since (c77)(0,i) = Ic, ^ — ^ Oil — C

If 3 = ao : /o

247

"0-. f / ( O m , im) , ^ JOm ^ Jim'

^ /om in C, we have

Proposition 9. C"^ is compactly generated by r]{C). Proof. By Lemmas 5 and 6. Corollary 4. C"^ is io-complete. Proof. By Proposition 6 and Proposition 9.

D

We now show C"^ is a continuous ci7a. Proposition 10. For each a G S, the functor ac^ is continuous. Proof. For ease of notation, assume that a G Si. We have to show that if ([•?"'] •' / ' ^g)i is a colimit of the w-diagram A, then ([cr{T^)] : a-{p) ^cr(g))i is a colimit of (j{A), i.e., the diagram a(r)'^la(/^)[«la(/2)—... But this fact follows just as above, diagonal, which is a{g). There is an alternative argument is compactly generated by C". Then, a preserves cohmits of functors N

since the colimit of this diagram is the using the fact that for each n > 1, (C^)" by Proposition 6, we need show only that >• C " . D

Proposition 11. If s,t are E-terms in Tms{p),

then C \= s ^t

t.

iff C" \= s <

a

We turn now to showing that r] : C ^C" has the universal property stated in Theorem 1. Suppose that D is an w-continuous ci7a, and F : C >• D is a ciJamorphism. We want to define F'^ : C^ >• D. We use Proposition 6. For each chain / : N ^C be an object of C"^, choose a colimit cone (A{ : fiF - — K{fF))i

(7)

in D. On the object / in C"^, we define fF'^ as the colimit object K{fF). Suppose f,g:N s- C are objects in C^ and a : f >• ruag is any weak map. Then a determines the weak map aF : fF ^m{gF), which in turn determines the map a* : K ( / F ) ^ by the property that for each

K{gF)

i>0, X{ -a* =Qi-Af

248

S. Bloom and Z. Esik

Lemma 7. If a,/3 : f

>• g are weak maps, and if a c± f3, then a * = /?*.

Proof. Since a ~ /3 = > a F ~ /3F.

D

Definition 7. We define [a]F'^ = a*. Proposition 12. Suppose that r / : fnij defined above. Then [ T / ] F ' ^ = A^.

^ f is the monic colimit morphism

Proof. By definition, [r^jF'^ is a"^, where a = T^F. Since /^r^F is the constant chain whose object is fnF, the morphisms r/" are all the identity map 1 / „ F '• fuF ^fnF. Thus, for any i > 0, a* =

TI"F

-a*

= 4ii)F.xi^, = TlF{n,n

+ i)-\l^^^

Thus, T / F " = A^, showing that F'^ preserves colimit cocones of functors fr], for/:N ^C. D Corollary 5. F'^ is continuous. Proof. By Proposition 6, part 3. D It remains to show F^ is a ciJa-morphism. When a £ 1^2, we want to show that for any objects f,g € C" F^ia{f,g))=aD{F^{f),F'^{g)), at least up to isomorphism. The method is to show that each side is the colimit object of the same w-diagram in D. We omit the details. D

7 Conclusion We have presented a completion theorem for categorical algebras that generalizes the well-known completion of ordered algebras from [Blo76]. We have shown that the completion C^ is conservative in the sense that it satisfies all (in)equalities that hold in C. In addition to order completion, we have presented two main applications: synchronization trees and words, and thus found concrete descriptions of free continuous categorical algebras satisfying monoid and commutative monoid "equations". We beheve that the Completion Theorem will find several more applications in Computer Science. For one example, the collection of countable labeled partial orders over an alphabet, sometimes called pomsets, equipped with the operations of series and parallel composition is a continuous categorical algebra in a natural way, cf. [Pra86, Ren96, LWOO]. We expect that this algebra is equivalent to the completion of the categorical algebra determined by the finite pomsets. Further natural sources of applications are event structures (cf. [WN95]), or labeled transition systems with bisimulations, cf. [Mil89].

Completing Categorical Algebras

249

References [Bio76] S.L. Bloom. Varieties of ordered algebras. J. Computer and System Sci., vol. 13, no. 2 (1976) 200-212 [BE93] S.L. Bloom and Z. Esik. Iteration Theories. Springer, 1993. [BEOS] S.L. Bloom and Z. Esik. The equational theory of regular words. Information and Computation, 197/1-2 (2005) 55-89. [BET93] S.L. Bloom, Z. Esik and D. Taubner. Iteration theories of synchronization trees. Information and Computation, 102/1 (1993) 1-55. [Cou78] B. Courcelle. Frontiers of infinite trees. RAIRO Inform. Theor., 12/4 (1978), 319-337. [EBT78] C. Elgot, S.L. Bloom, R. Tindell. The algebraic structure of rooted trees. J. Computer and System Sci., vol. 16, no. 3 (1978) 362-399. [CCL80] G. Gierz, K.H. Hofmann, K.Keimel, J.D. Lawson, M.Mislove, D.S. Scott. A compendium of continuous lattices. Springer-Verlag 1980. [GTWW77] J.A. Goguen, J.W. Thatcher, E.G. Wagner and J.B. Wright. Initial algebra semantics and continuous algebras. J. Assoc. Comput. Mach., 24/1 (1977), 68-95. [GueSl] I. Guessarian. Algebraic Semantics. Lecture Notes in Computer Science, 99. Springer-Verlag, Berlin-New York, 1981. [Ele02] P.T. Johnstone. Sketches of an Elephant: A topos theory compendium Oxford University Press, 2002. [LWOO] K. Lodaya and P. Weil. Series-parallel languages and the bounded-width property. Theoret. Comput. Sci. 237 (2000), 347-380. [Mil89] R. Milner. Communication and Concurrency. Prentice-Hall, Englewood Chffs, NJ., 1989. [Pra86] V. Pratt. Modeling concurrency with partial orders. Internat. J. Parallel Programming, 15/1 (1986), 33-71. [Ren96] A. Rensink. Algebra and theory of order-deterministic pomsets. Combining logics. Notre Dame J. Formal Logic 37/2 (1996), 283-320. [Win84] G. Winskel. Synchronization trees. Theoretical Computer Science, 34 (1984), 33-82. [WN95] G. Winskel and M. Nielsen. Models for concurrency, in: Handbook of Logic in Computer Science, Vol. 4, Oxford University Press, 1995, 1-148.

Reusing Optimal TSP Solutions for Locally Modified Input Instances* (Extended Abstract) Hans-Joachim Bockenhauer'^, Luca Forlizzi^, Juraj Hromkovic^, Joachim Kneis'^**, Joachim Kupke^, Guido Proietti'^''*, and Peter Widmayer^ ' Department of Computer Science, ETH Zuricii, Switzerland, {hjb, j u r a j .hromkovic, jkupke,widinayer}Qinf . e t h z . c h ^ Department of Computer Science, Universita di L'Aquila, Italy, {forlizzi,proietti}@di.univaq.it

^ Department of Computer Science, RWTH Aachen University, Germany, j oachim.kneisQcs.rwth-aachen.de

* Istituto di Analisi dei Sistemi ed Informatica "A. Ruberti", CNR, Roma, Italy A b s t r a c t . Given an instance of an optimization problem together with an optimal solution, we consider the scenario in which this instance is modified locally. In graph problems, e.g., a singular edge might be removed or added, or an edge weight might be varied, etc. For a problem U and such a local modification operation, let LM-t/ (local-modificationU) denote the resulting problem. The question is whether it is possible to exploit the additional knowledge of an optimal solution to the original instance or not, i.e., whether LM-U is computationally more tractable than U. Here, we give non-trivial examples both of problems where this is and problems where this is not the case. Our main results are these: 1. The local modification to change the cost of a singular edge turns the traveling salesperson problem (TSP) into a problem L M - T S P which is as hard as T S P itself, i.e., unless P = NP, there is no polynomial-time p(n)-approximation algorithm for LM-TSP for any polynomial p. Moreover, LM-TSP where inputs must satisfy the /3triangle inequahty (LM-/i/3-TSP) remains NP-hard for all /3 > | . 2. For LM-Zi-TSP (i.e., metric LM-TSP), an efficient 1.4-approximation algorithm is presented. In other words, the additional information enables us to do better than if we simply used Christofides' algorithm for the modified input. 3. Similarly, for all 1 < /? < 3.34899, we achieve a better approximation ratio for LM-Zi,3-TSP than for A^-TSP. 4. Metric TSP with deadlines (time windows), if a single deadline or the cost of a single edge is modified, exhibits the same lower bounds on the approximability in these local-modification versions as those currently known for the original problem. This work was partially supported by SNF grant 200021-109252/1, by the research project GRID.IT, funded by the Italian Ministry of Education, University and Research, and by the COST 293 (GRAAL) project funded by the European Union. This author was staying at ETH Zurich when this work was done. Please use the following format when citing this chapter: Bockenhauer, H.-J., Forlizzi, L., Hromkovic, J., Kneis, J., Kupke, J., Proietti, G., Widmayer, P., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 251-270.

252

H.-J. Bockenhauer et al.

1 Introduction Traditionally, optimization theory has been concerned with the task of finding good feasible solutions to (practically relevant) input instances, little or nothing about which is known in advance. Many applications, however, demand good, sometimes optimal, solutions to a limited set of input instances which reflect a supposedly-constant environment (imagine, e.g., an existing railway system or communications network). When this environment does change, maybe only slightly and maybe only locally, do we have no choice but to recompute some good feasible solution, effectively forgetting about the old one? Here, we will analyze local modifications only. In a graph problem, for example, the cost of a single edge might change, an edge might be removed or added, or some other local parameter might be adjusted. Results related to this work pertain to the question by how much a given instance of an optimization problem may be varied if it is desired that optimal solutions to the original instance retain their optimality [12, 17, 18, 20, 21]. In contrast with this so-called "postoptimality analysis," our approach here is to ask, if we cannot avoid to lose the optimality of a given solution when an instance is varied arbitrarily, what can we do to restore the quality of a solution, maybe in an approximative sense? Surely, for some problems, knowing an optimal solution to the original instance trivially makes their local-modification variants easy to solve because the given optimal solution is itself a very good solution to the modified instance. For example, adding an edge in the instance of a coloring problem will increase the cost of an optimal solution by at most the amount of one - an excellent approximation, but certainly not the object of our interest. Our goal is to present non-trivial examples of problems, some where the knowledge of an optimal solution to an instance close to the input is helpful and some where it is not. To this end, we will study T S P , its restricted versions, and its generalizations such as T S P with deadlines (a special case of T S P with time windows). Let A-TSP denote metric TSP, and, for ah /? > i , let Afs-TSP denote the special case of TSP where all instances satisfy the /^-triangle inequality

cax,z})<(3-{c{{x,y})+ci{y,z})) for all vertices x, y, and 2;. If ^ < /? < 1, we call this the strengthened triangle inequality; and if /3 > 1, we call it the relaxed triangle inequality. For an optimization problem U, we denote our local-modification variant of U by LM-U. For the aforementioned TSP-based problems, we regard it as a local modification to change the cost of exactly one edge. For T S P with deadlines, we also regard it as a local modification to shift one deadline by the amount of at least one time unit. Our main results are as follows: (i) It is well-known that T S P is not approximable in polynomial time with a polynomial approximation ratio (unless P = NP). We show that this

Reusing Optimal TSP Solutions for Locally Modified Input Instances

253

holds for LM-TSP, too. Thus, in terms of a worst-case analysis, LM-TSP is as hard as T S P , and we do not have anything to gain from knowing an optimal solution to a close problem instance. By parameterizing T S P with respect to the /3-triangle inequality [1, 2, 3, 4, 5] and by introducing the concept of stability of approximation [15, 5], it was shown that T S P is not as hard as it may look like in the light of worst-case analyses. For any /? > 5, we have a constant polynomial-time approximation ratio, depending on (3 only. Bockenhauer and Seibert [8] proved that zi/3-TSP is APX-hard for every ^ > \ (note that for /3 = ^, the problem becomes trivially solvable in polynomial time). Here, we prove that LM-zi/3-TSP is NP-hard for every 13 > \. This implies in particular that LM-Zi-TSP, too, is NP-hard. We conjecture that this problem is also APX-hard, which, so far, we have been unable to prove and thus leave as an open research problem. (ii) For many years, Christofides' algorithm [9] with its approximation ratio of 1.5 has been the best known approximation algorithm for attacking Z\-TSP. It remains a grand challenge to improve on Christofides' algorithm. We will show that, intriguingly enough, LM-Z\-TSP admits an efficient 1.4approximation algorithm. This result can be generalized to LM-Zi^-TSP, and the resulting approximation guarantee beats all previously-known approximation algorithms for zi/3-TSP for all 1 < /? < 3.34899, which includes the practically most relevant T S P instances. (iii) T S P with time windows is one of the fundamental problems in operations research [10]. Usually, only heuristic algorithms are used to attack it although the question how hard it is w. r. t. approximability has only been resolved in [6, 7], where even an i7(n) lower bound on the polynomial-time approximability of Z\-TSP with time windows was shown, in contrast to the constant approximability of Z\-TSP. This lower bound already holds for the special case of this problem where all time windows are immediately open, a special case of the problem which we will call T S P with deadlines, or AD L T S P for short. Here, we consider local-modification versions of Z\-TSP with deadlines. We show that already if we only allow a single deadline to be changed, and only by an amount of one time unit, the resulting problem, LM-/i-DLTSP, has the same lower bound of Q{n) on the approximation ratio as Z \ - D L T S P . Let us underscore the importance of this negative result: Not only does TSP with deadlines remain an intractable problem in its LM version, but the extra knowledge of an optimal solution to a related instance does not even help a single bit. Likewise, we will establish the lower bound of (2 — e), for any £ > 0, for L M - Z \ - D L T S P with a constant number of deadlines, the same as is known for / A - D L T S P with a constant number of deadlines [6, 7]. These results can also be obtained if, again, we modify the cost of an edge rather than a deadline. So, on the one hand, additional information about an optimal solution to a related input instance may be useful to some extent, and on the other hand, the local-modification problem variant may remain exactly as hard as the original problem. Yet, the final aim of our paper is to call forth the investigation of

254

H.-J. Bockenhauer et al.

the hardness of local-modification optimization problems in order to develop approaches to handle situations where multiple (and, potentially, dynamically determined) local modifications may arise. The paper is subdivided into two main sections. In Section 2, we will analyze TSP with local modifications and present hardness results as well as approximation algorithms for the metric and near-metric case. Section 3 is devoted to inapproximability results for the local-modification version of TSP with deadlines.

2 Results for T S P In this section, we will analyze the local-modification version of TSP. In Subsection 2.1, we will present our hardness results. In Subsection 2.2, we will present a 1.4-approximation algorithm for the local-modification metric TSP, and Subsection 2.3 is devoted to approximability results for the case of the relaxed triangle inequality. We start off with a formal definition of T S P and its local-modification variants. Definition 1. Let G = {V,E,c) he a weighted complete graph, and let (3 > \ be a real value. We say that G obeys the Ap-inequality iff for all vertices x, y, z GV, we have c{{x,z})
+ c{{y,z}))

.

(A^)

By T S P , we denote the following optimization problem. For a given weighted complete graph G = {V,E,c), find a minimum cost Hamiltonian cycle, i. e., a tour on all vertices of cost OTG

••= min I ^

c(e) {V,C') is a Hamiltonian cycle

I eec Restricting, for some value of (3, the set of admissible input instances to those which obey the Af}-inequality yields the problem Zi/j-TSP. Besides, define Zi-TSP := zii-TSP. Definition 2. Let U € {TSP, zi-TSP, Ap-TSV). as follows. Input:

The problem IM-U is defined

- two complete weighted graphs Go = {V, E,co), GN = {V, E, Cjv) such that Go and GM are both admissible inputs for U and such that CQ and CN coincide, except for one edge; - a Hamiltonian cycle {V,C) such that ^ co{e.) = OTGOeec

Problem; Find a Hamiltonian cycle (V,C) that minimizes ^ eec

CM{C).

Reusing Optimal TSP Solutions for Locally Modified Input Instances

255

2.1 Hardness Results Before presenting approximation algorithms for LM-Zi-TSP, we start by proving some hardness results. First, we will show that LM-TSP is as hard to approximate as "normal" (i. e., unaltered) TSP. Theorem 1. There is no polynomial-time p{n)-approximation LM-TSP for any polynomial p (unless P = NP).

algorithm for

Proof idea. We will give a reduction from the Hamiltonian cycle problem (HC): Given an undirected, unweighted graph G, decide whether G contains a Hamiltonian cycle or not. Let G = {V, E) be an input instance for HC where V = {vi,...,Vn]. _ In order to construct an input instance (Go, Gjsi, C) for LM-TSP, we employ a graph construction due to Papadimitriou and Steiglitz [19], who used the same construction in order to give examples of TSP instances which are hard for local search strategies: For each vertex Vi, we construct a so-called diamond graph Di as shown in Figure 1 (a). These diamonds are connected as shown in Figure 1 (b). The edge costs in Go are set as follows. Let M := n • 2" + 1. All diamond edges shown in Figure 1 (a) and the connections from Ei to Wj+i and from En to Wi as shown in Figure 1 (b) are assigned a cost of 1 each. Edges {Ni,Sj) are assigned a cost of 1 whenever {vi^Vj} € E and a cost of M otherwise. All other edges receive a cost of M each. In Gjv, the cost of the edge {En, Wi) is changed from 1 to M. The given optimal Hamiltonian cycle C is the one shown in Figure 1 (b). This optimal solution for Go has a cost of 8n. It is easy to see that if there is a Hamiltonian cycle H' in G, a corresponding Hamiltonian cycle H in G can traverse all diamonds from Ni via Wj via Ei to Si. Hence, CN{H) = 8n. All Hamiltonian cycles in Gjv that do not correspond (in this way) to Hamiltonian cycles in G cost at least M -\- 8n — 1. Thus, the approximation ratio of any non-optimal solution is at least as bad as 1-1-2""'^. For a more detailed description of diamond graph constructions, also see, for example, [16]. D

Fig. 1. The diamond construction in the proof of Theorem 1.

256

H.-J. Bockenhauer et al.

Now, we will show that LM-Z\-TSP remains a hard problem for any /3 > | . Theorem 2. LM-Z\/3-TSP is NP-hard for any

P>\.

Proof. We will use a reduction from the restricted Hamiltonian cycle problem (RHC). The objective in RHC is, given an unweighted, undirected graph G and a Hamiltonian path P in G which cannot be trivially extended to a Hamiltonian cycle by joining its end-points, to decide whether a Hamiltonian cycle in G exists. This problem is well-known to be NP-complete (see, for example, [16]). The reduction uses an idea analogous to the standard reduction from the Hamiltonian cycle problem to TSP: Let {G,P) be an instance of RHC where G = {V,E), V = {vi,... ,Vn}, and P = ( u i , . . . ,Vn)- From this, we construct an instance {Go, GN, C) of LM-Zi/3-TSP as follows: Let Go = (V, E, CQ) and GAT = {y, E, CN) where {V, E) is a complete graph, co{e) = 1 for all e e £^U {{t;„, vi)} and co{e) = 2/? otherwise, and CAr({t;„, vi}) = 2/3. Let C = {vi,V'2, • • • ,Vn,vi)Clearly, this reduction can be done in polynomial time, and it is easy to see that there is a Hamiltonian cycle in G iff there is a Hamiltonian cycle of cost n in GN• 2.2 The M e t r i c Case In what follows, we will show that LM-Z\-TSP admits a |-approximation, which beats the naive approach of using Christofides' algorithm (which would yield a |-approximation), whereby the input cycle iV, C) would be ignored altogether. Theorem 3. There is a 1.4-approximation algorithm for

hM-A-TSP.

In order to prove Theorem 3, we will need the following few lemmas. Our crucial observation is that in a metric graph, all of the neighboring edges of short edges can only be modified by small amounts. Lem.ma 1. Let Gx = {V,E,ci) and G2 = {V,E,C2) be metric graphs such that Ci and C2 coincide, except for one edge e & E. Then, every edge adjacent to e has a cost of at least ||ci(e) — C2(e)|. Proof. We set {a, a'] := {ci(e),C2(e)} such that a' > a and 5 := a' — a. Let f G Ehe any edge adjacent to e, and for any such / , let f'GEhe the one edge that is adjacent to both e and / . Then, by the triangle inequality, we have: a ' < c ( / ) + c(/') and hence a' — a < 2c{f).

c{f')
We will have to distinguish two cases. Either, an edge becomes more expensive, or it becomes less expensive. In either case, our strategy is to compare the input solution (to the old problem instance) with an approximate solution (to the new problem instance). Let us start with the latter case.

Reusing Optimal TSP Solutions for Locally Modified Input Instances

257

Lemma 2. Let {GO,GN,C) be an admissible input for L M - Z \ - T S P such that 6 := co{e) — Civ(e) > 0 for the edge e. If -QY— < ^, it is a ^-approximation to output the feasible solution C := C for LM-/1-TSP. Proof. CN{C)

^ co{C) ^ OTGO ^ OTG, + ^ _ I I

OTG^

-

OTG„

OTG,

-

OTG,

^ OTG^

 0 for the edge e. / / -QT^-— > | , there is a ^-approximation for LM-A-TSP. Proof. We may assume that optimal TSP tours in GN use the edge e. For if they did not, C would already constitute an optimal solution. Fix one such optimal tour COPT in GN- In COPT, e is adjacent to two edges / and / ' . Let v be the vertex incident with / , but not with e, and let v' be the vertex incident with / ' , but not with e. By P, denote the path from v to v' in COPT that does not involve e. Consider the following algorithm: For every pair / , / ' of disjoint edges, both of which are adjacent to e, compute an approximate solution to the TSP path problem on the subgraph of GN induced by the vertex set V \ e (i. e., without two vertices) with start vertex v and end vertex v' where {v} = f \ e and {{;'} = f'\e. It is known [13, 14] that this can be done with an approximation guarantee of | . Each of these paths is augmented by / , e, and / ' so as to yield a TSP tour. The algorithm concludes by outputting the least expensive of all of these tours. Note that since all pairs / , / ' are taken into account, one of the considered tours uses exactly those edges f = f, f = f that COPT uses. This is why the algorithm outputs a tour of cost at most c{f) + c{f') + CN{e) + \c{P) = [OTG,

- c{P)) + \c{P) = OTG, + \c{P)

(where c is short-hand notation for CN wherever CQ and CN coincide) and thus achieves an approximation guarantee of 1+2 3

^(^) OTG^

Since by Lemma 1, min{c(/), c(/')} > | for i G {1,2}, we have OTG^ -C{P) > 6 and hence: OTG^

-

OTG^

- 5

So, we obtain an overall approximation guarantee of 1 + | = | .

D

258

H.-J. Bockenhauer et al.

Corollary 1. There is a '^-approximation algorithm for the subproblem of LM-Z\-TSP where edges may only become less expensive. Proof. Compute, as laid out in Lemma 3, an approximate solution to LM-/A-TSP and compare it with the input solution C. Output the less expensive of the two solutions. Depending on whether the value of •Q^'— (where 5 := co{e) — CN{S) > 0) is less or greater than | (which we cannot necessarily tell), one of the considered two feasible solutions is a |-approximation. D We will now turn to the case where an edge becomes more expensive. We can state a lemma akin to Lemma 2, but notice that by reusing a formerly optimal solution, we incur a certain extra cost. Lemma 4. Let {GO,GN,C) he an admissible input for L M - Z \ - T S P such that 5 := Cjv(e) — co{e) > 0 for the edge e. If Q^ < ^, it is a ^-approximation to output the feasible solution C := C for LM-Zi-TSP. Proof. CNJC) ^ cojC) + 5 ^ OTGQ + 5 _^ OTG^ +'^ = 11 OTG^

-

OTG^

OTG^

-

OTG^

^ OTG„

< i i ^ _ ^ "

5

5

D When computing an approximate solution, things become slightly different from what they used to be like in Lemma 3: We may assume that e used to be a part of C and that a new solution should no longer use it. Instead, it will use two edges / and / ' such that / and / ' are non-disjoint and both incident with the same vertex of e. This pair may be chosen at either end-point of e, a choice which is completely arbitrary. We conjecture that, if an improvement of the approximation guarantee is possible, this is precisely the point where to start at. Lemma 5. Let {GO,GN,C) be an admissible input for L M - ^ - T S P such that 5 := CAr(e) —co{e) > 0 for the edge e. If -QY— ^ f) there is a ^-approximation for LM-Z\-TSP. Proof. We may assume that optimal TSP tours in GN do not use the edge e. For if they did, C would already constitute an optimal solution. Fix one such optimal tour COPT, and fix one vertex w incident with e. In CQPT, W is incident with two edges / and / ' . Let v be the vertex incident with / , but not with e, and let v' be the vertex incident with / ' , but not with e. By P, denote the path from V to v' in COPT that does not involve w. Consider the following algorithm: For every pair / , / ' of edges incident with w, compute an approximate solution to the TSP path problem on the subgraph of G2 induced by the vertex set V \ {w} with start vertex v and end vertex v' where {v} = f \e and {v'} = / ' \ e. It is known [13, 14] that this can be done

Reusing Optimal TSP Solutions for Locally Modified Input Instances

259

with an approximation guarantee of | . Each of these paths is augmented by / and / ' so as to yield a TSP tour. The algorithm concludes by outputting the least expensive of all of these tours. Note that since all pairs / , / ' are taken into account, one of the considered tours uses exactly those edges / = / , / ' = / ' that CQPT uses. This is why the algorithm outputs a tour of cost at most

c(/) + c(/') + \c{P) =

{OTG,

just as in the proof of Lemma 3.

- c{P)) + \c{P) = OTG, + \c{P)

, D

Using the same arguments as in the proof of Corollary 1, the preceding lemma yields the following corollary. Corollary 2. There is a ^-approximation algorithm for the subproblem of LM-Zi-TSP where edges may only become more expensive. D 2.3 The N e a r - M e t r i c Case The algorithm outlined in Lemma 3 can be generalized to graphs which are not necessarily metric, but only near-metric, i.e., where the metricity constraint is relaxed by a factor of /3. Since it will pay off later, let us pay extra attention to the fact that input instances for all the problems from Definition 2 contain two distinct graphs, potentially obeying relaxed triangle inequalities according to different values of /?. Notice that the parameter /? need not be greater for the graph with the costlier edge. Under some circumstances, it might even decrease when we modify the cost of a single edge. In the following generalization of Lemma 1, the convention is therefore that ci is the cost function of the less expensive graph, C2 that of the more expensive one, and both Cj obey the Zi^^-inequality, i G {1,2}. Lemma 6. Let Gi = {V, E, ci) and G2 = {V, E, C2) be graphs such that Ci obeys the Afj^-inequality for i S {1,2} and some values j3i,(32 > 1 and such that c\ and C2 coincide, except for one edge e £ E. By convention, let ci(e) < C2(e). Then, every edge adjacent to e has a cost of at least '^^ ^ ^ " 3 ' ! ^ • Proof. Analogous to Lemma 1.

D

Note that for relatively small changes, the value C2(e) — /?i/32Ci(e) may well be non-positive, rendering Lemma 6 trivial in such a case. The algorithm from Lemmas 3 and 4 should be adjusted to accommodate for the relaxation of the triangle inequality. More precisely, in order to find a Hamiltonian path between a given pair of vertices in a /3-metric graph, we will employ the algorithm by Forlizzi et al. [11], a variation of the path-matching Christofides algorithm (PMCA, see [5]) for the path version of near-metric TSP, which yields an approximation guarantee of |/3^. This gives us Algorithm 1.

260

H.-J. Bockenhauer et al.

Algorithm 1 Input: An instance {GO,GN,C)

of LM-zi/3-TSP where Go = {V,E,co)

and GN =

{V,E,CN).

1. Let e £ E he the edge where co{e) ¥" <^N{e)Let S be the set of all unordered pairs {/, f'} Q E where f y^ f are edges adjacent to e such that if co{e) < CN{e): / fi / ' n e is a singleton; and i f c o ( e ) > C i v ( e ) : / n / ' = 0. 2. For all {f,f'} € S, compute a Hamiltonian path between the two vertices from ( / U / ' ) \ e o n the graph G \ ( e n ( / U / ' ) ) , using the PMCA path variant by Forlizzi et al. [11]. Augment this path by edges / , / ' , and, if co{e) > cjv(e), edge e to obtain the cycle Cyj>y. 3. Let C be the least expensive of the cycles in the set {C} U {C^fj'y \ {/, / ' } e £}. Output: The Hamiltonian cycle C

L e m m a 7. Algorithm

1 achieves an approximation

guarantee of

15/3,^ + 5 A - 6 '^'^^

10/32 + 3/3U/3H + 3/3H - 6

^ '

for input graph pairs {GO,GN) such that Go obeys the Ap^-inequality and GN obeys the Apj^-inequality and where PI := mm{po,pN} and j3„ := maxlpoj^w}Proof. Adhering t o the convention of Lemma 6, set {01,02} = {co,cjv} such t h a t ci(e) < C2(e) for all edges e e E. In other words, we have C2 = c^v if an edge becomes more expensive and ci = cjv otherwise. We m a y assume t h a t optimal T S P tours in GN = {V, E, CN) use t h e edge e iff CJV = ci; otherwise, C is an optimal solution, and we are done. F i x one such optimal tour COPT in GN, and let { / , / ' } € £ be such t h a t COPT uses both / and / ' . By P, denote the p a t h t h a t results from COPT by removing edges / , / ' , and, potentially, e. Set ^(^) a := ——— OTG^

^, . f

K

•.

and let,' tor brevity, "

9

««

15(31 +

u ': = ^PLPH '^"

5(3,-6

10/?2 + 3A/3H + 3/3H - 6

denote t h e approximation guarantee claimed in (1). In t e r m s of a, Algorithm 1 always achieves an approximation guarantee of 1 - a edges / , / ' , (potentially) e are chosen optimally

+

5 -^(3^a

,

P will be approximated

even if we did not have C at our disposal. (Note t h a t t h e strategy t o approximate P m a y rely on t h e Ap^_ inequality, i. e., t h e less relaxed one of t h e two because this strategy removes the edge e from t h e graph.) Hence, unless

Reusing Optimal TSP Solutions for Locally Modified Input Instances

261

^ - 1

we are done. Let use therefore assume t h a t (2) holds. B y Lemma 6, we have -in{c(/),c(/')} > ^ ^ ^ t ' / f r ^ ' ^

^

c,ie)-^Me)

PIP2 + P2

PLPH + PH

and hence ^ ""-

2.(c2(e)-/?,/?HCi(e)) 0TG«.(A/3„+/?H)

•

P u t t i n g this together with (2), we know t h a t 1

•d

^

2.(c2(e)-/3,/?HCi(e))

| / ? 2 - l -

O T G „ - ( A / 3 „ + /3„)

which yields C2(e) - /?L/3HCi(e) ^ /?,/3H + /?H

OTG^

-

(^ - 1) • {PL/^H +

2

M

f/3?-2

By adding {/3L/3H — 1 ) ^ ^ ^ t o b o t h sides, we are given: C2(e)-ci(e) AAi+An OTG^ 2

(^-1)-(^L/3H+/?H) f/32-2

, ,^ ^ '^'^"

.x c i ( e ) ' OTG, <1

and thus, substituting t h e value (1) for d, C2(e)-ci(e) ^ 3

_ 3 .

_

1

1^

.

(^ - 1) • ( / 3 A + A )

(/3L/3H • loffff 3/35H^+3^H-6 - ^)(/^^/^" + M

— TTPLPH + TTPH — -L —

(tedious calculations)

X5/J2 _|_ 5/5 _ g = • • • = /3L/3H • in^2_LQ/? ^ 1 Q/?

R - 1 = ?? - 1 .

IOPL^ + 3/3LPH + S/i^H - 6

Since, by t h e same reasoning as t h a t of Lemmas 2 a n d 4, reusing t h e input optimal solution C inflicts a deviation from the new optimum by at most C2(e) — ci(e) < {'d — 1) • OToff, Algorithm 1 is a t?-approximation algorithm. D Hence, whenever t h e /? values of Go and GN coincide, we have Theorem 4. T h e o r e m 4 . T/iere «s a (polynomial-time) algorithm for LM-Zi/j-TSP.

0^ •

X5/52 _|_ 5/5 15/3^ + 5 / 3_- g 6 —-—--approximation 13/32 + 3/3 - 6 '

262

H.-J. Bockenhauer et al.

- / ^ V + /3

Approximation guarantee 13-1

Cor. 3 1

1.5

2

2.5

3

3.5 Parameter /9

Fig. 2. Approximation guarantees of various algorithms, depending on (3

Interestingly, Algorithm 1 achieves a better approximation guarantee not just t h a n P M C A [5], but also t h a n Bender's and Chekuri's 4/3-approximation algorithm [3] for the most practically relevant values of ^ . T h e turning point is about at (3* « 3.34899. More to the point, Andreae's (/3^+/3)-approximation [1], which performs better t h a n 4/3 only when /5 < 3, always performs worse t h a n Algorithm 1 in the interval /? £ (1,/?*). These observations are illustrated in Figure 2. Another practical special case is t h a t where (3^_ = l,\.e., where we start with a metric graph, b u t changing the cost of an edge will violate the Z\-inequality. C o r o l l a r y 3. LM-Z\/3-TSP, restricted mits a 27^-approximation.

to those inputs where Go is metric,

adD

3 Deadline TSP In this section, we will analyze the approximability of local-modification variants of T S P with deadlines. To begin with, let us define this problem formally. D e f i n i t i o n 3 . Let G = iy,E) he a complete graph weighted by c: E ^ N"*". We call ( s , D , d ) a deadline set for G if s e V,D CV \ {s} and cZ: £» ^ N + . A vertex v £ D is called deadline vertex. A path {vo,vi,... ,Vn) satisfies the deadlines iff s = VQ and, for all Vi G D, we have Yl\^ic{{vj-i,Vj}) < d{vi). A cycle {vo,vi,... ,Vn,vo) satisfies the deadhnes iff it contains a path (VQ, vi, ..., Vn) satisfying the deadlines. D e f i n i t i o n 4. The problem Z \ / 3 - D L T S P is defined as follows: For a given complete graph G — {V,E) with edge weights c: E —^ N"*" satisfying the Ap-

Reusing Optimal TSP Solutions for Locally Modified Input Instances

263

inequality, deadlines {s, D, d) for G, and a Hamiltonian cycle satisfying the deadlines^, find a minimum-weight Hamiltonian cycle satisfying all deadlines. If \D\ is a constant k, the resulting subproblem is fc-Zi^-DLTSP. We set Z \ - D L T S P := zii-DLTSP and fc-Z\-DLTSP := A ; - Z \ I - D L T S P for all k. In the case of T S P with deadlines, we will regard it as a local modification to change a single deadline although the LM operation from the previous section would let us obtain exactly the same results. The connection between these two LM operations will be presented in detail in the journal version of this paper. Definition 5. The optimization problem L M - D L T S P is defined as: Input: A complete weighted graph G = {V,E,c), deadlines O = {s,D,do) for G with a minimal Hamiltonian cycle satisfying the deadlines O, new deadlines N = (s, D, djv) such that do and d^ differ in exactly one vertex, and a Hamiltonian cycle satisfying N. Problem: Find a minimum-cost Hamiltonian cycle satisfying N. By LM-fc-DLTSP, LM-Zi-DLTSP, LM-A;-Z1-DLTSP, hU-Afj-DhHSF, LMfc-Zi/3-DLTSP, we denote the canonical special cases of L M - D L T S P . For our proofs, we will need some reductions from the following problem, which can easily be shown to be NP-hard analogously to the proof of the NPhardness of the restricted Hamiltonian cycle problem, as presented, e.g., in [16]. Definition 6. For a given graph G = {V,E), s, t € V and a given Hamiltonian path P from s to t, the problem R H P is to decide whether G contains a Hamiltonian path starting in s, but ending in some vertex v ^t. 3.1 Bounded Number of Deadline Vertices We start with the case where only few deadline vertices occur. Note that k-AD L T S P can be approximated within a ratio of 2.5 [6, 7]. Furthermore, a lower bound of 2 — £ on the approximability, for every £ > 0, can be proved [6, 7]. We will show that this lower bound also holds for LM-fc-Z\-DLTSP. Theorem 5. Let e > 0. There is no polynomial-time (2 — e)-approximation algorithm for the subproblem of L M - A ; - Z \ - D L T S P where one deadline is increased by S, time units, £,>!, unless P = NP. Proof. By means of a reduction, we will show that such an approximation algorithm could be used to solve R H P . Let £ > 0. Let (G', P) be an input instance for R H P where G' = {V, E'), \V'\ = n + 1 , s',t' £ V, and P is a Hamiltonian path from s' to t'. Pick a 7 > ^ ^ ^ (which

^ Requiring a feasible Hamiltonian cycle as part of the input ensures that the problem is in NPO. Otherwise, it would even be a hard problem to find a feasible solution. For details, see [6, 7].

264

H.-J. Bockenhauer et al 7

Fig. 3. Increasing a deadline. All vertices v' ^ V' \ {s',t'} are connected like v.

We construct a complete weighted graph G — (V, E, c) as part of an input for LM-fe-Z\-DLTSP as shown in Figure 3: We set V := V'0{s,Di,D2}, and, for any edge e between two vertices wi,W2 G V, let c(e) = 1 if e G -E' and c(e) = 2 otherwise. All edges depicted in Figure 3 have the indicated costs while non-depicted edges obtain maximal possible costs. For these deadlines, one optimal solution C is the cycle s, Di,D2,t',... ,s',s, which uses the Hamiltonian path P from s' to t' in G'. It costs exactly 7 — 1 + 7 + 7 + n-t-7 = 47 + n — 1. All other feasible solutions visit some vertices in V between s and Di, but cost at least the amount of 1 more. Now, we increase d{Di) by ^. If G' contains a Hamiltonian path P from s' to some vertex v ^ t', a. new optimal solution is s, P,Di,D2,s, and it costs 7 + n + 1 + 7 + 2n = 27 + 3n + 1. If G' does not contain such a path, it is not possible to visit all vertices in V before reaching Di and D2. As c{{t', Di}) > 2, we cannot follow the given Hamiltonian path P because this would violate the deadline d{D2)- Similar arguments hold for every other possibility. Hence, C remains an optimal solution in this case. Thus, we could use any approximation algorithm with an approximation guarantee better than 47-

27 + 3n + 1

>2

to solve R H P . This is why approximating this subproblem of LM-fc-/i-DLTSP within 2 - e is NP-hard for all fc > 2. • Theorem 6. Let e > 0. There is no polynomial-time (2 — e)-approximation algorithm for the subproblem of LM-A;-zi-DLTSP where one deadline is decreased by ^ time units, ^ >l, unless P = NP. Proof. Let £ > 0. Like in the preceding proof, we will use a reduction from RHP. Let (G", P) be an input instance for R H P where G' = (V"', E'), \V'\ = n + 1 , s',t' G V, and P is a Hamiltonian path from s' to t'. Pick some 7 such that 27+8n -^ ^

*"•

We construct a complete weighted graph G = {V, E, c) as part of an input for LM-fc-Z\-DLTSP as shown in Figure 4: We set V := V'U{s, Di,D2, -D3, D4}, and, for any edge e between two vertices vi, V2 ^ V, let c(e) = 1 if e e -E' and

Reusing Optimal TSP Solutions for Locally Modified Input Instances Di =2n

265

7-1

:==f<'

\

„ D3 = f + 5n

D4 = 27 + 5n

Fig. 4. Decreasing a deadline. All vertices v' €V' \ {s',t'} are connected like v.

c(e) = 2 otherwise. All edges depicted in Figure 4 have the indicated costs while non-depicted edges obtain maximal possible costs. The initial deadlines are depicted in Figure 4. In this setting, an optimal solution is the cycle s,D2,Di,t',...,s',D3,D4,s, which contains the Hamiltonian path from s' to t'. This path costs 2n + 7 — 1 on its way to G', spends n on the path from t' to s', and reaches s at time 27 + 8n — 1. Now, we decrease the deadline d{Di) by S,, whereby the old optimal solution becomes infeasible. Any new solution must visit £>i before Dg- If we try to reuse the Hamiltonian path from t' to s', we have to spend 2n + 7 + 1 on the way to t'. Therefore, we cannot reach D3 if we follow the complete Hamiltonian path. Furthermore, we cannot visit any vertex v G V between visiting D3 and D4 because D3 is not reached before in + 7, going back to V would cost another 2n, and the cheapest path from V to D4 costs more than 7. This is why any solution using a Hamiltonian path between s' and t' violates one of the deadlines diDs), d{Di). If G' contains a Hamiltonian path P from s' to some v ^ t', the new optimal solution contains this path in reverse on its way to D3. The path s, Di,D2,P, D3 visits all vertices in V between v and s' and reaches D3 at time 7+5n. Therefore, this new optimal solution costs 27 + 8n. If G' does not contain such a Hamiltonian path, the optimal solution cannot visit all vertices in V before reaching D3 or even D4, and consequently, it is more expensive than 47. Thus, we could use an approximation algorithm with an approximation guarantee better than 47

27 + 8n

>2-e

to solve R H P . Hence, approximating this subproblem of LM-fc-Z\-DLTSP within 2 - e is NP-hard. D 3.2 Unbounded Number of Deadline Vertices When the number of deadline vertices is unbounded, we can show a linear lower bound on the approximability of L M - Z \ - D L T S P . Our reduction from RHP involves two steps. A first construction will guarantee that an optimal path becomes shorter by a constant factor if a Hamiltonian path exists in the RHP

266

H.-J. Bockenhauer et al.

instance. A second construction inflates this advantage. Tours which start at time X, different from those that start between times X + g and X + (g, may spend some extra time to visit a group of vertices which, unless visited early, will cause belated tours to run k times zigzag across a huge distance 7. The following lemma describes the construction in detail. See Figure 5 for an overview. Lemma 8. Let X, g,k,^,( G N such that k is even, C > 1 o-i^d 7 > g. Let G' = {V',E') be a graph with deadline set {s,D',d') such that any Hamiltonian path in G' respecting the deadlines ends in the same vertex t. Then, we can construct a complete graph G D G' and deadlines {s, D, d) such that D D D', d^jy = d' and any path that reaches t in time X can be extended to a Hamiltonian cycle which costs at most X + (A; + 2C - 4)5 + 27 , while any path that reaches t after X + g, but before X + (g can only be extended to a Hamiltonian cycle which costs at least k-2,

X

+ C 5 + ^7

Proof. We construct G = {V, E) with V = V U {Ei,. ..Ek} and edge costs as depicted in Figure 5, where b := ^(C— ^)- To aU other edges, we assign maximal possible costs. Note that the edge {t,Ei} costs exactly the same as the path Ek-\,Ek-3, • • • ,Ei. We set the deadlines

G'

p,

Ek - 1

• '^^ 9

^7

9

7

Ek

i '1 ( 2

a \ 9

E2 [

\ ^ t

/

Ek-i

7

E& \

Ei

b

^T 7

9

^7 7

b

V

s"

1 ^)9

9

\EZ/

d(Bi) = X + Cs + ( ^

7

,*,1

i ^ + 09 9

Ei I; 9 ^7 7

+ C)ff

b Bi

Fig. 5. The zigzag construction for the proof of Lemma 8. The left-hand side shows the optimal path if t is reached at time X. The right-hand side shows the optimal solution if t is reached after X + g. We set b := g{(^ — | ) and d(Di+i) := d(Di) + 7.

Reusing Optimal TSP Solutions for Locally Modified Input Instances

d{Ei):=X

+ Cg+(^+c)g

d{Ei+i) :=^ diEi) + J

267

and

for alH e { 1 , . . . ,fc - 1} .

If a path reaches t after X+g, it must proceed immediately to Ei. Note that it cannot use any other edge since it would have to use an edge of an additional cost of at least b = g{( — 5) > ff(C ~ 1)) then. Together with even the shortest path to El, this would violate this deadline. But then, it is forced to follow the sequence E2, En,,..., Ek to reach every deadline since even if we visited £^3 before £'2, we would incur an extra cost of 6, and this would violate the deadline of £^2- Hence, the Hamiltonian cycle costs at least X + g + ( ^ ^ + C,)g + ^7. A path that visits t before X can visit Ek-i,Ek^3,..., S3 before Ei because this path to Ei costs at most

X + b+{^-2)g

+ b- X +

Cg+{'^+C]9
Closing the cycle to s, we obtain a cost of at most X + C9 +

k

+ ( 9 +

1 U + 27 = X + (fc + 2C - 4)5 + 27 D

We will now employ Lemma 8 to prove the desired lower bound. T h e o r e m 7. Let e > 0. There is no polynomial-time ((^ — e) • \V\)-approximation algorithm for the subproblem of LM-Zi-DLTSP where one deadline is increased by ^ >1, unless P = NP.

do{Dx)

= 3n - 1

= 4n

.» do{D^)

= lOn

dNiDi)

= 3n-l

+ i

n do{Di)

== 6n 2n

doiDi)

=-- 8n

2n do(De)

2n

2n

= Un

Fig. 6. Increasing a deadline: If the deadline for the vertex Di is increased, using a Hamiltonian path from s to v leads to a new optimal solution.

268

H.-J. Bockenhauer et al.

Proof. By means of a reduction, we will show that such an approximation algorithm could be used to solve R H P . Let ( C , P) be an input instance for R H P , where G' = {V, E'), \V'\ = n + 1, s,t e V, and P is a Hamiltonian path from s to t. We construct a complete weighted graph G = {V, E, c) as part of an input for the L M - Z \ - D L T S P as shown in Figure 6: We set V = V U {Di,... ,De} and, for any edge e between two vertices t;i,t;2 G V, c(e) = 1, if e G E', and c(e) = 2 otherwise. To the other edges, assign costs as depicted in Figure 6, and maximal possible costs to the non-depicted edges, and set the deadlines do{Di) according to Figure 6. Pick some suitable 0 < (5 < 1 and 0 < o; < 1 such that -^^ ^ ^ "~ £. We use the zigzag construction defined in Lemma 8 with parameters X = lOn, g = 2n, C = 2, fc > (n -t- 7) j £ ^ , and 7 > M^iliOn ^ obtain the graph Go of our input instance. This guarantees 2kn + lOn < 5"f and k > a{k + n + 6). The given optimal Hamiltonian tour C in Go starts in s, uses the given Hamiltonian path in G' to t, and afterwards follows the sequence Di, D2, -D3, D4, Ds, DQ. Hence, it reaches DQ in time 13n. Following the zigzag construction, this leads to a cost of at least lOn + ( ^ ^ + C) 9 + ^7- In G^, we change the deadline for Di to d^^Di) = 3n — 1 + ^ for some ^ > 1. C remains a feasible solution. If G" contains a Hamiltonian path from s to some vertex u ^ i, an optimal solution uses this path and follows the sequence D2, Di, D^, D^, D4, DQ. This solution reaches DQ in time lOn. By Lemma 8, this cycle costs lOn + (fc + 2C - 4)5 + 27. If G' does not contain any Hamiltonian path to such a vertex v, C remains the optimal solution in the case where ^ = 1. If ^ > 2, an optimal solution follows P to t and afterwards uses the sequence D2, Di, D^, D4, D5, DQ. This solution reaches DQ in time 12n + 1 > X + g. By Lemma 8, we obtain a cost of lOn + ( ^ ^ -I- ()g + k^. This leads to a ratio of at least 10n + ( ^ - 2 ) 2 n + fc7 ^ kj > 10n + {k + i - 4)2n + 27 2kn + lOn + 27

Hence, a polynomial-time (^ -e)|T/|-approximation algorithm could be used to solve RHP. D Theorem 8. Let e > 0. There is no polynomial-time ( ( | — e) \V\)-approximation algorithm for the subproblem of L M - Z \ - D L T S P where one deadline is decreased by ^ > 1 unless P — NP. Proof idea. The proof can be done in a way similar to the proof of Theorem 7. The relevant construction is illustrated in Figure 7. Details will be given in a journal version of this paper. D Corollary 4. Let e > 0. There is no polynomial-time ((^ — e)\V'\)-approximation algorithm for L M - Z \ - D L T S P unless P = NP. D

Reusing Optimal TSP Solutions for Locally Modified Input Instances

doiDi)

= 4n

dN{D2) =

if do(D2) = 3n >n - 1

'V/^

269

3n-$

• 1

n'^^lj do{D4) = 7n

Fig. 7. Decreasing a deadline: If the deadline for the vertex D2 is decreased, the old optimal solution (depicted on the left-hand side) becomes infeasible. If G' contains a Hamiltonian path from s to v, we obtain the depicted new optimal solution. If no such Hamiltonian path exists, the new optimal solution must follow D2,Di,D3,Dz,Di,De.

4 Conclusion In this work, we have introduced and successfully applied the concept of reusing optimal solutions when input instances are locally modified. In the case of metric T S P , we are able to improve on the previously-known upper bound of 1.5, as achieved by Christofides' algorithm (applied to the new instance, ignoring the given optimal solution), with non-trivial extensions t o the near-metric case. As for T S P with deadlines, which is remarkably hard [6], we have been able t o reestablish almost all known lower bounds on the approximability of its variants in the setting of local modifications. As an open problem, we state t h e question how hard it is to approximate LM-A;-Zi/3-DLTSP. Another open problem is whether the N P - h a r d LM-Z\-TSP is also APX-hard.

References T. Andreae: On the traveling salesman problem restricted to inputs satisfying a relaxed triangle inequality. Networks 38, 2001, pp. 59-67. T. Andreae, H.-J. Bandelt: Performance guarantees for approximation algorithms depending on parameterized triangle inequalities. SIAM Journal on Discrete Mathematics 8, 1995, pp. 1-16. M. Bender, C. Chekuri: Performance guarantees for TSP with a parameterized triangle inequality. Information Processing Letters 73, 2000, pp. 17-21.

270

H.-J. Bockenhauer et al.

4. H.-J. Bockenhauer, J. Hromkovic, R. Klasing, S. Seibert, W. Unger: Approximation algorithms for TSP with sharpened triangle inequality. Information Processing Letters 75, 2000, pp. 133-138. 5. H.-J. Bockenhauer, J. Hromkovic, R. Klasing, S. Seibert, W. Unger: Towards the notion of stability of approximation for hard optimization tasks and the traveling salesman problem. Theoretical Computer Science 285, 2002, pp. 3-24. 6. H.-J. Bockenhauer, J. Hromkovic, J. Kneis, J. Kupke: On the parameterized approximability of TSP with deadlines. Theory of Computing Systems, to appear. 7. H.-J. Bockenhauer, J. Hromkovic, J. Kneis, J. Kupke: On the approximation hardness of some generalizations of TSP. Proc. SWAT 2006, to appear. 8. H.-J. Bockenhauer, S. Seibert: Improved lower bounds on the approximability of the traveling salesman problem. RAIRO Theoretical Informatics and Applications 34, 2000, pp. 213-255. 9. N. Christofides: Worst-case analysis of a new heuristic for the travelling salesman problem. Technical Report 388, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh, 1976. 10. J.-F. Cordeau, G. Desaulniers, J. Desrosiers, M. M. Solomon, F. Soumis: VRP with time windows. In; P. Toth, D. Vigo (eds.): The Vehicle Routing Problem, SIAM 2001, pp. 157-193. 11. L. Forlizzi, J. Hromkovic, G. Proietti, S. Seibert: On the stability of approximation for Hamiltonian path problems. Algorithmic Operations Research 1(1), 2006, pp. 31-45. 12. H. Greenberg: An annotated bibliography for post-solution analysis in mixed integer and combinatorial optimization. In: D. L. Woodruff (ed.): Advances in Computational and Stochastic Optimization, Logic Programming, and Heuristic Search, Kluwer Academic Publishers, 1998, pp. 97-148. 13. N. Guttmann-Beck, R. Hassin, S. KhuUer, B. Raghavachari: Approximation algorithms with bounded performance guarantees for the clustered traveling salesman problem. Algorithmica 28, 2000, pp. 422-437. 14. J. A. Hoogeveen: Analysis of Christofides' heuristic: Some paths are more difficult than cycles. Operations Research Letters 10, 1978, pp. 178-193. 15. J. Hromkovic: Stability of approximation algorithms for hard optimization problems. Proc. SOFSEM'99, Springer LNCS 1725, 1999, pp. 29-47. 16. J. Hromkovic: Algorithmics for Hard Problems. Introduction to Combinatorial Optimization, Randomization, Approximation, and Heuristics. Springer 2003. 17. M. Libura: Sensitivity analysis for minimum Hamiltonian path and traveling salesman problems. Discrete Applied Mathematics 30, 1991, pp. 197-211. 18. M. Libura, E. S. van der Poort, G. Sierksma, J. A. A. van der Veen: Stability aspects of the traveling salesman problem based on fc-best solutions. Discrete Applied Mathematics 87, 1998, pp. 159-185. 19. Ch. Papadimitriou, K. Steiglitz: Some examples of difficult traveling salesman problems. Operations Research 26, 1978, pp. 434-443. 20. Y. N. Sotskov, V. K. Leontev, E. N. Gordeev: Some concepts of stabihty analysis in combinatorial optimization. Discrete Appl. Math. 58, 1995, pp. 169-190. 21. S. Van Hoesel, A. Wagelmans: On the complexity of postoptimality analysis of 0/1 programs. Discrete Applied Mathematics 91, 1999, pp. 251-263.

Spectral Partitioning of Random Graphs with Given Expected Degrees Amin Coja-Oghlan^, Andreas Goerdt^, and Andre Lanka-^ ^ Humboldt Universitat zu Berlin, Institut fiir Informatik Unter den Linden 6, 10099 Berlin, Germany COj aOinf ormat ik.hu-berlin.de

^ Fakultat fiir Informatik, Technische Universitat Chemnitz Strafie der Nationen 62, 09107 Chemnitz, Germany {goerdt, lanka}@informatik.tu-chemnitz.de

A b s t r a c t . It is a well established fact, that - in the case of classical random graphs like (variants of) Gn,p or random regular graphs spectral methods yield efficient algorithms for clustering (e. g. colouring or bisection) problems. The theory of large networks emerging recently provides convincing evidence that such networks, albeit looking random in some sense, cannot sensibly be described by classical random graphs. A variety of new types of random graphs have been introduced. One of these types is characterized by the fact that we have a fixed expected degree sequence, that is for each vertex its expected degree is given. Recent theoretical work confirms that spectral methods can be successfully applied to clustering problems for such random graphs, too provided that the expected degrees are not too small, in fact > log® n. In this case however the degree of each vertex is concentrated about its expectation. We show how to remove this restriction and apply spectral methods when the expected degrees are bounded below just by a suitable constant. Our results rely on the observation that techniques developed for the classical sparse G„,p random graph (that is p = c/n) can be transferred to the present situation, when we consider a suitably normalized adjacency matrix: We divide each entry of the adjacency matrix by the product of the expected degrees of the incident vertices. Given the host of spectral techniques developed for Gn,p this observation should be of independent interest.

1 Introduction For definiteness we specify the model of random graphs to be considered first. This model is very similar to t h a t considered and convincingly motivated in [9]. (In particular, we refer to Subsection 1.3 of t h a t paper where the model is defined.)

Please use the following format when citing this chapter: Coja-Oghlan, A., Goerdt, A., Lanka, A., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIF International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 271—282.

272

A. Coja-Oghlan, A. Goerdt, and A. Lanka

1.1 The model Our random graphs with planted partition and given expected degree sequence are generated as follows. Let V = { 1 , . . . ,n} be the set of nodes. Partition V into k disjoint subsets V\,... ,Vk, where k is fixed. We assume that the size of each set \Vj\ > 6n for some arbitrarily small but constant 5 > 0. For i GV we let V'(i) denote the number of the subset i belongs to, that is i € V^(i). We fix some symmetric k x A;-matrix D = [dij) with non-negative constants as entries. Moreover, we assign some weight Wi to each node i GV. We let W^ = ^ Wi and w = W/n be the arithmetic mean of the WiS. We construct the random graph G = {V,E) by inserting each edge {i,j} independently with probability Wi • Wj • d^{i),ip{j) /W. Of course the parameters should be chosen such that each probability is bounded above by 1. (It has some mild technical advantages to allow for loops as we do.) Depending on the matrix D, we can model a variety of random instances of clustering problems. For example we can generate 3colourable graphs, then the Vj are the colour classes, or graphs having a small bisection, in which case the Vj are the two sides of the bisection, or graphs with subsets of vertices which are very dense or sparse... The algorithmic problem is to efficiently reconstruct the Vj (or large parts thereof) given such a random G. Note that the model from [9] allows for directed edges where the minimum expected in- and out-degree of each vertex is log n. We restrict our attention to undirected graphs. We denote the expected degree of vertex i by w'^, then ,

Wi

sr^

jev

In order for our algorithm to work properly we impose the following restrictions on the model's parameters: 1. The matrix D has full rank. 2. We have Wi > e -w fox all i, where e is some arbitrarily small constant. 3. w > d, where d = d{£,D,5) is a sufficiently large constant. Our asymptotics is such that n gets large, while D,k,e,S,d are fixed. On the other hand the weights Wi can be picked arbitrarily subject to our restrictions (in particular depending on n) and the subsets Vj with \Vj\ > Sn are arbitrary, too. Our restrictions 2. and 3. imply that I • Wi < w"^ < u • Wi ior constants I = l{e,D,5) and u = u{£,D,5) that is w^ = 0{wi). This shows the extent to which we consider graphs with given expected degree sequence. Note that depending on the weight Wi 2. and 3. allow w'^ among others to be constant, independent of n. 1.2 Motivation and related literature The analysis of large real life networks, like the internet graph, social or bibliographical networks is one of the current topics not only of Computer Science.

Spectral Partitioning of Random Graphs with Given Expected Degrees

273

Clearly it is important to obtain efficient algorithms adapted to the characteristics of these networks. One particular problem of interest is the problem of detecting some kind of clusters, that is subsets of vertices having extraordinarily many or few edges. Such clusters are supposed to mirror some kind of relationship among its members (= vertices of the network). Heuristics based on the eigenvalues and eigenvectors of the adjacency matrix of the network provide one of the most flexible approaches to clustering problems applied in practice. See for example [15] or the review [19] or [18]. Note that the eigenvalues and eigenvectors of symmetric real valued matrices, first are real valued and second can be approximated efficiently to arbitrary precision. The relationship between spectral properties of the adjacency matrix of a graph on the one hand and clustering properties of the graph itself on the other hand is well established. Usually this relationship is based on some separation between the (absolute) values of the largest eigenvalues and the remaining eigenvalues. It has a long tradition of being exploited in practice, among others for numerical calculations. However, it is in general not easy to obtain convincing proofs certifying the quality of spectral methods in these cases, see [23] for a notable exception. Theoretically convincing analyses of this phenomenon have been conducted in the area of random graphs. This leads to provably efficient algorithms for clustering problems in situations where purely combinatorial algorithms do not seem to work, just to cite some examples [2], [3], or [4], or the recent [20] and subsequent work such as [14]. In particular [3] has lead to further results [10], [11]. The reason for this may be that [3] is based on a rather flexible approach to obtain spectral information about random graphs [12]: Spectral information directly follows from clustering properties known to be typically present in a random graph by (inefficient) counting arguments. We apply this technique here, too. In order to explain the success of spectral algorithms to detect clustering properties of large real life networks the preceding results do not seem to be readily applicable. As opposed to classical random graphs such networks are well known to have many vertices whose degree deviates considerably from the average degree, that is the degree distribution has a "heavy tail", or it seems to follow a "power law" , see for example [1] . And in fact in [21] it is shown that the largest eigenvalues of a random graph with power law degree distribution are proportional to the square root of the largest degrees, and thus do not reveal any non-local information about the graph. This result looks somehow related to the fact that the largest eigenvalue of a sparse random graph Gn,p where p = c/n is always the square root of the largest degree of the graph and that there is an unbounded number of eigenvalues of this size, see [16]. However, in the case of classical random graphs it helps to delete the vertices of highest degree as observed by [3] leaving the clustering properties of the graph essentially unchanged. However, in the case of a degree distribution with a heavy tail this trick is not useful, because significant parts of the graph may

274

A. Coja-Oghlan, A. Goerdt, and A. Lanka

just be ignored in this way. Thus, the adjacency matrix itself does not seem appropriate to represent graphs with heavy-tailed degree distributions. To come to terms with varying degrees the Laplacian matrix is considered, see [5] for a nice exposition of the relationship of the Laplacian spectrum to clustering properties of general graphs. It is also used in practical applications, cf. [22]. However, for randomly generated graphs it is more difficult to handle theoretically than the adjacency matrix. As far as classical random graphs are concerned it is already a major difficulty to get insight into the Laplacian spectrum, at least in the interesting sparse case. The difficulty stems from the fact that in this case the graph is not asymptotically regular. See however [6] for very recent progress in this direction. Clustering problems in the denser case can be treated with the help of the Laplacian even for random graphs modelling real networks as our model does (which allows for arbitrary, in particular heavily tailed degree distributions): In [9] it is shown that the Laplacian eigenvalues allow to find the partition in the model considered here, too (provided that the average degree is ;§> In n). Laplacian eigenvalues of random graphs with given expected degree sequence are also investigated in [8]. Both papers rely on [13] and in part on [17] to obtain information about the spectrum. This makes it inevitable that the degree is > log n, in fact > log^ n in the case of [9]. The case of small expected degrees as considered here is interesting because the actual degree of a vertex is not any more concentrated at the expected degree. It is also mentioned in the concluding section of [9]. Independently of its applications to graph partitioning problems, we have also investigated the Laplacian eigenvalues of sparse graphs with given expected degrees in [7]. 1.3 Techniques and result We consider the following algorithm to reconstruct the Vj for random graphs as generated by our model. Only for technical simplicity we restrict our attention to A; = 2. It poses no substantial difficulties to extend the algorithm to arbitrary, yet constant k: Instead of the two eigenvectors 82,83 we use k eigenvectors S2, • • • ,Sfc+i. The sufficiently large constants Ci,C2,C3 depend on the actual partioning problem. The values can be calculated with the knowledge of D, s and 5. Algorithm 1. Input: The adjacency matrix A of some graph G — {V, E) generated in the above model and the expected degree sequence w'l,... ^w'^. Output: A partition VI, V^ of V. 1. Calculate the expected average degree, w' = Yll^=i ^ i / ^ 2. Construct R = {rij) with rij = w'^ • aij/{w[ -w'^)3. Let 81 — R-1 where 1 is the all one's vector. 4. Let [/ = {i e F : ^ ^ =1 Tij < Ci • w'} for some sufficiently large constant Ci. 5. Construct R* from R by setting all entries rij with i ^ U or j ^ U to 0.

Spectral Partitioning of Random Graphs with Given Expected Degrees

275

6. Calculate the eigenvectors of R*. 7. Let S2,S3 be two eigenvectors of R* belonging to different occurrences of eigenvalues which are > C2 -w' in absolute value. 8. At least one of the si, 52,^3 turns out to have the property that all but C3 • {n/w') entries are close to two sufficiently different values ci, C2. Let V! be all the entries close to Cj for i = 1,2. Distribute the remaining entries arbitrarily among the V!. Some remarks are in order. First observe that the algorithm besides the graph needs the expected degree sequence as additional information. Note that the algorithm of [9] even gets the Wi themselves. The main idea is to use the normalized adjacency matrix R, where we divide each entry of the adjacency matrix by the expected degrees of the incident vertices (the additional factor of wJ'^ is only for technical convenience.) It is this choice of the matrix which makes our analysis possible. Of course, a natural idea is to divide the entries by the actual degrees rather than the expected degrees, in order to remove the requirement that w^,... w^ are given at the input. In fact, it turns out that this approach can be carried out successfully, i.e., the resulting matrix is suitable to recover the planted partition as well. Nonetheless, since the analysis is technically significantly more involved, we omit the details from the present extended abstract (the complete analysis will be given in the full paper version of this work). In fact using R we get a situation formally rather similar to the case: classical sparse random graph with a planted partition and adjacency matrix, the situation as considered in [3] or [20]. Note that all entries rtj with the same (•0(i), V'(i)) have the same expected value which makes the analogy possible. In particular we can apply [12]. The vector si is necessary in order to recognize partitions which can be readily recognized just from the row sums of R. Step 5. has the analogous effect on the spectrum of R as has the deletion of high degree vertices in the case of sparse random graphs on the spectrum of the adjacency matrix. Being eigenvectors of different occurrences of eigenvalues, S2 and S3 are orthogonal to each other. Notions "vague" up to now, like "close" or the d, Ci in the algorithm are made precise through the subsequent proof of Theorem 2. Let D, e, 5 as defined above. There exists constants Ci,C2,C3 with Ci = Ci{D,s,6) such that the following property holds: Let G be some graph generated in the above model. With probability 1 — o(l) with respect to G Algorithm 1 produces a partition which differs from the original partition V\.,V2 only in 0(n/w') vertices. Note that the number of vertices not classified correctly is 0{n/w') = 0{n/W) and thus decreases linearly in w. We present the proof of Theorem 2 in the following two sections. The proof in section 3 is based on some notions and lemmas used throughout. These are presented in section 2.

276

A. Coja-Oghlan, A. Goerdt, and A. Lanka

2 Notation and basic facts We use the following notation. 1. II • II denotes the Z2-norm of a vector or matrix. 2. The transpose of a matrix or vector M is written as M*. 3. For U C.'N and a vector v we construct the vector v\u by setting the ith component oi V\ij to Vi if i G U and to 0 if i 0 [/. If {7 is clear from the context, we write simply v*. For a matrix M we obtain M* by setting all entries rriij := 0 ii i ^ U or j ^ U. For a set of vectors S we define S* = {v* :ve S}. 4. We abbreviate ( 1 , . . . , 1)* by 1. 5. For a matrix M = {rriij) we define SM{X,Y)

= ^

rrixy.

xex yeY

The Courant-Fischer characterization of eigenvalues reads Fact 3. Let A € K"^" be some symmetric matrix with eigenvalues Ai > . . . > A„. Then Xi+i =

min A\mU=j

\n-i

=

max dimU=i

max x^Ax \\x\\ = l

min x^Ax \\x\\ = \

where U^ denotes the orthogonal complement to U. The next two lemmas are slight generalizations of two lemmas from [3]. Lemma 1 is proved as Lemma 3.4 in that paper for 0 — 1 random variables. Our generalization can be derived analogously. Lemma 1. Let xi,...,Xn independent random variables each having exactly two possible values from the interval [0, b] and the same expectation /x, such that for all i Pr[xi = 0] = l-pi

and

P r [xj 7^ 0] = P r [xj =/i/pi] = pj.

Let a i , . . . , a„ real numbers from [—a, a] and Z = Y17=i ^i' ^i- V foi^ S, D and some constant c > 0 n

Y^af

and

S < c-e" • D •/j/a

i=l

hold, then Pr[\Z -E[Z]\

> S] <2e^>^~-^-D>'.

Spectral Partitioning of Random Graphs with Given Expected Degrees

277

Let R be some n x n-matrix with random entries r^ and let V = { 1 , . . . , n} be the set of indices. We assume either that all r^ are independent or that the only dependence is due to symmetry. We assume that the collection of the Tij's otherwise has the same properties as the Xi's in Lemma 1, in particular E [rjj] = iJ,. The subsequent Lemma 2 is as Lemma 3.6 in [3]. Its proof is analogous. A similar lemma occurs as Lemma 2.5 in [12]. Lemma 2. With probability 1 — o(l) for any pair {A,B) of sets A,BCV following holds: Ifm^ max{|A|, \B\} < n/2 then 1.SR{A,B)

= 0{E{SR{A,B)])

the

or

^•«i^(AS)-lngggIj=0(m.ln^). Let R he a, random matrix as above and .B > 1 be some constant. For symmetric Rlet U CV he given by u £ U if and only if S}i{V, {u}) =

SR{{U},

V) <

B

• fi • n.

For non-symmetric R we define U = {u&V

: max(sfl({M}, V), SR{V, {U}))

The following lemma is at the heart of our results. It is a transfer of Lemma 3.3 in [3] and Theorem 2.2 in [12]. In contrast to [3] and [12] we require that only the vector y is perpendicular to 1. The proof is similar to [3] and [12]. In particular recall item 3. of the notation as introduced above. Lemma 3. For R and U as above with probability 1 — o(l) we have for all unit vectors x,y G (K")* with y ± 1 that |a:'i?y| = 0 ( ^ / i • n).

3 The analysis of the algorithm Let G = (V,-E), D, Vi, V2 and wi,.. .,Wn as in Subsection 1.1. Let di he the actual degree of i in G. For W QV we define ^{W) = "^Zi^w'"'» '^^'^ abbreviate ^i := ${Vi)/${V). Since al\wi>e-w = i?(wJ) and \Vi\ = J7(n) we have

^{V)

wn

wn

n

and each ^i are bounded away from 0 by some constant. For i GVI we have E [d,] = «;^ = 5 ^ d u • ^ ^ ^ +^d^2w•n -^

^ ^ ^ = Wi . ( d n ^ i + di2^2) w•n

jevi

and for i e V2 we get

'w[=Wi-

{di2$i + 0(22^2)•

278

A. Coja-Oghlan, A. Goerdt, and A. Lanka

Since D is of full rank, we have no row containing only 0. So, each wj is 7^ 0 and w[ = 0{wi). The expected average degree w' in G is n

,

I

•^ n

^

i=i

ieVi

= w • (dn^l

n

^—^ n ieV2

n + 2 • di2^i^2 + d22^l) = 0{w).

Let A be the adjacency matrix of G. We construct R by multiplying each entry Uij with w''^/{wl • w'j) — Q{vP'j{wi • Wj)) = 0 ( l / e ^ ) . So each entry in R is bounded by some constant. We have for i,j e Vi E[rijJ = d i i • - = — ' - • — j,= w-n w[-w'-'

du

w-n

(dii
for i € V^i, j G V2 or the other way round uJ'2 E [vij] ^ di '•^ ^^ w-n

{dn^i + ^12^2) • (^12^1 + d22^2)'

and finally for i,j G V2

•'

w'^ w-n

{di2^i

+d22^2Y

We obtain a symmetric 2 x 2-matrix M = {niij) of expectations such that E [vij] = m^{i),^{j)- With X

(dii#l+dl2^2)"^ 0 0 {di2^i + d22<^2) - 1

we get M = ^-X-('{'''{''~]-X w-n \ui2 "22/

=

w•n

^-X-D-X

If e = (ei 62) is some eigenvector of D, then ( e i / x n 627x22) is an eigenvector of X • £) • X with the same eigenvalue. So, the eigenvalues oi X - D - X are determined only by D, and are ^ 0. We divided the entries of e by the xu. This makes the entries larger, but at most by some constant factor independent of w' or n. So, the normalized eigenvectors oi X • D - X have entries, that are bounded away from 0 by some constant. We need this fact later. We summarize, M has 2 eigenvalues, whose absolute value is Q{w'^/{wn)) ^

fi{w'/n)

Spectral Partitioning of Random Graphs with Given Expected Degrees

279

and all the entries of the normalized eigenvectors are Q{1). The expected row-sum sji{{i},V) for some i £Vi is

w -n

\

xf 1

xii • 0:22

and for i G V2 W'^ ( di2\Vi\ , ^221^2 w • n \ x i i • X22+ -^ff^ a;22

= 0(^')-

(2)

The number of rows with SR{{i}, V^) > 5 • E [si?({i}, V)] is with high probability e-^(^') • n. This can be shown as follows: Use Lemma 1 to calculate the probability that a fixed i is such a row. This probability is e-^^""'). So, we have an expected number of such rows bounded by e^^^'" •* • n. Since the dependence between any two rows is small, we have a relatively small variance and Chebycheff's inequaltity gives the result. If (1) and (2) differ by a factor of at least 25, we can simply detect large parts of Vi and V2 by partitioning the rows by the value of sii{{i},V). This is the reason for si in the algorithm. If (1) and (2) are closer, then both are relatively near to the average row-sum, which is 0{w'). Now, let U be the set of all i, with SR{{'>-}J y) ^ C • w'. The exact value of C depends on D, e and the lower bound S on |l^|/n. A similar calculation as above shows, that \U\ > (1 — e"^'^'^ )) • n. L e m m a 4. With high probability for any set X CV have SR(X, V) = Q-^^"^'^ • n. Proof. Let Xi = X nVi.We

with \X\ = e"^^'^^ • n we

have that 2

sn{X,V)=

J2SR{X,,VJ).

If we can show, that with high probability for each summand the bound g-fi(«)) . ^ holds, then the assertion follows. We give the proof for SR{XI,VI) explicitly. The remaining cases follow analogously. Fix some set Xi C Vi with \Xi\ = Sn = e""^"^ • n, where ci is some arbitrarily small constant. Then FI[SR(XI,VI)] = 0{mii • \Xi\ • \Vi\) = 0{w' • |Xi|)=e-^(^')-n. Let t — \Xi\- \Vi\. We use Lemma 1. For {u, v} C Xi we set Xj in the lemma to ruv with u < V and ai to 2, because such entries are counted twice in the sum. For the other terms in SR{XI, VI), namely ruv with u & Xi and v ^ Xi we let Xi = ruv and a^ = 1. This gives for the lemma, that a = 2, D <2t and jj, = mil. We choose 5 = c • e"^ • mn -t = c-e'^ • 0(w' • Sn) = e"^^"' ^ • n for some constant c determined later. Then Pr{\sR{Xi,Vi)-mn-t\>S]<2'

-n{S^/{mii-e''-t)) = 2 • e~^(<= «"•'""•*) ^ 2 . g-«(c^'e°-tu'-i5n)

280

A. Coja-Oghlan, A. Goerdt, and A. Lanka

The number of sets Xi possible is bounded by f\Vi\\

< /^"^ <

/'£y"^gfc-5in5.n^gfc+0(fe.i«')

A union bound gives that the probabihty for the existence of a set Xi contradicting the claim is

if c is large enough (but still constant). For sets Xi with cardinality < 5n the same bounds for SR{XI, Vi) and the probability hold, since we can fill them up until they contain exactly 5n elements without decreasing SR{XI, VI). U By the above lemma we see that the sum of the entries we loose by building R* is bounded e-'"^^') • n. Thus, we have that ||i? - i?* || < e-"^"^'^ • n. And for all unit vectors f,gwe have max/,g \f{R - R*)g\ < \\R - i?*|| = e-^(^') • n. Let e = (ei 62) be some normalized eigenvector of M and xii X2 be the characteristic vectors of 14, V2 (XiiJ) = 1 if j G 1^ and 0 otherwise) and a = |Vi| /n, /? = IV2I / n . Let g = e\ • 0 • xi + ^2 • oi • X2 • Then ll^ll = Vefa/32n + e'ia^pn = 0 ( V ^ ) . We have with probability 1 — o(l) that asymptotically g'Rg = el • /3''SR{VI, VI) + 2eie2 • a/3sR{Vr,V2) + elsR{V2, V2) = e\ • a^0^ • n^ • m n + 2eie2 • a^/3^ • n^ • mu + el • a^/3^ • n^ • m22 = a^/3^ • n^ • (el • m n + 2eie2 • mi2 + 6^77122) =

a^(3'.n'.{e,e2)-M.(llj.

Since all eigenvalues of M are in absolute value 0(w'/n) \9'R*9\ > \9*Rg\-e'"'-^"^

we get

-n = a'^13'^ -n^ -Qiw' /n)-e-"^'^'^

-ein) = Q{w' -n),

by using the triangle inequality. Thus, using the 2 eigenvectors of M, we can construct 2 orthogonal vectors g and h for R* such that 9l_ T,* _9_ M • • M

n{w')

and

h'

h

F n r - ^R* *-¥I¥ \\h\\ "

m

=^(w').

By Fact 3 we obtain, that at least two eigenvalues of R* are f2{w') in absolute value. It is important that all the other eigenvalues of R* are bounded by O ( v ^ ) in absolute value. Let u and v some unit-vectors with u perpendicular to g and h. Because both g and h are linear combinations of xi and X2, u is also perpendicular to xi and X2We partition u into ui, U2 as V is partitioned into Vi, V2- By the same principle we construct iij, -Rj^- and R*i,y Then

Spectral Partitioning of Random Graphs with Given Expected Degrees 2

max \v*R*u\ — max uA.g,h

281

2

"S " ^i^**.i"i t>,*J?*j oW, < max \>" \v^R*i jUj ^uA . Yl u±g,h

«,i=i

(3)

i,j=i

If u and t; maximize the above terms, we can assume that u = u* and v = v*. Then the Uj = Wj* are perpendicular to 1. In addition we have v\-R*i^j -Uj = Vj** • Rij • Uj*. By the construction of R we have for all Rij that the entries are bounded by some constant and the expectation of each entry is the same, namely 0{dij -w'/n). Lemma 3 allows us to bound each term in the above sum by 0{\w'). Fact 3 can be used to bound the remaining eigenvalues of R* by O(V^). Finally we show that it is possible to obtain Vi and V2 by investigating the eigenvectors of R*. For this let vi, V2 be two orthonormal eigenvectors of R* with eigenvalue Q{w') (in absolute value). Then Vi can be written as Vi = Ci • rrii + di • Ui with 11 Will = 11 Will = 1 and cf + df = 1. rrii comes from the space spanned by g and h, and Ui comes from the orthogonal complement. Then by the bound for (3) \v^R*Ui\ = n{w') • \vl • Ui\ = Q{w') • \di\ = and \di\ must be 0{l/^/W). l-0(l/v^). Since

0{\/^),

As |ci| + \di\ > c? + rf? = 1, we have |cj| =

0 = v\v2 = ciC2m\m2 + Cid2m\u2 + C2diu\m2 + d\d2u\u2 we have \c1C2m\m2\ = \c1d2m\u2 + C2diu\m2 + did2u\u2\ < \c1d2m\u2\

+ \c2d1u\m2\

+ |(ild2WiW2|

= \did2u\u2\ < \did2\ = 0{l/w'). Together with Cj = 1 — 0{1/Vw') we can follow that mi and 1712 must be almost perpendicular. We write rrii = ji- Xi/\/n-\-5i • X2/V^- For at least one i we have |7i — 5i| > e for some small constant e, otherwise TOI and m2 could not be almost perpendicular. Taking this nii, we have that the entries belonging to Vi differ from the other entries by at least e/i/n. This gives us the chance to identify the Vi, V2 by the entries of rrij. Unfortunaly, we have only Vi and not mj. But we can assume, that in Cj • rrii the distance of e/{2y/n) still holds, because Cj > (1 — 0{l/w')) > 1/2. It is possible, that some entries j in u change the value of Cj • mi{j), such that we put j into the wrong partition. This may happen, if the value is changed by at least £/(4y^). But such entries are relatively rare. The entry in Wj must have an absolute value of J7(vw) • e/{4^/n), because \di\ = 0 ( 1 / V w ) is small. The number of such entries is bounded by 0{n/w') since u has length 1. We obtain, that we are able to partition at least (1 — 0{l/w')) • n vertices correctly by visiting the eigenvector Vi of R*. This finishes our proof of Theorem 2.

282

A. Coja-Oghlan, A. Goerdt, and A. Lanka

References 1. Aiello, W, Chung, F., Lu, L.: A random graph model for massive graphs. Proc. 33rd. SToC (2001), 171-180. 2. Alon, N. Spectral techniques in graph algorithms. Proc. LATIN (1998), LNCS 1380, Springer, 206-215. 3. Alon, N., Kahale, N.: A spectral technique for coloring random 3-colorable graphs. SIAM J. Comput. 26 (1997) 1733-1748. 4. Boppana, R.B.: Eigenvalues and graph bisection: An average case analysis. Proc. 28th FoCS (1987), 280-285. 5. Chung, F.K.R.: Spectral Graph Theory. American Mathematical Society (1997). 6. Coja-Oghlan, A.: On the Laplacian eigenvalues of Gn,p- Preprint (2005) http://wvirw.informatik.hu-berlin.de/~coja/de/publikation.php. 7. Coja-Oghlan, A., Lanka, A.: The Spectral Gap of Random Graphs with Given Expected Degrees. Preprint (2006). 8. Chung, F.K.R., Lu, L., Vu, V.: The Spectra of Random Graphs with Given Expected Degrees. Internet Mathematics 1 (2003) 257-275. 9. Dasgupta, A., Hopcroft, J.E., McSherry, F.: Spectral Analysis of Random Graphs with Skewed Degree Distributions. Proc. 45th FOCS (2004) 602-610. 10. Feige, U., Ofek, E.: Spectral Techniques Applied to Sparse Random Graphs. Random Structures and Algorithms, 27(2) (2005), 251-275. 11. Flaxman, A.: A spectral technique for random satisfiable 3CNF formulas. Proc. 14th SODA (2003) 357-363. 12. Friedman, J., Kahn, J., Szemeredi, E.: On the Second Eigenvalue in Random Regular Graphs. Proc. 21th STOC (1989) 587-598. 13. Fiiredi, Z., Komlos, J.: The eigenvalues of random symmetric matrices. Combinatorica 1 (1981) 233-241. 14. Giesen, J., Mitsche, D.: Reconstructing Many Partitions Using Spectral Techniques. Proc. 15th FCT (2005) 433-444. 15. Husbands, P., Simon, H., and Ding, C : On the use of the singular value decomposition for text retrieval. In 1st SIAM Computational Information Retrieval Workshop (2000), Raleigh, NC. 16. Krivelevich, M., Sudakov, B.: The largest eigenvalue of sparse random graphs. Combinatorics, Probability and Computing 12 (2003) 61-72. 17. Krivelevich, M., Vu, V.H.: On the concentration of eigenvalues of random symmetric matrices. Microsoft Technical Report 60 (2000). 18. Lempel, R., Moran, S. Rank-stability and rank-similarity of link-based web ranking algorithms in authority-connected graphs. Information retrieval, special issue on Advances in Mathematics/Formal methods in Information Retrieval (2004) Kluwer. 19. Meila, M., Varna D.; A comparison of spectral clustering algorithms. UW CSE Technical report 03-05-01. 20. McSherry, F.: Spectral Partitioning of Random Graphs. Proc. 42nd FoCS (2001) 529-537. 21. Mihail, M., Papadimitriou, C.H.: On the Eigenvalue Power Law. Proc. 6th RANDOM (2002) 254-262. 22. Pothen, A., Simon, H.D., Liou, K.-P.: Partitioning sparse matrices with eigenvectors of graphs. SIAM J. Matrix Anal. Appl. 11 (1990) 430-452 23. Spielman, D.A., Teng, S.-H.: Spectral partitioning works: planar graphs and finite element meshes. Proc, 36th FOCS (1996) 96-105.

A Connectivity Rating for Vertices in Networks Marco Abraham^, Rolf Kotter^^, Antje Krumnack^, and Egon Wanke^ ^ Institute of Computer Science, Heinrich-Heine-Universitat Diisseldorf, D-40225 Diisseldorf, Germany ^ C. & O. Vogt Brain Research Institute, Heinrich-Heine-Universitat Diisseldorf, D-40225 Dusseldorf, Germany 3 Institute of Anatomy II, Heinrich-Heine-Universitat, Dusseldorf, D-40225 Dusseldorf, Germany

A b s t r a c t . We compute the influence of a vertex on the connectivity structure of a directed network by using Shapley value theory. In general, the computation of such ratings is highly inefficient. We show how the computation can be managed for many practically interesting instances by a decomposition of large networks into smaller parts. For undirected networks, we introduce an algorithm that computes all vertex ratings in linear time, if the graph is cycle composed or chordal.

1 Motivation and Introduction This work is originally motivated by the analysis of networks t h a t represent neural connections in a brain. T h e cerebral cortical sheet can be divided into many different areas according to several parcellation schemes [4, 9, 20]. T h e primate cortex forms a network of considerable complexity depending on the degree of resolution. Information forwarding is usually accompanied by the possibility to respond. Thus, t h e corresponding networks are generally strongly connected. Prom a systems point of view, it is a great challenge t o analyze the influence of a single area to the connectivity structure of the hole system. Such information could be helpful to understand the functional consequences of a lesion. We measure the influence of a vertex on the connectivity structure of a directed graph G = {VG, EG) by a function based on the Shapley value theory, which was originally developed within game theory^, see [16]. Our function 0 is parameterized by a so-called characteristic function denoted by / Q . It counts for a set of vertices V C VG the number of strongly connected components in the subgraph of G induced by the vertices of V. In general, a characteristic function is a mapping from the subsets of a set of abstract objects A'^ to the real numbers R. T h e application of Shapley value computations to graphs was first done by Myerson in [10], who considered only undirected graphs. For a characteristic function h : 2^° —> R defined on vertex sets, Myerson analyzed ^ In game theory literature the argument of (/> is a game (usually denoted by letter v) over an abstract set of players A'' and the result is a vector of R^. Since we consider graphs, we prefer to use letter v for vertices rather for functions. Please use the following format when citing this chapter: Abraham, M., Kotter, R., Krumnack, A., Wanke, E., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 283-298.

284

M. Abraham et al.

the function that computes for a given vertex set V the sum of all h{V"), where V" is a vertex set of a connected component in the subgraph of G induced by v . That is, for undirected graphs, our function i?!)/^ is equivalent to the function defined by Myerson (called Myerson value) for the case that h{V") = 1 for all V" C VG. Several authors have already analyzed the computation of Shapley values defined for vertices in graphs. Owen shows in [12] how to compute Myerson values for trees. Gomez et al. prove in [7] a simple separation property for undirected graphs that can be used to compute some Myerson values more efficiently. Van den Brink and Borm analyze in [18] a characteristic function for vertex sets of directed graphs and show that the Shapley values for this function can be computed efficiently. However, this characteristic function covers only a local property of the vertices. Deng and Papadimitriou consider in [2] a characteristic function that sums up the weights of all edges between two vertices of V. The paper is organized as follows. In Section 2, we recall the definitions we need from Shapley value theory [16]. In Section 3, we introduce a binary relation on vertices called strong separability. If two vertices u, v are strongly separable then the rating 4>f^ (u) is independent of the existence of v and vice versa, that is, (pfaiu) = 4>fG-{v}(^) ^'^^ 4'fai''^) = 'Pfa-iuy (''^)' where G — {u} is graph G without vertex u and G—{v} is graph G without vertex v. This allows us to decompose a directed graph into subgraphs such that the ratings of the vertices in the original graph are computable by the ratings of the vertices in the subgraphs (Theorem 1). We also show that deciding whether two vertices u,v are not strongly separable is NP-complete (Theorem 2) and deciding (?!>/c (u) < 'Pfa (^) foi' t'^0 given vertices u, v is NP-hard. This implies that an algorithm for the computation of 4>f^ can be used to decide an NP-hard as well as a co-NP-hard decision problem. In Section 4, we consider undirected graphs as a special case of directed graphs where undirected edges are represented by directed edges oriented against each other. Definition 1 applied to undirected graphs yields that two vertices are strongly separable if and only if there is no chordless cycle passing u and V. The extension of Theorem 1 to undirected graphs (Theorem 4) allows us to compute the rating 4>fa{u) for all vertices in linear time if G is cycle composed (Theorem 5) or chordal (Theorem 6). Although some of the results shown in this paper can be extended to a much more general case, we restrict ourself to the one characteristic function fo- This reduces the mathematical notations and keeps the proofs as simple as possible.

2 T h e Shapley value Let N be any set of abstract objects. A characteristic function / is a mapping from the subsets of A'" to the real numbers R with /(0) = 0. A carrier of / is a set C C A?" such that f{S) = f{S 0 C) for every S Q N. Any superset of

A Connectivity Rating for Vertices in Networks

285

a carrier C of / is again a carrier of / . The objects outside a carrier do not contribute anything to the computations by / . The sum (superposition) of two characteristic functions / and g, defined by (/ + 9){S) = f{S) + g{S), is again a characteristic function. Let TT be any permutation of A'', that is, TT is a one to one mapping of N to itself. For a set 5 C A'' let n{S) = {^{x) \ x e S) he the image of S under TT. Let /,r be the characteristic function defined by /^(S') = /(7r~^(5)). To rate the objects of N with respect to a characteristic function / , we use a function <> / that associates with every characteristic function / a rating function (f>f : N -^R such that

(Axiom 1:) for every permutation n oi N and all x G N,

(Axiom 2:) for every carrier C of / ,

x;/(:r) = /(c), xec

and (Axiom 3:) for any two characteristic functions / and 5, (l^f+a ='Pf + ' 'gShapley has shown in [16] that function

*,M= E

"^1' "X'-1^1" (/(.)-/(5-M)), '

SCN, xes

(1)

'•

where l^l and |A''| denote the size of S and C, respectively, or alternatively by 1 •^Z^^) = IM E ifirr^i^'^) U W) - fimiT^,^))),

(2)

TTGil/V

where 11N is the set of all one to one mappings (enumerations) w : N {!,..., |iV|} and m('K,x) = {y E N \ ^(y) < 7r(x)} is the set of all y G A^ arranged on the left side of x.

3 A vertex rating for directed graphs We now define a characteristic function fa to rate the vertices in directed graphs. The rating will measure the influence of a vertex on the connectivity

286

M. Abraham et al.

structure. The smaller the rating of a vertex the greater its importance to the connectivity. Let G = {VG,EG) be a finite directed graph, where VG is a finite set of vertices and EG C VG X VG is a finite set of directed edges. A path in G is a sequence p = {vi,... ,Vk), k > 1, oi distinct vertices such that (vi,Vi+i) G EG for i = 1 , . . . , A; — 1. We say, p is a path of length k from vi to Vk- A path is called a cycle of G if G additionally has edge {vk,vi). We will consider only simple paths and cycles in which all vertices are distinct. For a vertex set V C VQ, let G\v' be the subgraph of G induced by the vertices o f y , that is, G|K' = (VQ', Ec) where VQ' =V' undEo' =-EnV^'xV^'. G is strongly connected if for every pair of vertices u,v G VG there is a path from u to u in G. A strongly connected component of G is a maximal strongly connected subgraph of G. Let SCC(G) be the set of all strongly connected components of G, and / G be a function from the subsets of VG to the real numbers R (here we need only the set of non-negative integers) such that for every subset V C VG, / G ( n = |SCC(G|vOIThat is, / G ( ^ ' ) is the number of strongly connected components in the subgraph of G induced by the vertices of V. Note that / G is a characteristic function, because / G ( 0 ) is always zero. The complete vertex set VG is always the only carrier of / G for every directed graph G. By Axiom 2, we have

J2haiv)=fG{VG)

= \SCG{G)\.

veVa

Figure 1 shows an example of the vertex rating (j)f^ for a directed graph G with vertex set VG = {f i, ^'2, ^3, V4, v^, VQ, v-r, v^}. Since G is strongly connected, we get fciVo) = Y^veVa'^foi'") = 1- Following the computation of (j)fc by Equation 2, vertex vs has rating (j)f^{v8) = 5, because /G("^(7'',U8) U {VS}) — /G(m(7r, Vg)) = 0 if and only if 7r(v6) < 7r(z;8). Otherwise, we have fG{m{TT, vs)D {t^s}) —/G(TO('?r, Vg)) = 1, which happens for half of all 8! enumerations n. Vertex vi has rating (pfa{vi) = | , because fG{'m{Tr,vi) U {vi}) — /GC^^C""",fi)) = 0 if and only if 7r(f2) < 7r(vi) and 7r(i;3) < 7r(wi). Otherwise, we have /G(m(7r, fi) U {vi}) — fGi'm{n,vi)) = 1. Here the second case happens for two third of all 8! enumerations TT. Let G = {VG,EG) and G' = {VG',EG') be two directed graphs. We call G and G' isomorphic if there is a one to one mapping 6 : Vcj —» VG' such that for every pair of vertices z^i, 112 € VG, {vi,v2)e

EG <S=> {b{vi),b{v2))

eEG'.

Such a mapping b is called an isomorphism between G and G'. If G and G' are isomorphic then / G ( ^ ' ) = fG'{b{V')) for every vertex set V C VG- Here

A Connectivity Rating for Vertices in Networks

287

Fig. 1. The vertex rating (j>f^ for a directed graph G with 8 vertices. The smaller the rating of a vertex the greater its importance to the connectivity of the graph. h{y') = {h{u) I u G V'} is the image of V under b. This implies 4'fai''^) = 4'ja' (^(^)) f^'^ ^"^ vertices v € V. Let V C VQ be any set of vertices of G. Graph G is called V-symmetric if for every pair of vertices vi,V'2 S V there is an isomorphism 6 of G to G itself such that b{yi) = V2- In ^'-symmetric graphs all vertices v E V have the same rating. If two vertices Vi, V2 have the same neighborhood, i.e., if {u \ {u,vi) G EQ} = {u | {u,V2) € EG} and {u I {vi,u) G EG} = {u I (i>2, u) G EG}, then G obviously is {wi, wgj-symmetric. Figure 2 shows some examples of partially symmetric graphs.

Fig. 2. The graph to the left is {i)i,'i;3,t;5,ii7}-symmetric and {v2,V4,ve,vs}symmetric, the graph in the middle is {iii,i)2,t'3}-symmetric, and the graph to the right is {iiiji^aji'a, V4, «5, V6}-symmetric. The computation of a vertex rating 0/^ {v) by Equation 1 or Equation 2 is highly inefficient. The number of subsets and the number of enumerations increase exponentially in the number of vertices of G. To handle the computation of (j)f^ for many practically interesting instances we will introduce a method to decompose a large graph into smaller parts. This decomposition will allow us to compute efficiently the ratings of vertices of the original graph by using the ratings of the vertices of smaller subgraphs. Our decomposition method will be introduced by the following two lemmas and Theorem 1. The first lemma shows that the computation of a rating 4>fa{v) for which the arguments of fa are restricted to vertices of a subset V C VG yields the computation of 0/^. ^ (v).

288

M. Abraham et al.

Lemma 1. Let G = {VG,EG) be a graph, V C VQ, and G' = G\v'- Let Ilva be the set of all enumerations n : VG —»• { 1 , . . . , | V G | } . Then for every vertex veV ha' (^) = riTTl E

(fG(.{rn{n, v) U {v}) n V) - /G(m(7r, v) D V')).

Proof. Let Lfv be the set of all enumerations TT' : V —> { 1 , . . . , | y |}. First we show that for every enumeration TT' e Uv there are (|V''| + l ) - ( | y | + 2 ) \VG\ unique enumerations 7r e LIVG ^^ch that for every pair of vertices vi,f2 € V, n'{vi) < TT'{V2) if and only if 7r(ui) < IT{V2). Let p = {vi^,.. .,Vi^^,^) be the sequence of vertices of V in the order defined by TT', that is n'ivi,)

If we consider the vertices of VG — V in an arbitrary order, then the first vertex of VG — V can be placed at | y | + 1 positions at sequence p to get a sequence with \V'\ + 1 vertices. After that the next vertex can be placed at | y | + 2 positions in the resulting sequence to get a sequence with | y | + 2 vertices, and so on. The final vertex of VG — V can be placed at | VQ | positions in the sequence obtained by the preceding placement to get a sequence of all |VG| vertices of G. For all these ( | y I + 1) • (|T^'| + 2) |VG| enumerations TT defined for enumeration TT' we have fG'im{TT',v) U {v}) fG'{m{n',v)) = / G ' ((m(7r, v) U {t;}) n V) - fG' (m(7r, v) n V) for every vertex v £V', and thus -^/o'W = WvX.'en^XfG'{m(.ir',v)VJ{v})-fG'{m{'K',v))) _ _1_Y- \V'\\ ^^enva

= ^

(/»/ ((m(7r,i;)U{-»})nV')-/„, {m{-K,v)nV')) {\V'\+l)-{\V'\+2) \VG\

E^envG ifaiimin,

v) U W ) n ^ ) - fcimi-,,

v) n F'))-

The last equaUty follows from the fact that fG'{V" n V) = fG{V" n V) for every subset V" CVQ• It is easy to see that the rating of a vertex in a graph G depends only on the connectivity structure of the strongly connected component the vertex belongs to, as the following observation shows. If G' = {VG',EG') is a strongly connected component of G = (VG, EG) then for every vertex v G VG' and every vertex set V" C VG, faiiV" U {v}) n VG') - faiV" n VG') - fG{V" U W ) and thus by Lemma 1, (pj^, (f) = ^/(,(v).

fG{V"),

A Connectivity Rating for Vertices in Networks

289

We will now define a property of a vertex pair u, v that allows us to compute independently the rating for two vertices u and v. That is, the rating of M in G will be equal to the rating of u in graph G without v. Definition 1. Let G = (VQ, EQ) he a directed graph and u,v £VG be two nonadjacent vertices, that is, neither {u,v) nor {v,u) is an edge ofG. Vertex u and vertex v are strongly separable in G if for every strongly connected induced subgraph H = {VH,EH) of G which contains u and v there is a strongly connected subgraph J = {Vj,Ej) of H without u and v such that H\VH-VJ has no path from u to V and no path from v to u. For the proof of the next lemma we need the notion of an undirected graph. In an undirected graph G = iVc, EG) the edge set is a subset of {{u, v} \u,v & VG, U ^ v}. Analogously to the definitions for directed graphs, an undirected path of length A;, fc > 1, is a sequence p = (vi,... ,Vk) of fc distinct vertices such that {vi, Vi+i} S EG iov i = 1,... ,k — 1. An undirected path is called an undirected cycle if G additionally has edge {vk,vi} and the path has at least three vertices. The subgraph of G induced by a vertex set V Q VG has edge set EG r\{{u,v} \ U,V G V, U y^ V}. A graph is connected if there is a path between every pair of vertices, a connected component is a maximal connected subgraph, a forest is an undirected graph without cycles, and a tree is a connected forest. L e m m a 2. Let G = {VG,EG) be a directed graph and VH,VJ Q VG be two vertex sets such that VH U Vj = VG and for every edge (^1,112) G EG both vertices are in VH or in Vj, or in both sets. Let H = G\VH, J = G\vj, and I = G|y„nVj • V every pair of vertices u G VH — Vj, v £ Vj — VH is strongly separable in G, then for every vertex set V C VG, foiV)

= fniV

n VH) + fj{V' n Vj) - fj{V' n Vi).

Proof. Let V C VG be any set of vertices of G. Consider the following undirected graph T — (Vr, ET) with vertex set VT == SCC{H\v') U SCC(J|yO such that two vertices of VT are connected by an undirected edge if and only if the two strongly connected components have at least one common vertex. If two distinct strongly connected components of VT are connected by an undirected edge in T then one of them has to be from SCC{H\v') and the other has to be from S C C ( J | K ' ) - Furthermore, for every strongly connected component C of SCC{I\v'), there is exactly one strongly connected component Ci of SCC{H\v') and exactly one strongly connected component C2 of S C C ( J | v ) , and the common vertices of Ci and C2 are exactly the vertices of C. Since every pair of vertices u GVH — Vj, v €Vj — VH ^s strongly separable in G, the undirected graph T has no cycles, that is, T is a forest. The number of connected components of T (the number of trees of forest T) is equivalent to the number of strongly connected components of G. The number of connected

290

M. Abraham et al.

components in a forest is always equivalent to its number of vertices minus its the number of edges. Since T has exactly one edge for every strongly connected component of S C C ( / | v ) and exactly one vertex for every strongly connected component of SCC(iJ|y') and S C C ( J | y ) , we get /G(V')

= fniV n VH) + fj{v' n Vj) - fiiV n Vj). a

The following theorem states how ratings of vertices of G can be computed by the ratings of the same vertices in certain subgraphs of G. Theorem 1. Let G = (VCEG) be a directed graph and VH,VJ Q VQ be two vertex sets such that VH U Vj = VQ and for every edge (vi,'y2) € EQ both vertices Vi,V2 are in VH or in Vj, or in both sets. Let H — G\VH, J = G\vj, and I = GlvnnVj • If every pair of vertices u GVH — VJ, V € VJ — VH is strongly separable in G, then 1. for every vertex w GVH (^ Vj, 4>f^ {w) = (f)f^ (w) + (pfj {w) — 4>fi (w), 2. for every vertex w &VH — Vj, 4>fG (^) — 4'fH (^)» ^''^'^ 3. for every vertex w GVJ — VH, (/>/G(W) = (f)fj{w). Proof. Let w be any vertex of VQ- By Lemma 2, for every vertex set V fciV

U {w}) - faiV)

=

ifHiiV

CVG,

U M ) n VH) - fniV' n VH))

+ {fj{{v'yj{w])nvj) -{fi{{V'yj{w))nVi)

-fj{V'nVj)) -fj{v'nVi)).

If w e VH n Vj, then by Lemma 1 we get 4>ia (w) =
If w is a vertex of VH-VJ, V r\Vi, and thus fciV

{w).

then (V' U {w}) nVj = V'nVj and {V U{w})nVi

U {w}) - faiV)

=

=

fH{{V'U{w})nVH)-fH{V'nVH),

which implies by Lemma 1 f„{w). If w is a vertex of Vj — VH, then an analog argumentation yields cpf^lw) =
A Connectivity Rating for Vertices in Networks

291

Fig. 3. Four graphs G = (VG,-EG), H = G\v„, J = G\vj, and I = G\v„nVj such that VH,VJ C VG, VH U VJ = VQ, and for every edge (vi, V2) £ EG both vertices are in VH or in Va, or in both sets. Vertex pair vi & VH — Vj, ve G Vj — VH is strongly separable in G. polynomial time algorithms which decides whether two vertices in a directed graph are not strongly separable, unless P = NP. The NP-hardness follows by a simple reduction from the satisfiability problem. The terms we use in describing this problem are the following. Let X = {xi,..., x„} be a set of Boolean variables. A truth assignment for X is a function t : X —^ {true, false}. If t{xi) = true we say variable Xj is true under t; if t{xi) = false we say variable Xi is false under t. If Xi IS Si variable of X, then x, and xj are literals over X. Literal Xi is true under t if and only if variable Xj is true under t; literal xj is true under t if and only if variable Xi is false under t. A clause over X is a set of literals over X, for example {xi,xj, X4}. It represents the disjunction of literals which is satisfiedhy a truth assignment t if and only if at least one of its literals is true under t. A collection C of clauses over X is satisfiable if and only if there is a truth assignment t that simultaneously satisfies all clauses of C. The satisfiability problem, denoted by SAT, is specified as follows. Given a set X of variables and a collection C of clauses over X. Is there a satisfying truth assignment for C? This problem is NP-complete even for the case that every clause of C has exactly three distinct literals (3-SAT, for short). T h e o r e m 2. The problem to decide whether two vertices u,v of a directed graph G are not strongly separable is NP-complete. Proof. Let us first illustrate that the problem belongs to NP. Two vertices u and V are not strongly separable in G if and only if G has a strongly connected induced subgraph G' = (VG',EG') that includes u and v such that G"|VQ,_{„_„} has no strongly connected subgraph G" = (VG",EG") such that in G'\v^,-Va>i there is no path from u to v and no path from v to u. Without loss of generality we can assume that G" is a strongly connected component of G' | y^, _ |„_^,}. So we can non-deterministically consider every strongly connected subgraph G' of G that includes u and v. Then we can verify in polynomial time for every strongly connected component G" — {VG",EG") of G'\v^,-^u,v} whether G'\v^,-Va'' has no path from u to v and no path from v to u. Thus, the problem to decide whether two vertices u, v are not strongly separable belongs to NP.

292

M. Abraham et al.

The NP-hardness follows by a simple transformation from 3-SAT. Let X = { x i , . . . , Xn} be a set of n Boolean variables and C = { C i , . . . , Cm} be a collection of m clauses. We define a graph G{X, C) with two vertices u, v such that there is a truth assignment t for X that satisfies every clause of C if and only if u and v are not strongly separable in G{X,C). Figure 4 shows an example of such a construction for four variables a;i,a;2,X3,X4 and four clauses { X 2 , X 3 , X i } , {xi,X^,X4},

{xl,X^,X^},

{xT,X2,Xs}.

Graph G{X,C) has six vertices u,a,b,v,c,d, two literal vertices Xj, x7 for every variable Xi, 1 < i < n, and three literal vertices Cj^i, Cj,2, Cj,3 for every clause Cj = {cj,i,Cj,2,Cj,3}, I < j < rn. G{X,C) has the edges {u,a), (a,xi), (a,xl), the edges (xi,Xi+i), ( x i , x ^ ) , (x7,Xi+i), ( x 7 , x ^ ) for i = 1 , . . . , n 1, the edges (x„,6), (x^^, &), {b,v), {v,c), (c,ci,i), (c,ci,2), (c,ci,3), the edges {cj^k,Cj+i,i) for j = l , . . . , m — 1 and k,l G {1,2,3}, and the edges {cm,i,d), {cm,2,d), {cm,3,d), {d,u), and {d,a). Additionally, there are a so-called cross edges from every literal vertex Xj (x7) for variable Xi to every literal vertex xj (xj, respectively) for some clauses. In Figure 4, the cross edges are drawn as dotted arcs.

literal vertices for variables

literal vertices for clauses

Fig. 4. The graph G{X,C) for X = a;i,a;2,a;3,a:4 and C = {a;2,a;3,a;4}, {xi,0:2,2:4},

Every cycle of G{X, C) that includes vertex u and v consists of two vertex disjoint path pi = {u,a,..., b, v) and p2 = {v,c,..., d, u). Path pi passes exactly one literal vertex for every variable, and defines in this way an assignment t for the variables, where path p2 passes exactly one literal vertex for every clause. Assume there is a truth assignment t for X that satisfies every clause. Then there is an induced subgraph G' of G{X, C) that includes vertex u and v but no cross edge, for example the subgraph of G{X, C) induced by u, v, a, b, c, d and all true literal vertices. In this case, it is not possible to destroy all paths between u and V and all paths between v and u by removing a strongly connected subgraph of G'. Thus u and v are not strongly separable. Assume there is no truth assignment for X that satisfies every clause. Then every strongly connected induced subgraph G' of G that includes u and v has

A Connectivity Rating for Vertices in Networks

293

at least one cross edge («', v'). In this case it is easy to destroy all paths from u to V and all path from v to M by removing a cycle that includes the edge (d, a) and the cross edge {u',v'). Thus u and v are strongly separable. D Theorem 2 can be used to prove that deciding whether two vertices have a different rating is NP-hard. Consider again the graph G{X, C) with the two vertices u and v constructed for an instance {X, C) of 3-SAT as in the proof of Theorem 2. Let G'{X,C) be the graph G{X,C) without the vertex v and its incident edges. Then 4'fo(x,c) (") ~ ^fa'tx o (^) ^^ ^ ^^^ ^ ^^^ strongly separable in G, and (l>fg,x c) (•") < 'PSG'IX O (") '^^ " ^'^'^ ^ ^^^ '^°^ strongly separable in G. Theorem 3. The problem to decide whether (pf^ (u) < (pf^ {v) for two vertices u,v of a directed graph G is NP-hard. Thus, an algorithm for the computation of 4>f^ can be used to decide an NP-hard as well as a co-NP-hard decision problem.

4 A vertex rating for undirected graphs The vertex rating (j)f^ for directed graphs can simply be extended to undirected graphs. For an undirected graph G let dir(G) be the directed graph we get if we replace every undirected edge {u,v} by two directed edges {u,v) and {v,u). Let fa now be the function from the subsets of VQ to the real numbers R such that for every V CVG, / G ( ^ ' ) is the number of connected components in the subgraph of G induced by the vertices of V. That is, the rating of a vertex v in an undirected graph G is equal to the rating of v in the directed graph dir(G). Figure 5 shows an example of the vertex rating cpf^ for an undirected graph G with vertex set VQ = {VI,V2,V3,V4,V5,VQ,V7,V8}.

Fig. 5. The vertex rating (j)f^ for an undirected graph G with 8 vertices. It is easy to verify that two vertices u, v of dir(G) are not strongly separable if and only if G has a chordless cycle that includes u and v. A chord for a cycle c = ( u i , . . . ,Uk) is an edge {ui,Uj} such that 2 < |i — j | < k — 2. The problem of determining whether an undirected graph G contains a chordless cycle can be solved in linear time [3, 15, 17]. This is the well-known chordal

294

M. Abraham et al.

graph recognition problem. A graph G is a chordal graph if any cycle of G of length at least four has at least one chord, or alternatively, if G has no chordless cycle, see [8]. The problem of determining whether G contains a chordless cycle of length fc > 5 can be solved in 0 ( | V G | + |-EG|^) time on 0 ( | V G | • \EG\) space, see [11]. Theorem 1 applied to undirected graphs yields the following theorem which is a more general version of Proposition 2 of [7]. Theorem 4. Let G — (VcEa) be an undirected graph and VH,VJ C VG he two vertex sets such that V/f U Vj = VG and for every edge {vi,U2} G EG both vertices ui,U2 are in VH or in Vj, or in both sets. Let H = G\VH, J = G\vj, and I = G\vHnVj- If G has no chordless cycle with a vertex ofVn — Vj and a vertex of Vj — VH, then 1. for every vertex w £ VH r\Vj, 0/^ (w) = (pf„ (w) + (j)fj (w) — (j)fj (w), 2. for every vertex w GVH — Vj, 1 connected components Gi,... ,Gk by removing a complete subgraph / of G. Let G'^ = G\va-uVi for i = 1 , . . . , fc be the subgraphs of G induced by the vertices of connected component Gi and the vertices of the removed complete subgraph / . Then by Theorem 4 for every i = 1 , . . . , fc the vertex rating for a vertex w of G'^ is ^fo (•"') = ^fc'. ("') ^^^ ^^^ vertex rating for a vertex iz; of J is k

An example of a class of graphs for which the rating (j>f^ is efficiently computable is the class of cycle composed graphs which can recursively be defined as follows. The cycle C„ with n > 3 vertices is cycle composed. Let G = (VG, EG) be a cycle composed graph and ei = {ui, ui} be an edge of G. Let C„ = {Vc„,Ecn) be a cycle with n> 3 vertices and 62 = {w2, ^"2} be edge of Cn- Then the graph obtained by the vertex disjoint union of G and C„ and the identification of U2 with Ml and V2 with vi is cycle composed. That is, the composed graph has vertex set VcjUVcn -{u2,V2} and edge set {{h{u),h{v)} \ {u,v} € EQUEC^}) where h{u) — u for every M G VG U Vc„ — {^2,^2}, and h{u2) = MI and h{v2) = vi. Cycle composed graphs are biconnected and have tree-width at most 2, see [13, 1] for a definition of tree-width. Graphs of tree-width at most 2 can be recognized in linear time by removing vertices of degree at most 2. When a vertex u of degree 2 is removed then the two neighbors of u will be connected by an edge if they are not adjacent. A graph has tree-width 2 if and only if it can completely be reduced by removing vertices of degree at most 2 in the way described above, see for example [19]. Let C be the set of vertex sets of the cycles used to compose a cycle composed graph G, that is, C has a vertex set C for every cycle used to compose G. The

A Connectivity Rating for Vertices in Networks

295

vertex rating 0(w) for all vertices w of G is computable in linear time by the following simple procedure. 1. for every u E V do { 2. let<^/^(u):=(deg(u)-2)*(-i);} 3. for every C £ C do { 4. for every u £ C do { 5. let (l)f^(u) := (pfa(M) + 1^; } } Since a vertex u is involved in degr(u) — 2 vertex identifications, the rating for u can be initialized by (deg('u) — 2) * (—|). After that the algorithm adds for every cycle C the fraction j ^ to the ration of every vertex of C. Since the number of vertices in the sets of C is ^cec 1^1 ~ 2|£^G| — \^G\, the rating for all vertices in cycle composed graphs can be computed in linear time, if C is given. The vertex sets of the cycles can be computed by the following algorithm. We assume that an empty vertex list is initially assigned to every edge. That is, every edge {u,v} is initially represented as a pair ({u,u},0). An edge e = {{u,v},L) with a non-empty vertex list L represents a path between u and v passing the vertices of L. If G has a vertex u of degree 2 such that the two neighbors v,w of M are not adjacent, we remove vertex u and its two incident edges {{u,v},Li), {{u,w},L2) and insert a new edge {{v,w},Li U L2 U {u}) between u and v. If G has a vertex u of degree 2 such that the two neighbors VjW of u are adjacent, the vertices of a cycle can be reported. Let {{u,v},Li), {{u,w},L2)-, {{v, w}, L3) be the three edges between the vertices u, v and w. The algorithm then reports vertex set Li U L2 U L3 U {u,v,w}. If graph G has no further edges than the three edges above, then all cycles are reported and the algorithm finishes. If graph G has some further edges and L3 is non-empty, then the graph is not cycle composed. In any other case the algorithm removes the two edges {{u,v},Li), {{u,w},L2) and so forth. If this processing ends because there are no further vertices of degree 2, then the graph is also not cycle composed. This algorithm computes the vertex sets of all cycles used to compose a cycle composed graph. The running time of this algorithm is 0 ( | V G P ) because we have to check for every vertex whether its two neighbors are adjacent. However, this problem can be eliminated by a simple trick which is also used in [19] for the recognition of outerplanar graphs. The trick is to check whether the two neighbors v, w of u are adjacent at the time when one of these two vertices V, w gets a degree of 2 or less. At that point the test can be done in a fixed number of steps and either a new edge is inserted or a cycle is reported. This modification yields a linear time algorithm for the computation of all cycles of a cycle composed graph. The following example shows a possible implementation.

296

M. Abraham et al.

create-new-edge (vertex u) { let ei = {{u,v},Li),e2 = {{u,w},L2) € EG be the two edges incident to u; insert {{v, w}, Li U -L2 U {u}) into Enew', remove ei and 62 from EG; if (deg^^ (u) = 2) then insert i; into M; if (deg£;^(w) = 2) then insert w into M; } raove-new-edge (edge enew = {{u,v},Lnew)) { if there is an edge e = {{u, v}, L) G EG then { o u t p u t L U Lnew U {u, v]]

remove Cnew from Enewl if ( | £ G | = 1) and (l^newl = 0) then halt " all cycles reported"; else if (L ^ 0) then halt " G is not cycle composed"; else { remove enew from -Enew; insert Cnew into EG] if (deg^;^ (u) = 3) then remove u from M; if {deg^^iy) = 3) then remove v from M; }

} compute-cycles (graph G = {VG,EG)) { let M := 0; for every u GVQ do { a (degE^iu) = 2) then { insert u into M; } } while (M 5^ 0) { let u€ M; if there is an edge enew S -Enew incident to u then move-new-edge (e else { if (deg£;^(u) = 2 ) then create-new-edge (u); remove u from M; } } halt " G is not cycle composed"; }

A Connectivity Rating for Vertices in Networks

297

The algorithm above stores in a set M all vertices of degree 2. Note that the degree of a vertex is always determined by the edges of EQ • For every vertex u adjacent with exactly two vertices v,w a. new edge is inserted into a set denoted by -Bnew but not yet into edge set EG of graph G. Whenever a vertex w of M is considered for processing it is first checked whether there are edges incident to u in set -Bnew If -E'new has an edge e incident to u then e will either be inserted into EG (if the two vertices of e are not adjacent by some edge of EG), or a cycle is reported (if the two vertices of e are adjacent by some edge of EQ)- The test whether the two vertices of e are adjacent by some edge of EG can be done in time 0(1) because u is one of the end vertices of e and has vertex degree 2. This proves the following theorem. Theorem 5. The vertex rating 4>f^{u) for all vertices u of a cycle composed graph G is computable in linear tim,e. The vertex rating (l>f^ is also computable in linear time for chordal graphs. An interesting characterization of chordal graphs is the existence of a perfect elimination order. Let p = ( u i , . . . , u„) be an order of the \VG\ = n vertices of G = {VG,EG), and let N{G,p,i) for i = 1 , . . . ,n be the set of neighbors Uj of vertex Ui with i < j , N{G,p,i)

:= {uj I {ui,Uj} € EG A i < j}.

The vertex order p — (ui,..., w„) is called a perfect elimination order (PEO) if the vertices of N{G,p, z) for i = 1 , . . . , n — 1 induce a complete subgraph of G. Dirac [3], Fulkerson and Gross [5], and Rose [14] have shown that a graph G is chordal if and only if it has a perfect elimination order. Rose, Tarjan, and Lueker have shown in [15], that a perfect elimination order can be found in linear time if one exists. If a perfect elimination order p = {vi,..., Vn) of the vertices of G = (VQ, EQ) is given, then the vertex rating (j)f^ can be computed with Theorem 4 by the following algorithm. Note that, in a complete graph G with n vertices, I/I/Q (V) = - for every vertex of G, because G is Vc-symmetric. 1. let (pfoivn) ••= 1;

2. for i = n — 1 , . . . , 1 do { 3.

l e t (Pfa{Vi)

4.

for a l i v e A/'(G,p,j) do {

: = |;v(G,p,i)|+i'

5.

let (t>Sc{v) ••= 'Pfaiv)

+ \N{G,l,i)\ + l -

\NiG,p,i)V

>>

The running time of this algorithm is linear in the size of G, because the assignment of Line 3 is done exactly | VG | — 1 times and the assignment of line 5 is done exactly |£^G| times. Since the perfect elimination order can be found in linear time, we get the following theorem. Theorem 6. The vertex rating 4>fah^) f°''~ ^^^ vertices v of a chordal graph G is computable in linear time.

298

M. Abraham et al.

References 1. H.L. Bodlaender. A partial fe-arboretum of graphs with bounded treewidth. Theoretical Computer Science, 209:1-45, 1998. 2. X. Deng and C.H. Papadimitriou. On the complexity of cooperative solution concepts. Methods of Operations Research, 19(2):257-266, 1994. 3. G. Dirac. On rigid circuit graphs. Ahh. Math. Sem. Univ. Hamburg, 25:71-76, 1961. 4. D.J. Felleman and D.C. Van Essen. Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1:1-47, 1991. 5. D.R. Fulkerson and O.A. Gross. Incidence matrices and interval graphs. Pacific J. Math., 15:835-855, 1965. 6. M.R. Garey and D.S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, San Francisco, 1979. 7. D. Gomez, E. Gonzalez-Arangiiena, C. Manuel, G. Owen, M. del Pozo, and J. Tejada. Splitting graphs when calculating Myerson value for pure overhead games. Mathematical Methods of Operations Research, 59:479-489, 2004. 8. A. Hajnal and J. Suranyi. Uber die Auflosung von Graphen in vollstandige Teilgraphen. Ann. Univ. Sci. Budapest, Eotvos Sect. Math., 1:113-121, 1958. 9. R. Kotter and E. Wanke. Mapping brains without coordinates. Philosophical Transactions of the Royal Society London, Biological Sciences, 360(1456) :751766, 2000. 10. R.B. Myerson. Graphs an cooperations in games. Methods of Operations Research, 2:255-229, 1977. 11. S.D: Nikolopoulos and L. Palios. Hole and antihole detection in graphs. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pages 850-859. ACM-SIAM, 2004. 12. G. Owen. Values of graph-restricted games. SIAM Journal on Algebraic and Discrete Methods, 7(2):210-220, 1986. 13. N. Robertson and P.D. Seymour. Graph minors II. Algorithmic aspects of tree width. Journal of Algorithms, 7:309-322, 1986. 14. D.J. Rose. Triangulated graphs and elimination process. J. Math. Analys. AppL, 32:597-609, 1970. 15. D.J. Rose, R.E. Tarjan, and G.S. Lueker. Algorithmic aspects of vertex elimination on graphs. SIAM Journal on Computing, 5:266-283, 1976. 16. L.S. Shapley. A value for n-person games. In H.W. Kuhn and A.W. Tucker, editors. Contributions to the Theory of Games II, pages 307-317, Princeton, 1953. Princeton University Press. 17. R.E. Tarjan and M. Yannakakis. Simple linear-time algorithms to test chordality of graphs, acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM Journal on Computing, 13:566-579, 1984. 18. R. van den Brink and P. Borm. Digraph competitions and cooperative games. Theory and Decision, 53:327-342, 2002. 19. M. Wiegers. Recognizing outerplanar graphs in linear time. In Proceedings of Graph-Theoretical Concepts in Computer Science, volume 246 of LNCS, pages 165-176. Springer-Verlag, 1987. 20. K. Zilles. Architecture of the Human Cerebral Cortex. Regional and Laminar Oganization. In G. Paxinos and J.K. Mai, editors, The Human Nervous System, pages 997-1055, San Diego, CA, 2004. Elsevier. 2nd edition.

On PTAS for Planar Graph Problems Xiuzhen Huang ^ a n d Jianer Chen'^ ^ Department of Computer Science, Arkansas State University, State University, Arkansas 72467. Email: [email protected] ^ Department of Computer Science, Texas A&M University, College Station, TX 77843. Email: [email protected]** A b s t r a c t . Approximation algorithms for a class of planar graph problems, including PLANAR INDEPENDENT SET, PLANAR VERTEX COVER and

PLANAR DOMINATING SET, were intensively studied. The current upper bound on the running time of the polynomial time approximation schemes (PTAS) for these planar graph problems is of 2°'-''/^'n°^^\ Here we study the lower bound on the running time of the PTAS for these planar graph problems. We prove that there is no PTAS of time 20(\/1A)„0(1) £QJ. PLANAR INDEPENDENT SET, PLANAR VERTEX COVER and PLANAR DOMINATING SET unless an unlikely collapse occurs in parameterized complexity theory. For the gap between our lower bound and the current known upper bound, we specifically show that to further improve the upper bound on the running time of the PTAS for PLANAR VERTEX COVER, we can concentrate on PLANAR VERTEX COVER on pla-

nar graphs of degree bounded by three.

1 Introduction There is intensive research work on a class of planar graph N P - h a r d optimization problems, such as PLANAR I N D E P E N D E N T S E T , P L A N A R V E R T E X C O V E R a n d

PLANAR DOMINATING SET. Approximation algorithms for these planar graph problems and related problems were studied by researchers such as Bar-Yehuda and Even [5], Lipton a n d Tarjan [25], Baker [4], Eppstein [16], Grohe [20], K h a n n a and Motiwani [24], and Cai et al. [7]. T h e current upper bound on t h e running time of t h e polynomial time approximation scheme (PTAS) for these planar graph problems is of 2°(^/^)n'-'^^' [4, 25]. In this paper, we study the lower bound on t h e running time of t h e P T A S algorithms for these planar graph problems. O u r work follows some recent research progress in parameterized complexity theory [10, 11], where strong computational lower bound results on the running time of the algorithms for W[^]-hard problems are derived, t > 1. This research is supported in part by US NSF under Grants CCR-0311590 and CCF-0430683. Please use the following format when citing this chapter: Huang, X., Chen, J., 2006, in International Federation for Information Processing, Volume 209, Fourth IFIP International Conference on Theoretical Computer Science-TCS 2006, eds. Navarro, G., Bertossi, L., Kohayakwa, Y., (Boston: Springer), pp. 299-313.

300

X. Huang and J. Chen

Our research work here is focused on the computational lower bounds on the running time of the algorithms for the parameterized problems that are fixedparameter tractable (in FPT). We first give a brief review on parameterized complexity theory and the recent research results in [10, 11]. A parameterized problem Q is a decision problem consisting of instances of the form {x,k), where the integer fc > 0 is called the parameter. The parameterized problem Q is fixed-parameter tractable [15] if it can be solved in time f{k)\x\'^^^\ where / is a recursive function^. Certain NP-hard parameterized problems, such as VERTEX COVER, are fixedparameter tractable, and hence can be solved practically for small parameter values [12]. On the other hand, the inherent computational difficulty for solving many other NP-hard parameterized problems with even small parameter values has suggested that certain parameterized problems are not fixed-parameter tractable, which has motivated the theory oi fixed-parameter intractability [15]. The ly-hierarchy lJj>o W[t] has been introduced to characterize the inherent level of intractability for parameterized problems. A large number of parameterized problems have been proved to be hard or complete for various levels in the VF-hierarchy [15]. Examples of iy[l]-hard problems include many wellknown NP-hard problems such as CLIQUE, DOMINATING SET, SET COVER, and WEIGHTED CNF SATISFIABILITY. The theory of parameterized intractability has found important applications in a variety of areas such as database systems and model checking [20, 27]. The M^[l]-hardness of a parameterized problem provides a strong evidence that the problem is not fixed-parameter tractable, or equivalently, cannot be solved in time f{k)n'-"'^^ for any function / . Recent investigation has derived much stronger computational lower bounds on the running time of the algorithms for well-known NP-hard parameterized problems [10, 11]. For example, it has been shown that unless an unlikely collapse occurs in the parameterized complexity theory, any algorithm solving the iy[l]-hard CLIQUE problem takes time at least n^^''\ Note that this lower bound is asymptotically tight in the sense that the trivial algorithm that enumerates all subsets of k vertices in a given graph to test the existence of a chque of size k runs in time 0{n''). Similar lower bound results could be shown for other VF[i]-hard problems, t > 1. A method for deriving lower bounds on the running time of approximation algorithms for NP-hard combinatorial optimization problems is designed. It was proved in [11] that unless an unlikely collapse occurs in parameterized complexity theory, the VF[l]-hardness of the parameterized problem under the linear fpt-reduction implies the nonexistence of polynomial time approximation schemes of running time f{\/e)n°^^/'^^ for the original optimization problem, where / is any recursive function. ^ In this paper, we always assume that complexity functions are "nice" with both domain and range being non-negative integers and the values of the functions and their inverses can be easily computed. For two functions / and g, we write /(n) = o{g{n)) if there is a nondecreasing and unbounded function A such that /(n) < g{n)/\{n). A function / is subexponential if /(n) = 2°^"^

On PTAS for Planar Graph Problems

301

2 Terminologies in Approximation For a reference of the theory of approximation, the readers are referred to the book [3]. In this section, we provide some basic terminologies for studying approximability and its relationship with parameterized complexity. An NP optimization problem Q is a four-tuple {IQ,SQ, fQ,optQ), where 1. JQ is the set of input instances. It is recognizable in polynomial time; 2. For each instance x G IQ, SQ{X) is the set of feasible solutions for x, which is defined by a polynomial p and a polynomial time computable predicate n {p and TT only depend on Q) as SQ{X) = {y : \y\ < p{\x\) and iT{x,y)}; 3. fQ{x,y) is the objective function mapping a pair x G IQ and y £ SQ{X) to a non-negative integer. The function fg is computable in polynomial time; 4. optq € {max, min}. Q is called a maximization problem if optg = max, and a minimization problem if optg = min. An optimal solution yo for an instance x £ / Q is a feasible solution in SQ{X) such that fQ{x,yo) = optQ{fQ{x,z) | z £ SQ{X)}. We will denote by optQ{x) the value optQ{fQ{x,z) \ z € SQ{X)}. An algorithm A is an approximation algorithm for an NP optimization problem Q = {IQ, SQ,fQ,optQ) if, for each input instance x in IQ, A returns a feasible solution yA{x) in SQ{X). The solution yA{x) has an approximation ratio r{n) if it satisfies the following condition: optQ{x)/fQ{x,yA{x)) fQ{x,yA{x))/optQ{x)

< r{\x\) if Q is a maximization problem < r{\x\) if Q is a minimization problem

The approximation algorithm A has an approximation ratio r{n) if for any instance x in IQ, the solution yA{x) constructed by the algorithm A has an approximation ratio bounded by r(|a:|). Definition 1. An NP optimization problem Q has a polynomial-time approximation scheme (PTAS) if there is an algorithm AQ that takes a pair {x,e) as input, where x is an instance of Q and e > 0 is a real number, and returns a feasible solution y for x such that the approximation ratio of the solution y is bounded by 1 -\- e, and for each fixed e > 0, the running time of the algorithm AQ is bounded by a polynomial of \x\.* An NP optimization problem Q has a fully polynomial-time approximation scheme (FPTAS) if it has a PTAS AQ such that the running time of AQ is bounded by a polynomial of \x\ and 1/e. "* There is an alternative definition for PTAS in which each e > 0 may correspond to a different approximation algorithm Ae for Q [19]. The definition we adopt here may be called the uniform PTAS, by which a single approximation algorithm takes care of all values of e. Note that most PTAS developed in the literature are uniform PTAS.

302

X. Huang and J. Chen

Observe that the time complexity of a PTAS algorithm may be of the form 0(2^/^|a:|'^) for a fixed constant c or of the form 0(1x1^/"^). Obviously, the latter type of computations with small e values will turn out to be practically infeasible. This leads to the following definition [9]. Definition 2. An NP optimization problem Q has an efficient polynomial-time approximation scheme (EPTAS) if it admits a polynomial-time approximation scheme whose time complexity is bounded by 0(/(l/e)|a;|'^), where f is a recursive function and c is a constant. An NP optimization problem Q can be parameterized in a natural way as follows. Definition 3. Let Q = {lQ,SQ,fQ,optQ) be an NP optimization problem. The parameterized version of Q is defined as follows: (1) If Q is a maximization problem, then the parameterized version of Q is defined as Q> = {{x, k) | a; € / Q A optQ{x) > k]; (2) If Q is a minimization problem, then the parameterized version of Q is defined as Q< = {{x,k) | x € IQ A optQ{x) < k). The above definition offers the possibility to study the relationship between the approximability and the parameterized complexity of NP optimization problems. However, there is an essential difference between the two categories: an approximation algorithm for an NP optimization problem constructs a solution for a given instance of the problem, while a parameterized algorithm only provides a "yes/no" decision on an input. To make the comparison meaningful, we need to extend the definition of parameterized algorithms in a natural way so that when a parameterized algorithm returns a "yes" decision, it also provides an "evidence" to support the conclusion (see [6] for a similar treatment). Definition 4. Let Q = {IQ,SQ, fQ,optQ) be an NP optimization problem. We say that a parameterized algorithm AQ solves the parameterized version of Q if (1) in case Q is a maximization problem, then on an input pair {x, k) in Q>, the algorithm AQ returns "yes" with a solution y in SQ{X) such that fQi^^y) ^ k> o,nd on any input not in Q>, the algorithm AQ simply returns "no"; (2) in case Q is a minimization problem, then on an input pair {x, k) in Q<, the algorithm AQ returns "yes" with a solution y in SQ{X) such that fQ{x,y) < k, and on any input not in Q<, the algorithm AQ simply returns "no".

3 Lower Bound on Running Time of P T A S for P l a n a r G r a p h Problems Suppose e > 0 is the given error bound, and n is the number of vertices of a planar graph. Lipton and Tarjan [25] designed an EPTAS approximation

On PTAS for Planar Graph Problems

303

algorithm of time 0(2°(^/'')n°(^)) for PLANAR INDEPENDENT SET, as an application of a separator theorem on planar graphs. Based on the outer-planarity of planar graphs, Baker [4] designed EPTAS algorithms of time 0{2'-'^^/'^^n) for several famous NP-hard optimization problems on planar graphs, such as PLANAR VERTEX COVER, PLANAR INDEPENDENT SET, and PLANAR DOMINATING SET. In [6], Cai and Chen proved that if an optimization problem has a fully polynomial-time approximation scheme (FPTAS), then the corresponding parameterized problem is fixed-parameter tractable (in FPT). Later this result was extended in [9] by Cesati and Trevisan: All optimization problems that have efficient polynomial time approximation schemes (EPTAS) have their parameterized problems in FPT. Therefore, the parameterized versions of these aforementioned optimization problems, PLANAR VERTEX COVER, PLANAR INDEPENDENT SET, and PLANAR DOMINATING SET, are in FPT. Alber et. al [2] designed parameterized algorithms of time 2'^^^''^n'^^^^ for the parameterized versions of the above NP-hard optimization problems. A lot of research has been done on these problems to try to further improve the time complexity of the parameterized algorithms. Interested readers are referred to [1, 23, 17, 18]. Cai et. al [8] proved the following lower bound result for the parameterized algorithms of these problems: Lemma 1. (Lemma 5.1 in [8]) PLANAR VERTEX COVER, PLANAR INDEPENDENT SET, and PLANAR DOMINATING SET do not have parameterized algorithms oftime2°^^^n'^^'^\ unless VERTEX COVER-3 has 2"'^''^n'-"^'^^-time parameterized algorithms. The class SNP introduced by Papadimitriou and Yannakakis [26] contains many well-known NP-hard problems including, for any fixed q > 3, CNF q-SAT, q-COLORABILITY, q-SET COVER, and VERTEX COVER, CLIQUE, and INDEPENDENT SET [22]. It is commonly believed that it is unlikely that all problems in SNP are solvable in subexponential time. Impagliazzo, Paturi and Zane [22] studied the class SNP and identified a group of SNP-complete problems under the serf-reduction, such that if any of these SNP-complete problems is solvable in subexponential time, then all problems in SNP are solvable in subexponential time. This group of SNP-complete problems under the serf-reduction includes t h e p r o b l e m s CNF q-SAT, q-COLORABILITY, q-SET COVER, a n d VERTEX COVER, CLIQUE, a n d INDEPENDENT SET.

We have: Lemma 2. (Theorem 3.3 in [13]) The VERTEX COVER-3 problem can he solved in 2°('^)n'^'^^^ time if and only if the VERTEX COVER problem can be solved in 2o(fc)^o(i) ^-^g^

Therefore Lemma 1 could be restate as:

304

X. Huang and J. Chen

L e m m a 3 . PLANAR VERTEX COVER, PLANAR INDEPENDENT SET, anrf PLANAR

DOMINATING SET do not have parameterized algorithms of time 2°^^'°)n'-'^^^, unless all SNP problems are solvable in subexponential time. We prove the following lower bound results on the running time of the EPTAS algorithms for those planar graph problems: T h e o r e m 1. PLANAR VERTEX COVER, PLANAR INDEPENDENT SET, and PLA-

NAR DOMINATING SET have no EPTAS of running time 2°(VVe)n'-'(^), where e > 0 is the given error bound, unless all SNP problems are solvable in subexponential time. Proof We provide the proof for

PLANAR VERTEX COVER.

Let Q be the mini-

mization problem of PLANAR VERTEX COVER.

Prom the EPTAS algorithm AQ for the PLANAR VERTEX COVER problem Q, we provide the parameterized algorithm A< shown in Fig. 1 for the parameterized version Q< of the PLANAR VERTEX COVER problem Q. Algorithm A<: Input: An instance (G, k) of Q<, where G is a planar graph. Output: If the minimum vertex cover Go has the size |Go| < k, then Output "yes"; otherwise Output "no". 1. On the instance {G,k) of Q<, call the EPTAS algorithm AQ on G and e = l/(2fc + 1). Suppose that the algorithm AQ returns a vertex cover G. 2. If \C\ < k, then return "yes"; otherwise return "no".

Fig. 1. Algorithm A<. We verify that the algorithm A< solves the parameterized problem Q<. Since the PLANAR VERTEX COVER problem Q is a minimization problem, if \C\ < k then obviously |Co| < k. Thus, the algorithm A< returns a correct decision in this case. On the other hand, suppose \C\ > k. Since \C\ is an integer, we have \C\> k + 1. Since AQ is a EPTAS for the PLANAR VERTEX COVER problem Q and e = l/(2fc -f-1), we must have

|C|/|(7oifc+ 1) |Co| > |C|/(1 + l/(2fc + 1) > (fc + 1)/(1 + l/(2fc + 1) =fc+ 1/2 > fc Thus, in this case the algorithm A< also returns a correct decision. This proves that the algorithm A< solves the parameterized version Q< of the PLANAR

On PTAS for Planar Graph Problems

305

VERTEX COVER problem Q. The running time of the algorithm A< is dominated by that of the algorithm AQ, which is bounded by 2°^Vy'^)n^W = 2°(^)n'^W. Thus, the parameterized version Q< of the PLANAR VERTEX COVER problem is solvable in time 2°^^''^n'-^^^\ Therefore, the result in the theorem follows from Lemma 3. T h e p r o o f s for PLANAR INDEPENDENT SET a n d PLANAR DOMINATING SET

are similar and hence are omitted. C o r o l l a r y 1. PLANAR VERTEX COVER, PLANAR INDEPENDENT SET, and

PLA-

NAR DOMINATING SET have no PTAS of running time 2 ° ( v ^ ) n ° ( ^ ) , where € > 0 is the given error bound, unless all SNP problems are solvable in subexponential time. By a comparison with the upper bound on the running time of the EPTAS algorithms for these planar graph problems in Baker [4], which is 2^^^^'^^nP^^^ (also in Lipton and Tarjan [25]), we can see that there is a gap between the upper bound result and our lower bound result in Theorem 1. To come up with new approaches to improve the upper bound on the running time of the EPTAS algorithms in [4] will be interesting research. To study this issue, we concentrate on the PLANAR VERTEX COVER problem in the next section.

4 U p p e r Bound on Running Time of P T A S for P l a n a r Vertex Cover In this section, we study the PTAS algorithms for the VERTEX COVER problem on planar graphs of degree bounded by 3, abbreviated as P-vc-3. The VERTEX COVER problem on general planar graphs is abbreviated as P-VC. Prom the proof of Theorem 1, we get the following lemma: L e m m a 4. The P-vc-3 problem has no EPTAS of running time 2''(viA)n°(^), where e > 0 is the given error bound, unless the P-vc-3 problem has a parameterized algorithm of time 2°'^^^^n'~"^^\ It is well known that a planar embedding of a planar graph can be constructed in linear time [21]. We define an operation, called the unfolding operation, based on a planar embedding of a planar graph. Definition 5. Suppose that G is a planar graph with a planar embedding i^{G), and that v is a degree-d vertex in G, where d > 3, with neighbors vi, v^, • • •, Vd, such that when one traverses around the vertex v on the embedding 7r(G), the edges incident to v are in the cyclic order [v,vi], [v,V2], •••, [v,Vci]. The unfolding operation on the vertex v will do the following: remove the vertex v from TT{G), and add a path of length 2d — 5: Pv = {yi,xi,y2,X2,

•••

,yd-3,Xd-3,yd-2}

306

X. Huang and J. Chen

where each vertex Xi is of degree 2 and adjacent to the vertices yi and yi+i, and each vertex yi is of degree 3 such that j/i is adjacent to {ui, i;2, Xi}, yd-2 is adjacent to {vd-i,V(i,Xd.-3}, and yi is adjacent to {vi+i,Xi-i,Xi}, for 2 3, where y<3 is the set of vertices whose degree is less than or equal to 3, y>3 is the set of vertices whose degree is greater than 3. We apply the unfolding operation on a vertex v G V>3. We get a new planar graph G2 = (V2, E2), where G2 has one fewer vertex of degree larger than 3, compared with Gi. We first consider a vertex cover C2 of the graph G2. - Suppose for some i, 1 < i < d — i, the three vertices Xj, j/j, and yj+i are all in C2. Then we simply remove Xi from C2. It is obvious that C2 — {xi} is still a vertex cover of G2, with one fewer vertex compared with C2. Call this operation clean-one. - Suppose for some i, 1 < i < d — 3, exactly two of the three vertices Xi, yi, and j/j+i are in C2- If one of these two vertices is Xi, then we can replace the two vertices by j/j and j/j+i, resulting in a new vertex cover of the same size. Call this operation clean-two.

V3

V

Vl

V2

Fig. 2. Unfolding operation on the vertex v (with degree 6). Note that at least one of the three vertices Xi, y^, and yj+i must be in the vertex cover C2 in order to cover the edges [xi,yi] and [xi,yi+i]. Therefore, besides the above cases, the only remaining case is that for the three vertices Xi, yi, and y^+i, only one of them is in C2. In this case, this vertex in C2 must b e Xi.

On PTAS for Planar Graph Problems

307

In the following discussion, cleaning a vertex cover C2 means that we apply the processing of clean-one and clean-two on C2. After the cleaning process, we say that the vertex cover C2 is clean. By the above discussion, in a clean vertex cover C2 of the graph G2, we have Claim. Either all d — 3 vertices Xi, 1 < i < d — 3, are in C2 and none of the d — 2 vertices yj, 1 < j < d — 2, is in C2; or all d — 2 vertices yj, 1 < j < d — 2, are in C2 and none of the d — 3 vertices Xi, 1 < i < d — 3, is in C2. Let Ci be any vertex cover of the graph Gi such that Ci has ki vertices. If V G Ci (so V covers the d edges [^,^1], ..., [v,Vd] in G), then by replacing v in Ci by the d—2 vertices j/i, 2/2, • • •, yd-2 in G2, we obviously get a clean vertex cover C2 for the graph G2. The vertex cover C2 has ki + {d — 3) vertices. On the other hand, if v is not in Ci (so the edges [f, vi], ..., [v, Vd] must be covered by the vertices vi, ..., v^ in Ci), then by adding the d — 3 vertices xi, X2, • • •, Xd-3 to Ci, we get a clean vertex cover C2 for the graph G2 and C2 contains k\ + {d — 3) vertices. In conclusion, from a vertex cover of fei vertices for the graph Gi, we can always construct a (clean) vertex cover of k\ + [d — 3) vertices for the graph G2. Conversely, suppose that we are given a clean vertex cover C2 of the graph G2, where C2 has k2 vertices. If C2 contains the d — 2 vertices j/i, 2/2, • • •, yd-2, then replacing the d — 2 vertices yi, y2, ..., yd-2 in C2 by a single vertex v gives a vertex cover of k2 — {d — 3) vertices for the graph Gi. On the other hand, if C2 contains the d — 3 vertices xi, X2, . •., Xd-3, then removing these d — 3 vertices from C2 gives a vertex cover of k2 — {d — 3) vertices for the graph Gi. In conclusion, from a vertex cover of ^2 vertices for the graph G2, we can always construct a vertex cover of k2 — [d- 3) vertices for the graph Gi. Now suppose that the set of vertices of degree larger than 3 in the graph Gi is y>3 = {ui,U2,.. • ,Ur-}. Denote by deg{u) the degree of the vertex u. Inductively, suppose that the graph Gj+i is obtained from the graph G, by unfolding the vertex Uj, for 1 < i < r. Note that the graph Gr has its degree bounded by 3, and we say that the graph Gr is obtained from the graph Gi by unfolding all vertices of degree larger than 3. Let C\ be a vertex cover for the graph Gi with \Gi\ = ki. By the above discussion, we can construct from Gi a vertex cover G2 of ki + {deg{ui) — 3) vertices for the graph G2; then from G2, we can construct a vertex cover G3 of ki + (deg{ui) — 3) -f {deg(u2) — 3) vertices for the graph G3, , and finally we construct a vertex cover Cr of fci -t- '}2\^i{deg{ui) — 3) vertices for the graph GrOn the other hand, let Cr be a vertex cover of kr vertices for the graph GrFirst we clean Cr to get a clean vertex cover C'r for Gr. Since cleaning does not increase the size of the vertex cover, we have |G^| < \Cr\ = kr. Now by the above discussion, we can get a vertex cover Cr-i of |C^| — {deg{ur) — 3)
308

X. Huang and J. Chen

{deg{ur-i) — 3) vertices for the graph Gr-2, , finally, we will construct a vertex cover of at most kr — ^l^i{deg{ui) — 3) vertices for the graph G\. In particular, the above discussion enables us to derive a relation between the minimum vertex covers for the graphs G\ and Gr- Let h\ and k^ be the sizes of minimum vertex covers of the graph G\ and Gr, respectively. By the above discussion, from a minimum vertex cover for the graph G\, we can construct a vertex cover of k\ + ^[=i(c?e(7(w,) — 3) vertices for the graph Gr- Therefore, fci + X)[=i(^65(^i) ~ 3) > ky. On the other hand, from a minimum vertex cover of the graph G^, we can construct a vertex cover of no more than kr — Si=i('^65(^i) ^ 3) vertices for the graph G\, thus kr — J2l=i(.deg{ui) — 3) > fci. Combining these two relations, we get fci + YA=i{deg{ui) — 3) = krSummarizing the above discussion, we get the following: Claim. Let Gi be a graph in which the set of vertices of degree larger than 3 is V'>3. Let Gr be a graph obtained by unfolding all vertices of degree larger than 3 in Gi. Then from a vertex cover Ci for the graph Gi, we can construct in polynomial time a vertex cover of |Gi| + Yluev id^9i'^) " 3) vertices for the graph Gr', and from a vertex cover Cr for the graph Gr, we can construct in polynomial time a vertex cover of at most \Cr\ — X)uev ideg{u) — 3) vertices for the graph Gi. Moreover, the size of a minimum vertex cover of the graph Gr is equal to the size of a minimum vertex cover of the graph Gi plus Using the unfolding operations, we can prove Lemma 5. The P-vc-3 problem has no parameterized algorithm of time 2°^^^^n'-'^^\ unless the P-VC problem has a parameterized algorithm of time 2°^'^''^n'^^^\ Proof. Suppose the P-VC-3 problem has a parameterized algorithm A of time 20(^^)7^0(1). We have the following algorithm A' shown in Fig 3 for the P-VC problem. We prove the algorithm A' is correct. By Claim 4, OPTi is a vertex cover for the graph Gi with \0PT2\ — J2uev i^^di'^) " 3) vertices and OPTi is computable in time n'-'^^\ Since OPT2 is a minimum vertex cover for the graph G2, by Claim 4 again, a minimum vertex cover for the graph Gi contains IOPT2I — X^^jgy {deg{u) — 3) vertices. In conclusion, OPTi is a minimum vertex cover for the graph Gi. We analysis the running time of A' in the following. For the graph Gi = {Vi,Ei), Vi = V<3 Uy>3, where l^il = n and \Ei\ = m, we can always assume \OPTi\ > n/2 by applying the NT-theorem [12]. That is, the parameter k > n/2. After applying the unfolding operation on each V G V>3, we get the new planar graph G2 = (V2, i?2) with degree bounded by 3. The construction of G2 can be done in polynomial time. For a planar graph with n vertices and m edges, we have [14]: m < 3n — 6.

(1)

On PTAS for Planar Graph Problems

309

Algorithm A' Input: A planar graph Gi = {Vi,Ei), Vi = V<3UV>3, and an integer k > 0. Output: Output "Yes", if the size of the minimum vertex cover OPTi of Gi satisfies jOPTil 3 be the set of all vertices of degree larger than 3 in the graph Gi. Construct a planar graph G2 by unfolding all vertices of degree larger than 3 inGi. 2. Run the algorithm A on the graph G2 with the parameter ^2 = 1, 2,..., IV2I. We get a minimum vertex cover OPT2 for the graph G2. 3. Construct a vertex cover OPTi for the graph Gi from OPT2 such that \OPTt\ = IOPT2I - Zuev^Megiu) - 3). 5. If \OPTi\ < k, Return "Yes"; Otherwise, Return "No".

Fig. 3. Parameterized algorithm for PLANAR VERTEX COVER.

By Equation 1, for the graph Gi, the total degree of all its vertices satisfies: ^

deg(v) = 2m< 2(3n - 6) < 6n,

(2)

veVi

We have IV2I = \V<s\ + J2 ^(deg{v) - 3) + {deg{v) - 2))

V&V>3

<\Vi\+2Ydeg{v) veVi

n + 12n = 13n = 0(n). Therefore, the calls to the algorithm^ on the graph G2 takes time 2''^v'^^'^|V2|'^'^^ 2oiV^)j^oii) ^ 2°(^)n°(^). All the other steps of the algorithm A' takes polynomial time n'^^^\ Therefore the algorithm A' has running time 2°^^''^n^^^\ Therefore, from Lemma 4, Lemma 5 and Theorem 1, we have Theorem 2. The P-vc-3 problem has no EPTAS of running time 2"^^^"^ n'-"^^\ where e > 0 is the given error bound, unless all SNP problems are solvable in subexponential time.

310

X. Huang and J. Chen

Theorem 2 implies t h e difficulty of improving t h e E P T A S algorithm for t h e P - v c - 3 problem. Baker [4] provided an E P T A S algorithm of time 2'^'^^/^'>p{n) for t h e P-VC problem. B y applying t h a t algorithm, we get an E P T A S algorithm of time 2'-'^^/^^p[n) for t h e P - v c - 3 problem. Since t h e P - v c - 3 problem seems simpler, one might suspect t h a t we could have a better E P T A S algorithm for it t h a n t h a t for t h e P-VC problem. In t h e following we show t h a t if we can improve t h e E P T A S algorithm for the P-VC-3 problem, then we can improve t h e E P T A S algorithm for t h e P-VC problem. T h e o r e m 3 . If the P-VC-3 problem has an EPTAS of running time then the P-VC problem has an EPTAS of running time f{13/e)n'-'^^\ is a recursive function and e > 0 is the given error bound.

f{l/e)n^^^\ where f

Proof Given an E P T A S algorithm A of running time f(l/e)n°^^^ for t h e P-VC3 problem, we provide an E P T A S algorithm B of running time /(13/e)n'^^^) for t h e P-VC problem. T h e description of algorithm B is given in Fig. 4. Algorithm B Input: A planar graph Gi = {Vi,Ei),

and a constant e > 0.

Output: A vertex cover Ci for Gi, such that |Gi| < (1 + e) * | O P r i | . 1. Let V>3 be the set of all vertices of degree larger than 3 in the graph G i . Unfold all vertices of degree larger than 3 in G i , let the resulting graph be G2 = {V2,E2), whose degree is bounded by 3. 2. Run the algorithm A with e' = e/13 on the graph G2. We get a vertex cover G2 for the graph G2. 3. From G2 construct a vertex cover Gi of at most IG2I — ^^^y vertices for the graph Gi.

('^sff('") " 3)

4. Return Gi.

Fig. 4. EPTAS algorithm for PLANAR VERTEX COVER.

We claim t h a t t h e vertex set C i is t h e required vertex cover for t h e graph By Equation 1 and Claim 4, we have

IOPT2I = lOPTil + Yl (^e5(w) - 3) u€V>3

u6Vi

On PTAS for Planar Graph Problems

311

< \OPTi\+&n < |(9PTi| + 12|OFTi| < 13|0PTi|. Therefore, \0PT2\ < 13|0PTi|.

(3)

By Claim 4, we have lOPTil = IOPT2I -

^

{deg{u) - 3)

USV>3

and •ueK>3

Therefore, we have |C2|-|Ci|>|OPT2|-|OPri! or equivalently |C2|-|OPT2|>|Ci|-|OPri| Prom this, we derive immediately |Ci|/|OPTi|-l = (|Ci|-|OPTi|)/|OPTi| <(|C2i-|OPT2|)/|OPTi| < 13(|C2| -

\OPT2\)/\OPT2\

= 13(|C2|/|OPr2| - 1) < 13*(e/13)

Here we have used the assumption that C2\/\OPT2\ < 1 + e' = 1 + e/13, and the fact IOPT2I > 13|0PTi|. The call of the algorithm A on the graph G2 takes time f{l/e')n^^^\ All the other steps of the algorithm B take polynomial time n'-'^^\ Therefore, the running time of the algorithm B is f{13/e)n'-^^^\ and the approximation ratio for the algorithm P is 1 + e.

5 Summary In this paper, we have proved lower bound results on the running time of the PTAS algorithms for a class of planar graph problems including PLANAR INDEPENDENT SET, PLANAR VERTEX COVER and PLANAR DOMINATING SET. We pointed out that there is a gap between our lower bound result and the current

312

X. Huang and J. Chen

known upper bound result on t h e running time of the P T A S algorithms for these planar graph problems. We then studied t h e P T A S algorithms for PLANAR VERTEX COVER problem. Based on our study of t h e relationship between PLANAR VERTEX COVER and PLANAR VERTEX COVER on planar graphs of degree bounded by three, we showed t h a t to further improve the upper bound on the running time of the P T A S algorithms for P L A N A R V E R T E X C O V E R , we

could concentrate on t h e PLANAR VERTEX C O V E R on planar graphs of degree bounded by three. Closing t h e gap and further improving t h e upper bound on the running time of the P T A S algorithms for these planar graph problems are nice open problems inviting further research.

References 1. Alber J, Bodlaender HL, Fernau H, Kloks T, and Niedermeier R (2002) Fixed parameter algorithms for dominating set and related problems on planar graphs. Algorithmica 33:461-493 2. Alber J, Fernau H, Niedermeier R (2004) Parameterized complexity: exponential speed-up for planar graph problems. J. Algorithms 52:26-56 3. Ausiello G, Crescenzi P, Gambosi G, Kann V, Marchetti-Spaccamela A, and Protasi M (1999) Complexity and Approximation, Combinatorial Optimization Problems and Their Approximability Properties. New York, Springer-Verlag 4. Baker BS (1994) Approximation algorithms for NP-complete problems on planar graphs. Journal of the ACM 41:153-180 5. Bar-Yehuda R and Even S (1982) On approximating a vertex cover for planar graphs. Proceedings of the fourteenth annual ACM symposium on Theory of computing, pp.303-309 6. Cai L and Chen J (1997) On fixed-parameter tractability and approximability of NP optimization problems. Journal Of Computer and System Sciences 54:465-474 7. Cai L, Fellows M, Juedes D, Rosamond F (2006) The complexity of polynomialtime approximation. Theory of Computing Systems, to appear. 8. Cai L and Juedes DW (2003) On the existence of sub-exponential time parameterized algorithms. Journal of Computer and System Sciences 67:789-807 9. Cesati M and Trevisan L (1997) On the efficiency of polynomial time approximation schemes. Information Processing Letters 64:165-171 10. Chen J, Chor B, Fellows M, Huang X, Juedes DW, Kanj I and Xia G (2004) Tight lower bounds for parameterized NP-hard problems. Proc. of the 19th Annual IEEE Conference on Computational Complexity, pp. 150-160 11. Chen J, Huang X, Kanj I and Xia G (2004) Linear F P T reductions and computational lower bounds. Proc. of the 36th ACM Symposium on Theory of Computing, pp. 212-221 12. Chen J, Kanj I, and Jia W (2001) Vertex cover: further observations and further improvements. Journal of Algorithms 41:280-301 13. Chen J, Kanj I, Xia G (2003) A note on parameterized exponential time complexity. Tech. Report, DePaul University 14. Diestel R (2000) Graph theory. New York: Springer 15. Downey RG and Fellows MR (1999) Parameterized complexity. Springer, New York

On PTAS for Planar Graph Problems

313

16. Eppstein D (2000) Diameter and treewidth in minor-closed graph families, Algorithmica 27:275-291 17. Fomin FV and Thilikos DM (2003) Dominating sets in planar graphs: branchwidth and exponential speed-up. Proc. of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 168-177 18. Fomin FV and Thilikos DM (2004) A simple and fast approach for solving problems on planar graphs. Lecture Notes in Computer Science 2996:56-67 19. Garey M and Johnson D (1979) Computers and intractability: a guide to the theory of NP-Completeness. W. H. Freeman, New York 20. Grohe M (2003) Local tree-width, excluded minors, and approximation algorithms, Combinatorica 23:613-632 21. Hopcroft JE and Tarjan RE (1974) Efficient planarity testing. Journal of the ACM 21:549-568 22. Impagliazzo R, Paturi R, Zane F (2001) Which problems have strongly exponential complexity? Journal of Computer and System Sciences 63: 512-530 23. Kanj I, Perkovic L (2002) Improved parameterized algorithms for planar dominating set. Lecture Notes in Computer Science 2420:399-410 24. Khanna S, Motwani R (1996) Towards a Syntactic Characterization of PTAS, STOC 1996: 329-337 25. Lipton RJ, Tarjan RE (1980) Applications of a planar separator theorem. SIAM J. Comput. 9:615-627 26. Papadimitriou CH, Yannakakis M (1991) Optimization, approximation, and complexity classes. Journal of Computer and System Sciences 43: 425-440 27. Papadimitriou CH and Yannakakis M (1999) On the complexity of database queries. Journal of Computer and System Sciences 58:407-427

Index

Abraham, Marco 283 Arenas, Marcelo 3 Bockenhauer, Hans-Joachim Bloom, Stephen 231 Bortolussi, Luca 91 Brodnik, Andrej 103 Caromel, Denis 165 Chen, Jianer 299 Coja-Oghlan, Amin 271

Jeron, Thierry

251

Kotter, Rolf 283 Karlsson, Johan 103 Kiwi, Marcos 9 Kneis, Joachim 251 Kralovic, Rastislav 131 Krumnack, Antje 283 Kupke, Joachim 251 Kutrib, Martin 151 Lanka, Andre

d'Orso, Julien 213 Dean, Brian 65 Di Lena, Pietro 185 Dobrev, Stefan 131 Esik, Zoltan

231

Fabris, Prancesco 91 Flocchini, Paola 131 ForHzzi, Luca 251 Goemans, Michel 65 Goerdt, Andreas 271 Gruska, Jozef 5,17 Gutierrez, Claudio 7 Guttmann, Walter 77

197

271

Malcher, Andreas 151 Marchand, Herve 197 Matsuzaki, Kazutaka 115 Maucher, Markus 77 Munro, J. Ian 103 Nilsson, Andreas

103

Policriti, Alberto 91 Prencipe, Giuseppe 47 Proietti, Guido 251 Rusu, Vlad

197

Santoro, Nicola 11,47,131 Sei, Yuichi 115

Henrio, Ludovic 165 Honiden, Shinichi 115 Hromkovic, Juraj 251 Huang, Xiuzhen 299

Wanke, Egon 283 Widmayer, Peter 251

Immorlica, Nicole

Yannakakis, Mihalis

65

Touili, Tayssir

213

13

Foundations of computer science

Read more

Foundations of Computer Science

Read more

Network Control and Engineering for QoS, Security and Mobility, V: IFIP 19th World Computer Congress,TC-6, 5th IFIP International Conference on Network ... and Communication Technology) (v. 5)

Read more

Theoretical computer science

Read more

Theoretical Computer Science

Read more

Theoretical computer science

Read more

Theoretical Computer Science

Read more

Advances in Computer Science and Information Technology. Computer Science and Information Technology: Second International Conference, CCSIT 2012, ... and Telecommunications Engineering)

Read more

Theoretical Computer Science: 6th IFIP WG 2.2 International Conference, TCS 2010, Held as a Part of WCC 2010, Brisbane, Australia, September 20-23, ... in Information and Communication Technology)

Read more

Algebraic Foundations in Computer Science

Read more

Advanced Software Engineering: Expanding the Frontiers of Software Technology: IFIP 19th World Computer Congress, First International Workshop on Advanced ... in Information and Communication Technology)

Read more

Foundations of Software Technology and Theoretical Computer Science, 20 conf., FST TCS 2000

Read more

FST TCS 2003: Foundations of Software Technology and Theoretical Computer Science

Read more

Theoretical Computer Science Cheat Sheet

Read more

Theoretical Computer Science Proc.conf.2000

Read more

Ad-Hoc Networking: IFIP 19th World Computer Congress, TC-6, IFIP Interactive Conference on Ad-Hoc Networking, August 20-25, 2006, Santiago, Chile (IFIP ... in Information and Communication Technology)

Read more

Computer and Computing Technologies in Agriculture II, Volume 3: The Second IFIP International Conference on Computer and Computing Technologies in Agriculture ... in Information and Communication Technology)

Read more

Computer and Computing Technologies in Agriculture II: The Second IFIP International Conference on Computer and Computing Technologies in Agriculture ... in Information and Communication Technology)

Read more

FSTTCS 2006: Foundations of Software Technology and Theoretical Computer Science: 26th International Conference, Kolkata, India, December 13-15, 2006,

Read more

Mathematical Foundations of Computer Science 2006, 31 conf., MFCS 2006

Read more

Arithmetic Circuits (Foundations and Trends in Theoretical Computer Science)

Read more

Logic in computer science

Read more

Logic in computer science

Read more

Theoretical and Mathematical Foundations of Computer Science - ICTMF 2011

Read more

Logic in computer science

Read more

Logic in computer science

Read more

Logic in computer science

Read more

Artificial Intelligence in Theory and Practice: IFIP 19th World Computer Congress, TC-12 IFIP AI 2006 Stream, August 21-24, 2006, Santiago, Chile

Read more

Logic in computer science

Read more

Categories and computer science

Read more

Recommend Documents

Foundations of computer science

TABLE OF CONTENTS ✦ ✦ ✦ ✦ Table of Contents Preface Chapter 1.1. 1.2. 1.3. 1.4. 1.5. 1.6. 1.7. 1.8. ix 1. Computer S...

Foundations of Computer Science

Network Control and Engineering for QoS, Security and Mobility, V: IFIP 19th World Computer Congress,TC-6, 5th IFIP International Conference on Network ... and Communication Technology) (v. 5)

NETWORK CONTROL AND ENGINEERING FOR QoS, SECURITY AND MOBILITY, V IFIP - The International Federation for Information...

Theoretical computer science

Theoretical Computer Science

Proceedings of the 10th Italian Conference on THEORETICAL COMPUTER SCIENCE ICTCS’07 This page intentionally left bla...

Theoretical computer science

Theoretical Computer Science

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris ...

Advances in Computer Science and Information Technology. Computer Science and Information Technology: Second International Conference, CCSIT 2012, ... and Telecommunications Engineering)

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering Editorial Bo...

Theoretical Computer Science: 6th IFIP WG 2.2 International Conference, TCS 2010, Held as a Part of WCC 2010, Brisbane, Australia, September 20-23, ... in Information and Communication Technology)

IFIP Advances in Information and Communication Technology 323 Editor-in-Chief A. Joe Turner, Seneca, SC, USA Editoria...

Algebraic Foundations in Computer Science

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris...